Welcome to Xplenty's Blog

All things data

5 Reasons to Use an ETL Tool rather than “Script Your Own”

5 Reasons to Use an ETL Tool rather than “Script Your Own”

Why you need a Data Integration Layer (ETL), reasons to use a SaaS based tool for ETL rather than coding it through scripts.

Data Hoarding: A Bad Habit Companies Need to Overcome

Data Hoarding: A Bad Habit Companies Need to Overcome

In an interview with Datanami, Xplenty CEO Yaniv Mor explains the pitfalls of data hoarding and how to overcome it.

How to Use Data to Create Personalized Marketing Campaigns and Increase ROI

How to Use Data to Create Personalized Marketing Campaigns and Increase ROI

This post takes a look at how you can use Xplenty, a data integration tool, to create personalized data-driven marketing campaigns to increase your ROI.

Recap: Xplenty Data Panel → Big Data's 'Janitor' Problem - Is it Killing ROI?

Recap: Xplenty Data Panel → Big Data's 'Janitor' Problem - Is it Killing ROI?

May was a crazy month here at Xplenty. We started our event tour at Collision in Las Vegas. Then we headed over to NYC for the Data Summit where CEO, Yaniv Mor, had 2 speaking slots, both packed to capacity!

Cloud Data Integration - Myth vs. Reality

Cloud Data Integration - Myth vs. Reality

So you’re on the cloud or plan to move there very soon.

Top Big Data Sessions for Data Summit 2015

Top Big Data Sessions for Data Summit 2015

If you want to know what the big players are doing with Big Data, then the Data Summit is the place to be. This Big Data conference, that will take place at the New York Hilton Midtown on May 11-13, will be attended by giants like Amazon, eBay, and even Pfizer and Mastercard. Dozens of engaging workshops will take place during the conference, so here are our picks top for the 10 sessions you can’t miss at Data Summit 2015.

11 Recommended Sessions for Collision 2015

11 Recommended Sessions for Collision 2015

What do U2’s lead singer Bono, pro skater Tony Hawk, and Netflix founder Reed Hastings have in common? Collision. Not a literal crash, but the huge tech conference that will take place in downtown Las Vegas May 5-6, 2015. Collision is related to Europe’s Web Summit and aims to bring all kinds of technology professionals together. Since there will be over 500 speakers at this year’s conference, you can’t go to everything. To help you get the most out of the conference, here are our favorite picks.

6 Lessons for Big Data Startups: Xplenty's 2014

6 Lessons for Big Data Startups: Xplenty's 2014

Last year, we recruited $3 million, got featured on TechCrunch, found new customers, hired more employees, attended conferences around the globe, spent thousands of hours on R&D, and invested a lot more effort on sales and marketing. Not everything was perfect, though. Now that 2014 is over and we have gained some perspective, here are six lessons that we learned as a Big Data startup.

Top 5 Big Data Events

Top 5 Big Data Events

Every year, dozens of Big Data conferences take place all over the world, from San Francisco to Shanghai. Now that 2015 is finally here, it’s time to open up your smartphone calendar and mark in this year’s Big Data conferences. Here are our five favorite events.

Xplenty’s Data Processing Survey

Xplenty’s Data Processing Survey

We want to know more about the Big Data community: what causes them headaches? What makes them happy? Which tools and technologies do they use? When we had our booth at AWS re:Invent 2014, we met as many people as possible and talked about their data needs. To get an even better picture, we conducted a little survey on the side.

5 Reasons You Need to Process Small Data with Hadoop

5 Reasons You Need to Process Small Data with Hadoop

Big Data is mostly famous, well, for being big. So if you have anything under a petabyte, why would you even think about using Apache Hadoop? But you should.

How To Offload Data Processing from Google BigQuery

How To Offload Data Processing from Google BigQuery

Google BigQuery is a great Big Data warehouse on the cloud for the SQL-savvy. But it’s not right for everything. Google itself recommends using Hadoop’s MapReduce rather than BigQuery for certain cases.

How to connect Xplenty to Amazon S3

How to connect Xplenty to Amazon S3

Screencast on how to connect Amazon S3 with Xplenty.

How to Count Page Views and Visitors in Web Server Logs

How to Count Page Views and Visitors in Web Server Logs

We’ve uploaded a brand new screencast which shows how to process web server logs with Xplenty.

Enter The GitHub Data Challenge without Coding

Enter The GitHub Data Challenge without Coding

What if you have a killer idea for GitHub’s Data Challenge but no money, servers, or coders at your disposal? We have the solution for you. You can sign up to Xplenty for free, process the data via our visual editor, and run it on a cluster, all without any installations or code. Let’s look at an example project to show you how it’s done.

Mining Dark Data without Hadoop

Mining Dark Data without Hadoop

Hadoop is definitely a great solution for processing dark data, but what if you don’t know how to use it? Hadoop requires you to buy new hardware, provide expert maintenance, and hire developers to program MapReduce jobs. Luckily, there is an alternative — Xplenty.

World Cup 2014 - Australia vs. Netherlands Twitter Analysis

World Cup 2014 - Australia vs. Netherlands Twitter Analysis

The 2014 World Cup is the hottest World Cup ever. Not just because of the soaring temperatures in Brazil that send players begging for water breaks, but also because of the high activity on social networks. Curious to take an in-depth look at what happens on Twitter during a game, we collected World Cup tweets during the Australia-Netherlands match.

Designing a Big Data Warehouse on the Cloud

Designing a Big Data Warehouse on the Cloud

In our previous post, we discussed Mad Men and how to design a data warehouse in the space age of Big Data. This post will take another step forward, or rather up, and examine how to design a data warehouse on the cloud.

Designing a Data Warehouse in the Age of Big Data

Designing a Data Warehouse in the Age of Big Data

In "The Monolith", the fourth episode in Mad Men’s final season, a huge computer is installed in the center of the floor. The computer was brought in because a competing ad agency has one, and being an innovation at the time, it was a competitive advantage that clients were looking for, and something the agency’s talented creative team could never do. Just as advertising went through major changes in Don Draper’s time, data warehousing is going through changes in our time.

Processing Unstructured Data 101

Processing Unstructured Data 101

Unstructured data is big - according to IDC, about 90 percent of the storage in the world is used for unstructured data. It comes as no surprise considering the amount of photos, videos, documents, and emails being generated on the web by the minute.

Amazon EMR vs. Xplenty

Amazon EMR vs. Xplenty

Amazon launched Elastic Map Reduce (EMR) to make Hadoop easier, but there were still too many Hadoop hoops to jump through before processing Big Data. That’s why we founded Xplenty. Since we both claim to make working with Big Data easier, we decided to run a quick comparison of Xplenty vs. EMR.

What's the Cheapest Way to Store Big Data in the Cloud?

What's the Cheapest Way to Store Big Data in the Cloud?

There are three ways to collect data on the cloud: storing it directly in the database, uploading log files, or logging via S3/CloudFront. Although we reviewed the pros and cons for each method there was one aspect we didn't mention - price. Let's try and estimate how much collecting data on the cloud actually costs.

Hadoop ETL with Apache Pig

Hadoop ETL with Apache Pig

What does it mean to be a pig? Well, according to the philosophers behind the Apache Pig project pigs eat anything, live anywhere, and are domestic animals. They even claim that pigs can fly!

Use Data on the Cloud to Measure KPIs

Use Data on the Cloud to Measure KPIs

Huge amounts of data are needed to calculate key performance indicators (KPIs), a luxury that only large enterprises were able to afford. This post series discusses how companies of all sizes can measure KPIs by collecting and processing Big Data on the cloud.

Inmon vs. Kimball - The Big Data Warehouse Duel

Inmon vs. Kimball - The Big Data Warehouse Duel

In his recent article "Turbocharge Your Porsche - Buy An Elephant", Bill Inmon, "the father of data warehousing", criticizes Cloudera for associating Big Data with the data warehouse, two totally unrelated terms according to him. This marks a new round in the fight between two academic geezers, a decades long argument over what is a data warehouse and how it should be implemented.

Hadoop in the Streets of London

Hadoop in the Streets of London

Last week I packed my suitcase and got on a plane to London. The agenda - presenting at the February Hadoop Users Group UK meetup. The meetup was supposed to take place two weeks ago, but it was delayed due to a Tube strike. Fortunately the strike was suspended after unions reached a deal with the London Underground and the rescheduled event took place on time.
Almost a hundred Hadoop enthusiasts turned up. From genius techies to data newbies, everyone came to network over pizza and beer. I was really excited to see such a vibrant Hadoop community in London, it's not trivial at all.

4 Tips on Collecting Streaming Data

4 Tips on Collecting Streaming Data

Readers of our blog should know by now that Apache Hadoop is great for offline batch processing of Big Data. But what about online streaming data? What if you’re running a ticker for the stock exchange or a real-time analytics dashboard? You might think that collecting streaming data is only relevant for big enterprises, but you don’t have to be The New York Stock Exchange to collect real-time data. Before you jump into the stream, here are 4 tips to get you started.

7 Tips to Improve ETL Performance

7 Tips to Improve ETL Performance

Consider for a moment, if you will, plastic patio furniture. Plastic Fantastic is a global manufacturer with several factories, warehouses, and plenty of stores. One can only imagine the sheer amount of data resulting from sales, production, suppliers, and finances. Everything that happens, from purchase and onward, to these chairs, tables, and cupboards in all corners of the world is measured.

Why Santa needs Big Data

Why Santa needs Big Data

Now, Hadoop! Now, SQL! Now, NOSQL, and Opensource! On, Cloud technology! On, Apps! On, SaaS, and Infrastructure!

Xplenty's First Day at SAP TechEd Pt. 1

Xplenty's First Day at SAP TechEd Pt. 1

First day at Vegas!

Big Data: Not Just for Nerds Anymore

Big Data: Not Just for Nerds Anymore

Peanut butter and jelly. Peas and carrots. Forest and Jenny. Bert and Ernie. Abbot and Costello. Sports and data? Of course.

ETL - Is it Still Relevant?

ETL - Is it Still Relevant?

Buzz about Big Data has been at fever pitch for over a year now. We hear a lot about how the insights we glean will propel businesses, about emerging technologies, and companies merging. But how often do we hear about the guts behind Big Data, what makes it actually work? Maybe I’m wrong, but from what I read, not often enough. So to buck that trend, let’s dive into one of the main building blocks of traditional data warehousing, ETL, and see how it fits in with current Big Data architecture.