Welcome to Xplenty's Blog

All things data

Improving Pig Data Integration Performance with Join

Improving Pig Data Integration Performance with Join

Did you know that choosing the right join type for your data could improve your data integration performance? Given certain data sets, Pig provides several join algorithms that process them in the most optimal way, thus saving you (X)plenty of time. This post will review them.

Hadoop Data Integration 101

Hadoop Data Integration 101

Last year Cloudera published a blog post on Big Data’s new use cases: transformation, active archive, and exploration. There’s one more use case that isn’t explicitly mentioned - data integration.

Fear of a Hadoop Planet

Fear of a Hadoop Planet

Despite the Hadoop hype machine crunching away, not everyone is fond of that little yellow elephant. In fact, some fear it. But why should the cute mammal and the innovative data processing technology that it represents raise anxiety levels? Everyone has their reasons.

Hadoop vs. Redshift

Hadoop vs. Redshift

Childhood dreams do come true - in 2015 "Batman vs. Superman" will bring the world’s biggest superheroes to battle on-screen, finally solving that eternal debate who will prevail (I put my Bitcoins on Batman).

7 Tips to Improve ETL Performance

7 Tips to Improve ETL Performance

Consider for a moment, if you will, plastic patio furniture. Plastic Fantastic is a global manufacturer with several factories, warehouses, and plenty of stores. One can only imagine the sheer amount of data resulting from sales, production, suppliers, and finances. Everything that happens, from purchase and onward, to these chairs, tables, and cupboards in all corners of the world is measured.