Did you know that choosing the right join type for your data could improve your data integration performance? Given certain data sets, Pig provides several join algorithms that process them in the most optimal way, thus saving you (X)plenty of time. This post will review them.
Welcome to Xplenty's Blog
All things data
Last year Cloudera published a blog post on Big Data’s new use cases: transformation, active archive, and exploration. There’s one more use case that isn’t explicitly mentioned - data integration.
Despite the Hadoop hype machine crunching away, not everyone is fond of that little yellow elephant. In fact, some fear it. But why should the cute mammal and the innovative data processing technology that it represents raise anxiety levels? Everyone has their reasons.
Childhood dreams do come true - in 2015 "Batman vs. Superman" will bring the world’s biggest superheroes to battle on-screen, finally solving that eternal debate who will prevail (I put my Bitcoins on Batman).
Consider for a moment, if you will, plastic patio furniture. Plastic Fantastic is a global manufacturer with several factories, warehouses, and plenty of stores. One can only imagine the sheer amount of data resulting from sales, production, suppliers, and finances. Everything that happens, from purchase and onward, to these chairs, tables, and cupboards in all corners of the world is measured.