Welcome to Xplenty's Blog

All things data

Spark, Impala, Tez and Hive: Interview with David Gruzman

Spark, Impala, Tez and Hive: Interview with David Gruzman

Big Data consultant David Gruzman answered some of our burning questions about which Big Data platform to use, whether streaming is a must or not, and what are the biggest issues with the cloud.

Hive vs. HBase

Hive vs. HBase

Comparing Hive with HBase is like comparing Google with Facebook - although they compete over the same turf (our private information), they don’t provide the same functionality. But things can get confusing for the Big Data beginner when trying to understand what Hive and HBase do and when to use each one of them. Let’s try and clear it up.

Hadoop Data Integration 101

Hadoop Data Integration 101

Last year Cloudera published a blog post on Big Data’s new use cases: transformation, active archive, and exploration. There’s one more use case that isn’t explicitly mentioned - data integration.

12 SQL-on-Hadoop Tools

12 SQL-on-Hadoop Tools

An overview of 12 open source and commercial SQL-on-Hadoop tools: Apache Hive, Apache Sqoop, Apache Phoenix, Impala, Presto, BigSQL, CitusDB, Hadapt, Jethro, Lingual, and HAWQ.

8 SQL-on-Hadoop Challenges

8 SQL-on-Hadoop Challenges

Introducing Apache Hadoop to the organization can be difficult - everyone is trained and experienced in the old ways of SQL and all the analytics tools integrate with SQL. Certain technologies can help make the transition to Hadoop easier by providing support for SQL on Hadoop.