One of the most overdue Amazon Web Services (AWS) functionalities is now out there and available: Amazon Athena- which is bringing the data lakes concept back into popular focus. Check out the long list of updates (and helpful blog posts) Amazon delivers on their dedicated Amazon Athena space. Amazon Athena is providing a groundbreaking change for the industry.

Amazon Athena

Athena is a service that lets you run SQL queries directly on Amazon SE. The distinctive benefit? It's serverless. There's nothing to install or deploy, and users pay only for what they query.

Presto powers the Amazon Athena Data Lakes functionality. Presto is a distributed SQL query engine for big data. It's fast, powerful, and scalable. We tested the service on a pile of data we had sitting on S3 - and the performance was terrific.

Athena is comparable to Google BigQuery. You basically pay for storage (S3) and data queried. However, in Athena, you also have to handle the underlying data files, format, and directory structure. This makes it more complex to handle, but more flexible.

It's also helpful to compare Amazon Athena to Amazon Redshift. Redshift's performance is superior, but you're also bound to the Redshift cluster limitations and cost. Athena, on the other hand, is serverless.

undefined

Filling Up the Data Lakes

Amazon Athena's Data Lakes compatibility goes a long way to making Amazon S3 an ideal environment for an organizational data lake. With Athena, data from all sources goes into S3. It then gets queried on an ad-hoc basis in a performant, scalable, accessible manner. It's a data professional's dream come true.

A key thing to remember: you still need to get that data into Amazon S3. It can come from other services you use on Amazon - RDS or EMR, services outside AWS (Salesforce, MixPlanel, Facebook, Google Analytics...etc), or other data stores on other platforms.

How Xplenty Can Help

The solution? Turn to Xplenty. Xplenty can write data to Amazon S3 in all the file formats Amazon Athena supports (CSV, TSV, JSON, and Parquet). Xplenty can also connect to more than 100 data sources and destinations so you can regularly pump data into your data lake from external data stores.

Another benefit? Xplenty is also a service, so companies using Xplenty don’t have to worry about maintenance or administration.

If you think that Xplenty could help you with Amazon Athena and your data lake, we’re happy to provide a demo, a seven-day free trial, and a free setup session with our implementation team. Just drop us a line at +1-888-884-6405 or set up a meeting with us here.

Originally Published: December 21, 2016