Way back when, I worked with a business intelligence software that provided customers with a platform that presented data as interactive data visualizations (think dashboards and pretty charts).
Through various conversations with customers and BI people, I first encountered the term "ETL" or what data people affectionately call Extremely Tough to Load.
What I learned about ETL
ETL is a critical, necessary process for almost all analytics projects. To get a feel for it, I started with the basics; chatting with every data engineer I knew to hear about their challenges, and researching existing ETL tools in the marketplace.
My logic was, that if I could find an easier way for my clients to ETL, I’d be able to implement my Business Intelligence software for them faster. Unfortunately, I encountered the same harsh reality described, ETL is complicated, with many challenges along the way, and implementation can be a daunting task.
Common ETL implementation challenges:
Lack of engineering resources to write custom code for everything from establishing data connection, building a pipeline, scheduling, to maintenance.
Steep learning curve for those that are able invest in ETL software.
Uncertainty when it comes to scale. No one can predict the volume of data that will need to be processed down the line.
During my research, I was introduced to Xplenty and my curiosity was piqued - The idea of simplified ETL, without any lines of code or need to deploy sounded too good to be true. I remained skeptical, so I went ahead with my own due diligence.
Data Processing without Programming
I was pleased to see that Xplenty had completely simplified the ETL process to the extent that someone without any programming background (ahem, me) could easily create an ETL pipeline in matter of minutes.
I’ll take you through my process.
Step 1: Login to Xplenty
Xplenty’s interface is cloud-based, so to get started, you sign up, get a free trial and go.
There is no software to set up or deploy because it is a SaaS product. How great is that?!
Step 2: Establish Source and Destination connection
Xplenty has a variety of connections you can choose from, including Amazon S3, Amazon Redshift, Google BigQuery, SFTP, Google AdWord/Analytics, SQL, Postgres and MongoDB - to name a few.
Step 3: Insert your credentials
Add the necessary credentials to your data source and get connected within seconds.
Step 4: Build Your ETL pipeline
(At Xplenty these pipelines are called ‘packages’)
The interface is incredibly easy to use. Dev invested a lot of time creating a product that would be user friendly for data engineers and BI users alike.
Step 5: Execute your ETL pipeline
Create a cluster (infrastructure) to execute the job. As the end user, I decided how much resources I wanted to allocate by dragging a button. This is my favorite Xplenty feature because it was so easy to scale up or down depending on data volume and/or complexity of the transformation.
Step 6 Optional: Create a schedule for your ETL process
With the ability to read incremental data loads, why not automate the ETL process and simplify life? (So I did).
There were a few other steps in between - for example, defining the business logic within each transformation component, but it typically took under an hour to set up a package and execute it. Certainly more efficient than spending countless hours coding, or reading through pages and pages of documentation on existing ETL platforms.
I was sold. So much so, I joined the Xplenty team.