Redshift is Amazon’s data warehouse-as-a-service, a scalable columnar DB based on PostgreSQL. Xplenty can be used to offload ETL from Redshift, saving resources and allowing each platform to do what it does best: Xplenty for batch processing and Redshift for real-time querying.
Loading Data from Redshift
Data can be loaded from Redshift and processed by Xplenty by creating a Redshift connection and using this connection in a database source component. Note that Xplenty reads data directly from Redshift just like all other DBs, so the reading part isn’t currently offloaded while the processing is offloaded.
To integrate Redshift with Xplenty, first get the relevant details about your Redshift cluster:
- Login to AWS and go to the Redshift page
- Go to the “Clusters” page and select the relevant cluster
- Keep a note of the endpoint, port, database name, username, and password (the password is not available on this page)
Back in Xplenty, create the Redshift connection:
Login to Xplenty or sign up for free
Go to the “Manage connections” page
Click “new database connection” and select “Redshift”
Enter all the relevant connection details.
Click “Test connection” to make sure that it works and then click “Create connection.”
Now you can create a dataflow that processes the data:
Go to the package editor by clicking the “my packages” menu
Open an existing package or create a new one
Click “new source” at the top and select “database”
Choose the database connection that you have just created and fill in all the relevant fields—please see the documentation for full details.
Click OK and continue editing your package
Storing Results in Redshift
Xplenty can also store results back to Redshift once the batch processing is complete. The data is stored via S3, which is the fastest way to load data into Redshift. You can use Redshift’s workload management to make sure that storing the data has a minimal effect on queries.
- Add a new connection for the Redshift output, as described in the previous section
- Open the relevant package and add an “Amazon Redshift” destination by clicking the plus sign below the last component
- Choose the connection that you created and fill in all the relevant fields—please see the documentation for full details.
You can easily offload your ETL from Redshift by integrating it with Xplenty. Xplenty can load data from Redshift, process it, and then store the results back to Redshift. This way you can keep your ETL and data warehouse at maximum warp speed and save plenty of time and money.