Snowflake, the Elastic Data Warehouse in the Cloud, has several exciting features such as Virtual Warehouse, Time Travel, Zero Copy cloning, Secure Views, and Undrop. Through these offerings, Snowflake has re-imagined and re-invented what a data warehouse can do. In this article, we will explore two of these - Time Travel & Zero Copy Cloning - from within Integrate.io. 

For a detailed view of Snowflake architecture, here’s a reference. To connect to Snowflake from Integrate.io, click here.

For more information on Integrate.io's native Snowflake connector, visit our Integration page.

Table of Contents

  1. Time Travel
  2. Zero Copy Cloning
  3. Summary

Time Travel 

Time Travel enables accessing historical data at any point within a defined period. By default, the maximum retention period is 1 day (i.e. one 24 hour period), as mentioned in the Snowflake documentation on Understanding & Using Time Travel. With Snowflake Enterprise Edition (and higher), this default can be set to any value up to 90 days.

For a quick demonstration, here is a sample daily recurring data pipeline loading one week’s data incrementally from a Google Analytics connection onto Snowflake.

thumbnail image

The table GA_USERS in Snowflake shows the current data (T2 from the image below): 

thumbnail image

thumbnail image

Let’s say at T2 (present-day)we need to Time-Travel to the table at T0 (yesterday). This can simply be done with a single query.

From Integrate.io, lets source data from the Snowflake table (Snowflake source is accessible via the DB source component) on a separate pipeline with the query

select * from GA_USERS at(timestamp => '2020-09-21 18:18:59.769 -0700'::timestamp)

thumbnail image

This gives the records with date ending at 20200921.


thumbnail image

Offset

Another option when using the Time Travel feature is the ‘offset’.

To Time-Travel from T2 to T1(say 60 min prior before T2), we can time offset from the present to source the data. Similar to the pipeline mentioned above, let’s source data from Snowflake (accessible via the DB source component) and run the query:

select * from GA_USERS at(offset => -60*60);  

thumbnail image

This sources the data from the table 60 mins ago (which had 3 records deleted) and shows the preview as below.

thumbnail image

Data_retention_time_in_days

As stated above, the standard retention period is 1 day and is automatically enabled for all Snowflake accounts. The DATA_RETENTION_TIME_IN_DAYS is an Object parameter that can be used by users with the ACCOUNTADMIN role to set or control the default retention for the Snowflake account. More information on this is available here.

The Time Travel feature can be a powerful tool for:

  1. Picking up and processing from the desired state on the table.
  2. Processing of historical data to make up for any accidental inclusions or deletions on the data. 

Zero Copy Cloning

Dubbed as one of the most powerful Snowflake features, Zero Copy Cloning offers an easy way to clone and test the database without physically copying the data. For more information, refer to the Snowflake documentation.

From Integrate.io, furthering our data processing from before, we can create a clone of the Prod Database.

thumbnail image

thumbnail image

Using CREATE DATABASE DEV CLONE PROD;  as a Pre-action SQL command creates a database clone DEV with the table GA_USERS;

(Note: This requires SYSADMIN privileges to create a Database)

Or we can clone a table with CREATE OR REPLACE TABLE GA_USERS_clone CLONE GA_USERS;

This would be a great way to take a snapshot before doing a truncate and using this to compare incremental loading.

Taking the process a step further, we could combine Time-Travel + Clone on the Integrate.io Snowflake destination to capture a historical snapshot of the table on Test DB (Post-action SQL) while creating a clone on Dev (Pre-action SQL).

thumbnail image

thumbnail image

This creates multiple copies of the tables without replicating the data itself, while also providing the ability to quickly make the data available to use for multiple user groups, without the additional cost and time spent in actually replicating the data.

Summary

Zero copy cloning and Time Travel are truly remarkable features that simplify data warehousing and keep it agile. These features and more can be leveraged from within Integrate.io. To explore more of Snowflake’s native integration with Integrate.io, contact us to book your demo today.