Elt Process With Snowflake Stored Procedure and Task
Elt Process With Snowflake Stored Procedure and Task
ELT Highlights
• Cloud-based data warehouses offer unlimited storage as well as compute processing power.
This make ELT more viable on cloud-based servers.
• In the ELT world, it transforms only the data required for business decision.
• ELT allows you to load all forms of data immediately once it is available.
ELT Process
This article attempts to discuss how one can perform data integration from snowflake staging area to final table,
using snowflake stored procedure and schedule it using “Task”.
Stored Procedure
A stored procedure is useful to perform one or more SQL, data transform and data validation. A stored procedure
may contain one or many statements and even call additional stored procedures, passing parameters through as
required.
Currently following UDF’s are supported in snowflake:
• SQL
• Javascript
SQL UDF:
• A SQL UDF evaluates a random SQL expression and returns the result in the form of tabular or scalar format.
Javascript:
• A JavaScript UDF lets one use the JavaScript programming language to manipulate data and return either scalar
or tabular results
ELT Process
At the end of this article, one would have preliminary information on,
• How to create a stored procedure in snowflake
• How to call one stored procedure from another procedure
• Variable concatenation / binding
Snowflake stored procedure will read the metadata table, execute respective SQL and return the status:
ELT_SCHEMA_DETAILS
Note: Loading data from the stage table to the target table can be done using snowflake Merge statement. In order
to process the data from the stage table to the target table using a merge statement, we need the primary key.
MERGE
INTO <target_table>
using <source>
ON <join_expr> { matchedClause | notMatchedClause } [ ... ]
Snowflake task allows to schedule a SQL script or stored procedure on snowflake instance. Snowflake support
CRON based job schedule. CRON is the Linux version of windows task schedule, and it has a simpler mechanism
regards to run a job.
10 * * * * Every 10 Min
Create Task