0% found this document useful (0 votes)

5 views34 pages

Snowflake Data Storage and Management Guide

The document provides an overview of data storage and management in Snowflake, detailing its columnar storage format, micro-partitioning, and various data types. It covers table types, constraints, views, and access control mechanisms, including role-based access management. Additionally, it discusses features like Time Travel, zero-copy cloning, and dynamic tables for efficient data handling and querying.

Uploaded by

M jameel ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views34 pages

Snowflake Data Storage and Management Guide

Uploaded by

M jameel ahmed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

• Data Storage in Snowflake

• Supported Data Types

• Defining Constraints and Using Operators

Learning
• Database and Type with Features
Objectives

• Table Types with Features

• Types of Views with Features and Applications

• Role Based Access Control + Access Management

 When data is loaded into Snowflake, Snowflake reorganizes that data into its internal optimized,
compressed, columnar format. Snowflake stores this optimized data in cloud storage(AWS, GCP, AZURE).

 Row-based storage is ideal for transactional databases with frequent insertions and updates, while
column-based storage excels in analytical workloads involving large-scale data scans and aggregations

 Data files staged in Snowflake internal stages are not subject to the additional costs associated with Time
Travel and Fail-safe, but they do incur standard data storage costs
Feature (Row-Based) (Traditional RDBMS) (Columnar-Based) (Snowflake)

Storage Format Primarily row-based (row partitioning) Fully columnar storage

Data Retrieval Best for full record reads Best for accessing specific columns

Compression Moderate (block-level compression) High (columnar compression with encoding)

Slower for frequent small updates (uses micro-

Insert/Update Performance Faster for row-by-row inserts and updates
partitions)

Aggregation Queries Slower for large-scale aggregation Very fast due to columnar and MPP architecture

Storage Optimization Row-level locking and indexing Automatic clustering and partition pruning

Concurrency Good for mixed workloads (OLTP + OLAP) Excellent for concurrent analytic workloads

Best for operational systems, real-time Best for data warehousing, BI, and big data
Use Case Fit
transactions analytics
Cloud-native auto-scaling with separate
Scaling Shared-nothing, linear scaling
compute/storage
Minimal indexing, relies on columnar scan and
Indexing Heavily relies on indexes for performance
metadata

Example Scenario Bank transaction system, CRM systems Marketing analytics, reporting dashboards
Introduction
to Micro
partitions

Snowflake automatically organizes all data in tables into

micro-partitions, which are contiguous storage units
 Size: Each micro-partition contains between 50MB and 500MB of uncompressed data.

 Automatic Table Partitioning: Snowflake automatically creates and manages micro-partitions for all tables.

 Columnar Storage: Micro-partitions store data in a columnar format, enabling efficient compression and
query optimization.

 Metadata: Snowflake stores metadata for each micro-partition, including the range of values for each column,
distinct value counts, and other information that aids in query optimization.

 Query Performance: Improve query performance through horizontal and vertical query pruning, scanning only
the needed micro-partitions for better query performance.

 Immutable: Original micro-partitions in Snowflake remain immutable. When data changes new micro-
partitions are generated, and the existing ones are never altered.
Numeric Data Types
 DECIMAL, NUMERIC , NUMBER: Default precision and scale are (38,0).
 INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT: Synonymous with NUMBER, except precision
and scale can’t be specified.
 FLOAT , FLOAT4 , FLOAT8: Snowflake treats all three as 64-bit floating-point numbers.

String & Binary Data Types

 VARCHAR: Default (and maximum) length is 16,777,216 bytes.
 CHAR, CHARACTER: Synonymous with VARCHAR, except the default length is VARCHAR(1).
 STRING, TEXT: Synonymous with VARCHAR.
 BINARY: The maximum length is 8 MB.

Logical Data Types

 BOOLEAN: Can have TRUE or FALSE values. BOOLEAN can also have an UNKNOWN value, which is
represented by NULL
Date and Time Data Types
 DATE: DATE accepts dates in the most common forms (YYYY-MM-DD, DD-MON-YYYY, and so on).
 TIME: Snowflake supports a single TIME data type for storing times in the form of HH:MI:SS
 TIMESTAMP_LTZ , TIMESTAMP_NTZ , TIMESTAMP_TZ.
a. TIMESTAMP_LTZ: TIMESTAMP with local time zone; time zone, if provided, isn’t stored.
b. TIMESTAMP_NTZ: TIMESTAMP with no time zone; time zone, if provided, isn’t stored.
c. TIMESTAMP_TZ : TIMESTAMP with time zone.
 DATETIME: It is an alias for TIMESTAMP_NTZ

Semi Structured Data Types

 VARIANT: A VARIANT value can have a maximum size of up to 128 MB of uncompressed data.
 OBJECT: An OBJECT value contains key-value pairs .
 ARRAY: An array contains 0 or more pieces of data. Each element is accessed by specifying its
position in the array.
 Snowflake supports the following constraint types : PRIMARY KEY, FOREIGN KEY, UNIQUE, NOT NULL but does
not enforce them, except for NOT NULL constraints, which are always enforced.

 Snowflake supports constraints on permanent, transient, temporary, and hybrid tables

Operators in
Snowflake
Types of
Tables in
Snowflake
 Standard table type in Snowflake. When you create a table without specifying the type, it becomes a permanent table
 Data stored in permanent tables persists indefinitely (subject to your Snowflake account retention policies) and is
available for querying until explicitly dropped.
 Permanent tables contribute to your overall storage costs.
 Use Cases: General-purpose data storage, data warehousing, business intelligence, long-term data retention
 Example: CREATE OR REPLACE TABLE PERMANENT_TBL (NAME VARCHAR(100))

 Temporary tables exist only for the duration of the session in which they were created.
 Once the session ends (e.g., you disconnect from Snowflake), the data and the table definition are automatically
dropped.
 Temporary tables do not contribute to your long-term storage costs.
 Use Cases: ETL/ELT staging, temporary workspaces for data transformation, storing intermediate query results.
 Example: CREATE OR REPLACE TEMPORARY TABLE TEMP_TABLE (NAME VARCHAR(100));
 Transient tables are an alternative to regular tables, designed for scenarios where data persistence isn’t a top
priority
 Transient tables are similar to permanent tables with the key difference that they do not have a Fail-safe period.
 Like permanent tables, transient tables add to the overall storage costs associated with your account.
 Use Cases: transient tables are specifically designed for transitory data that needs to be maintained beyond each
session
 Example: create transient table student (id number, name varchar(100));
 Snowflake also supports creating
transient databases and schemas.
 All tables created in a transient
schema, as well as all schemas
created in a transient database, are
transient by definition

 After creation, transient tables

cannot be converted to any other
table type.

 Temporary table can exist with

same name as permanent table
inside same schema however all
queries and other operations
performed in the session on the
table affect only the temporary
table.
Dynamic
Tables

 A dynamic table materializes the results of a query that you specify and is automatically kept up
to date on a schedule.

 Instead of creating a separate target table and writing code to transform and update the data in
that table, you can define the target table as a dynamic table, and you can specify the SQL
statement that performs the transformation.
 Dynamic Tables are refreshed at a frequency you specify, called “target lag”

 For example, you can specify that the data should be at most five minutes behind the updates to the base
table. Based on this target freshness, the automated process sets up refreshes so that the data in the
dynamic table is kept up to date within this target

 Dynamic tables support two refresh modes

a. Incremental refresh
b. Full refresh

 Dynamic table is used in below scenarios:

a. You don’t want to write code to track data dependencies and manage data refresh
b. You don’t need, or want to avoid, the complexity of streams and tasks.
c. You do need to materialize the results of a query of multiple base tables.
d. You don’t need fine-grained refresh schedule control and you just want to specify the target data freshness
for your pipelines.
 Database Object that contains SQL query built over one or more tables.

 Using a view, we can pull only selected columns from the underlying table and provide custom output to
users without letting them access table directly.
 Created to access data from one or more tables in a
custom manner compared to accessing individual
tables directly to avoid running complex SQL
Non – statements by business users.
materialized  Example: Employee table, department table, salary
Views table needs to be joined and provide comprehensive
view to manager to see the relevant columns only.
 Store the results of the underlying query as a physical table within
Snowflake.
 Joining of tables is not allowed.
 Snowflake automatically refreshes materialized views after changes are
made to the base tables they depend on.
Materialized  Example: Creating pre-aggregated data for dashboards and reporting

Views  Available in Enterprise Edition and Higher.

 Materialized view incurs both storage cost and compute cost.

• Limitations:
 Can query single table only
 Does not support normal joins or self joins
 Does not support all aggregate and window functions
 A secure view doesn’t allow users to see definition of view.

 Useful for concept of data sharing with secure views only.

 SECURE keyword needs to be used in view DDL to create

secure views.

 Definition is exposed only to the authorized users.

Secure Views
 The internals of a secure view are not exposed even in query
profile to any user.

 For queries containing secure views snowflake doesn’t expose

the amount of data scanned or the total amount of data.
 Zero-copy cloning in Snowflake creates a copy of a database, schema, or table without
physically duplicating the underlying data

 We can maintain multiple copies of data at no extra cost hence zero copy clone

 Changes made in the clone do not affect the original, and vice versa

 Zero-copy cloning is used for various purposes like testing, development, and data analysis.

 No additional storage costs are incurred until changes are made to the data

 Enables testing and development environments without impacting production data.

 Objects that can be cloned: Databases, Schemas , Tables, Streams , Stages , File Formats ,
Tasks.
Snowflake Time Travel enables accessing historical data (i.e. data that has been changed or deleted) at
any point within a defined period
 Using Time Travel, you can perform the following actions within a defined period of time:

a) Query data in the past that has since been updated or deleted.
b) Create clones of entire tables, schemas, and databases at or before specific points in the past.
c) Restore tables, schemas, and databases that have been dropped.

Retention Period
 Number of days for which historical data is preserved.
a) Standard Edition: 1 Day Retention Period (can be updated to 0)
b) Enterprise Edition and higher: 90 days retention period (can be updated b/w 0-90)

 Retention period of 0 days means TT is disabled for the object

 Higher the retention period, higher the storage cost

 Method 1: At Specified Timestamp
 SELECT * FROM my_table AT(TIMESTAMP => 'Wed, 26 Jun 2024 09:20:00 -0700'::timestamp_tz);

 Method 2: At Some Time Ago

 SELECT * FROM my_table AT(OFFSET => -60*5);

 Method 3: Before executing any statement or query

 SELECT * FROM my_table BEFORE(STATEMENT => '8e5d0ca9-005e-44e6-b858-a8f5b37c5726');

 Method 4: Undrop objects under retention period

 UNDROP TABLE ‘TABLE_NAME’;
 UNDROP SCHEMA ‘SCHEMA_NAME’;
 UNDROP DATABASE ‘DATABASE_NAME’;
 ACCOUNTADMIN: The god-mode role that can do anything on the account-level. ACCOUNTADMIN role inherits
all the privileges that SECURITYADMIN and SYSADMIN have.

 SECURITYADMIN - This role has the MANAGE GRANTS account-level privilege granted to it, allowing it to grant (or
revoke) grants on all objects in the account to (or from) other roles. Notice that SECURITYADMIN role inherits all
the privileges passed to it from the USERADMIN role.

 USERADMIN - As its name suggests, this role is used for creating and managing users (with the help of CREATE
USER and CREATE ROLE account-level privileges.

 SYSADMIN - This role is allowed to create all objects in the account (including databases, warehouses, and other
schema-level objects).

 PUBLIC - All users in the account are granted the PUBLIC role by default, but unless explicitly granted access to
objects, it can not do much. Still, be very careful not to accidentally grant it privileges on important data, as all
users in the account will be able to access it
 Discretionary Access Control (DAC)
Each object has an owner, who can in turn
grant access to that object.

 Role-based Access Control (RBAC)

Access privileges are assigned to roles, which
are in turn assigned to users.

 User-based Access Control (UBAC)

Access privileges are assigned directly to
users.
 User - An entity that enables a person (or service) to connect to Snowflake
 Object - An entity that a User can access (i.e. table, view, database), if they have the right privileges.
 Privilege - An operation that a User could execute on an Object, if their Role has been granted it.
 Role - A bridging entity between Users and Privileges. Privileges are granted to Roles, and Roles are granted
to Users.
 Secondary Roles: Additional roles that can be activated alongside a user's primary role within a session
 USE SECONDARY ROLES ALL: Activates all roles granted to the user as secondary roles, in addition to the
primary role.
 USE SECONDARY ROLES NONE: Disables all secondary roles.
 USE SECONDARY ROLES <role1>, <role2> : Activates a specific set of roles as secondary roles.

 Future Grants: As new objects are created, the defined privileges are automatically granted to a role, simplifying
grant management
Example:
-- Grant the SELECT privilege on all new (future) tables in a schema to role R1
GRANT SELECT ON FUTURE TABLES IN SCHEMA s1 TO ROLE r1;
-- Granting USAGE and MONITOR on all future schemas in a database
GRANT USAGE, MONITOR ON FUTURE SCHEMAS IN DATABASE my_database TO ROLE admin_role;

Snowflake Data Warehousing Essentials
100% (1)
Snowflake Data Warehousing Essentials
40 pages
Snowflake Data Access and Management
No ratings yet
Snowflake Data Access and Management
22 pages
Detailed Snowflake Notes
No ratings yet
Detailed Snowflake Notes
8 pages
Snowflake Architecture and Features Overview
No ratings yet
Snowflake Architecture and Features Overview
8 pages
Snowflakecheatsheetv 1
No ratings yet
Snowflakecheatsheetv 1
17 pages
Snowflake Data Warehouse Overview
No ratings yet
Snowflake Data Warehouse Overview
6 pages
Snowflake Components and SnowSQL Setup
No ratings yet
Snowflake Components and SnowSQL Setup
45 pages
Understanding Snowflake Architecture
100% (1)
Understanding Snowflake Architecture
35 pages
Snowflake
No ratings yet
Snowflake
7 pages
Snowflake Snowpro Core Ebook
No ratings yet
Snowflake Snowpro Core Ebook
12 pages
Understanding Snowflake Stages Explained
No ratings yet
Understanding Snowflake Stages Explained
51 pages
Snowflake Cloud Data Platform Overview
No ratings yet
Snowflake Cloud Data Platform Overview
108 pages
Introduction to Snowflake Data Warehouse
No ratings yet
Introduction to Snowflake Data Warehouse
32 pages
Snowflake Query Management Essentials
No ratings yet
Snowflake Query Management Essentials
122 pages
Snowflake Architecture and Data Management
100% (1)
Snowflake Architecture and Data Management
51 pages
Overcoming Data Warehouse Challenges with Snowflake
No ratings yet
Overcoming Data Warehouse Challenges with Snowflake
5 pages
Snowflake Database Overview and Features
No ratings yet
Snowflake Database Overview and Features
10 pages
Snowflake Cloud Data Warehouse Overview
No ratings yet
Snowflake Cloud Data Warehouse Overview
73 pages
Snowflake Types of Tables Guide
No ratings yet
Snowflake Types of Tables Guide
4 pages
Snowflake vs. Redshift vs. BigQuery
No ratings yet
Snowflake vs. Redshift vs. BigQuery
4 pages
Snowflake Data Migration and Architecture Guide
No ratings yet
Snowflake Data Migration and Architecture Guide
102 pages
Snowflake Data Warehouse Architecture
No ratings yet
Snowflake Data Warehouse Architecture
20 pages
Snowflake Programming Masterclass 2024
100% (1)
Snowflake Programming Masterclass 2024
342 pages
Snowflake Data Warehouse Overview
No ratings yet
Snowflake Data Warehouse Overview
16 pages
Snowflake Architecture Overview
No ratings yet
Snowflake Architecture Overview
220 pages
Snowflake NoSQL Column-Oriented Techniques
No ratings yet
Snowflake NoSQL Column-Oriented Techniques
26 pages
Snowflake Tutorial and Key Concepts Guide
91% (11)
Snowflake Tutorial and Key Concepts Guide
67 pages
Snowflake Data Warehouse Features Guide
No ratings yet
Snowflake Data Warehouse Features Guide
68 pages
Snowflake Architecture and Key Concepts
No ratings yet
Snowflake Architecture and Key Concepts
3 pages
Snowflake Architecture and Features Explained
No ratings yet
Snowflake Architecture and Features Explained
4 pages
Snowflake - Interview Questions
No ratings yet
Snowflake - Interview Questions
15 pages
Snowflake Data Management Features
No ratings yet
Snowflake Data Management Features
16 pages
Snowflake Views vs Tables Explained
No ratings yet
Snowflake Views vs Tables Explained
12 pages
Snowflake Training Overview and Features
No ratings yet
Snowflake Training Overview and Features
136 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
71 pages
Comprehensive Snowflake MasterClass Guide
33% (3)
Comprehensive Snowflake MasterClass Guide
192 pages
Snowflake Object Management Guide
No ratings yet
Snowflake Object Management Guide
10 pages
Snowflake Basics
No ratings yet
Snowflake Basics
11 pages
Snowflake 10 Page Study Guide
No ratings yet
Snowflake 10 Page Study Guide
10 pages
Understanding Snowflake Architecture
No ratings yet
Understanding Snowflake Architecture
5 pages
SnowPro Core Certification Cheat Sheet
No ratings yet
SnowPro Core Certification Cheat Sheet
2 pages
Snowflake: Company Origins and Features
0% (1)
Snowflake: Company Origins and Features
15 pages
Snowflake 03
No ratings yet
Snowflake 03
10 pages
Snowflake Interview Q&A Cheat Sheet
No ratings yet
Snowflake Interview Q&A Cheat Sheet
2 pages
Snowflake Architecture Deep Dive
No ratings yet
Snowflake Architecture Deep Dive
51 pages
Snowflake Data Warehousing Overview
No ratings yet
Snowflake Data Warehousing Overview
10 pages
Snowflake Architecture
No ratings yet
Snowflake Architecture
9 pages
Snowflake Data Cloud Overview
No ratings yet
Snowflake Data Cloud Overview
2,157 pages
Snowflake Database Architecture Overview
No ratings yet
Snowflake Database Architecture Overview
20 pages
E-Invoicing and FX Rate Data Pipelines
No ratings yet
E-Invoicing and FX Rate Data Pipelines
12 pages
Snowflake Overview and Key Features
No ratings yet
Snowflake Overview and Key Features
20 pages
Snowflake Connection and DDL Guide
No ratings yet
Snowflake Connection and DDL Guide
50 pages
Snowflake Interview Preparation Guide
No ratings yet
Snowflake Interview Preparation Guide
14 pages
Snowflake: Key Concepts Overview
No ratings yet
Snowflake: Key Concepts Overview
2 pages
Snowflake Cloud Database Overview
No ratings yet
Snowflake Cloud Database Overview
50 pages
Snowflake Course Overview and Guide
No ratings yet
Snowflake Course Overview and Guide
83 pages
Overview of AWS Redshift Features
No ratings yet
Overview of AWS Redshift Features
145 pages
DDoS Attack Patterns and Mitigation
No ratings yet
DDoS Attack Patterns and Mitigation
66 pages
Harley Benton fxl8 Pro
No ratings yet
Harley Benton fxl8 Pro
44 pages
Information Systems For Managers, CH 4
No ratings yet
Information Systems For Managers, CH 4
89 pages
Chapter 1.3 - Review of Elementary Data Structures
No ratings yet
Chapter 1.3 - Review of Elementary Data Structures
7 pages
Design and Development of Centralized Squid Proxy Management System
No ratings yet
Design and Development of Centralized Squid Proxy Management System
6 pages
Computer Networks and Data Science Overview
No ratings yet
Computer Networks and Data Science Overview
4 pages
Evolution of Information Technology
No ratings yet
Evolution of Information Technology
8 pages
Overview of SONET Technology
No ratings yet
Overview of SONET Technology
9 pages
Online Cake Order System Overview
No ratings yet
Online Cake Order System Overview
41 pages
Royal Greenland's SAP Digital Transformation
No ratings yet
Royal Greenland's SAP Digital Transformation
7 pages
ERP System Functional Requirements FRS
No ratings yet
ERP System Functional Requirements FRS
16 pages
Understanding Blockchain Technology Basics
No ratings yet
Understanding Blockchain Technology Basics
26 pages
EA-IO Addressable I/O Module Manual
No ratings yet
EA-IO Addressable I/O Module Manual
4 pages
Understanding IEEE 802.5 Token Ring
No ratings yet
Understanding IEEE 802.5 Token Ring
4 pages
SLC Facilitator Day-Of Guide
No ratings yet
SLC Facilitator Day-Of Guide
4 pages
Database and BI Developer Resume
No ratings yet
Database and BI Developer Resume
2 pages
Set Up Hadoop Single Node Cluster
No ratings yet
Set Up Hadoop Single Node Cluster
3 pages
CTF Problem Solving Strategies
No ratings yet
CTF Problem Solving Strategies
16 pages
MedWrench User Guide and Resources
No ratings yet
MedWrench User Guide and Resources
10 pages
KMIT Programming for Problem Solving Lab
No ratings yet
KMIT Programming for Problem Solving Lab
136 pages
Anand Raghunathan Raghunathan@purdue - Edu: ECE 695R: S - C D
No ratings yet
Anand Raghunathan Raghunathan@purdue - Edu: ECE 695R: S - C D
12 pages
Mesh-TensorFlow: Advanced DNN Training
No ratings yet
Mesh-TensorFlow: Advanced DNN Training
10 pages
Machine Learning & Robotics in Automation
No ratings yet
Machine Learning & Robotics in Automation
9 pages
Cryptography and Network Security Solutions
92% (13)
Cryptography and Network Security Solutions
196 pages
Individual Business Report
No ratings yet
Individual Business Report
13 pages
3D Printer Setup Guide in Serbian
No ratings yet
3D Printer Setup Guide in Serbian
2 pages
HiP19 Cracking Mifare Classic On The Cheap Workshop
100% (1)
HiP19 Cracking Mifare Classic On The Cheap Workshop
78 pages
E-commerce API Development Guide
100% (1)
E-commerce API Development Guide
4 pages
WiFi-Controlled Smart Quadcopter Design
No ratings yet
WiFi-Controlled Smart Quadcopter Design
10 pages
AI in Computer Vision: Applications & Impact
No ratings yet
AI in Computer Vision: Applications & Impact
27 pages

Snowflake Data Storage and Management Guide

Uploaded by

Snowflake Data Storage and Management Guide

Uploaded by

• Data Storage in Snowflake

• Supported Data Types

• Defining Constraints and Using Operators

• Table Types with Features

• Types of Views with Features and Applications

• Role Based Access Control + Access Management

Storage Format Primarily row-based (row partitioning) Fully columnar storage

Compression Moderate (block-level compression) High (columnar compression with encoding)

Slower for frequent small updates (uses micro-

Snowflake automatically organizes all data in tables into

String & Binary Data Types

Logical Data Types

Semi Structured Data Types

 Snowflake supports constraints on permanent, transient, temporary, and hybrid tables

 After creation, transient tables

 Temporary table can exist with

 Dynamic tables support two refresh modes

 Dynamic table is used in below scenarios:

Views  Available in Enterprise Edition and Higher.

 Useful for concept of data sharing with secure views only.

 SECURE keyword needs to be used in view DDL to create

 Definition is exposed only to the authorized users.

 For queries containing secure views snowflake doesn’t expose

 Enables testing and development environments without impacting production data.

 Retention period of 0 days means TT is disabled for the object

 Higher the retention period, higher the storage cost

 Method 2: At Some Time Ago

 Method 3: Before executing any statement or query

 Method 4: Undrop objects under retention period

 Role-based Access Control (RBAC)

 User-based Access Control (UBAC)

You might also like