Snowflake — Data on Cloud
Thanks for reading and subscribing to my blogs. Along with GCP series, I am starting with Snowflake series with few topics. I believe this series will help you understand Snowflake and its key features.
This is very first blog in the series where we are going to learn what is snowflake? What is architecture of snowflake which makes it different in comparison with other data on cloud providers.
What is Snowflake?
Snowflake offers Data on Cloud offered as SaaS — Software as a Service . Snowflake is not build on any other Database or Hadoop/Bigdata platform. This is data offering built from scratch for cloud offering.
Why Snowflake is SaaS ?
Snowflake is SaaS offering with features like –
1. There is no hardware to be selected to configure size of compute or storage
2. There is no software need to be installed or maintained for Snowflake
3. Similar to any other managed services, ongoing patching , maintaining , upgrades are taken care by Snowflake
Where Snowflake is hosted ?
Snowflake is completely run on cloud infrastructure. Snowflake services — storage, compute, metadata management are run on Public cloud infrastructure. Snowflake can not be run on Private infrastructure.
How to install Snowflake?
Snowflake is not offered as packaged software that can be installed by user. Snowflake manages its setup and offering to customers.
What is Snowflake Architecture ?
Snowflake is mix of Shared Disk and Shared nothing architecture. Similar to shared disk — Snowflake has centralized storage layer which is accessible from all compute/warehouses. Similar to shared nothing architecture, Snowflake processes queries using MPP. This architecture offers the data management simplicity of a shared-disk architecture, but with the performance and scale-out benefits of a shared-nothing architecture.
What are Snowflake Architecture Components?
There are 3 components in Snowflake architecture. Snowflake architecture is referred as 3 layered architecture as well as Shared data multi cluster architecture. Snowflake’s 3 layered unique architecture consists of three key layers:
Database Storage
Query Processing
Cloud Services
How these layers integrate to make Snowflake unique — Shared data architecture.
Database Storage
This is the layer where Data gets stored. Snowflake organizes data into internal, optimized, compressed and columnar format. This is stored in cloud storage.
Snowflake manages all — how this data is stored — the organization, file size, structure, compression, metadata, statistics, and other aspects of data storage are handled by Snowflake.
Data objects stored by Snowflake are not directly accessible or visible by customers, they are only accessible through SQL or query operations using Snowflake.
Query Processing
This is layer where query execution or processing takes place. Snowflake runs queries using compute called “ Virtual Warehouses”. There are different types of warehouses can be spun up to meet customer need. These compute/ware houses are MPP compute cluster allocated by Snowflake from cloud provider.
Every Virtual warehouse is independent, they do not share compute resources amongst each other. These warehouses are not dependent and have impact on performance of each other.
Cloud Services
This is service layer of Snowflake where various services are present which helps in co-ordinating various activities across snowflake. This helps in processing user requests from logon to query execution. This layer also runs on compute provided by Snowflake from cloud provider.
Some of key services managed in this layer include:
Authentication
Infrastructure management
Metadata management
Query parsing and optimization
Access control
I believe, by now you know what is Snowflake and how this is different than other offerings. We have also learnt about different layers of Snowflake Architecture. This is just beginning of Snowflake journey. In next blog, we will learn more about Snowflake integrations and load-unload utilities.
About Me :
I am DWBI and Cloud Architect! I am currently working as Senior Data Architect — GCP, Snowflake. I have been working with various Legacy data warehouses, Bigdata Implementations, Cloud platforms/Migrations. I am SnowPro Core certified Data Architect as well as Google certified Google Professional Cloud Architect. You can reach out to me LinkedIn you need any further help on certification, Data Solutions and Implementations!