Snowflakecloud~15 mins

Snowflake architecture (storage, compute, services layers) - Deep Dive

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Overview - Snowflake architecture (storage, compute, services layers)

What is it?

Snowflake architecture is a way to organize how data is stored, processed, and managed in the Snowflake cloud data platform. It separates storage, compute, and services into different layers that work together but can scale independently. This design helps users handle large amounts of data efficiently and run many queries at the same time without slowing down.

Why it matters

Without this architecture, data platforms would struggle to balance storage needs and computing power, causing slow queries and high costs. Snowflake’s design solves this by letting storage grow separately from compute, so companies only pay for what they use and get fast results. This means better performance, flexibility, and cost control for businesses working with big data.

Where it fits

Before learning Snowflake architecture, you should understand basic cloud computing and data storage concepts. After this, you can explore how to write queries in Snowflake, optimize performance, and manage security. This architecture is a foundation for mastering Snowflake’s features and cloud data warehousing.

Mental Model

Core Idea

Snowflake architecture splits data storage, computing power, and management services into separate layers that work independently but together to deliver fast, scalable, and cost-efficient data processing.

Think of it like...

Imagine a restaurant kitchen where the pantry (storage) holds all ingredients, the chefs (compute) prepare meals, and the managers (services) coordinate orders and quality. Each part works separately but must communicate smoothly to serve customers quickly and well.

┌───────────────┐   ┌───────────────┐   ┌───────────────┐
│   Services    │──▶│   Compute     │──▶│   Storage     │
│  Layer       │   │   Layer       │   │   Layer       │
│ (Coordination│   │ (Processing)  │   │ (Data held)   │
│  & Security) │   │               │   │               │
└───────────────┘   └───────────────┘   └───────────────┘

Build-Up - 7 Steps

FoundationUnderstanding Cloud Data Storage Basics

Concept: Learn what cloud data storage means and why it is important for modern data platforms.

Cloud data storage means saving data on remote servers accessed over the internet instead of local computers. This allows easy access, sharing, and scaling without buying physical hardware. Data is stored in files or tables and can grow as needed.

Result

You understand that data is kept safely and flexibly in the cloud, ready for processing.

Knowing cloud storage basics helps you grasp why separating storage from compute is powerful in Snowflake.

FoundationWhat Compute Means in Data Platforms

IntermediateServices Layer Role in Snowflake

IntermediateHow Storage Layer Works Independently

IntermediateCompute Layer and Virtual Warehouses

AdvancedHow Layers Communicate and Coordinate

ExpertOptimizations and Multi-Cluster Warehouses

Under the Hood

Snowflake’s architecture separates storage, compute, and services into distinct layers connected by secure APIs. Storage uses cloud object storage to hold compressed, columnar data files. Compute runs in virtual warehouses that read data from storage on demand. The services layer manages metadata, security, query parsing, and optimization, coordinating compute and storage without holding data itself. This separation allows independent scaling and efficient resource use.

Why designed this way?

Snowflake was designed to overcome limits of traditional data warehouses that tightly couple storage and compute, causing bottlenecks and high costs. By separating layers, Snowflake enables elastic scaling, better concurrency, and cost savings. Alternatives like monolithic systems were less flexible and more expensive to operate at scale.

┌───────────────┐          ┌───────────────┐          ┌───────────────┐
│   Services    │─────────▶│   Compute     │─────────▶│   Storage     │
│  Layer       │          │   Layer       │          │   Layer       │
│ (Metadata,   │          │ (Virtual      │          │ (Cloud Object │
│  Security,   │          │  Warehouses)  │          │  Storage)     │
│  Coordination)│          │               │          │               │
└───────────────┘          └───────────────┘          └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Do you think Snowflake stores data inside compute clusters? Commit to yes or no.

Common Belief:Snowflake stores data inside its compute clusters for faster access.

Tap to reveal reality

Quick: Do you think all users share the same compute resources in Snowflake? Commit to yes or no.

Common Belief:All users run queries on the same compute resources, so heavy use slows everyone down.

Tap to reveal reality

Quick: Do you think the services layer does heavy data processing? Commit to yes or no.

Common Belief:The services layer processes data and runs queries like compute does.

Tap to reveal reality

Quick: Do you think storage and compute must always scale together? Commit to yes or no.

Common Belief:Storage and compute scale together because they are tightly linked.

Tap to reveal reality

Expert Zone

Virtual warehouses can be paused to save costs without losing data or metadata.

The services layer caches metadata aggressively to reduce latency and improve query planning speed.

Snowflake’s storage uses micro-partitions with automatic clustering, which optimizes query performance without manual tuning.

When NOT to use

Snowflake architecture is not ideal for real-time transactional systems requiring millisecond latency. For such cases, specialized OLTP databases or streaming platforms are better. Also, if you need on-premises deployment, Snowflake’s cloud-only design is not suitable.

Production Patterns

In production, teams use multiple virtual warehouses sized and scheduled for different workloads, such as ETL, reporting, and ad-hoc queries. They leverage multi-cluster warehouses for concurrency and use resource monitors to control costs. The services layer is configured with role-based access control for security.

Connections

Microservices Architecture

Both separate concerns into independent layers or services that communicate via APIs.

Understanding Snowflake’s layered design helps grasp how microservices isolate functions for scalability and maintainability.

Operating System Kernel

The services layer acts like an OS kernel managing resources and coordinating tasks between hardware (storage) and applications (compute).

Seeing the services layer as a kernel clarifies its role in managing metadata, security, and resource allocation.

Restaurant Kitchen Workflow

Similar to how storage, compute, and services layers work, a kitchen separates ingredient storage, cooking, and order management.

This cross-domain connection shows how separating roles improves efficiency and scalability in complex systems.

Common Pitfalls

#1Trying to scale compute and storage together manually.

Wrong approach:Manually increasing both storage size and compute warehouse size at the same time for every workload change.

Correct approach:Scale storage independently as data grows and adjust compute warehouses based on query load separately.

Root cause:Misunderstanding that storage and compute are separate layers that scale independently.

#2Using a single virtual warehouse for all workloads.

Wrong approach:Running all queries on one virtual warehouse regardless of workload type or concurrency needs.

Correct approach:Create multiple virtual warehouses sized and scheduled for different workloads to avoid resource contention.

Root cause:Not knowing that virtual warehouses isolate compute resources for better performance.

#3Assuming services layer processes data directly.

Wrong approach:Trying to optimize query speed by focusing on services layer settings expecting it to speed up data processing.

Correct approach:Focus on compute warehouse sizing and query optimization since services layer manages coordination, not data processing.

Root cause:Confusing the role of the services layer with compute.

Key Takeaways

Snowflake architecture separates storage, compute, and services into independent layers for flexibility and efficiency.

Storage holds all data in cloud object storage, allowing it to scale without affecting compute resources.

Compute runs in virtual warehouses that can be started, stopped, and resized independently to handle different workloads.

The services layer manages metadata, security, and coordination but does not store or process data itself.

This design enables Snowflake to deliver fast, scalable, and cost-effective data processing in the cloud.

Practice

(1/5)

1. Which layer in Snowflake architecture is responsible for storing data securely in the cloud?

easy

A. Network layer

B. Storage layer

C. Services layer

D. Compute layer

Snowflake architecture (storage, compute, services layers) - Deep Dive

Start learning this pattern below

Practice

Solution

Step 1: Understand Snowflake layers

Step 2: Identify the storage role

Final Answer:

Quick Check:

Solution

Step 1: Recall Snowflake layers

Step 2: Identify compute layer role

Final Answer:

Quick Check:

Solution

Step 1: Understand compute warehouse pause

Step 2: Analyze impact on storage and services

Final Answer:

Quick Check:

Solution

Step 1: Identify cause of slow queries

Step 2: Check compute layer scaling

Final Answer:

Quick Check:

Solution

Step 1: Understand services layer role

Step 2: Differentiate from storage and compute

Final Answer:

Quick Check: