0
0
Azurecloud~15 mins

Blob storage (block, append, page) in Azure - Deep Dive

Choose your learning style9 modes available
Overview - Blob storage (block, append, page)
What is it?
Blob storage is a way to save large amounts of data in the cloud as files called blobs. There are three types of blobs: block blobs for regular files, append blobs for adding data at the end, and page blobs for random read/write access. Each type is designed for different ways you might want to store and use your data. This helps you pick the best way to save your files depending on your needs.
Why it matters
Without blob storage, saving and accessing large files in the cloud would be slow, complicated, or expensive. Blob storage makes it easy to store anything from images to logs to virtual machine disks, and to access or update them efficiently. This means apps and services can work faster and handle more data without hassle.
Where it fits
Before learning blob storage, you should understand basic cloud storage concepts like files and containers. After this, you can learn about managing access, security, and optimizing storage costs. Blob storage is a foundation for many cloud services like backups, streaming, and virtual machines.
Mental Model
Core Idea
Blob storage offers three types of files optimized for different ways of writing and reading data: block blobs for uploading files in pieces, append blobs for adding data only at the end, and page blobs for fast random access.
Think of it like...
Imagine a notebook: block blobs are like writing pages in order, append blobs are like adding sticky notes only at the end, and page blobs are like a binder where you can flip to and change any page quickly.
Blob Storage Types
┌───────────────┬───────────────┬───────────────┐
│ Block Blob    │ Append Blob   │ Page Blob     │
├───────────────┼───────────────┼───────────────┤
│ Files split   │ Logs or       │ Virtual disk  │
│ into blocks   │ audit trails  │ files needing │
│ uploaded     │ data added    │ random access │
│ in pieces     │ only at end   │ and updates   │
└───────────────┴───────────────┴───────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding Blob Storage Basics
🤔
Concept: Learn what blob storage is and why it stores data as blobs.
Blob storage is a cloud service that stores data as blobs, which are like files. These blobs live inside containers, which organize them like folders. You can upload, download, and manage blobs easily over the internet.
Result
You know that blob storage holds files in containers and can be accessed remotely.
Understanding that blob storage is just a way to save files in the cloud helps you see how apps can use it to store data without worrying about physical disks.
2
FoundationIntroducing Blob Types: Block, Append, Page
🤔
Concept: Blob storage has three types of blobs, each for different use cases.
Block blobs store files as blocks, which you upload separately and then combine. Append blobs let you add data only at the end, perfect for logs. Page blobs store data in pages, allowing fast random read and write, useful for virtual disks.
Result
You can identify which blob type fits your data needs.
Knowing there are different blob types helps you choose the right one for your app's performance and update needs.
3
IntermediateHow Block Blobs Work with Blocks
🤔Before reading on: do you think block blobs upload the whole file at once or in pieces? Commit to your answer.
Concept: Block blobs upload files in smaller pieces called blocks, which are combined later.
When uploading a block blob, you split your file into blocks (up to 100 MB each). You upload each block separately. After all blocks are uploaded, you commit them to form the complete blob. This allows resuming uploads if interrupted.
Result
You can upload large files efficiently and recover from upload failures.
Understanding block blobs as pieces that combine later explains why uploads can be paused and resumed without starting over.
4
IntermediateAppend Blobs for Logging and Data Streams
🤔Before reading on: do you think append blobs allow editing data in the middle or only adding at the end? Commit to your answer.
Concept: Append blobs only allow adding data at the end, making them ideal for logs.
Append blobs let you add new blocks only at the end of the blob. You cannot modify or delete existing blocks. This makes them perfect for scenarios like logging where data grows over time but old data stays unchanged.
Result
You can efficiently store growing data streams without overwriting.
Knowing append blobs only add data at the end prevents mistakes like trying to edit old log entries.
5
IntermediatePage Blobs for Random Read/Write Access
🤔Before reading on: do you think page blobs are better for sequential or random access? Commit to your answer.
Concept: Page blobs store data in fixed-size pages allowing fast random read and write.
Page blobs divide data into 512-byte pages. You can read or write any page independently without touching others. This makes them suitable for virtual machine disks or databases needing quick updates anywhere in the file.
Result
You can use page blobs for scenarios requiring fast, random data access.
Understanding page blobs as pages explains why they are chosen for virtual disks needing quick updates.
6
AdvancedChoosing Blob Types for Performance and Cost
🤔Before reading on: do you think append blobs cost more or less than block blobs? Commit to your answer.
Concept: Each blob type has different performance and cost tradeoffs based on usage patterns.
Block blobs are best for general files and large uploads. Append blobs optimize for write-heavy, append-only workloads like logs. Page blobs support random access but cost more due to frequent updates. Choosing the right type balances speed, cost, and functionality.
Result
You can optimize your storage choice for your app's needs and budget.
Knowing the cost and performance differences helps avoid overspending or slow apps.
7
ExpertInternal Storage and Consistency Guarantees
🤔Before reading on: do you think Azure guarantees immediate consistency for all blob types? Commit to your answer.
Concept: Azure Blob Storage uses different internal mechanisms to ensure data consistency and durability for each blob type.
Block blobs use a commit model where blocks are uploaded then committed, ensuring atomic updates. Append blobs guarantee that appends are atomic and ordered. Page blobs use page ranges with write locks to maintain consistency during random writes. Azure replicates data across servers for durability.
Result
You understand how Azure keeps your data safe and consistent under the hood.
Knowing these internal guarantees helps design apps that rely on data correctness and availability.
Under the Hood
Azure Blob Storage stores data in distributed servers. Block blobs upload data in blocks stored temporarily until committed as a single blob. Append blobs add blocks only at the end, maintaining order and atomicity. Page blobs store data in 512-byte pages with random read/write access, using page ranges and write locks. Data is replicated across multiple servers for durability and availability.
Why designed this way?
These designs balance flexibility, performance, and reliability. Block blobs allow large file uploads with resume capability. Append blobs optimize for write-heavy, append-only workloads like logs. Page blobs support virtual disks needing fast random access. Alternatives like single large file uploads or no append-only option would reduce efficiency and increase failure risk.
Azure Blob Storage Architecture
┌─────────────────────────────┐
│        Client App           │
└─────────────┬───────────────┘
              │
  ┌───────────▼───────────┐
  │ Blob Storage Service  │
  └───────────┬───────────┘
              │
  ┌───────────▼───────────┐
  │  Block Blob Storage   │
  │  (Upload blocks,      │
  │   commit blocks)      │
  ├───────────┬───────────┤
  │ Append Blob Storage   │
  │ (Append-only writes)  │
  ├───────────┬───────────┤
  │  Page Blob Storage    │
  │ (Random read/write)   │
  └───────────┬───────────┘
              │
  ┌───────────▼───────────┐
  │  Data Replication &    │
  │  Durability Layer      │
  └───────────────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do you think append blobs allow editing data anywhere or only adding at the end? Commit to your answer.
Common Belief:Append blobs let you edit or delete any part of the blob like block blobs.
Tap to reveal reality
Reality:Append blobs only allow adding new data at the end; you cannot modify or delete existing data.
Why it matters:Trying to edit append blobs causes errors and data corruption, especially in logging scenarios.
Quick: Do you think page blobs are cheaper than block blobs for all workloads? Commit to your answer.
Common Belief:Page blobs are always cheaper because they allow random access.
Tap to reveal reality
Reality:Page blobs cost more due to frequent write operations and storage overhead for pages.
Why it matters:Using page blobs unnecessarily can increase costs and reduce performance.
Quick: Do you think block blobs upload the entire file in one go? Commit to your answer.
Common Belief:Block blobs upload the whole file at once, so interrupted uploads must restart.
Tap to reveal reality
Reality:Block blobs upload files in blocks, allowing resuming interrupted uploads by uploading only missing blocks.
Why it matters:Misunderstanding this leads to inefficient uploads and wasted bandwidth.
Quick: Do you think Azure Blob Storage guarantees immediate consistency for all blob operations? Commit to your answer.
Common Belief:All blob operations are immediately consistent everywhere.
Tap to reveal reality
Reality:Azure Blob Storage provides strong consistency for most operations, but some replication delays can occur in geo-redundant setups.
Why it matters:Assuming immediate consistency everywhere can cause bugs in distributed applications.
Expert Zone
1
Block blobs support up to 50,000 blocks per blob, allowing very large files up to about 4.75 TB.
2
Append blobs have a maximum size of 195 GB, limiting their use to certain logging or streaming scenarios.
3
Page blobs require 512-byte aligned writes, which affects how you design applications that write to them.
When NOT to use
Avoid append blobs if you need to modify or delete data in the middle; use block blobs instead. Do not use page blobs for simple file storage due to higher cost; block blobs are better. For very large files needing streaming, consider Azure Data Lake Storage or other specialized services.
Production Patterns
Block blobs are widely used for storing images, videos, backups, and documents. Append blobs are common for storing logs, audit trails, and telemetry data. Page blobs are used as virtual hard disks for Azure Virtual Machines and for databases requiring fast random access.
Connections
File Systems
Blob storage types map to file system concepts: block blobs like sequential files, append blobs like write-only logs, and page blobs like random-access files.
Understanding file systems helps grasp why blob types exist and how they optimize different access patterns.
Database Write-Ahead Logging
Append blobs resemble write-ahead logs where data is only appended to ensure durability and order.
Knowing database logging helps understand why append blobs are append-only and useful for audit trails.
Memory Paging in Operating Systems
Page blobs' fixed-size pages and random access mirror how OS memory paging works.
Recognizing this connection clarifies why page blobs require aligned writes and support fast random updates.
Common Pitfalls
#1Trying to modify data in the middle of an append blob.
Wrong approach:Upload a block to append blob at a middle offset to overwrite existing data.
Correct approach:Use block blobs if you need to modify or overwrite data anywhere in the blob.
Root cause:Misunderstanding append blobs as editable like block blobs leads to errors.
#2Uploading very large files as a single block blob without splitting into blocks.
Wrong approach:Upload a 1 GB file in one request as a block blob.
Correct approach:Split large files into blocks (e.g., 4 MB each) and upload blocks separately before committing.
Root cause:Not knowing block blobs support block uploads causes inefficient or failed uploads.
#3Using page blobs for simple file storage to save cost.
Wrong approach:Store images or documents as page blobs to simplify access.
Correct approach:Use block blobs for simple file storage to reduce cost and complexity.
Root cause:Confusing page blobs' random access feature with general storage needs leads to unnecessary expenses.
Key Takeaways
Azure Blob Storage offers three blob types—block, append, and page—each optimized for different data access patterns.
Block blobs upload files in blocks that combine later, enabling efficient large file uploads and resumable transfers.
Append blobs allow only adding data at the end, making them ideal for logs and audit trails where data grows sequentially.
Page blobs store data in fixed-size pages with fast random read/write access, suitable for virtual disks and databases.
Choosing the right blob type balances performance, cost, and functionality, and understanding their internal mechanisms helps build reliable cloud applications.