Bird
Raised Fist0
DBMS Theoryknowledge~5 mins

Sharding and partitioning in DBMS Theory - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is partitioning in databases?
Partitioning is the process of dividing a large database table into smaller, more manageable pieces called partitions. Each partition holds a subset of the data, often based on a specific column value like date or region.
Click to reveal answer
beginner
Define sharding in simple terms.
Sharding is a type of database partitioning where data is split across multiple servers or machines. Each shard holds a portion of the data, helping to spread the load and improve performance.
Click to reveal answer
intermediate
How does sharding differ from partitioning?
Partitioning divides data within a single database or server into parts, while sharding splits data across multiple servers or machines. Sharding is a form of horizontal scaling, whereas partitioning is often within one system.
Click to reveal answer
beginner
Give a real-life example of when sharding is useful.
Imagine a social media app with millions of users. Sharding lets the app store user data on different servers based on user location or ID, so no single server gets overloaded and the app runs faster.
Click to reveal answer
intermediate
What are common ways to partition data?
Data can be partitioned by range (e.g., dates), list (e.g., countries), or hash (using a function to evenly distribute data). These methods help organize data for faster access and easier management.
Click to reveal answer
What does sharding primarily help with?
ACompressing data to save space
BDistributing data across multiple servers
CEncrypting database data
DBacking up data regularly
Which of these is a common partitioning method?
ABackup partitioning
BEncryption partitioning
CRange partitioning
DCompression partitioning
Partitioning usually happens within:
AA single database or server
BMultiple servers worldwide
COnly cloud databases
DBackup storage systems
Which scenario best fits sharding?
ABacking up data to an external drive
BSplitting a table into smaller parts on one server
CEncrypting sensitive data in a database
DA large website spreading user data across servers
Hash partitioning means:
AUsing a function to evenly distribute data
BDividing data by date ranges
CGrouping data by country names
DEncrypting data with a hash
Explain the difference between sharding and partitioning in databases.
Think about where the data is divided and why.
You got /4 concepts.
    Describe a real-life example where sharding would improve a system's performance.
    Consider popular apps or websites with many users.
    You got /4 concepts.

      Practice

      (1/5)
      1. What is the main difference between sharding and partitioning in databases?
      easy
      A. Sharding divides data within one database; partitioning spreads data across multiple servers.
      B. Partitioning divides data within one database; sharding spreads data across multiple servers.
      C. Both sharding and partitioning mean the same and are used interchangeably.
      D. Partitioning is used only for backups, while sharding is for data security.

      Solution

      1. Step 1: Understand partitioning

        Partitioning splits data inside a single database into smaller parts for easier management and faster queries.
      2. Step 2: Understand sharding

        Sharding spreads data across multiple servers or machines to handle very large datasets and improve performance.
      3. Final Answer:

        Partitioning divides data within one database; sharding spreads data across multiple servers. -> Option B
      4. Quick Check:

        Partitioning = single database, Sharding = multiple servers [OK]
      Hint: Partitioning = one DB; Sharding = many servers [OK]
      Common Mistakes:
      • Confusing sharding with partitioning
      • Thinking both are the same
      • Assuming partitioning involves multiple servers
      2. Which of the following is a correct way to describe horizontal partitioning in a database?
      easy
      A. Splitting a table into multiple tables with the same columns but different rows.
      B. Splitting a table into multiple tables with different columns but same rows.
      C. Combining multiple tables into one large table.
      D. Backing up the entire database to a separate server.

      Solution

      1. Step 1: Define horizontal partitioning

        Horizontal partitioning means dividing a table by rows, so each partition has the same columns but different sets of rows.
      2. Step 2: Check options

        Splitting a table into multiple tables with the same columns but different rows. matches this definition exactly, while others describe different concepts or unrelated actions.
      3. Final Answer:

        Splitting a table into multiple tables with the same columns but different rows. -> Option A
      4. Quick Check:

        Horizontal partitioning = split rows [OK]
      Hint: Horizontal partitioning splits rows, not columns [OK]
      Common Mistakes:
      • Mixing horizontal with vertical partitioning
      • Thinking partitioning means backup
      • Confusing rows with columns
      3. Consider a database sharded by user ID across three servers: Server 1 stores users with IDs ending in 0-3, Server 2 stores 4-6, and Server 3 stores 7-9. If a query requests data for user ID 27, which server will handle the request?
      medium
      A. Server 3
      B. Server 2
      C. Server 1
      D. All servers

      Solution

      1. Step 1: Identify the shard key and ranges

        The sharding is based on the last digit of user ID: 0-3 on Server 1, 4-6 on Server 2, 7-9 on Server 3.
      2. Step 2: Find the last digit of user ID 27

        The last digit of 27 is 7, which falls in the 7-9 range assigned to Server 3.
      3. Final Answer:

        Server 3 -> Option A
      4. Quick Check:

        User ID 27 ends with 7, so Server 3 [OK]
      Hint: Check last digit of ID to find server [OK]
      Common Mistakes:
      • Ignoring the last digit and guessing server
      • Choosing all servers instead of one
      • Mixing up the shard ranges
      4. A database administrator tries to shard a database but notices that some shards have much more data than others, causing slow queries. What is the most likely problem?
      medium
      A. The backup process is running during queries.
      B. The database is not partitioned vertically.
      C. The database server hardware is outdated.
      D. The shard key is not chosen properly, causing uneven data distribution.

      Solution

      1. Step 1: Understand shard key role

        The shard key determines how data is split across shards. A poor choice can cause uneven data distribution.
      2. Step 2: Analyze the problem

        Uneven shard sizes causing slow queries usually mean the shard key is not distributing data evenly.
      3. Final Answer:

        The shard key is not chosen properly, causing uneven data distribution. -> Option D
      4. Quick Check:

        Uneven shards = bad shard key choice [OK]
      Hint: Uneven shards? Check shard key choice [OK]
      Common Mistakes:
      • Blaming hardware without checking shard key
      • Confusing sharding with partitioning issues
      • Ignoring data distribution patterns
      5. You have a large customer database that is partitioned by region within a single server. To improve performance and handle growth, you want to shard the data across multiple servers. Which approach best combines partitioning and sharding?
      hard
      A. Use only partitioning by region on one server; sharding is unnecessary.
      B. Partition the database by customer type across servers, and shard data by region within each server.
      C. Shard the database by region across servers, and within each server, partition data by customer type.
      D. Backup the database regularly instead of sharding or partitioning.

      Solution

      1. Step 1: Understand combining sharding and partitioning

        Sharding splits data across servers; partitioning splits data inside each server for better management.
      2. Step 2: Analyze the best approach

        Sharding by region spreads data geographically, and partitioning by customer type inside each shard improves query speed and organization.
      3. Final Answer:

        Shard the database by region across servers, and within each server, partition data by customer type. -> Option C
      4. Quick Check:

        Shard by region, partition by type inside servers [OK]
      Hint: Shard first, then partition inside shards [OK]
      Common Mistakes:
      • Mixing up shard and partition levels
      • Ignoring partitioning after sharding
      • Thinking backup replaces sharding