0
0
DBMS Theoryknowledge~5 mins

Why distributed databases handle scale in DBMS Theory - Performance Analysis

Choose your learning style9 modes available
Time Complexity: Why distributed databases handle scale
O(n / k)
Understanding Time Complexity

When databases grow large, how fast they handle data matters a lot.

We want to see how distributed databases manage more data and users efficiently.

Scenario Under Consideration

Analyze the time complexity of this simplified distributed query process.


-- Assume data is split across 3 servers
SELECT * FROM users WHERE age > 30;
-- Query runs on each server in parallel
-- Results are combined and returned
    

This shows a query running on multiple servers at once, then merging results.

Identify Repeating Operations

Look for repeated work done by the system.

  • Primary operation: Each server scans its part of the data.
  • How many times: Once per server, all running at the same time.
How Execution Grows With Input

As total data grows, it is split across servers, so each server handles approximately n/k data items.

Input Size (n)Approx. Operations per Server
10,000~3,333
100,000~33,333
1,000,000~333,333

Pattern observation: Total work grows with data size, but each server's work grows slower because data is shared.

Final Time Complexity

Time Complexity: O(n / k)

This means the work per server grows with data size divided by number of servers, so adding servers helps handle more data efficiently.

Common Mistake

[X] Wrong: "Adding more servers always makes queries instantly faster."

[OK] Correct: Some work like combining results or network delays still take time, so speed improves but not infinitely.

Interview Connect

Understanding how distributed databases split work helps you explain real systems that handle big data smoothly.

Self-Check

What if the data was not evenly split across servers? How would that affect the time complexity?