DISTINCT for unique values in MySQL - Time & Space Complexity
We want to understand how the time to find unique values changes as the data grows.
How does using DISTINCT affect the work the database does?
Analyze the time complexity of the following code snippet.
SELECT DISTINCT city
FROM customers;
This query finds all unique city names from the customers table.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning each row in the customers table to check the city value.
- How many times: Once for each row in the table.
As the number of rows grows, the database must look at more city values to find unique ones.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks to find unique cities |
| 100 | About 100 checks to find unique cities |
| 1000 | About 1000 checks to find unique cities |
Pattern observation: The work grows roughly in direct proportion to the number of rows.
Time Complexity: O(n)
This means the time to find unique cities grows linearly as the number of rows increases.
[X] Wrong: "DISTINCT instantly returns unique values without checking all rows."
[OK] Correct: The database must look at each row to know if the city is new or already seen, so it checks all rows.
Understanding how DISTINCT works helps you explain query performance clearly and shows you know how databases handle data.
"What if we added an index on the city column? How would the time complexity change?"