String collation and sort order in PostgreSQL - Time & Space Complexity
When sorting text data, the order depends on collation rules. Understanding how sorting time grows helps us know how fast queries run as data grows.
We want to see how sorting strings by collation affects the work done as more rows are sorted.
Analyze the time complexity of the following SQL query.
SELECT name
FROM users
ORDER BY name COLLATE "en_US";
This query sorts the 'name' column of the 'users' table using the English (US) collation rules.
Sorting involves comparing pairs of strings many times.
- Primary operation: Comparing two strings according to collation rules.
- How many times: Multiple comparisons happen, roughly proportional to the number of rows times the log of the number of rows.
As the number of rows grows, the number of comparisons grows faster than the number of rows but slower than the square of rows.
| Input Size (n) | Approx. Comparisons |
|---|---|
| 10 | About 30 to 40 comparisons |
| 100 | About 600 to 700 comparisons |
| 1000 | About 10,000 to 12,000 comparisons |
Pattern observation: The comparisons grow a bit faster than the number of rows, roughly like rows times log rows.
Time Complexity: O(n log n)
This means sorting strings takes more time as the list grows, but not as much as checking every pair; it grows in a balanced way.
[X] Wrong: "Sorting strings is just as fast as scanning them once."
[OK] Correct: Sorting needs many comparisons between strings, which takes more time than just reading them once.
Knowing how sorting time grows helps you explain query performance clearly and shows you understand how databases handle text data efficiently.
"What if we sorted by a numeric column instead of a string column? How would the time complexity change?"