Suffix arrays in Data Structures Theory - Time & Space Complexity
Suffix arrays help us quickly find patterns in text by sorting all endings of a string.
We want to know how the time needed to build and use suffix arrays grows as the text gets longer.
Analyze the time complexity of the following suffix array construction code.
function buildSuffixArray(text) {
let suffixes = [];
for (let i = 0; i < text.length; i++) {
suffixes.push({index: i, suffix: text.slice(i)});
}
suffixes.sort((a, b) => a.suffix.localeCompare(b.suffix));
return suffixes.map(item => item.index);
}
This code creates all suffixes of the text, sorts them alphabetically, and returns their starting positions.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Creating and sorting all suffixes of the string.
- How many times: There are as many suffixes as the length of the text (n), and sorting compares these suffixes multiple times.
As the text length grows, the number of suffixes grows linearly, but sorting them involves comparing many characters.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 100 comparisons of suffixes, each up to length 10 |
| 100 | About 10,000 comparisons, each up to length 100 |
| 1000 | About 1,000,000 comparisons, each up to length 1000 |
Pattern observation: The work grows much faster than the input size because sorting compares many pairs of suffixes, each possibly long.
Time Complexity: O(n² log n)
This means the time needed grows roughly like the square of the text length times a logarithm, making it slower for very long texts.
[X] Wrong: "Sorting suffixes is just like sorting numbers, so it should be O(n log n)."
[OK] Correct: Each suffix is a string that can be very long, so comparing them takes extra time, making sorting much slower than simple number sorting.
Understanding suffix arrays and their time complexity shows you can analyze complex data structures and their costs, a valuable skill in many coding challenges.
"What if we used a more advanced algorithm that builds the suffix array in O(n) time? How would that change the time complexity?"