Highlighting with ts_headline in PostgreSQL - Time & Space Complexity
We want to understand how the time taken by ts_headline changes as the text size grows.
How does the work needed to highlight keywords grow when the input text gets bigger?
Analyze the time complexity of the following PostgreSQL snippet using ts_headline.
SELECT ts_headline(
'english',
document_text,
to_tsquery('english', 'database & query')
) AS highlighted_text
FROM documents
WHERE to_tsvector('english', document_text) @@ to_tsquery('english', 'database & query');
This code highlights search terms in the document text for matching rows.
Look for repeated work inside the function and query.
- Primary operation: Scanning the entire document text to find and mark keywords.
- How many times: Once per matching document, the function processes the full text.
As the document text gets longer, the work to find and highlight keywords grows roughly in proportion to the text length.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 words | About 10 checks to find keywords |
| 100 words | About 100 checks |
| 1000 words | About 1000 checks |
Pattern observation: The work grows linearly as the text length increases.
Time Complexity: O(n)
This means the time to highlight grows directly with the size of the text being processed.
[X] Wrong: "Highlighting keywords is instant no matter how big the text is."
[OK] Correct: The function must scan the whole text to find keywords, so bigger text means more work and more time.
Understanding how text processing time grows helps you explain performance in real applications that search and highlight text.
What if we changed the query to highlight multiple keywords instead of just two? How would the time complexity change?