How to Use tsvector in PostgreSQL for Full-Text Search
In PostgreSQL,
tsvector is a data type that stores preprocessed searchable text for full-text search. You create a tsvector column by converting text using to_tsvector(), which breaks text into searchable tokens. Use it with @@ operator and to_tsquery() to find matching text efficiently.Syntax
The tsvector type stores lexemes (words normalized for searching). You create it using the to_tsvector() function, which takes a text input and converts it into a searchable vector.
Basic syntax:
to_tsvector('config', 'text')- converts text to a tsvector using a text search configuration (like 'english').tsvector_column @@ to_tsquery('query')- checks if the tsvector matches the search query.
sql
SELECT to_tsvector('english', 'The quick brown fox jumps over the lazy dog');
Output
'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2
Example
This example shows how to create a table with a tsvector column, insert text data, and query it using full-text search.
sql
CREATE TABLE articles ( id SERIAL PRIMARY KEY, title TEXT, body TEXT, document_with_weights tsvector ); -- Insert data with tsvector generated INSERT INTO articles (title, body, document_with_weights) VALUES ('PostgreSQL Tutorial', 'Learn how to use PostgreSQL full-text search.', to_tsvector('english', 'PostgreSQL Tutorial') || to_tsvector('english', 'Learn how to use PostgreSQL full-text search.')); -- Query using full-text search SELECT id, title FROM articles WHERE document_with_weights @@ to_tsquery('full & text & search');
Output
id | title
----+--------------------
1 | PostgreSQL Tutorial
Common Pitfalls
Common mistakes when using tsvector include:
- Not updating the
tsvectorcolumn when the source text changes, causing stale search data. - Using
to_tsquery()without understanding its syntax, leading to no matches. - Ignoring text search configuration, which affects stemming and stop words.
Always keep tsvector columns in sync with text and use proper queries.
sql
/* Wrong: Not updating tsvector after text change */ UPDATE articles SET body = 'Updated text with new content' WHERE id = 1; -- But document_with_weights is not updated, so search misses new words /* Right: Update tsvector too */ UPDATE articles SET body = 'Updated text with new content', document_with_weights = to_tsvector('english', title) || to_tsvector('english', 'Updated text with new content') WHERE id = 1;
Quick Reference
| Function/Operator | Description |
|---|---|
| to_tsvector('config', text) | Converts text to a tsvector for full-text search |
| to_tsquery('config', query) | Creates a tsquery to search tsvector data |
| tsvector_column @@ to_tsquery() | Tests if tsvector matches the query |
| setweight(tsvector, 'A'|'B'|'C'|'D') | Assigns weight to lexemes for ranking |
| ts_rank(tsvector, tsquery) | Ranks search results by relevance |
Key Takeaways
Use to_tsvector() to convert text into searchable tsvector format.
Keep tsvector columns updated when source text changes to ensure accurate search.
Use @@ operator with to_tsquery() to perform full-text search queries.
Choose the right text search configuration for language-specific stemming and stop words.
Use weights and ranking functions to improve search result relevance.