0
0
PostgresqlHow-ToBeginner · 3 min read

How to Use to_tsvector in PostgreSQL for Full-Text Search

In PostgreSQL, use to_tsvector to convert plain text into a searchable document vector by breaking it into lexemes. This function prepares text for full-text search by normalizing and tokenizing it, making it easy to match search queries.
📐

Syntax

The to_tsvector function converts text into a tsvector type, which is a sorted list of distinct lexemes (words) with their positions. It can take an optional configuration parameter to specify language rules.

  • to_tsvector([config,] text)
  • config: Optional text search configuration (e.g., 'english').
  • text: The input string to be processed.
sql
to_tsvector('english', 'The quick brown fox jumps over the lazy dog')
💻

Example

This example shows how to_tsvector converts a sentence into lexemes for full-text search. It normalizes words and removes stop words like 'the'.

sql
SELECT to_tsvector('english', 'The quick brown fox jumps over the lazy dog');
Output
'brown':3 'dog':9 'fox':4 'jump':5 'lazi':8 'quick':2
⚠️

Common Pitfalls

Common mistakes include:

  • Not specifying the correct language configuration, which affects stemming and stop words.
  • Passing NULL or empty strings, which returns an empty tsvector.
  • Expecting to_tsvector to perform search; it only prepares text for searching.

Always pair to_tsvector with to_tsquery or plainto_tsquery for searching.

sql
/* Wrong: No config, may use default which might not suit your language */
SELECT to_tsvector('The quick brown fox');

/* Right: Specify language for proper stemming */
SELECT to_tsvector('english', 'The quick brown fox');
📊

Quick Reference

ParameterDescription
configOptional language configuration like 'english', 'simple', etc.
textInput string to convert into tsvector
Return Typetsvector - searchable document vector
UsagePrepare text for full-text search queries

Key Takeaways

Use to_tsvector to convert text into searchable lexemes for full-text search.
Specify the correct language configuration to get proper stemming and stop word removal.
to_tsvector only prepares text; combine it with to_tsquery for searching.
Empty or NULL input returns an empty tsvector.
Use to_tsvector consistently on both stored text and search queries for accurate matching.