← Search · beginner · 10 min · 02 / 06 বাংলা

Postgres Full-Text Search

tsvector, GIN indexes, ranking, highlighting, and when Postgres search is all you need.

PostgreSQLfull-text searchtsvectorGINts_rankts_headline

Real-World Analogy

A reference librarian who also manages the card catalogue: they know the collection inside out, can search it intelligently, and you don’t need a separate specialist. Postgres full-text search is that librarian — already in your stack, no extra service to run, no sync to maintain. Good enough for most applications.

When Postgres FTS Is Enough

Use Postgres full-text search when:

Your searchable data is already in Postgres
You need basic keyword search (not typo tolerance, not ML-based ranking)
You don’t need faceted search with real-time facet counts
Your document corpus is < 10M rows
Operational simplicity matters more than search feature richness

Use a dedicated search engine (Meilisearch, Elasticsearch) when:

You need typo tolerance out of the box
You need facets with counts and filtering
You need sub-100ms search at high QPS on large corpora
You need relevance tuning, synonyms, or personalization

For most SaaS products, Postgres FTS handles the “search my data” use case adequately.

Schema Design for Search

-- Articles table with generated search vector
CREATE TABLE articles (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  title       TEXT NOT NULL,
  body        TEXT NOT NULL,
  author      TEXT,
  tags        TEXT[],
  published_at TIMESTAMPTZ,
  -- Generated tsvector — automatically updated on title/body change
  search_vector TSVECTOR GENERATED ALWAYS AS (
    setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
    setweight(to_tsvector('english', coalesce(author, '')), 'B') ||
    setweight(to_tsvector('english', coalesce(body, '')), 'C')
  ) STORED
);

-- GIN index on the vector
CREATE INDEX articles_search_idx ON articles USING GIN(search_vector);

setweight assigns importance: ‘A’ > ‘B’ > ‘C’ > ‘D’. A match in the title (weight A) scores higher than the same match in the body (weight C). This is how you implement “title matches matter more.”

Basic Search Queries

-- Simple keyword search
SELECT id, title, published_at
FROM articles
WHERE search_vector @@ to_tsquery('english', 'database')
ORDER BY published_at DESC
LIMIT 20;

-- Multiple terms (AND)
SELECT id, title
FROM articles
WHERE search_vector @@ to_tsquery('english', 'database & performance')
ORDER BY published_at DESC;

-- Multiple terms (OR)
SELECT id, title
FROM articles
WHERE search_vector @@ to_tsquery('english', 'database | cache')
ORDER BY published_at DESC;

-- Phrase search (adjacent terms)
SELECT id, title
FROM articles
WHERE search_vector @@ phraseto_tsquery('english', 'query optimization');

-- Prefix search (term begins with)
SELECT id, title
FROM articles
WHERE search_vector @@ to_tsquery('english', 'datab:*');
-- Matches: database, databases, databas

Handling User Input Safely

User input can’t be fed directly into to_tsquery — it’ll throw on special characters:

-- WRONG — crashes on "database & !"
WHERE search_vector @@ to_tsquery('english', user_input)

-- RIGHT — plainto_tsquery normalizes arbitrary input
WHERE search_vector @@ plainto_tsquery('english', user_input)
-- "database performance" → 'databas' & 'perform'

-- websearch_to_tsquery — Google-style syntax (Postgres 11+)
WHERE search_vector @@ websearch_to_tsquery('english', user_input)
-- Supports: "exact phrase", -exclude, OR
-- "database performance" -cache → 'databas' & 'perform' & !'cache'

// Application code
async function searchArticles(query: string, limit = 20): Promise<Article[]> {
	const result = await db.query(
		`SELECT id, title, published_at,
            ts_rank(search_vector, query) AS rank
     FROM articles,
          websearch_to_tsquery('english', $1) query
     WHERE search_vector @@ query
     ORDER BY rank DESC
     LIMIT $2`,
		[query, limit]
	);
	return result.rows;
}

Ranking

-- ts_rank: basic ranking by term frequency in document
SELECT title, ts_rank(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'database') query
WHERE search_vector @@ query
ORDER BY rank DESC;

-- ts_rank_cd: considers term proximity (cover density ranking)
SELECT title, ts_rank_cd(search_vector, query) AS rank
FROM articles, to_tsquery('english', 'database & performance') query
WHERE search_vector @@ query
ORDER BY rank DESC;

-- Boost by recency (combine text rank with time signal)
SELECT title,
  ts_rank(search_vector, query) *
  (1.0 / (1 + EXTRACT(EPOCH FROM (NOW() - published_at)) / 86400)) AS boosted_rank
FROM articles, websearch_to_tsquery('english', 'database') query
WHERE search_vector @@ query
ORDER BY boosted_rank DESC
LIMIT 20;

The recency boost: 1 / (1 + days_old) gives a recent article a multiplier > 1, an old one < 1. Adjust the denominator to tune decay speed.

Highlighting (Snippets)

Show users which part of the document matched:

SELECT
  title,
  ts_headline(
    'english',
    body,
    query,
    'StartSel=<mark>, StopSel=</mark>, MaxWords=30, MinWords=15, ShortWord=3'
  ) AS snippet
FROM articles, websearch_to_tsquery('english', 'database performance') query
WHERE search_vector @@ query
ORDER BY ts_rank(search_vector, query) DESC
LIMIT 10;

snippet:
"...optimizing <mark>database</mark> <mark>performance</mark> requires understanding
 how queries are executed and which indexes..."

ts_headline finds the most relevant excerpt and wraps matched terms in your chosen HTML tags.

Autocomplete

Prefix search for “search as you type”:

-- Index for prefix search
CREATE INDEX articles_title_trgm ON articles USING GIN(title gin_trgm_ops);
CREATE EXTENSION IF NOT EXISTS pg_trgm;

-- Fast prefix autocomplete
SELECT DISTINCT title
FROM articles
WHERE title ILIKE $1 || '%'
ORDER BY title
LIMIT 10;

-- Or using ts_lexize for prefix matching in tsvector
SELECT DISTINCT title
FROM articles
WHERE search_vector @@ to_tsquery('english', $1 || ':*')
LIMIT 10;

For better autocomplete (typo-tolerant), maintain a separate terms table:

CREATE TABLE search_terms (
  term    TEXT PRIMARY KEY,
  count   INT DEFAULT 1   -- how often searched
);

-- Populate from successful searches
INSERT INTO search_terms (term, count)
VALUES ($1, 1)
ON CONFLICT (term) DO UPDATE SET count = search_terms.count + 1;

-- Autocomplete query
SELECT term
FROM search_terms
WHERE term ILIKE $1 || '%'
ORDER BY count DESC, term
LIMIT 10;

Multi-Language Support

-- English (default)
to_tsvector('english', 'The database is running')
-- 'databas':2 'run':4

-- French
to_tsvector('french', 'La base de données est rapide')
-- 'base':2 'données':4 'rapid':6

-- Auto-detect language (store language per document)
ALTER TABLE articles ADD COLUMN language regconfig DEFAULT 'english';

-- Dynamic language in vector
ALTER TABLE articles DROP COLUMN search_vector;
ALTER TABLE articles ADD COLUMN search_vector TSVECTOR
  GENERATED ALWAYS AS (
    to_tsvector(language, coalesce(title, '') || ' ' || coalesce(body, ''))
  ) STORED;

Combining with Filters

FTS integrates naturally with SQL filters:

-- Search within a category and date range
SELECT id, title, ts_rank(search_vector, query) AS rank
FROM articles, websearch_to_tsquery('english', $1) query
WHERE search_vector @@ query
  AND category = $2
  AND published_at > NOW() - INTERVAL '30 days'
ORDER BY rank DESC
LIMIT 20;

Postgres uses the GIN index for text search and the regular indexes for category/date — the query planner combines them efficiently.

Performance

-- Check index usage
EXPLAIN ANALYZE
SELECT id, title FROM articles
WHERE search_vector @@ to_tsquery('english', 'database');

-- Should show: Bitmap Index Scan on articles_search_idx
-- NOT: Seq Scan on articles

-- GIN vs GiST:
-- GIN: faster queries, slower updates, larger index
-- GiST: faster updates, slower queries, smaller index
-- Use GIN for most search workloads (reads >> writes)

For tables with > 1M rows and high query rates, consider using a materialized view with pre-computed search vectors across joined tables, refreshed periodically.

-- Materialized view for complex multi-table search
CREATE MATERIALIZED VIEW article_search AS
SELECT
  a.id,
  a.title,
  a.published_at,
  u.name AS author_name,
  setweight(to_tsvector('english', a.title), 'A') ||
  setweight(to_tsvector('english', coalesce(u.name, '')), 'B') ||
  setweight(to_tsvector('english', a.body), 'C') AS search_vector
FROM articles a
LEFT JOIN users u ON u.id = a.author_id;

CREATE INDEX article_search_vector_idx ON article_search USING GIN(search_vector);

-- Refresh after bulk imports or on a schedule
REFRESH MATERIALIZED VIEW CONCURRENTLY article_search;