RAG

Build a RAG pipeline with pgvector

pgvector turns Postgres into a vector database. If you already run Postgres, this is the lowest-friction RAG store you can pick.

Prerequisites

+Postgres 14+ (Neon, Supabase, RDS, or self-hosted)
+Ability to install extensions
+OpenAI API key for embeddings

Step-by-Step

1
Enable the extension
Run once per database. Most providers (Neon, Supabase) just require this single statement.
```
CREATE EXTENSION IF NOT EXISTS vector;
```

Create the table and index

Use HNSW for the index - it beats IVFFlat on recall vs query speed for most workloads.

CREATE TABLE docs (
  id bigserial PRIMARY KEY,
  source text,
  content text,
  embedding vector(1536)
);
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);

Embed and insert from Node

Use a real Postgres client and bind the embedding as a string-cast vector literal.

import { Pool } from 'pg';
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const e = await openai.embeddings.create({ model: 'text-embedding-3-small', input: chunk });
await pool.query('INSERT INTO docs (source, content, embedding) VALUES ($1, $2, $3)', [
  source, chunk, JSON.stringify(e.data[0].embedding),
]);

4
Query with cosine distance
The <=> operator computes cosine distance. Order ascending and limit to topK.
```
SELECT id, source, content
FROM docs
ORDER BY embedding <=> $1
LIMIT 5;
```
5
Combine with SQL filters
This is pgvector's superpower: combine vector search with arbitrary SQL WHERE clauses, joins, and full-text search.
```
SELECT * FROM docs
WHERE tenant_id = $2 AND created_at > now() - interval '30 days'
ORDER BY embedding <=> $1 LIMIT 10;
```
6
Tune index parameters
HNSW exposes m and ef_construction at index time, plus ef_search per query. Higher = more recall, slower.

Common Pitfalls

!Forgetting the index. Sequential scans on millions of vectors are unusably slow.
!Using IVFFlat without enough rows. Build the index after you load data, not before.
!Storing embeddings as float8[] instead of vector(N). You lose all the operator overloads.

From the Developers Digest stack

DevDigest Academy

Structured AI engineering courses with hands-on labs. Build production-ready apps faster.

Explore DevDigest Academy Watch on YouTube

What's Next

->Combine pgvector with Postgres full-text search for hybrid retrieval.
->Move to dedicated vector DBs only when you exceed ~10M vectors.

Glossary

Compare Tools

More Build a RAG pipeline

RAG

Step-by-Step

Enable the extension

Run once per database. Most providers (Neon, Supabase) just require this single statement.

CREATE EXTENSION IF NOT EXISTS vector;

Create the table and index

Use HNSW for the index - it beats IVFFlat on recall vs query speed for most workloads.

CREATE TABLE docs (
  id bigserial PRIMARY KEY,
  source text,
  content text,
  embedding vector(1536)
);
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);

Embed and insert from Node

Use a real Postgres client and bind the embedding as a string-cast vector literal.

import { Pool } from 'pg';
const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const e = await openai.embeddings.create({ model: 'text-embedding-3-small', input: chunk });
await pool.query('INSERT INTO docs (source, content, embedding) VALUES ($1, $2, $3)', [
  source, chunk, JSON.stringify(e.data[0].embedding),
]);

Query with cosine distance

The <=> operator computes cosine distance. Order ascending and limit to topK.

SELECT id, source, content
FROM docs
ORDER BY embedding <=> $1
LIMIT 5;

Combine with SQL filters

This is pgvector's superpower: combine vector search with arbitrary SQL WHERE clauses, joins, and full-text search.

SELECT * FROM docs
WHERE tenant_id = $2 AND created_at > now() - interval '30 days'
ORDER BY embedding <=> $1 LIMIT 10;

Tune index parameters

HNSW exposes m and ef_construction at index time, plus ef_search per query. Higher = more recall, slower.

Prerequisites

Step-by-Step

Enable the extension

Create the table and index

Embed and insert from Node

Query with cosine distance

Combine with SQL filters

Tune index parameters

Common Pitfalls

DevDigest Academy

What's Next

Glossary

Compare Tools

More Build a RAG pipeline

LangChain

LlamaIndex

Pinecone

Get Smarter About AI Dev

Prerequisites

Step-by-Step

Enable the extension

Create the table and index

Embed and insert from Node

Query with cosine distance

Combine with SQL filters

Tune index parameters

Common Pitfalls

DevDigest Academy

What's Next

Glossary

Compare Tools

More Build a RAG pipeline

LangChain

LlamaIndex

Pinecone

Get Smarter About AI Dev