Build a RAG pipeline with Pinecone
Pinecone is a managed vector database built for scale. You get sub-100ms queries over billions of vectors without operating any infra.
Prerequisites
- +Pinecone account and API key
- +Node 20+
- +OpenAI API key for embeddings
Step-by-Step
- 1
Create a serverless index
Serverless indices auto-scale and bill per query. Pick the dimension to match your embedding model (1536 for text-embedding-3-small).
pnpm add @pinecone-database/pinecone openai - 2
Initialize the client
Use a single client instance for the lifetime of your app.
import { Pinecone } from '@pinecone-database/pinecone'; const pc = new Pinecone({ apiKey: process.env.PINECONE_API_KEY! }); const index = pc.index('docs'); - 3
Embed and upsert
Batch upserts in groups of 100 vectors. Always include rich metadata so you can filter at query time.
import OpenAI from 'openai'; const openai = new OpenAI(); const { data } = await openai.embeddings.create({ model: 'text-embedding-3-small', input: chunks.map((c) => c.text) }); await index.upsert( data.map((d, i) => ({ id: chunks[i].id, values: d.embedding, metadata: { source: chunks[i].source } })) ); - 4
Query with filters
Metadata filters narrow results before vector similarity runs - critical for multi-tenant apps.
const q = await openai.embeddings.create({ model: 'text-embedding-3-small', input: 'reset password' }); const res = await index.query({ vector: q.data[0].embedding, topK: 5, filter: { source: { $eq: 'help-docs' } }, includeMetadata: true }); - 5
Add hybrid search
Pinecone supports sparse-dense hybrid via SPLADE or BM25 vectors. Use it when keyword matches matter (proper nouns, codes).
- 6
Monitor and tune
Watch the Pinecone dashboard for read units consumed. If costs balloon, lower topK or add a cache layer in front.
Common Pitfalls
- !Wrong dimension at index creation. You cannot change it without rebuilding.
- !Putting raw text in metadata blows past the 40KB metadata limit.
- !Free pods pause after 7 days inactive. Use serverless if you need always-on.
DevDigest Academy
Structured AI engineering courses with hands-on labs. Build production-ready apps faster.
What's Next
- ->Add a reranker to squeeze more relevance out of topK results.
- ->Set up a nightly job to re-embed docs that drifted.
