pgvector is an open-source PostgreSQL extension that adds a vector data type and vector similarity search to PostgreSQL. It lets you store embeddings as columns in a regular Postgres table and query them by cosine similarity, L2 distance, or inner product using SQL.

How fast is pgvector?

pgvector with an HNSW index handles sub-10ms query latency for datasets up to about 1 million vectors on a reasonably sized instance. At 500K vectors with a good HNSW configuration, p99 latency is typically under 5ms. Performance degrades above 1 to 5 million vectors without additional optimizations like partitioning.

Should I use HNSW or IVFFlat in pgvector?

Use HNSW for most production use cases. It does not require training, handles incremental inserts without losing recall, and gives better query performance at the same recall level. IVFFlat is only worth considering if you have very large datasets and are comfortable with the maintenance overhead of rebuilding the index periodically.

How many vectors can pgvector handle?

pgvector works well up to about 1 million vectors on a standard instance. With careful tuning and larger hardware, teams have scaled it to 5 million vectors. Above that, a dedicated vector database like Qdrant will give you significantly better performance and recall at lower cost.

Yes. pgvector is open source under the PostgreSQL License. It is free to install and use on any PostgreSQL instance. Managed services like Supabase and AWS RDS offer pgvector support, usually included in the base instance cost.

pgvector Complete Guide: Vector Search in PostgreSQL

Most teams that build their first RAG application do not need a separate vector database.

They are already running PostgreSQL. They already have backups, monitoring, and on-call procedures set up for it. Adding a new database means a new thing to operate, a new failure mode to watch, and a new bill to pay.

pgvector solves that. It adds vector similarity search directly to PostgreSQL as an extension. Your vectors live in the same database as your users, products, and orders. You can join them together in a single query. You do not need to manage anything new.

This guide covers everything you need to get pgvector working in production: installation, indexing, querying, Python integration, performance tuning, and knowing when you have outgrown it.

What pgvector Actually Is

pgvector is a PostgreSQL extension that adds three things:

A vector data type for storing high-dimensional floating-point arrays
Three distance operators for measuring similarity between vectors
Two index types (HNSW and IVFFlat) for fast approximate nearest neighbor search

Without an index, pgvector does exact search. Every query scans the entire table and calculates the distance to every stored vector. That is accurate but slow at any meaningful scale. With an HNSW index, queries use approximate nearest neighbor search and return results in milliseconds.

pgvector is open source under the PostgreSQL License. It works on any PostgreSQL 12 or later instance. Supabase, AWS RDS, Google Cloud SQL, and most managed Postgres providers support it out of the box.

Installation

If you are running Postgres locally, install pgvector using your package manager.

On Ubuntu or Debian:

bash

sudo apt install postgresql-16-pgvector

On macOS with Homebrew:

bash

brew install pgvector

From source (any system):

bash

git clone --branch v0.7.0 https://github.com/pgvector/pgvector.git
cd pgvector
make
make install

Once installed, enable the extension in your database:

sql

CREATE EXTENSION IF NOT EXISTS vector;

That is all. pgvector is now active in that database.

Creating a Table with a Vector Column

The vector(n) type takes one argument: the number of dimensions. This must match the output dimensions of your embedding model.

Common embedding dimensions:

OpenAI text-embedding-3-small: 1536
OpenAI text-embedding-3-large: 3072
Google text-embedding-004: 768
Sentence Transformers all-MiniLM-L6-v2: 384

Here is a table for storing document chunks with their embeddings:

sql

CREATE TABLE documents (
    id        BIGSERIAL PRIMARY KEY,
    content   TEXT        NOT NULL,
    source    TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    embedding vector(1536)
);

The embedding column stores a 1536-dimensional vector. Everything else is a normal Postgres column.

Inserting Vectors

You can insert vectors from SQL directly or from Python using the pgvector library.

From SQL:

sql

INSERT INTO documents (content, source, embedding)
VALUES (
    'pgvector adds vector search to PostgreSQL.',
    'blog-post',
    '[0.021, -0.134, 0.082, ...]'  -- 1536 values
);

From Python with psycopg2:

python

import psycopg2
from pgvector.psycopg2 import register_vector
import openai

# Connect and register the vector type
conn = psycopg2.connect("postgresql://user:password@localhost/mydb")
register_vector(conn)

# Get an embedding from OpenAI
client = openai.OpenAI(api_key="your-key")

def embed(text: str) -> list[float]:
    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )
    return response.data[0].embedding

# Insert a document
text = "pgvector adds vector search to PostgreSQL."
embedding = embed(text)

with conn.cursor() as cur:
    cur.execute(
        "INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
        (text, "blog-post", embedding)
    )
conn.commit()

From Python with SQLAlchemy:

python

from sqlalchemy import create_engine, Column, Integer, Text
from sqlalchemy.orm import declarative_base
from pgvector.sqlalchemy import Vector

Base = declarative_base()

class Document(Base):
    __tablename__ = "documents"
    id        = Column(Integer, primary_key=True)
    content   = Column(Text)
    embedding = Column(Vector(1536))

engine = create_engine("postgresql://user:password@localhost/mydb")
Base.metadata.create_all(engine)

Querying: Finding Similar Vectors

pgvector adds three distance operators:

Operator	Distance type	Best for
`<->`	L2 (Euclidean)	Normalized vectors, general use
`<=>`	Cosine distance	Text embeddings from most models
`<#>`	Negative inner product	Vectors trained with dot product objective

For text embeddings from OpenAI, Sentence Transformers, or most general-purpose models, use cosine distance (<=>).

To understand the difference, read our article on cosine similarity vs Euclidean distance.

Basic similarity query:

sql

SELECT id, content, embedding <=> '[0.021, -0.134, 0.082, ...]' AS distance
FROM documents
ORDER BY embedding <=> '[0.021, -0.134, 0.082, ...]'
LIMIT 10;

From Python:

python

def search(query: str, limit: int = 10) -> list[dict]:
    query_embedding = embed(query)

    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT id, content, source,
                   embedding <=> %s AS distance
            FROM documents
            ORDER BY embedding <=> %s
            LIMIT %s
            """,
            (query_embedding, query_embedding, limit)
        )
        rows = cur.fetchall()

    return [
        {"id": r[0], "content": r[1], "source": r[2], "distance": r[3]}
        for r in rows
    ]

results = search("how does pgvector handle approximate search?")
for r in results:
    print(f"{r['distance']:.4f}  {r['content'][:80]}")

Filtering with WHERE Clauses

One of the biggest advantages of pgvector over dedicated vector databases is that filtering is just SQL. You do not need to learn a new filter syntax.

sql

-- Only search documents from a specific source
SELECT id, content, embedding <=> %s AS distance
FROM documents
WHERE source = 'blog-post'
ORDER BY embedding <=> %s
LIMIT 10;

-- Only search documents created in the last 30 days
SELECT id, content, embedding <=> %s AS distance
FROM documents
WHERE created_at > NOW() - INTERVAL '30 days'
ORDER BY embedding <=> %s
LIMIT 10;

-- Combine filters
SELECT id, content, embedding <=> %s AS distance
FROM documents
WHERE source = 'blog-post'
  AND created_at > NOW() - INTERVAL '30 days'
ORDER BY embedding <=> %s
LIMIT 10;

One thing to know: pgvector uses post-filtering by default. It runs the ANN search first and then applies the WHERE clause. For selective filters, this can return fewer than your requested LIMIT. If you are filtering on a highly selective condition and need guaranteed k results, you may need to increase your limit and re-filter in application code.

Building an Index

Without an index, every query is an exact scan of all rows. Fast for small tables, very slow above a few thousand rows.

pgvector supports two index types: HNSW and IVFFlat.

HNSW (Recommended)

HNSW (Hierarchical Navigable Small World) is the better choice for most production use cases. You do not need to train it, it handles inserts well, and it gives strong recall at low latency.

sql

-- Cosine distance index (most common for text embeddings)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- L2 distance index
CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops);

-- Inner product index
CREATE INDEX ON documents USING hnsw (embedding vector_ip_ops);

HNSW has two parameters you can tune:

m: the number of connections per node in the graph (default 16). Higher m means better recall but more memory and slower index build.
ef_construction: how many candidates are considered during index build (default 64). Higher means better recall but slower build.

sql

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

For most RAG applications, the defaults work well. If your recall is lower than expected, increase ef_construction to 128 and rebuild.

IVFFlat

IVFFlat (Inverted File with Flat compression) divides the vector space into lists clusters and searches only the nearest clusters at query time. It requires training on your data and needs to be rebuilt when your dataset grows significantly.

sql

-- Train and build an IVFFlat index
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

The lists value is roughly the square root of your total row count. 100 lists for 10K rows. 316 lists for 100K rows. 1000 lists for 1M rows.

IVFFlat is faster to build than HNSW but needs periodic rebuilding as your data changes. Use HNSW unless you have a specific reason not to.

Tuning Query Recall

HNSW uses an ef_search parameter at query time. Higher values check more candidates and return better recall at the cost of slightly more latency.

sql

-- Set ef_search for this session
SET hnsw.ef_search = 100;

-- Now run your query
SELECT id, content, embedding <=> %s AS distance
FROM documents
ORDER BY embedding <=> %s
LIMIT 10;

The default ef_search is 40. If your top results feel off, try 80 or 100. For IVFFlat, the equivalent parameter is ivfflat.probes.

sql

SET ivfflat.probes = 10;

Full Python RAG Example

Here is a complete working example: ingest documents, index them, and retrieve relevant chunks for an LLM.

python

import psycopg2
from pgvector.psycopg2 import register_vector
import openai

conn = psycopg2.connect("postgresql://user:password@localhost/mydb")
register_vector(conn)
client = openai.OpenAI(api_key="your-key")

def embed(text: str) -> list[float]:
    return client.embeddings.create(
        input=text, model="text-embedding-3-small"
    ).data[0].embedding

def ingest(texts: list[str], source: str) -> None:
    with conn.cursor() as cur:
        for text in texts:
            cur.execute(
                "INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
                (text, source, embed(text))
            )
    conn.commit()

def retrieve(query: str, limit: int = 5) -> list[str]:
    q_vec = embed(query)
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT content
            FROM documents
            ORDER BY embedding <=> %s
            LIMIT %s
            """,
            (q_vec, limit)
        )
        return [row[0] for row in cur.fetchall()]

def answer(question: str) -> str:
    chunks = retrieve(question)
    context = "\n\n".join(chunks)
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"Answer using this context:\n\n{context}"},
            {"role": "user", "content": question}
        ]
    )
    return response.choices[0].message.content

# Use it
ingest(["pgvector adds vector search to PostgreSQL.", "..."], source="docs")
print(answer("How do I run a vector similarity search in Postgres?"))

Performance at Scale

pgvector works well at these scale points:

Under 100K vectors: Any approach works. Even exact search is fast enough.

100K to 1M vectors: HNSW index with default settings. Expect sub-5ms query latency on a $20 to $50 per month cloud instance.

1M to 5M vectors: Still manageable with HNSW on a larger instance (8 to 16GB RAM). You may need to increase shared_buffers and work_mem in your Postgres config. The Postgres tuning guide from PGTune can generate a good starting config based on your hardware. Latency will stay reasonable but index build takes longer.

Above 5M vectors: This is where pgvector starts to show its limits. Memory pressure increases significantly because HNSW graphs are kept in memory. Query latency starts climbing. At this point, a dedicated vector database like Qdrant or Pinecone will give you better performance at lower cost.

When to Stop Using pgvector

pgvector is the right tool for a specific range of use cases. Here are the signals that tell you it is time to move to a dedicated vector database.

Your query latency is climbing. If p99 latency is above 50ms and you have already tuned ef_search and increased hardware, you have hit the limit.

You need hybrid search. pgvector has no native sparse vector support. Building hybrid search on top of pgvector requires running a separate keyword search (using Postgres full-text search) and merging results manually in application code. Dedicated databases like Qdrant handle this natively.

Filtering is returning incomplete results. Post-filtering on selective conditions produces fewer results than requested. If your application needs guaranteed k results under selective filters, a database with filtered HNSW traversal handles it correctly.

Your vector dataset is growing past 5M rows. The memory requirements for HNSW at this scale put pressure on the rest of your Postgres workload. You start trading off vector search performance against OLTP performance on the same instance.

If any of these apply to you, the guide to choosing a vector database walks through the full set of options and what to move to.

My Recommendation

Start with pgvector if you are already on PostgreSQL.

It removes the infrastructure overhead of a second database, keeps your data in one place, and handles all but the most demanding vector search workloads. The Postgres ecosystem tools you already use for migrations, backups, and monitoring work with pgvector data automatically.

Move to a dedicated vector database when your vector count exceeds 5M, your filtering patterns need guaranteed recall under selective conditions, or you need native hybrid search without building it yourself.

The transition is not difficult. pgvector and Qdrant use the same concept of embedding model outputs as stored vectors. The main work is moving data and updating query code. Waiting until you have a real reason to migrate is the right call.

Most teams that build their first RAG application do not need a separate vector database.

This guide covers everything you need to get pgvector working in production: installation, indexing, querying, Python integration, performance tuning, and knowing when you have outgrown it.

What pgvector Actually Is

pgvector is a PostgreSQL extension that adds three things:

A vector data type for storing high-dimensional floating-point arrays
Three distance operators for measuring similarity between vectors
Two index types (HNSW and IVFFlat) for fast approximate nearest neighbor search

Installation

If you are running Postgres locally, install pgvector using your package manager.

On Ubuntu or Debian:

bash

sudo apt install postgresql-16-pgvector

On macOS with Homebrew:

bash

brew install pgvector

From source (any system):

bash

git clone --branch v0.7.0 https://github.com/pgvector/pgvector.git
cd pgvector
make
make install

Once installed, enable the extension in your database:

sql

CREATE EXTENSION IF NOT EXISTS vector;

That is all. pgvector is now active in that database.

Creating a Table with a Vector Column

The vector(n) type takes one argument: the number of dimensions. This must match the output dimensions of your embedding model.

Common embedding dimensions:

OpenAI text-embedding-3-small: 1536
OpenAI text-embedding-3-large: 3072
Google text-embedding-004: 768
Sentence Transformers all-MiniLM-L6-v2: 384

Here is a table for storing document chunks with their embeddings:

sql

CREATE TABLE documents (
    id        BIGSERIAL PRIMARY KEY,
    content   TEXT        NOT NULL,
    source    TEXT,
    created_at TIMESTAMPTZ DEFAULT NOW(),
    embedding vector(1536)
);

The embedding column stores a 1536-dimensional vector. Everything else is a normal Postgres column.

Inserting Vectors

You can insert vectors from SQL directly or from Python using the pgvector library.

From SQL:

sql

INSERT INTO documents (content, source, embedding)
VALUES (
    'pgvector adds vector search to PostgreSQL.',
    'blog-post',
    '[0.021, -0.134, 0.082, ...]'  -- 1536 values
);

From Python with psycopg2:

python

import psycopg2
from pgvector.psycopg2 import register_vector
import openai

# Connect and register the vector type
conn = psycopg2.connect("postgresql://user:password@localhost/mydb")
register_vector(conn)

# Get an embedding from OpenAI
client = openai.OpenAI(api_key="your-key")

def embed(text: str) -> list[float]:
    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )
    return response.data[0].embedding

# Insert a document
text = "pgvector adds vector search to PostgreSQL."
embedding = embed(text)

with conn.cursor() as cur:
    cur.execute(
        "INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
        (text, "blog-post", embedding)
    )
conn.commit()

From Python with SQLAlchemy:

python

from sqlalchemy import create_engine, Column, Integer, Text
from sqlalchemy.orm import declarative_base
from pgvector.sqlalchemy import Vector

Base = declarative_base()

class Document(Base):
    __tablename__ = "documents"
    id        = Column(Integer, primary_key=True)
    content   = Column(Text)
    embedding = Column(Vector(1536))

engine = create_engine("postgresql://user:password@localhost/mydb")
Base.metadata.create_all(engine)

Querying: Finding Similar Vectors

pgvector adds three distance operators:

Operator	Distance type	Best for
`<->`	L2 (Euclidean)	Normalized vectors, general use
`<=>`	Cosine distance	Text embeddings from most models
`<#>`	Negative inner product	Vectors trained with dot product objective

For text embeddings from OpenAI, Sentence Transformers, or most general-purpose models, use cosine distance (<=>).

To understand the difference, read our article on cosine similarity vs Euclidean distance.

Basic similarity query:

sql

SELECT id, content, embedding <=> '[0.021, -0.134, 0.082, ...]' AS distance
FROM documents
ORDER BY embedding <=> '[0.021, -0.134, 0.082, ...]'
LIMIT 10;

From Python:

python

def search(query: str, limit: int = 10) -> list[dict]:
    query_embedding = embed(query)

    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT id, content, source,
                   embedding <=> %s AS distance
            FROM documents
            ORDER BY embedding <=> %s
            LIMIT %s
            """,
            (query_embedding, query_embedding, limit)
        )
        rows = cur.fetchall()

    return [
        {"id": r[0], "content": r[1], "source": r[2], "distance": r[3]}
        for r in rows
    ]

results = search("how does pgvector handle approximate search?")
for r in results:
    print(f"{r['distance']:.4f}  {r['content'][:80]}")

Filtering with WHERE Clauses

One of the biggest advantages of pgvector over dedicated vector databases is that filtering is just SQL. You do not need to learn a new filter syntax.

sql

-- Only search documents from a specific source
SELECT id, content, embedding <=> %s AS distance
FROM documents
WHERE source = 'blog-post'
ORDER BY embedding <=> %s
LIMIT 10;

-- Only search documents created in the last 30 days
SELECT id, content, embedding <=> %s AS distance
FROM documents
WHERE created_at > NOW() - INTERVAL '30 days'
ORDER BY embedding <=> %s
LIMIT 10;

-- Combine filters
SELECT id, content, embedding <=> %s AS distance
FROM documents
WHERE source = 'blog-post'
  AND created_at > NOW() - INTERVAL '30 days'
ORDER BY embedding <=> %s
LIMIT 10;

Building an Index

Without an index, every query is an exact scan of all rows. Fast for small tables, very slow above a few thousand rows.

pgvector supports two index types: HNSW and IVFFlat.

HNSW (Recommended)

HNSW (Hierarchical Navigable Small World) is the better choice for most production use cases. You do not need to train it, it handles inserts well, and it gives strong recall at low latency.

sql

-- Cosine distance index (most common for text embeddings)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

-- L2 distance index
CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops);

-- Inner product index
CREATE INDEX ON documents USING hnsw (embedding vector_ip_ops);

HNSW has two parameters you can tune:

m: the number of connections per node in the graph (default 16). Higher m means better recall but more memory and slower index build.
ef_construction: how many candidates are considered during index build (default 64). Higher means better recall but slower build.

sql

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

For most RAG applications, the defaults work well. If your recall is lower than expected, increase ef_construction to 128 and rebuild.

IVFFlat

sql

-- Train and build an IVFFlat index
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

The lists value is roughly the square root of your total row count. 100 lists for 10K rows. 316 lists for 100K rows. 1000 lists for 1M rows.

IVFFlat is faster to build than HNSW but needs periodic rebuilding as your data changes. Use HNSW unless you have a specific reason not to.

Tuning Query Recall

HNSW uses an ef_search parameter at query time. Higher values check more candidates and return better recall at the cost of slightly more latency.

sql

-- Set ef_search for this session
SET hnsw.ef_search = 100;

-- Now run your query
SELECT id, content, embedding <=> %s AS distance
FROM documents
ORDER BY embedding <=> %s
LIMIT 10;

The default ef_search is 40. If your top results feel off, try 80 or 100. For IVFFlat, the equivalent parameter is ivfflat.probes.

sql

SET ivfflat.probes = 10;

Full Python RAG Example

Here is a complete working example: ingest documents, index them, and retrieve relevant chunks for an LLM.

python

import psycopg2
from pgvector.psycopg2 import register_vector
import openai

conn = psycopg2.connect("postgresql://user:password@localhost/mydb")
register_vector(conn)
client = openai.OpenAI(api_key="your-key")

def embed(text: str) -> list[float]:
    return client.embeddings.create(
        input=text, model="text-embedding-3-small"
    ).data[0].embedding

def ingest(texts: list[str], source: str) -> None:
    with conn.cursor() as cur:
        for text in texts:
            cur.execute(
                "INSERT INTO documents (content, source, embedding) VALUES (%s, %s, %s)",
                (text, source, embed(text))
            )
    conn.commit()

def retrieve(query: str, limit: int = 5) -> list[str]:
    q_vec = embed(query)
    with conn.cursor() as cur:
        cur.execute(
            """
            SELECT content
            FROM documents
            ORDER BY embedding <=> %s
            LIMIT %s
            """,
            (q_vec, limit)
        )
        return [row[0] for row in cur.fetchall()]

def answer(question: str) -> str:
    chunks = retrieve(question)
    context = "\n\n".join(chunks)
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": f"Answer using this context:\n\n{context}"},
            {"role": "user", "content": question}
        ]
    )
    return response.choices[0].message.content

# Use it
ingest(["pgvector adds vector search to PostgreSQL.", "..."], source="docs")
print(answer("How do I run a vector similarity search in Postgres?"))

Performance at Scale

pgvector works well at these scale points:

Under 100K vectors: Any approach works. Even exact search is fast enough.

100K to 1M vectors: HNSW index with default settings. Expect sub-5ms query latency on a $20 to $50 per month cloud instance.

When to Stop Using pgvector

pgvector is the right tool for a specific range of use cases. Here are the signals that tell you it is time to move to a dedicated vector database.

Your query latency is climbing. If p99 latency is above 50ms and you have already tuned ef_search and increased hardware, you have hit the limit.

If any of these apply to you, the guide to choosing a vector database walks through the full set of options and what to move to.

My Recommendation

Start with pgvector if you are already on PostgreSQL.

pgvector: The Complete Guide to Vector Search in PostgreSQL (2026)

What pgvector Actually Is

Installation

Creating a Table with a Vector Column

Inserting Vectors

Querying: Finding Similar Vectors

Filtering with WHERE Clauses

Building an Index

HNSW (Recommended)

IVFFlat

Tuning Query Recall

Full Python RAG Example

Performance at Scale

When to Stop Using pgvector

My Recommendation

Krunal Kanojiya

Related Posts

pgvector: The Complete Guide to Vector Search in PostgreSQL (2026)

What pgvector Actually Is

Installation

Creating a Table with a Vector Column

Inserting Vectors

Querying: Finding Similar Vectors

Filtering with WHERE Clauses

Building an Index

HNSW (Recommended)

IVFFlat

Tuning Query Recall

Full Python RAG Example

Performance at Scale

When to Stop Using pgvector

My Recommendation

Krunal Kanojiya

Related Posts

What pgvector Actually Is

Installation

Creating a Table with a Vector Column

Inserting Vectors

Querying: Finding Similar Vectors

Filtering with WHERE Clauses

Building an Index

HNSW (Recommended)

IVFFlat

Tuning Query Recall

Full Python RAG Example

Performance at Scale

When to Stop Using pgvector

My Recommendation

Related Reading

Krunal Kanojiya

Related Posts

What pgvector Actually Is

Installation

Creating a Table with a Vector Column

Inserting Vectors

Querying: Finding Similar Vectors

Filtering with WHERE Clauses

Building an Index

HNSW (Recommended)

IVFFlat

Tuning Query Recall

Full Python RAG Example

Performance at Scale

When to Stop Using pgvector

My Recommendation

Related Reading

Krunal Kanojiya

Related Posts