ELI5 · AI & data

Vector databases.

A database built to answer "what is most similar to this?" by finding the nearest points in space, fast.

Once you turn text or images into embeddings — points in space where nearness means similar meaning — you need somewhere to store millions of them and, crucially, to ask "which stored points are closest to this new one?" A normal database is built for exact matches and ranges, not for finding nearest neighbours in hundreds of dimensions.

A vector database is built for exactly that question. It stores embeddings and is optimised to return the most similar ones to a query vector quickly, even across enormous collections. It is the storage-and-search engine behind semantic search, recommendations, and feeding relevant context to AI models.

  1. Millions of dots to store.
    millions of points
    1

    You already turned everything into points on a map. Now there are millions of them.

  2. What’s nearest to here?
    nearest to here?
    2

    The question is always the same: which stored points sit closest to this new one?

  3. Compare all of them? No.
    check all?!
    3

    Checking every dot one by one is hopeless once the map holds millions.

  4. Skip straight to the block.
    jump to the block
    4

    So it builds an index that jumps near the right neighbourhood instead of scanning all.

  5. Close enough, way faster.
    accuracy speed close enough, way faster
    5

    It trades a sliver of accuracy for a huge speed-up — "good enough, almost instantly."

  6. Here’s your top neighbours.
    doc · 0.98 doc · 0.91 LLM with context the engine behind RAG
    6

    Back come the nearest matches with their original items — the engine behind RAG.

A database built to answer "what is nearest to this point?" across millions, fast.

Why exact search does not scale

Finding the truly closest vector means comparing the query against every stored vector — fine for thousands, hopeless for millions, especially in high dimensions. So vector databases use approximate nearest neighbour (ANN) indexes, clever structures that get you the closest matches almost always, in a tiny fraction of the work, by trading a sliver of accuracy for an enormous speed-up. Tuning that accuracy-vs-speed dial is a core part of using them well.

Where they fit, and the options

Vector databases are the backbone of retrieval-augmented generation (RAG): embed your documents, store them, and at query time fetch the most relevant chunks to hand a language model as context. Choices range from adding vector search to an existing database (such as the pgvector extension for Postgres, simple if your data already lives there) to purpose-built services like Pinecone and Weaviate that specialise in scale, filtering, and managed operation. The trade is convenience and unified storage versus dedicated performance and features.

The real version pgvector vs Pinecone vs Weaviate →
Found this useful?