Embeddings

Vector representations of text used for semantic search and retrieval.

What it is

An embedding maps text (or images, audio, etc.) into a numeric vector space where "similar meaning" tends to be closer together.

Where you use them

  • Semantic search
  • Clustering and deduplication
  • Retrieval-augmented generation (RAG)

Gotchas

  • Embedding quality depends on the model and your data.
  • Similarity metrics (cosine, dot product) matter.