Semantic Search

Master the mathematics and engineering of meaning-based retrieval systems

Part 1

The Representation Problem

Why text search is hard and how vectors solve it

Beyond Keywords

The fundamental mismatch between how humans express meaning and how computers match strings

Meaning as Geometry

The distributional hypothesis and why context defines meaning

High-Dimensional Intuition

Why we need hundreds of dimensions and what that means geometrically

Part 2

Embedding Models

How neural networks learn to represent meaning

Word2Vec: The Mechanics

Skip-gram and CBOW: how prediction tasks create meaningful vectors

Attention and Context

How transformers compute context-dependent representations

Sentence Embeddings

Pooling strategies, contrastive learning, and what makes a good retrieval embedding

Tokenization Internals

BPE, WordPiece, and how subword tokenization affects embeddings

Part 3

Similarity Measures

The mathematics of comparing vectors

Dot Product and Cosine Similarity

Two sides of the same coin: when magnitude matters and when it doesn't

Distance Metrics

Euclidean, Manhattan, and why the choice matters for retrieval

The Curse of Dimensionality

Why all points become equidistant and what to do about it

Part 4

Vector Search at Scale

Indexing and retrieval algorithms

The Brute Force Baseline

Exact nearest neighbors: when it works and when it doesn't

Locality-Sensitive Hashing

Random projections and hash collisions for approximate search

HNSW: The Algorithm

Hierarchical navigable small world graphs: construction and search

Quantization

Product quantization and how to compress vectors 32x without losing accuracy

Filtering and Metadata

Pre-filtering, post-filtering, and hybrid approaches

Part 5

The Retrieval Pipeline

From raw text to retrieved passages

Chunking Strategies

Fixed-size, semantic, hierarchical: trade-offs and implementations

Query Understanding

Query expansion, rewriting, and the asymmetry between queries and documents

Reranking

Cross-encoders, bi-encoders, and why two-stage retrieval works

Hybrid Retrieval

Combining BM25 and dense retrieval: reciprocal rank fusion and beyond

Part 6

RAG Systems

Retrieval-augmented generation in practice

RAG Architecture

The retrieve-then-generate pattern and its variants

Context Optimization

How much to retrieve, what to include, and prompt engineering for RAG

Evaluation Metrics

Precision, recall, MRR, NDCG: measuring retrieval quality with worked examples

Failure Modes

When RAG breaks and how to diagnose and fix retrieval problems