HNSW: The Algorithm

Hierarchical navigable small world graphs: construction and search

HNSW (Hierarchical Navigable Small World) is the dominant algorithm for approximate nearest neighbor search. It achieves logarithmic search complexity by building a multi-layer graph where coarse navigation happens at the top and fine-grained search at the bottom.

The Core Idea

Imagine searching for a location on a map. You would not start with a street map of the entire country—that would be overwhelming. Instead, you first look at a country-level map to find the region, then a regional map to find the city, then a city map to find the street, and finally a street map to find the building. Each level of zoom narrows your search.

HNSW works similarly. Higher layers contain fewer, more spread-out nodes for fast coarse navigation. Lower layers contain all nodes for precise search.

Interactive: HNSW layer structure

Layer 2 (few, long-range)

Layer 1 (medium)

Layer 0 (all, dense)

Graph Construction

Building an HNSW index processes vectors one at a time:

Assign a random layer: Each new vector is assigned to a maximum layer using an exponential distribution. Most vectors exist only at layer 0. Fewer exist at higher layers.
Find entry point: Start at the top layer and navigate to find the closest node to the new vector.
Search and connect: At each layer (from the assigned layer down to 0), find the M closest neighbors and create bidirectional edges.

Interactive: Watch HNSW being built

Each new node connects to its nearest existing neighbors.

The layer assignment probability is:

$l = \lfloor -\ln(\text{uniform}(0, 1)) \times m_L \rfloor$

Where $m_L$ is a normalization factor (typically $\frac{1}{\ln(M)}$ ). This gives exponentially decaying probability for higher layers.

Greedy Search

Searching HNSW is a multi-phase greedy traversal:

Start at the top layer: Begin at the entry point (typically the node inserted at the highest layer).
Greedy descent: At each layer, greedily move to neighbors closer to the query until no closer neighbor exists.
Descend to next layer: Use the closest node found as the entry point for the next layer.
Fine search at layer 0: Perform more thorough search, maintaining a dynamic candidate list.

Interactive: Watch search navigate layers

The key insight: higher layers provide "long-range" connections that quickly narrow the search region. Layer 0 provides dense local connections for accurate final selection.

Key Parameters

Interactive: Tune HNSW parameters

M: 1616

efConstruction: 200200

efSearch: 5050

Recall

77.5%

Search latency

1.8 ms

Memory (1M vecs)

3.13 GB

Build time

150s

Parameter guide:

M: Connections per node. Higher = better recall, more memory.
efConstruction: Build thoroughness. Higher = better index, slower build.
efSearch: Query thoroughness. Tune at runtime for recall/latency.

M (Connections per node)

M is the maximum number of bidirectional edges per node per layer. Higher M means better recall because more paths lead to correct neighbors, but it also means more memory for storing those edges and slower construction due to more distance computations. Typical values range from 8 to 64, with 16 being a common default.

efConstruction

This parameter controls how many candidates to consider when building edges. Higher efConstruction produces a better quality index at the cost of slower index building. It has no impact on memory since it only affects the construction process. Typical values range from 100 to 500, with 200 as a common default.

efSearch

This parameter controls how many candidates to track during search. Higher efSearch yields better recall at the cost of slower search. Like efConstruction, it has no impact on memory. The right value depends on your recall and latency targets—start at 50 and increase until recall is sufficient for your application.

Memory Usage

HNSW's memory footprint has three components. The vectors themselves consume n × d × 4 bytes for float32 storage. The edges consume approximately n × M × 2 × 4 bytes since they are bidirectional and concentrated at layer 0. Upper layers add typically 10-20% additional overhead.

For 1M vectors of 768 dimensions with M=16:

Vectors: 3 GB
Edges: ~128 MB
Total: ~3.2 GB

Why HNSW Works

The algorithm exploits several properties that combine to make it effective.

The navigable small world property ensures that with high probability, there exists a short path between any two nodes—about O(log n) hops. This is a well-studied property of certain graph structures.

The hierarchical structure provides long-range links at higher layers for fast coarse navigation, while dense local links at layer 0 enable accurate fine search. Together they span scales from global to local.

Greedy efficiency means local greedy moves usually lead toward the global optimum because the graph is well-connected. You do not need expensive global search; local decisions suffice.

Probabilistic guarantees from the random layer assignment ensure good coverage without explicit global coordination. No central planner decides which nodes go where—randomness provides statistical guarantees.

Complexity Analysis

Search time: O(log n) expected. In practice, ~10-20 distance computations per query at 1M scale.

Construction time: O(n log n). Each insertion requires O(log n) search plus O(M × efConstruction) distance computations.

Memory: O(n × M) for edges, plus O(n × d) for vectors.

Comparison with Other Algorithms

Algorithm	Search	Build	Memory	Dynamic
HNSW	O(log n)	O(n log n)	High	Yes
IVF	O(n/k)	O(n)	Low	Partial
LSH	O(1) avg	O(n)	Medium	Yes
ANNOY	O(log n)	O(n log n)	Medium	No

HNSW excels at dynamic updates where you need to add or remove vectors without rebuilding the entire index, at achieving high recall at low latency, and at production workloads where reliability matters. The trade-off is higher memory consumption compared to alternatives like IVF.

Practical Tips

Start with defaults: M=16 and efConstruction=200 work well for most cases. You can tune later once you understand your specific workload.

Tune efSearch at query time since it is the easiest knob to adjust for recall/latency trade-offs. Unlike M and efConstruction, you can change efSearch without rebuilding the index.

Monitor recall on a test set. This is the only way to know if your parameters are sufficient. Abstract benchmarks tell you what is possible; your own data tells you what you need.

Pre-filter when possible. If you can apply metadata filters before the vector search, you reduce the search space and improve both speed and relevance.

Consider disk-based variants like DiskANN for datasets that exceed RAM. The algorithmic ideas are similar but the implementation handles memory pressure gracefully.

Key Takeaways

HNSW builds a hierarchical graph where higher layers enable fast coarse navigation
Vectors are assigned to random layers with exponentially decaying probability
Search greedily descends through layers, narrowing the search region
M controls connections (memory/accuracy), efConstruction controls build quality, efSearch controls query recall
O(log n) search complexity enables millisecond latency at million-vector scale
HNSW is the default choice for most production semantic search systems