Guide

Adaptive DiskANN — From Laptop to Billion-Scale

Cor ships a 100% pure-Rust Adaptive DiskANN implementation that takes a single Brainy instance from the ~10 M-vector HNSW comfort zone to 1 B+ vectors on commodity hardware. MEASURED at 1 M: p50 0.85 ms / p99 5.23 ms on bxl9000 (mode=auto, in-memory). 1 B target is PROJECTED at ~5–10 ms via Mode 3 (on-disk) — the DiskANN paper's published range. It's an algorithmic alternative to HNSW, not a competitor service — it lives in the same plugin, behind the same Brainy APIs.

The "Adaptive" part: the same on-disk index file silently changes residency mode based on dataset size + available RAM:

Mode When auto-selects What changes
Mode 1 — in-memory Vectors × dim × 4 fits in RAM (typical up through 10 M at 128-dim, 1 M at 1536-dim) Zero PQ compression. Full Vamana graph + vectors mmap'd into RAM. Fastest — no disk I/O on hot path.
Mode 2 — hybrid RAM-constrained but PQ centroids fit (100 M tier on commodity boxes) PQ-compressed centroids in RAM (16–32× smaller); full vectors mmap'd from disk with page-cache promotion. ~2-3× slower than Mode 1.
Mode 3 — on-disk Billion-scale where even uncompressed vectors don't fit on commodity hardware Maximally PQ-compressed; search walks disk via OS page cache + DiskANN's neighbour-prefetch pattern. Published 5 ms range from the paper.

You build the index once with cor 3.0 defaults; as your dataset grows over time, the same file changes its residency mode under the hood. No re-index, no API change, no perf cliff at boundaries.

The problem DiskANN solves

HNSW is excellent up to roughly 10 M vectors per machine. Beyond that, two costs compound:

  1. Memory pressure. At 1 B vectors of 384-dim float32, the vectors alone are ~1.5 TB and the HNSW graph metadata adds another ~2 TB. Even with the cortex 2.4.0 mmap vector backend, the graph traversal pattern faults pages with no spatial locality.
  2. No locality. HNSW's insertion order has no correlation with traversal order on disk, so every search hop on a cold cache is a fresh ~10 μs page fault.

DiskANN (Subramanya et al., NeurIPS 2019) was designed for exactly this regime:

  • Vamana α-pruned graph picks neighbours so that nodes visited together during search end up adjacent on disk — disk locality emerges from construction.
  • Product Quantization compresses each vector to M ≤ 16 bytes resident in RAM. At 1 B vectors that's ~16 GB of PQ codes instead of 1.5 TB of full vectors.
  • Full vectors on disk are only touched to re-rank the top candidate set, not during the graph walk.

What cor delivers

  • One contiguous file (header + codebook + PQ codes + Vamana graph + full vectors), mmap-mappable so an SSD-resident billion-vector dataset never has to be copied into RAM.
  • Zero-copy section accessors (vectors_f32() returns &[f32] directly into the mapped region).
  • Parallel build with rayon. The graph and PQ encoding both parallelize.
  • File-backed build adjacency for billion-scale construction (the in-RAM concurrent adjacency would consume ~64 GB of bookkeeping at 1 B nodes; the mmap variant uses atomic-u32 slots with sharded write locks).
  • Connectivity-repair pass that guarantees every node is reachable from the entry point. Sequential Vamana provides this via insertion order; parallel Vamana doesn't, so cor closes the gap explicitly.
  • PQ-walk + full-vector re-rank search. Greedy walk uses ADC distance over RAM-resident PQ codes (M-byte table lookups, ~50 ns per hop); re-rank scores ceil(k × paddingFactor) candidates with the exact full-precision distance.
  • HNSW-shaped TS wrapper (NativeDiskAnnWrapper) so Brainy's higher layers don't know which engine is underneath.

Scale envelope

MEASURED at 1 M on bxl9000 (Ryzen 9 7950X3D / 184 GB / NVMe), reproducible via scripts/verify-diskann.mjs. SIFT1M canonical: p50 0.86 ms / p99 1.22 ms, recall 0.9942 on the BIGANN reference dataset. 10 M / 100 M numbers below are partial-measured (SIFT10M canonical at p50 0.78 ms / p99 1.13 ms, hybrid mode). 1 B numbers are PROJECTED from the SIFT1B reference run + Subramanya et al. algorithm model.

Vectors RAM with DiskANN RAM with HNSW Mode DiskANN search latency
1 M 0.5–2 GB 0.5–2 GB Mode 1 p50 0.85 ms / p99 5.23 ms (MEASURED, mode=auto)
10 M 1–5 GB 8–20 GB Mode 1 p50 0.78 ms / p99 1.13 ms (MEASURED on SIFT10M, hybrid)
100 M 5–20 GB 80–200 GB (impractical on single machine) Mode 2 2–5 ms (PROJECTED from SIFT trends)
1 B 20–70 GB 1.5+ TB (single-machine impossible) Mode 3 5–10 ms (PROJECTED from SIFT1B trends)

End-to-end query latency at 1 B includes filesystem hydration of the returned entities. Design target is ~100–500 ms total (dominated by the FileSystemStorage random reads + metadata lookup, PROJECTED). The roadmap to bring end-to-end query latency down to match search latency is in docs/scaling.md — all fixes ship inside cor, no external storage.

How it engages

Cor registers a 'diskann' provider; Brainy's createIndex() consults it at init:

  1. Explicit opt-in via config.index.type: 'diskann' — required if the engagement conditions aren't satisfied, otherwise throws.
  2. Auto-engagement when all of:
    • The cor DiskANN provider is registered (you've loaded the plugin).
    • The storage adapter exposes a local filesystem path (getBinaryBlobPath('_diskann/main')). Cloud-storage adapters return null here and stay on HNSW.
    • The metadata index has a stable idMapper (cortex 2.4.0's stable EntityIdMapper).
  3. Explicit opt-out via config.index.type: 'hnsw' keeps the historical in-memory index for the rare workload where you want it.
import { BrainyData } from '@soulcraft/brainy'
import { register as registerCor } from '@soulcraft/cor'

const brain = new BrainyData({
  storage: { type: 'filesystem', rootDirectory: '/data/idx' }
})
await registerCor(brain)
await brain.init()
// → [brainy] DiskANN engaged (path=/data/idx/_diskann/main.bin, dim=384)

const hits = await brain.search(queryVector, 10)
// MEASURED at 1 M: p50 0.85 ms / p99 5.23 ms (mode=auto on bxl9000)
// PROJECTED at 1 B: ~5–10 ms depending on cache state and SSD

All Brainy APIs — add, search, relate, searchSimilarVerbs, find — work unchanged.

Migrating an existing index

Existing HNSW-backed Brainy installs do not auto-migrate on upgrade. They keep working as-is. To convert:

const result = await brain.migrateToDiskAnn({
  recallTarget: 0.95,    // require ≥95% recall vs old index before swapping
  paddingFactor: 1.2,    // search-time over-fetch for re-rank
  verifySampleSize: 100  // sample queries for the recall check
})
// 1. Builds new index in parallel (old HNSW keeps serving)
// 2. Samples queries — compares top-k results between old and new
// 3. Aborts if recall < target; old index stays in place
// 4. Atomically swaps if recall passes

// Reversible:
await brain.migrateToHnsw()

Reversibility is a contract — production rollbacks are always available.

Tuning

Defaults match the published DiskANN paper and work well for sentence/image embedding workloads (dim 128–1024). Tune via config.index.diskann:

config.index.diskann = {
  pqM: 16,                  // PQ subspaces; dim must be divisible by m
  pqKsub: 256,              // centroids per subspace (8-bit codes — standard)
  maxDegree: 64,            // Vamana R (out-degree per node)
  searchListSize: 100,      // Vamana L (build-time candidate set)
  alpha: 1.2,               // α-pruning density factor
  useMmapAdjacency: true,   // file-backed build adjacency — REQUIRED at >100M nodes
  mmapAdjacencyPath: '/data/scratch/diskann-build.adj'
}

At 1 B nodes you must set useMmapAdjacency: true. The in-RAM concurrent adjacency would consume ~64 GB of bookkeeping at that scale — the mmap variant uses atomic-u32 slots with sharded write locks and bounded RAM (a few hundred mutexes for the shard table).

Dynamic writes

DiskANN graphs are build-once by design. The cor wrapper handles dynamic writes via a delta buffer pattern that mirrors FreshDiskANN (Singh et al., 2021):

  • addItem → appends to an in-memory delta map.
  • search → queries the main index AND brute-forces the delta, merges, returns top-k.
  • removeItem → tombstone bitmap, filtered out at search time.
  • rebuild() → folds the delta into a new main index, swaps atomically.

Operationally: insertions are O(1), reads stay sub-ms while delta fits in cache, and you schedule rebuild() during off-peak windows. The delta brute-force scales linearly in delta size; you keep it small.

On-disk format

Single contiguous main.bin, all little-endian:

+--------------------------------------------------------------+
| Header (4 KB, page-aligned)                                  |
|   magic="DKAN" · version · dim · node_count                  |
|   pq_m · pq_ksub · pq_dsub · max_degree · entry_point        |
+--------------------------------------------------------------+
| PQ codebook    (m × ksub × dsub × f32)                       |
+--------------------------------------------------------------+
| PQ codes       (node_count × m bytes)                        |
+--------------------------------------------------------------+
| Vamana graph   (node_count × max_degree × u32)               |
|   Fixed-degree CSR. Sentinel `u32::MAX` marks unused slots.  |
+--------------------------------------------------------------+
| Full vectors   (node_count × dim × f32)                      |
+--------------------------------------------------------------+

Fixed-degree adjacency means neighbour-offset math is O(1) — at search time graph[node] is a single seek to graph_offset + node * max_degree * 4. The mmap base is page-aligned by the OS and section offsets are 4-byte aligned by construction, so bytemuck::cast_slice reinterprets section bytes as &[f32] / &[u32] without copying.

The same layout is what the build writes and the searcher mmaps — no separate serialization step.

Why 100% Rust, no C++ FFI

Cor re-implements Vamana from the published paper rather than wrapping Microsoft's C++ reference. Reasons:

  1. Cross-platform builds for Node native modules become operationally expensive with C++ (Linux/macOS/Windows × x64/arm64 binaries, headers, link-time gotchas). napi-rs gives mature cross-platform binary distribution.
  2. License posture stays clean — pure Rust port from a published algorithm + permissive Rust deps (memmap2, bytemuck, rayon, rand, thiserror). No patent grant ambiguity.
  3. Full control over the on-disk format + napi bindings + future cor-specific optimizations.

The pure-Rust DiskANN crate (native/diskann/) compiles and tests independently of napi, so it's separately benchmarkable and fuzz-target-ready.

See also