Brainy vs Brainy + Cortex

Overall: Cortex makes Brainy 5.2x faster (geometric mean across 15 operations). And takes single-machine scale from ~10M to 1B+ vectors via the DiskANN engine.

Embedding (real model): 2.8x | Infrastructure: 7.8x

Generated 2026-02-18 | 200 entities | 384-dim vectors | model: all-MiniLM-L6-v2 | in-memory storage

At a Glance

Category	Avg Speedup	What it covers
Embedding	2.8x	Real ML model inference (all-MiniLM-L6-v2)
Data Operations	7.3x	Adding, reading, and deleting entities
Search	25.2x	Finding entities by meaning, filters, and similarity
Graph	1.3x	Querying relationships between entities
Neural	1.9x	AI-powered neighbors and clustering

Scale Ceiling

The numbers above measure speedup at the 200-entity tier — the "make my queries faster" story. Cortex also moves the scale ceiling itself: with DiskANN engaged, a single Brainy instance reaches workloads that were previously single-machine impossible.

PROJECTED — design targets, not measured. The table below is extrapolated from algorithm math (Vamana / PQ memory + latency models). Measured numbers at 100M and 1B on real hardware ship in docs/verification-report.md as part of Piece 9 of the cortex 3.0 release. Per CLAUDE.md, perf claims without a MEASURED citation must carry a PROJECTED label until verified.

Workload	Brainy alone	Brainy + Cortex (HNSW)	Brainy + Cortex (DiskANN)
1 M vectors @ 384-dim	OK, ~1 GB RAM	OK, ~1 GB RAM, 25× faster search	OK, ~0.5 GB RAM, sub-ms search
10 M vectors	OK, ~15 GB RAM, slow build	OK, ~10 GB RAM	OK, ~3 GB RAM, 1–3 ms search
100 M vectors	Out of practical reach	OK if you have 100+ GB RAM	OK on a 32 GB box, 2–5 ms search
1 B vectors	Not possible (1.5 TB RAM)	Not possible (1.5+ TB RAM)	OK on a 64 GB box, 5–10 ms search

DiskANN auto-engages when storage is local and a stable idMapper is available — see docs/diskann.md. Cloud-storage adapters stay on HNSW transparently.

Embedding consistency: max element-wise diff = 1.43e-7, avg = 2.64e-8 (WASM and native produce equivalent vectors)

Same results, just faster. Cortex accelerates Brainy without changing any answer: native output matches the JavaScript baseline byte-for-byte, enforced by a 104-test cross-language parity suite (tokenization, value normalization, string collation, SQ8 distance, top-K ranking, roaring/msgpack). See Cortex Performance → Cross-Language Consistency.

Detailed Results

Embedding

What it does	WASM	Native Rust	Speedup
Engine initialization	192 ms	115 ms	1.7x
Embed one sentence	116 ms	47 ms	2.4x
Embed 1 sentence (batch path)	116 ms	55 ms	2.1x
Embed 10 sentences	1173 ms	330 ms	3.6x
Embed 20 sentences	2371 ms	611 ms	3.9x
Embed 50 sentences	6050 ms	1577 ms	3.8x

Data Operations

What it does	Brainy	+ Cortex	Speedup
Store one entity	21 ms	3.0 ms	7.0x
Store 20 entities at once	1384 ms	110 ms	12.6x
Retrieve one entity	0.001 ms	0.001 ms	1.3x
Remove and re-insert	113 ms	4.5 ms	25.0x

Search

What it does	Brainy	+ Cortex	Speedup
Search by meaning	21 ms	0.4 ms	50.8x
Search by fields	1.9 ms	0.018 ms	108.3x
Find similar items	0.3 ms	0.1 ms	2.9x

Graph

What it does	Brainy	+ Cortex	Speedup
Query relationships	0.011 ms	0.008 ms	1.3x

Neural

What it does	Brainy	+ Cortex	Speedup
Nearest neighbors	0.001 ms	0.001 ms	1.9x
Auto-cluster items	—	—	skipped

Embedding Throughput (batch of 50)

Engine	Texts/sec	ms/text
WASM	8	121.0
Native Rust	32	31.5

Methodology: Infrastructure operations measured 20 times after 3 warmup runs; embedding operations measured 10 times after 2 warmup runs. Reported value is the median. Infrastructure benchmarks use a lightweight hash-based embedding to isolate acceleration from model differences. Embedding benchmarks use the real all-MiniLM-L6-v2 model (384-dim). "Speedup" is the geometric mean of per-operation ratios within each category.