Brainy vs Brainy + Cortex
Overall: Cortex makes Brainy 5.2x faster (geometric mean across 15 operations). And takes single-machine scale from ~10M to 1B+ vectors via the DiskANN engine.
Embedding (real model): 2.8x | Infrastructure: 7.8x
Generated 2026-02-18 | 200 entities | 384-dim vectors | model: all-MiniLM-L6-v2 | in-memory storage
At a Glance
| Category | Avg Speedup | What it covers |
|---|---|---|
| Embedding | 2.8x | Real ML model inference (all-MiniLM-L6-v2) |
| Data Operations | 7.3x | Adding, reading, and deleting entities |
| Search | 25.2x | Finding entities by meaning, filters, and similarity |
| Graph | 1.3x | Querying relationships between entities |
| Neural | 1.9x | AI-powered neighbors and clustering |
Scale Ceiling
The numbers above measure speedup at the 200-entity tier — the "make my queries faster" story. Cortex also moves the scale ceiling itself: with DiskANN engaged, a single Brainy instance reaches workloads that were previously single-machine impossible.
PROJECTED — design targets, not measured. The table below is extrapolated from algorithm math (Vamana / PQ memory + latency models). Measured numbers at 100M and 1B on real hardware ship in
docs/verification-report.mdas part of Piece 9 of the cortex 3.0 release. PerCLAUDE.md, perf claims without aMEASUREDcitation must carry aPROJECTEDlabel until verified.
| Workload | Brainy alone | Brainy + Cortex (HNSW) | Brainy + Cortex (DiskANN) |
|---|---|---|---|
| 1 M vectors @ 384-dim | OK, ~1 GB RAM | OK, ~1 GB RAM, 25× faster search | OK, ~0.5 GB RAM, sub-ms search |
| 10 M vectors | OK, ~15 GB RAM, slow build | OK, ~10 GB RAM | OK, ~3 GB RAM, 1–3 ms search |
| 100 M vectors | Out of practical reach | OK if you have 100+ GB RAM | OK on a 32 GB box, 2–5 ms search |
| 1 B vectors | Not possible (1.5 TB RAM) | Not possible (1.5+ TB RAM) | OK on a 64 GB box, 5–10 ms search |
DiskANN auto-engages when storage is local and a stable idMapper is available — see docs/diskann.md. Cloud-storage adapters stay on HNSW transparently.
Embedding consistency: max element-wise diff = 1.43e-7, avg = 2.64e-8 (WASM and native produce equivalent vectors)
Same results, just faster. Cortex accelerates Brainy without changing any answer: native output matches the JavaScript baseline byte-for-byte, enforced by a 104-test cross-language parity suite (tokenization, value normalization, string collation, SQ8 distance, top-K ranking, roaring/msgpack). See Cortex Performance → Cross-Language Consistency.
Detailed Results
Embedding
| What it does | WASM | Native Rust | Speedup |
|---|---|---|---|
| Engine initialization | 192 ms | 115 ms | 1.7x |
| Embed one sentence | 116 ms | 47 ms | 2.4x |
| Embed 1 sentence (batch path) | 116 ms | 55 ms | 2.1x |
| Embed 10 sentences | 1173 ms | 330 ms | 3.6x |
| Embed 20 sentences | 2371 ms | 611 ms | 3.9x |
| Embed 50 sentences | 6050 ms | 1577 ms | 3.8x |
Data Operations
| What it does | Brainy | + Cortex | Speedup |
|---|---|---|---|
| Store one entity | 21 ms | 3.0 ms | 7.0x |
| Store 20 entities at once | 1384 ms | 110 ms | 12.6x |
| Retrieve one entity | 0.001 ms | 0.001 ms | 1.3x |
| Remove and re-insert | 113 ms | 4.5 ms | 25.0x |
Search
| What it does | Brainy | + Cortex | Speedup |
|---|---|---|---|
| Search by meaning | 21 ms | 0.4 ms | 50.8x |
| Search by fields | 1.9 ms | 0.018 ms | 108.3x |
| Find similar items | 0.3 ms | 0.1 ms | 2.9x |
Graph
| What it does | Brainy | + Cortex | Speedup |
|---|---|---|---|
| Query relationships | 0.011 ms | 0.008 ms | 1.3x |
Neural
| What it does | Brainy | + Cortex | Speedup |
|---|---|---|---|
| Nearest neighbors | 0.001 ms | 0.001 ms | 1.9x |
| Auto-cluster items | — | — | skipped |
Embedding Throughput (batch of 50)
| Engine | Texts/sec | ms/text |
|---|---|---|
| WASM | 8 | 121.0 |
| Native Rust | 32 | 31.5 |
Methodology: Infrastructure operations measured 20 times after 3 warmup runs; embedding operations measured 10 times after 2 warmup runs. Reported value is the median. Infrastructure benchmarks use a lightweight hash-based embedding to isolate acceleration from model differences. Embedding benchmarks use the real all-MiniLM-L6-v2 model (384-dim). "Speedup" is the geometric mean of per-operation ratios within each category.