Consistency Model
Brainy 8.0's consistency story rests on one mechanism: generational MVCC
— multi-version concurrency control over immutable, generation-stamped
records. It is exposed through a single value type, the Db: an
immutable, point-in-time view of the whole store that you query like the
live brain.
const db = brain.now() // pin the current state — O(1), no I/O
await brain.transact([
{ op: 'update', id: invoiceId, metadata: { status: 'paid' } }
])
await db.get(invoiceId) // still 'pending' — pinned, forever
await brain.get(invoiceId) // 'paid' — live
await db.release() // unpin when doneThis page states the guarantees precisely — what is promised, what it costs,
and where the honest limits are. The design record is
ADR-001; every guarantee below is proven
by a dedicated test in tests/integration/db-mvcc.test.ts.
The generation clock
A monotonic generation counter is the store's logical clock:
- It advances once per committed
transact()batch and once per single-operation write (add/update/remove/relate/…). brain.generation()reads it; it is persisted in the data directory and never reissued — not across restarts, and not acrossrestore()(the counter is floored at its pre-restore value).
Every Db is pinned at one generation. db.generation and db.timestamp
identify the view; newerDb.since(olderDb) returns exactly the entity and
relationship ids that committed transactions touched between two views.
Snapshot isolation for reads
Guarantee: a Db reads exactly the state at its pinned generation, no
matter what commits afterwards — including deletes. There are no torn reads,
no partially applied batches, and no drift over time.
brain.now()pins the current generation in O(1).brain.transact()returns aDbpinned at the freshly committed generation.brain.asOf(generation | Date | snapshotPath)pins past state.
While nothing has committed past the pin, reads delegate to the live fast paths — pinning is free until history actually moves. Once later transactions commit, the view keeps serving the full query surface at its generation (see "Reading the past" below).
Writers are never blocked by readers and readers never block writers: a pinned view stays valid because nothing overwrites the immutable records it resolves from (the LMDB reader-pin model).
Transaction atomicity
brain.transact(ops) executes a declarative batch — add, update,
remove, relate, unrelate — atomically as exactly one generation:
const db = await brain.transact([
{ op: 'add', id: orderId, type: NounType.Document, subtype: 'order', data: 'Order #1042' },
{ op: 'add', id: itemId, type: NounType.Thing, subtype: 'line-item', data: 'Widget x3' },
{ op: 'relate', from: orderId, to: itemId, type: VerbType.Contains, subtype: 'order-line' }
], { meta: { author: 'order-service', requestId: 'req-9f2' } })
db.receipt.ids // resolved id per operation, in input orderEither every operation applies, or none do and the store is byte-identical to its pre-transaction state. Operation semantics mirror the corresponding single-operation methods — validation, subtype enforcement, relationship deduplication, delete cascades — and later operations may reference ids created earlier in the same batch.
The commit point is one atomic rename. The durability protocol:
- Before-images of every touched id are staged into an immutable generation directory and fsynced.
- The batch executes through the transaction manager (which has its own operation-level rollback for non-crash failures).
- The store manifest is replaced via atomic temp-file rename and fsynced. The rename is the commit — a generation is committed if and only if the manifest says so.
Crash recovery: on the next open, any staged generation above the manifest watermark is an uncommitted transaction; its before-images are restored (idempotently — recovery can itself crash and rerun) and derived indexes never observe the rolled-back state. A crash anywhere before the rename rolls back to the exact pre-transaction bytes; a crash after it keeps the transaction.
Transaction metadata (meta) is reified Datomic-style: recorded in an
append-only transaction log readable via brain.transactionLog() — audit
fields live in the database, not in commit messages.
Two levels of compare-and-swap
Concurrent transact() calls commit serially (snapshot-isolated batches).
For lost-update protection across a read–modify–write cycle, Brainy offers
CAS at two granularities:
| Granularity | Mechanism | Conflict error | Use when |
|---|---|---|---|
| Per entity | _rev + { op: 'update', ifRev } (also on brain.update()) |
RevisionConflictError |
"This entity must not have changed since I read it." |
| Whole store | transact(ops, { ifAtGeneration }) |
GenerationConflictError |
"Nothing may have committed since I read." |
const view = brain.now()
const order = await view.get(orderId)
try {
await brain.transact(
[{ op: 'update', id: orderId, metadata: { total: recompute(order) }, ifRev: order._rev }],
{ ifAtGeneration: view.generation }
)
} catch (err) {
if (err instanceof GenerationConflictError) {
// Something committed since the pin — re-read and retry.
}
} finally {
await view.release()
}An ifRev conflict on any operation rejects the whole batch; an
ifAtGeneration conflict is detected before anything is staged. Both leave
the store untouched and the generation counter unchanged. See
Optimistic concurrency with _rev
for the per-entity pattern in depth.
Reading the past
brain.asOf() accepts a generation number, a Date (resolved through the
transaction log to the newest generation committed at or before it), or a
snapshot directory path. Historical views serve the full query surface
— get(), find() in every mode, semantic search, graph traversal,
cursors, aggregation — through two complementary paths:
- Record path (free):
get(), metadata-levelfind(), and filter-basedrelated()resolve directly through the immutable record layer. Ids untouched since the pin still ride the live fast paths. - Index path (paid once): index-accelerated queries — semantic/vector
search, graph traversal, cursors, aggregation — are served by an
at-generation index materialization built lazily on first use:
Brainy reconstructs in-memory indexes over the exact record set at that
generation. This costs O(n at the pinned generation) time and memory,
once per
Db, cached untilrelease(). That is the open-core price of historical index queries, stated plainly.
A native index provider implementing the optional
VersionedIndexProvider plugin capability serves the same historical reads
from its retained index segments without any rebuild — the materializer
is the correctness baseline, the provider is the accelerator. Semantics are
identical on both paths.
History granularity — the honest limit
Generation records are written per transact() batch only.
Single-operation writes (add/update/remove/relate/… outside
transact()) advance the generation counter — so watermarks and CAS stay
sound — but do not stage before-images: they remain visible through
earlier pins and are not reported by db.since(). Code that needs pinned
isolation across its own writes uses transact(). This is the documented
8.0 contract, not an accident.
Speculative writes: `db.with()`
db.with(ops) returns a new Db whose reads see the operations applied
in memory, on top of the view — Datomic's with. Nothing touches disk,
the generation counter, or index providers:
const current = brain.now()
const whatIf = await current.with([
{ op: 'update', id: employeeId, metadata: { team: 'platform' } }
])
await whatIf.find({ where: { team: 'platform' } }) // sees the change
await brain.get(employeeId) // unchanged — nothing committedThe one boundary: overlay entities carry no embeddings (with() never
invokes the embedder), so index-accelerated queries and persist() on a
speculative view throw SpeculativeOverlayError rather than returning
silently incomplete results. get(), metadata-filter find(), and
filter-based related() work fully on overlays. To get the full surface,
commit the same operations with brain.transact().
Retention and compaction
Historical records cost disk space, so retention is explicit:
- Every live
Dbholds a refcounted pin; a record-set is never reclaimed while any pin could need it — pinned reads stay correct across compaction, always. brain.compactHistory({ retainGenerations?, retainMs? })reclaims everything no retention rule and no pin protects, and records the horizon —asOf()below it throwsGenerationCompactedError, explicitly, never partial data.- To keep a state readable forever,
persist()it first: snapshots are self-contained and unaffected by compaction of the source store.
Release Db values you do not keep (including the ones transact()
returns). A FinalizationRegistry backstop releases leaked pins at garbage
collection, but explicit release() is what makes compaction
deterministic.
Durability: snapshots and restore
db.persist(path) cuts a self-contained snapshot under the store's
commit mutex, so no commit or compaction can interleave. On filesystem
storage it is built from hard links: because every data file is
immutable-by-rename, linking is safe — the snapshot is created without
copying entity data, shares disk space with the source, and later writes to
the source can never alter it (rewrites swap inodes; the snapshot keeps the
old bytes). Cross-device targets fall back to byte copies; in-memory stores
serialize to the same directory layout, producing a real, durable store.
Two rules keep snapshots honest:
persist()requires the view to still be the store's latest generation (a snapshot captures current bytes); a view that history has moved past throwsGenerationConflictErrorinstead of persisting the wrong state.brain.restore(path, { confirm: true })replaces the store's entire state from a snapshot via byte copy (never links — the snapshot stays independent), rebuilds all indexes, and floors the generation counter so observed generation numbers are never reissued. Live pins do not survive a restore — release them first (a warning is logged when any exist).
Brainy.load(path) (or brain.asOf(path)) opens a snapshot as a
self-contained read-only store with the full query surface, including
vector search.
Reserved fields
Some field names belong to Brainy, not to your metadata. They live at top
level on every entity and relationship, have dedicated write paths, and may
never appear inside a metadata bag:
| Entities (nouns) | Relationships (verbs) | Canonical write path |
|---|---|---|
noun |
verb |
the type param of add() / relate() |
subtype |
subtype |
the subtype param |
visibility |
visibility |
the visibility param ('public' | 'internal') |
confidence |
confidence |
the confidence param |
weight |
weight |
the weight param |
service |
service |
the service param (fixed at create time) |
data |
data |
the data param |
createdBy |
createdBy |
the createdBy param of add() (system-managed on verbs) |
createdAt, updatedAt, _rev |
createdAt, updatedAt, _rev |
system-managed (ifRev for CAS) |
The canonical machine-readable lists are exported as
RESERVED_ENTITY_FIELDS and RESERVED_RELATION_FIELDS (defined in
src/types/reservedFields.ts, the single source of truth). Three layers
enforce the contract:
- Compile time — every
metadataparam (add,update,relate,updateRelation, and the matchingtransact()operations) rejects a literal reserved key as a TypeScript error. - Write time — untyped (JavaScript) callers that pass one anyway are
normalized: user-settable fields (
confidence,weight,subtype, andservice/createdByat create time) are remapped to their dedicated param — top-level wins when both are supplied — and system-managed fields are dropped with a one-shot warning naming the correct write path.update({ metadata: { confidence: 0.9 } })therefore behaves exactly likeupdate({ confidence: 0.9 }). - Read time — every read path (
get,find,search,related, batch reads, and historicalasOf()materialization) surfaces reserved fields only at top level:entity.metadataandrelation.metadatacontain only your custom fields, always.
const id = await brain.add({
type: 'document', subtype: 'invoice',
data: 'Invoice #42', confidence: 0.95, // reserved → top-level params
metadata: { customer: 'acme', total: 129.5 } // custom fields only
})
const entity = await brain.get(id)
entity.confidence // 0.95 — top level
entity.metadata // { customer: 'acme', total: 129.5 }Visibility — `public` / `internal` / `system`
visibility is a reserved tier that controls whether an entity or relationship
surfaces on Brainy's default user-facing reads. The absence of the field is
exactly equivalent to 'public'.
| Tier | Counted in getNounCount() / stats()? |
Returned by default find() / related()? |
Opt-in |
|---|---|---|---|
'public' (default, or field absent) |
yes | yes | — |
'internal' |
no | no | find({ includeInternal: true }) / related({ includeInternal: true }) |
'system' |
no | no | find({ includeSystem: true }) / related({ includeSystem: true }) |
'public'— normal data. Counted and returned everywhere. Stored lean: the field is omitted on disk for public records, so existing data needs no migration.'internal'— your app's own bookkeeping (audit trails, derived caches, scratch entities) that should not pollute default queries, counts, orstats(), yet must stay retrievable on demand. Set it via thevisibilityparam; read it back with theincludeInternalopt-in.'system'— Brainy's own plumbing (for example the Virtual File System root entity). Hidden everywhere by default — even whenincludeInternalis set — and surfaced only with the explicitincludeSystemopt-in. The'system'tier is not part of the publicadd()/relate()param type ('public' | 'internal'); only internal Brainy code assigns it.
The opt-ins are applied as a hard candidate filter — hidden entities are
removed before limit / offset are applied, so a default find({ limit: 10 })
always returns ten visible results when that many exist, never a short page.
// App-internal scratch entity: present, retrievable, but out of the way.
await brain.add({ type: 'task', data: 'reindex job', visibility: 'internal' })
await brain.getNounCount() // unchanged — internal not counted
await brain.find({ type: 'task' }) // [] — hidden by default
await brain.find({ type: 'task', includeInternal: true }) // includes it
// A brand-new brain reports zero user entities even though the VFS root exists:
const fresh = new Brainy()
await fresh.init()
await fresh.getNounCount() // 0 — the root is visibility:'system'Note (8.0): the structural
Containsedges the VFS creates between directories and files are left at the default (public) visibility for now — only the VFS root entity is'system'. Marking those edges system requires companion changes to VFS traversal and is out of scope for this change.
What is not guaranteed
Stated plainly, so nothing surprises you in production:
- Single-writer. Brainy is a single-writer, many-reader database (multi-process model). Transactions are atomic within one writer process — there is no distributed or cross-process transaction coordination.
- History granularity. Only
transact()batches produce historical records; single-operation writes between commits stay visible through earlier pins (see above). - Compacted history is gone.
asOf()below the compaction horizon fails explicitly; persist what you must keep. - Counter persistence is coalesced for single-operation writes. Durable artifacts (records, manifests, snapshots) always persist the counter at their own commit points, so a crash inside the coalescing window can lose only counter values nothing durable ever referenced.
- Speculative overlays are metadata-only readers. Index-accelerated
queries on
with()views throw rather than guess.
Where to go next
- Snapshots & Time Travel — the recipes: backup, restore, time-travel debugging, what-if analysis, audit trails.
- Optimistic concurrency with
_rev— the per-entity CAS pattern. - ADR-001: Generational MVCC — the full design record, including the persisted layout and the proof table.