Database
The incremental_graph/database module wraps a LevelDB instance
and exposes it as a typed, namespace-scoped key–value store for the incremental graph engine.
Conceptual overview
Namespaces (x / y)
Every key is stored inside a namespace sublevel – currently x (live data) or y (staging
namespace used during schema migrations). At the LevelDB level this means all keys are prefixed
with !x! or !y!. Callers never deal with these prefixes directly; the RootDatabase class
encapsulates them.
Sub-sublevels
Within each namespace there are further typed sublevels:
| Sublevel | Purpose |
|---|---|
values | The computed output value for each graph node |
freshness | Whether a node is up-to-date or potentially-outdated |
inputs | Input dependency list for each node |
revdeps | Reverse-dependency index (input → list of dependents) |
counters | Monotonic integer tracking how many times a value changed |
timestamps | Creation and last-modification ISO timestamps |
meta | Namespace metadata (currently just the schema version) |
There is also a top-level _meta sublevel (outside the x/y namespace) that stores the database
format marker.
Key format
Node keys are JSON-serialised objects of the form {"head":"<name>","args":[...]}, for example:
{"head":"all_events","args":[]}
{"head":"event","args":["abc123"]}
{"head":"transcription","args":["/path/to/audio.mp3"]}
At the raw LevelDB level these are concatenated with the sublevel prefixes, e.g.
!x!!values!{"head":"all_events","args":[]}
!x!!freshness!{"head":"all_events","args":[]}
!_meta!format
Filesystem rendering
The database exposes two complementary operations for dumping and restoring its complete state to/from a plain directory tree:
const { renderToFilesystem, scanFromFilesystem } = require('./database');
// Dump every key/value pair to disk
await renderToFilesystem(capabilities, rootDatabase, '/path/to/snapshot');
// Restore the database from a snapshot (clears all existing entries first)
await scanFromFilesystem(capabilities, rootDatabase, '/path/to/snapshot');
Key → file-path mapping
Each raw LevelDB key is translated to a relative file path inside the snapshot directory. The algorithm depends on the key type:
Data sublevels (values, freshness, inputs, revdeps, counters, timestamps)
The stored key is a JSON-serialised NodeKey object {"head":"...","args":[...]}.
It is decomposed into human-readable path segments, similar to how /api/graph/nodes encodes
graph nodes in URLs:
!x!!values!{"head":"all_events","args":[]}
→ x/values/all_events
!x!!values!{"head":"event","args":["abc123"]}
→ x/values/event/abc123
!x!!values!{"head":"transcription","args":["/audio/x.mp3"]}
→ x/values/transcription/%2Faudio%2Fx.mp3
String arguments are percent-encoded: / → %2F, % → %25, ! → %21, and ~ → %7E.
In addition, literal dot-segment path components . and .. are encoded as %2E and %2E%2E
to prevent path traversal while keeping the key↔path mapping bijective. Non-string arguments
(numbers, booleans, arrays, objects) are JSON-encoded and prefixed with ~ so they remain
unambiguous even when string arguments begin with ~.
Meta sublevels (_meta, meta)
The stored key is a plain string (e.g. format, version).
It is used as a single percent-encoded path segment:
!_meta!format → _meta/format
!x!!meta!version → x/meta/version
File-path → key mapping (inverse)
relativePathToKey is the exact inverse of keyToRelativePath:
- Determine sublevel depth: if the first segment is
_meta→ depth 1; otherwise depth 2. - Extract sublevels: first
depthsegments. - Determine key type: if the last sublevel is
_metaormeta→ plain string; otherwise NodeKey. - Reconstruct key:
- Plain string: decode the single remaining segment and reassemble the LevelDB key.
- NodeKey: first remaining segment is the node head; subsequent segments are decoded arguments;
reassemble as
JSON.stringify({head, args})and build the LevelDB key.
Bijection guarantee
For all keys generated by this database the mapping key → path → key is an exact bijection:
relativePathToKey(keyToRelativePath(key)) === key // for all valid keys
The ! character in argument values is encoded as %21 before splitting, so it can never be
mistaken for the LevelDB sublevel separator. This is the P1 fix from the initial implementation.
Stale-key deletion (P2)
scanFromFilesystem clears all existing entries from the database before importing.
This ensures that keys present in the database but absent from the snapshot directory
(i.e., deleted entries) do not survive the restore, preserving the bijection/restore semantics.
Value serialisation
Values are stored as JSON. renderToFilesystem writes JSON.stringify(value) to each file;
scanFromFilesystem reads each file and calls JSON.parse(content) before writing back to the
database.
No locking
Neither renderToFilesystem nor scanFromFilesystem acquires any lock. Callers that require
atomicity must arrange their own locking around these calls.
Checkpointing and synchronisation
The live LevelDB now lives outside the git repository
(<workingDirectory>/generators-leveldb/). The git repository stores a rendered
filesystem snapshot under <workingDirectory>/generators-database/rendered/.
Two higher-level operations are available:
checkpointDatabase(capabilities, message, rootDatabase)– renders the live database into the tracked snapshot directory and commits it (no-op if nothing has changed). Used for single rendered snapshots such as sync.runMigrationInTransaction(capabilities, rootDatabase, preMessage, postMessage, callback)– wraps the whole migration in one gitstore transaction, commits the rendered snapshot before the migration body runs, executes the migration, then commits the rendered post-migration snapshot in the same transaction.synchronizeNoLock(capabilities, options)– renders the current database, synchronises the rendered repository with the remote generators repository, and then scans the updated rendered snapshot back into the live database.
See docs/gitstore.md for the gitstore primitives that back these operations.