Reference
Data storage
Nauro is local-first: a project's context lives as plain UTF-8 files on your machine, not in a database. There are two on-disk locations. The project store holds the actual context under your home directory at ~/.nauro/projects/<project-id>/. A tiny repo-local pointer at <repo>/.nauro/config.json records only which project a repo belongs to. This page documents the layout, file formats, write discipline, and snapshots.
Where the store lives (two locations)
The context (decisions, current state, stack, open questions, snapshots) lives in the project store under your home directory. The store is keyed by a ULID in the v2 layout, or by project name in the v1 legacy layout.
A separate repo-local pointer sits inside each associated git repo. It records only the project id, name, and mode. It does not contain decisions or state.
- Nauro home defaults to
~/.nauro/, overridable by theNAURO_HOMEenvironment variable. The constant isDEFAULT_NAURO_HOME = ".nauro"and the env var name isNAURO_HOME_ENV = "NAURO_HOME". Resolution:Path(os.environ.get(NAURO_HOME_ENV, Path.home() / DEFAULT_NAURO_HOME)). - Project store path is derived from the project id (v2) or name (v1), under a
projects/subdir (PROJECTS_DIR = "projects"):- v2 (canonical):
get_store_path_v2(project_id)resolves to~/.nauro/projects/<ULID>/. - v1 (legacy):
get_store_path(name)resolves to~/.nauro/projects/<name>/.
- v2 (canonical):
- Repo-local config: each associated repo carries
<repo>/.nauro/config.json. The constants areREPO_CONFIG_DIR = ".nauro"andREPO_CONFIG_FILENAME = "config.json". This file is a pointer only.
The store is located from anywhere inside a repo worktree by walking up the directory tree for .nauro/config.json, exactly like git locating .git.
def find_repo_config(start: Path | None = None) -> Path | None:
current = (start if start is not None else Path.cwd()).resolve()
while True:
candidate = current / REPO_CONFIG_DIR / REPO_CONFIG_FILENAME
if candidate.is_file():
return candidate
if current.parent == current: # filesystem root
return None
current = current.parent
Source: packages/nauro/src/nauro/constants.py, store/registry.py, store/repo_config.py, store/config.py.
Nauro home directory layout
The home root holds two control-plane JSON files plus the projects/ tree. Each directory under projects/ is one project store, named by a ULID (v2) or a name (v1).
The annotated tree below shows a single project store with its Markdown files, the decisions/ subdir of numbered files, and the snapshots/ subdir of versioned JSON captures.
~/.nauro/ # NAURO_HOME (override via $NAURO_HOME)
├── registry.json # REGISTRY_FILENAME — project list (v1 name-keyed or v2 id-keyed)
├── config.json # CONFIG_FILENAME — user settings (telemetry, search.embeddings); 0o600
└── projects/ # PROJECTS_DIR
└── 01J9Z3K7Q8XB.../ # one dir per project: ULID (v2) or name (v1) — this is the store
├── project.md # PROJECT_MD
├── state_current.md # STATE_CURRENT_FILENAME
├── stack.md # STACK_MD
├── open-questions.md # OPEN_QUESTIONS_MD
├── .decision-hashes.json # DECISION_HASHES_FILE — exact-dup index for propose_decision
├── decisions/ # DECISIONS_DIR
│ ├── 001-initial-setup.md
│ ├── 002-use-postgres.md
│ └── 042-adopt-event-sourcing.md
└── snapshots/ # SNAPSHOTS_DIR
├── v001.json
├── v002.json
└── v042.json
- Legacy stores may instead contain
state.md(STATE_LEGACY_FILENAME/STATE_MD = "state.md") carrying both## Currentand## Historysections. The firstupdate_statemigrates these tostate_current.md.STATE_HISTORY_FILENAME = "state_history.md"is the new-format history file. registry.jsonandconfig.jsonare NOT inside a project store. They live at the home root and are shared across all projects.
Source: nauro/constants.py, nauro_core/constants.py, store/registry.py, templates/scaffolds.py.
What nauro init scaffolds
scaffold_project_store(project_name, store_path) creates the store dir, the two subdirs, four Markdown files, and one teaching decision.
store_path.mkdir(parents=True, exist_ok=True)
(store_path / C.DECISIONS_DIR).mkdir(exist_ok=True) # decisions/
(store_path / C.SNAPSHOTS_DIR).mkdir(exist_ok=True) # snapshots/
(store_path / C.PROJECT_MD).write_text(...) # project.md
(store_path / C.STATE_CURRENT_FILENAME).write_text(...) # state_current.md
(store_path / C.STACK_MD).write_text(...) # stack.md
(store_path / C.OPEN_QUESTIONS_MD).write_text(...) # open-questions.md
(store_path / C.DECISIONS_DIR / "001-initial-setup.md").write_text(...) # first decision
- Templates are plain f-strings and
str.format(no Jinja2); bracketed[prompts]guide you on what to fill in.state_current.mdstarts as"# Current State\n\n_(No state recorded yet.)_". .decision-hashes.jsonand the snapshot files are NOT created at init. They appear on the firstpropose_decisionand snapshot capture respectively.- The first decision (
001-initial-setup.md) is emitted vianauro_core.decision_model.format_decision(not a string template), so the on-disk decision format has a single source of truth in nauro-core.
Source: templates/scaffolds.py, cli/commands/init.py.
Decision files: naming, numbering, and Markdown format
Decision files are named decisions/{num:03d}-{slug}.md, for example 042-adopt-event-sourcing.md. The number is zero-padded to 3 digits; the slug is derived from the title.
- Next number is
max(existing num) + 1over all decision file stems (_next_decision_num); the very first is 1. Numbering is NOT serialized across concurrent writers (see the file-locking section for how collisions are tolerated and repaired). - Slug rules (
_slugify): lowercased, runs of non-alphanumerics collapse to a single-, trimmed of leading and trailing-, capped atSLUG_MAX_LENGTH = 60(_SLUG_MAX_LENGTH = 60), truncating on a word (-) boundary. - File enumeration:
FilesystemStore.list_decisions()returns sorted*.mdstems fromdecisions/(for example"042-adopt-event-sourcing"), andread_decision(stem)readsdecisions/<stem>.md. extract_decision_number()parses a number from any of: a file stem like"042-some-title"(or.md), a synthetic id"decision-042", a prefixed"D042"/"D42", or a bare"42". It returnsNoneon no match.
The on-disk format is YAML frontmatter (between --- fences) plus a Markdown body. The canonical v2 shape is produced by format_decision.
---
date: 2026-04-16
version: 1
status: active
confidence: high
decision_type: data_model
reversibility: hard
source: mcp
files_affected:
- src/store/schema.py
supersedes: '12'
---
# 042 — Adopt event sourcing
## Decision
We will model the order aggregate as an append-only event log...
## Rejected Alternatives
### CRUD with audit table
Audit drifts from the source of truth; reconstruction is lossy.
- The H1 uses an em-dash separator:
# NNN — Title(the regex requires—). The body must contain a## Decisionsection (v2 does NOT accept## Rationale). Rejected alternatives render as a## Rejected Alternativessection with### namesubsections; the reason is the subsection body.rejectedis NOT in frontmatter. - Frontmatter key order is fixed:
date, version, status, confidence, decision_type, reversibility, source, files_affected, supersedes, superseded_by.format(parse(x))is byte-identical (idempotent round-trip). - Supersession refs are stored as plain integer strings:
"70", never"070","070-slug", or"D70".status=supersededrequiressuperseded_by; an active decision with a reasonless rejected alternative raises. - Parsing is strict: missing or unterminated frontmatter, a missing H1, a missing
## Decision, malformed YAML, or non-mapping frontmatter all raiseValueError/ValidationError. The kernel scan (parse_all_decisions) logs-and-skips files that fail v2 parsing, so mid-migration files can sit on disk without blocking writes.
The field schema lives in Decision (Pydantic, extra="forbid"). The required, defaulted, and optional fields are summarized below. See the Core concepts page for the full field schema and validation rules.
| Field | Requirement | Values / notes |
|---|---|---|
date | required | ISO date |
confidence | required | high | medium | low |
version | defaulted | 1, constraint >=1 |
status | defaulted | active (also superseded) |
decision_type | optional | architecture | api_design | infrastructure | pattern | refactor | data_model |
reversibility | optional | easy | moderate | hard |
source | optional | mcp | commit | compaction | manual | import |
files_affected | optional | list of paths |
supersedes / superseded_by | optional | integer-string refs |
Fields derived from the filename and body, and excluded from frontmatter: num, title, rationale, body, content.
Source: nauro_core/decision_model.py, nauro_core/parsing.py, operations/propose_decision.py, store/reader.py, nauro/constants.py.
Decision dedup index
.decision-hashes.json is an in-store JSON index used by propose_decision Tier 1 screening to reject exact-duplicate decisions. The filename constant is DECISION_HASHES_FILE = ".decision-hashes.json" and it lives at the store root, not in decisions/.
- Shape: maps a content hash to
{"decision_id": "<stem>", "timestamp": "<ISO>"}. The key iscompute_hash(title, rationale)=hashlib.sha256(f"{title.strip().lower()}|{rationale.strip().lower()}".encode()).hexdigest(), a SHA-256 of the normalized, pipe-joined title and rationale. - Written with
json.dumps(index, indent=2) + "\n". It is updated after every successful new-decision write, so subsequent Tier 1 checks catch exact dups. A missing or corrupt file is treated as an empty index (no crash).
{
"9f1c...e2": { "decision_id": "042-adopt-event-sourcing",
"timestamp": "2026-04-16T10:22:03.114512+00:00" }
}
Source: nauro_core/operations/propose_decision.py, nauro_core/constants.py.
Other store Markdown files
project.md: project overview with a one-liner, Goals, Non-goals, Users, and Constraints.state_current.md: current working state (sprint, focus, and blockers narrative). Diff fields tracked:STATE_DIFF_FIELDS = ("Sprint", "Focus", "Blockers").stack.md: tech choices with rationale and rejected alternatives. Empty marker:STACK_EMPTY_MARKER = "# Stack\n<!-- Tech choices with rationale and rejected alternatives -->". Parsers extract a one-line tech list (extract_stack_oneliner) or a summary (extract_stack_summary).open-questions.md: open questions as- [ ]/-bullets or###headings. A## Resolvedsection holds resolved ones (skipped byparse_questions).propose_decision(resolves_questions=[...])moves named questions into## Resolved, stamping the resolving decision number and date.- These files plus the legacy
state.mdfeed the L0 token estimate (TOKEN_ESTIMATE_FILES) and validation (VALIDATED_STORE_FILES).
Source: templates/scaffolds.py, nauro_core/parsing.py, nauro_core/constants.py, nauro/constants.py.
Atomic writes and file locking
There are two distinct write disciplines. Control-plane JSON files use atomic tmp-write plus rename; store files additionally take a per-file lock.
1. Control-plane JSON (registry.json, config.json, repo .nauro/config.json) uses atomic_write_text, a tmp-write-then-os.replace.
def atomic_write_text(path: Path, text: str, *, mode: int | None = None) -> None:
path.parent.mkdir(parents=True, exist_ok=True)
tmp = path.with_suffix(".tmp")
tmp.write_text(text)
if mode is not None:
os.chmod(tmp, mode) # chmod tmp BEFORE rename, so target is never briefly world-readable
os.replace(tmp, path) # atomic on a single filesystem
- The rename is atomic on a single filesystem, so a reader never sees a partial target. There is deliberately no
fsync; crash-durability is an explicit non-goal here, and durability scope is atomic-replace only. config.jsonis written owner-only viamode=0o600(save_config).registry.jsonand repo configs use the default umask.
2. Store files (under a project store): FilesystemStore.write_file takes a per-target FileLock (from the filelock library) named <file>.lock, then writes.
def write_file(self, path: str, content: str) -> None:
target = self._store_path / path
target.parent.mkdir(parents=True, exist_ok=True)
lock = target.with_name(target.name + ".lock")
with FileLock(str(lock)):
target.write_text(content)
- Reads are unlocked.
read_fileresolves the path and rejects anything that escapesstore_path(path-traversal guard:target.relative_to(store_path.resolve())); it returnsNonefor missing or non-file targets. - There is no cross-file lock serializing decision numbering across concurrent writers. Two writers can race between
list_decisions()andwrite_file()and mint the samenum. This collision is intentionally accepted and, per the code comment, "caught and repaired on the next sync-pull". - Control-plane mutations also serialize read-modify-write with a
FileLockon a sidecar.lock(_registry_lock,_config_lock). These locks are NOT re-entrant: a transaction body must not open another transaction or it deadlocks.config_transaction()reloads fresh under the lock, yields the dict, and saves on clean exit; a raising body skips the save entirely.
Source: store/_atomic.py, store/filesystem_store.py, store/config.py, store/registry.py.
Snapshots: capture format and logarithmic pruning
Snapshots are point-in-time JSON captures of the whole store, written to snapshots/v{NNN}.json (3-digit zero-padded version), then pruned after every capture.
Capture (capture_snapshot) bundles all root-level *.md files plus every decisions/*.md (keyed decisions/<name>.md) into a files dict, with an auto-incremented version (max existing + 1, or 1). Serialization is delegated to the pure nauro_core.snapshot.serialize_snapshot so local and cloud capture paths cannot drift.
out_path = snapshots_dir / f"v{next_version:03d}.json"
out_path.write_text(json.dumps(snapshot, indent=2) + "\n")
_prune_snapshots(snapshots_dir) # prune after every capture
The canonical snapshot dict has keys in fixed order:
{
"schema_version": 1,
"version": 42,
"timestamp": "2026-04-16T10:22:03.114512+00:00",
"trigger": "propose_decision",
"trigger_detail": "042-adopt-event-sourcing",
"token_count": 1287,
"files": {
"project.md": "# ...",
"state_current.md": "# Current State ...",
"decisions/042-adopt-event-sourcing.md": "---\n..."
}
}
SNAPSHOT_SCHEMA_VERSION = 1; snapshots written before the field existed read back as LEGACY_SCHEMA_VERSION = 0 (via normalize_snapshot). token_count = sum(len(content)) // CHARS_PER_TOKEN with CHARS_PER_TOKEN = 4. The serializer is side-effect-free (no datetime.now, no I/O, no regex); the caller supplies the ISO timestamp.
Logarithmic pruning (_prune_snapshots) buckets snapshots by age:
| Age bucket | Constant | Keep policy |
|---|---|---|
| Last 7 days | PRUNE_KEEP_ALL_DAYS = 7 | keep every snapshot |
| Last 30 days | PRUNE_DAILY_DAYS = 30 | one per day (newest of each day, %Y-%m-%d) |
| Last 6 months | PRUNE_WEEKLY_DAYS = 180 | one per week (%Y-W%W) |
| Older than 6 months | (no constant) | one per month (%Y-%m) |
keep = {latest} # newest snapshot always kept
keep.update(pinned) # auto-pinned snapshots always kept
# ... bucket the rest by age and keep newest per bucket ...
for snap in snapshots:
if snap["path"] not in keep:
snap["path"].unlink()
- Auto-pin: a snapshot whose
decisions/file count is greater than the previous snapshot's is pinned and never pruned. This preserves the full decision chain (_count_decisionscounts keys starting withdecisions/). - The latest snapshot is always kept. Pruning skips entirely when there is
<= 1snapshot. - A snapshot with an unparseable
timestamp(a user-editable field) is left on disk untouched rather than allowed to break pruning of valid snapshots. - Read helpers:
list_snapshots(metadata, newest-first),load_snapshot(version)(full dict),find_snapshot_near_date(target)(most recent at-or-before target, else oldest), andresolve_diff_snapshots(days)(baseline/latest pair fordiff_since_last_session).
Source: store/snapshot.py, nauro_core/snapshot.py, nauro_core/constants.py, nauro/constants.py.
Control-plane files
~/.nauro/registry.json is the project list. There are two schemas, mutually exclusive on disk.
- v1 (
REGISTRY_SCHEMA_VERSION_V1 = 1): keyed by project name, store path~/.nauro/projects/<name>/. Entry:{"repo_paths": [...]}. - v2 (
REGISTRY_SCHEMA_VERSION_V2 = 2, canonical): keyed by project id (ULID), store path~/.nauro/projects/<id>/. The entry carriesname,mode(local/cloud),repo_paths, and (cloud only)server_url.
{
"schema_version": 2,
"projects": {
"01J9Z3K7Q8XBV2N5...": {
"name": "nauro",
"mode": "local",
"repo_paths": ["/Users/me/code/nauro"]
}
}
}
load_registry_v2strictly refuses a v1 file (raisingRegistrySchemaErrortelling you to run a one-time manual migration); auto-migration is intentionally out of scope. Read-path lookups fall back to treating a v1 file as "no v2 entries" (_load_registry_v2_or_empty).- A repo path resolves to at most one project; resolution walks up the repo dir tree against registered
repo_paths. remove_project_v2deletes the registry entry but deliberately leaves the on-disk store intact, so a mistaken removal does not destroy decision history.
<repo>/.nauro/config.json is the repo-local pointer (REPO_CONFIG_SCHEMA_VERSION = 1).
{ "mode": "local", "id": "01J9Z3K7Q8XBV2N5...", "name": "nauro", "schema_version": 1 }
- Cloud mode additionally requires
server_url. Theidis either a CLI-minted local ULID or a server-minted cloud ULID, never both and never neither. An unknownschema_versionis rejected with a clear upgrade message; corrupt JSON is remapped to the sameRepoConfigSchemaErrorfamily. - ULID (
generate_ulid): a standard 26-char Crockford-base32 ULID = 48-bit ms timestamp + 80 bits randomness. Local ids are minted CLI-side; cloud ids arrive from the server. Alphabet:0123456789ABCDEFGHJKMNPQRSTVWXYZ.
~/.nauro/config.json holds user settings (not per project): a telemetry section (anonymous_id, enabled, consent_version, consented_at) and a search.embeddings flag. It is written 0o600. The NAURO_TELEMETRY=0 and NAURO_EMBEDDINGS env vars override the persisted values at read time.
Source: store/registry.py, store/repo_config.py, store/config.py, cli/commands/init.py, nauro/constants.py.
Local-first and privacy
- The store is plain files on your machine. There is no database; reads and writes are filesystem operations. You can inspect, edit, diff, and version-control the Markdown directly.
- The
Storeprotocol the operations kernel depends on is minimal, with six primitives:read_file,write_file,delete_file,list_decisions,read_decision, and the bulkread_decisions. The local CLI and stdio MCP supplyFilesystemStore; cloud supplies an S3 plus DynamoDB implementation behind the same protocol. Implementations own path traversal, locking, and backend error mapping at their boundary. - Cloud sync (when enabled) stores project context (decisions, state, open questions, NOT source code) encrypted in AWS S3 (us-east-1, SSE-S3), isolated per project behind a fail-closed membership check so a request reaches a project only if the account is a member of it. The remote MCP reads context from S3 and delivers it to the connected AI tool.
- Telemetry is default opt-in and anonymous (a per-machine UUID only). It never sends decision content, titles, or rationale, file paths, repo or project names, or MCP arguments and returns. Opt out via
NAURO_TELEMETRY=0ornauro telemetry disable(persists to~/.nauro/config.json).
Source: nauro_core/operations/store.py, store/filesystem_store.py, nauro/PRIVACY.md.
Key facts
- The project store lives at
~/.nauro/projects/<project-id>/(v2, ULID-keyed) or~/.nauro/projects/<name>/(v1 legacy); home is overridable viaNAURO_HOME. Default home constantDEFAULT_NAURO_HOME = ".nauro". - A separate repo-local
<repo>/.nauro/config.jsonis a pointer only (mode, id, name, and optionallyserver_url); the store is found by walking up the tree for it, git-style. - Store contents:
project.md,state_current.md,stack.md,open-questions.md,.decision-hashes.json,decisions/(numbered Markdown), andsnapshots/(versioned JSON). Legacy stores may havestate.md. - Decision files are named
decisions/{num:03d}-{slug}.md(slug capped at 60 chars); number =max(existing)+1. Format is YAML frontmatter plus a Markdown body with# NNN — Title(em-dash) and a required## Decisionsection; rejected alternatives render as## Rejected Alternatives/### name. .decision-hashes.jsonis an exact-dup index (content hash to{decision_id, timestamp}) used bypropose_decisionTier 1.- Control-plane JSON (
registry.json,config.json, repoconfig.json) is written atomically via tmp-write plusos.replace(nofsync; crash-durability is a non-goal).config.jsonis0o600. - Store-file writes take a per-file
FileLock(<file>.lock); reads are unlocked. There is no cross-file lock for decision numbering; racing collisions are tolerated and repaired on the next sync-pull. - Snapshots:
snapshots/v{NNN:03d}.json(SNAPSHOT_SCHEMA_VERSION = 1), captured on demand, pruned after every capture. Logarithmic buckets: keep-all<=7d, daily<=30d, weekly<=180d, monthly older. The latest is always kept; snapshots that increased thedecisions/count are auto-pinned and never pruned.