Conversation
…tasks ADR-016 specifies the complete undo/recovery system for TigerFS: - Per-table operation log (hypertable with SkipScan-optimized indexes) - Separate savepoint table with data-first pipeline access - Preview-then-apply undo interface (.undo/id/, .undo/to-id/, .undo/to-savepoint/) - Diff symlinks (before/after/current) on log entries using /dev/null for missing states - UUIDv7 timestamp+base36 display format (lossless, case-insensitive safe) - User identity via .info/user (mount-level, in-memory) - Auto-savepoints on write inactivity gaps - Pipeline filter composition with .apply (all except .sample/) - Skills updates establishing savepoints as core agent workflow Phase 12 implementation plan: 12 tasks from infrastructure (UUIDv7, symlinks) through features (log, savepoints, undo) to documentation.
Replace the old second-precision lossy version ID format with a lossless timestamp+base36 display name for UUIDv7 values used as filenames: Old: 2026-04-07T143000Z New: 2026-04-07T143000.123Z-zzz0063hd8e5r42 The new format encodes all 122 meaningful bits of the UUIDv7 (48-bit ms timestamp + 74 entropy bits as base36). Fully reversible with no DB lookup. Case-insensitive safe (0-9a-z only) for macOS APFS. Changes: - format/uuidv7.go: Core UUIDv7ToDisplayName/DisplayNameToUUIDv7, IsUUIDv7, IsDisplayName, ExtractUUIDv7Time, pack/unpack entropy - db/query.go: scanAndEncodePK detects UUIDv7 PKs and uses display format for directory listings - db/pk_match.go: pkDecodeSingle recognizes display names and converts back to hex UUID for DB queries - fs/synth/format.go: Delegates to format package, keeps backward- compatible wrappers (UUIDv7ToVersionID, VersionIDToTimestamp) Tests (32 new): - format/uuidv7_test.go: Round-trip (100 random), zero/max/small entropy boundaries, format validation, IsUUIDv7 edge cases (v0/v1/v4/ v7 with variant combos), IsDisplayName edge cases, invalid inputs, hex UUID rejection, known-value, millisecond precision ordering - db/pk_match_test.go: Display name decode to hex UUID, hex UUID passthrough, non-UUID passthrough, full encode-decode round-trip for v7 and v4, scanAndEncodePK for v7/v4/non-UUID values
Add symlink plumbing so that later tasks (.log/ diff symlinks, .undo/ preview /dev/null symlinks) can use them immediately. All changes go through the shared fs.Operations layer used by both NFS and FUSE adapters -- no legacy FUSE code is touched. Shared core layer (fs/): - Entry.Target field for symlink target path - Entry.IsSymlink() helper (checks os.ModeSymlink in Mode) - Operations.Readlink() -- Stat the path, verify it's a symlink, return its Target NFS adapter (nfs/ops_filesystem.go): - Readlink() delegates to ops.Readlink() - opsFileInfo.Mode() preserves os.ModeSymlink bit (go-nfs checks Mode() & os.ModeSymlink for symlink detection) FUSE adapter (fuse/ops_node.go, fuse/adapter.go): - OpsNode.Readlink() delegates to adapter.ops.Readlink() (OpsNode is the shared ops-based FUSE node, not legacy) - EntryToAttr handles S_IFLNK for symlink entries - EntriesToDirEntries handles S_IFLNK in directory listings - Lookup sets S_IFLNK in StableAttr for symlink inodes Note: user-created symlinks (ln -s) remain unsupported -- Symlink() still returns an error. These are virtual read-only symlinks that TigerFS computes for specific internal paths. Tests (13 new): - fs/: IsSymlink for all types, IsDir+ModeSymlink precedence, empty target, size convention, Readlink error paths (non-symlink, nonexistent) - fuse/: EntryToAttr and EntriesToDirEntries with symlinks - nfs/: opsFileInfo.Mode preserves ModeSymlink correctly
Extend GenerateHistorySQL in synth/build.go to create the operation log hypertable and savepoint table alongside the existing history table. These are only created for history-enabled synth apps. Log table (tigerfs.<app>_log): - UUIDv7 PK (log_id), user_id, type with CHECK constraint, file_id, filename (denormalized), history_id (pointer to before-state), description - Uses modern TimescaleDB CREATE TABLE WITH syntax: tsdb.hypertable, tsdb.partition_column, tsdb.chunk_interval (7 days), tsdb.segmentby (file_id), tsdb.orderby (log_id ASC) - Columnstore policy auto-created by TimescaleDB using chunk interval - Composite index (file_id, log_id ASC) for SkipScan on undo queries Savepoint table (tigerfs.<app>_savepoint): - UUIDv7 PK (savepoint_id), user_id, name (UNIQUE), description - Regular table (not hypertable -- savepoints are small) GenerateHistorySQL now returns 11 statements (was 8): 8 history + 2 log (table + index) + 1 savepoint Tests updated: - Statement count expectations updated across 5 existing tests - New assertions verify log table: all columns, CHECK constraint, tsdb.hypertable, chunk_interval, segmentby/orderby, composite index - New assertions verify savepoint table: PK, UNIQUE name - Non-history app test verifies NO log/savepoint tables created
Every write to a history-enabled synth app now creates a log entry in the _log hypertable. This is the foundation for undo operations. DB layer (db/): - LogWriter interface: InsertLogEntry, QueryLatestHistoryID - InsertLogEntry inserts into the log hypertable with all ADR-016 columns (user_id NULL for now -- wired in Task 12.5) - QueryLatestHistoryID fetches the most recent _history_id for a file_id, used to capture the before-state pointer after UPDATE/DELETE - MockLogWriter + MockLogEntry for testing Core layer (fs/synth_ops.go): - logSynthOp helper: checks HasHistory, captures history_id for update/delete ops via QueryLatestHistoryID, calls InsertLogEntry. Best-effort (log failures warn but don't fail the write) - writeSynthFile: logs "insert" (with returned PK) and "update" - deleteSynthFile: logs "delete" for both directory and file deletes - renameSynthFile: logs "update" with new filename for single file renames Known gap: Directory prefix renames (RenameByPrefix) are NOT logged. This is a correctness issue for undo-to-savepoint -- to be addressed before Task 12.5. Tests: - 5 unit tests: insert, update (with history_id), delete (with history_id), rename (new filename + history_id), no-history skip - 1 integration test against real PostgreSQL+TimescaleDB: creates synth app with history, performs insert/update/insert/delete/rename, queries _log table directly to verify all 5 entries with correct type/filename/file_id/history_id, verifies hypertable and savepoint table existence Demo: existing seed.sh creates 3 history-enabled apps (blog, docs, snippets) with creates+updates -- log tables are auto-populated.
ADR-017 replaces path-encoded filenames (ADR-011) with a parent-pointer model: filename stores leaf name only, parent_id references the parent directory row. This makes directory renames single-row operations, solving the undo log batch problem identified during Phase 12. Key decisions: - Source table: parent_id UUID (self-ref FK, DEFERRABLE INITIALLY DEFERRED), UNIQUE NULLS NOT DISTINCT (parent_id, filename, filetype) - History table: file_id, parent_id, version_id (was _history_id), operation (was _operation), segmentby=file_id - Log table: version_id (was history_id), filesystem-centric types (create/edit/rename/delete/undo) - Path resolution: PL/pgSQL resolve_path function (1 round-trip) + Go-level path cache (2s TTL) - Undo: all operations are single-row, standard DISTINCT ON + UPSERT. DEFERRABLE FK/UNIQUE allow any ordering within undo transactions. - Filtered undo may fail with FK errors (deleted parent) -- transaction rolls back safely. Unfiltered undo-to-savepoint always succeeds. Phase 13 (12 tasks): schema, path resolution, ReadDir, writes, rename/move, delete, history, log entries, migration script, tests (45 verification scenarios), ADR-016 updates, documentation. ADR-011 marked as superseded by ADR-017. Must complete Phase 13 before resuming Phase 12 (task 12.5+).
Source tables: add parent_id self-referencing FK, replace UNIQUE(filename, filetype) with UNIQUE NULLS NOT DISTINCT (parent_id, filename, filetype). Both constraints DEFERRABLE INITIALLY IMMEDIATE -- normal ops check immediately, undo transactions explicitly SET CONSTRAINTS ALL DEFERRED. History tables: rename id->file_id, _history_id->version_id, _operation->operation. Add parent_id column and CHECK constraints on filetype/encoding/operation. Use modern CREATE TABLE WITH syntax for hypertable+columnstore (replaces separate create_hypertable/compress calls). Log tables: rename history_id->version_id. Filesystem-centric operation types: create/edit/rename/delete/undo (replaces insert/update/delete/undo). New DDL: resolve_path PL/pgSQL function (EXECUTE+format for dynamic REGCLASS table), parent_id index for ReadDir and path resolution. Add ParentID to ColumnRoles with detection and frontmatter exclusion. Runtime column renames across query.go, interfaces.go, mocks.go, synth_ops.go, history.go to match new schema. InsertIfNotExists catches unique-violation errors (SQLSTATE 23505) instead of ON CONFLICT DO NOTHING, since PostgreSQL disallows ON CONFLICT with deferrable constraints. Update isVersionID to recognize UUIDv7 display names (ADR-016 format).
…3.2) Add ResolvePath DB wrapper that calls the tigerfs.resolve_path PL/pgSQL function, returning PathSegment results (depth, ID, name) for each resolved segment. Add PathResolver interface and mock. Add pathCache with 2-second TTL matching the stat cache. Maps (parentID, filename) -> rowID per table, with lookup/put/invalidate. Add resolveSynthPath on Operations combining cache + DB: walks segments checking cache at each level, calls resolve_path only for unresolved segments, populates cache from results. Supports partial cache hits (sibling access pattern) and graceful DB error handling. 19 unit tests covering all 6 ADR-017 verification scenarios (#12-17): path cache (10 tests: put/lookup/miss/TTL/invalidate/namespaces), resolution (9 tests: cold cache, partial hit, sibling reuse, 5-level deep nesting, nonexistent path, root level, invalidation, DB error).
…13.3-13.7) Tasks 13.3-13.7 are inherently coupled -- all share the invariant that "filename" means leaf name, not full path. Implemented together: ReadDir (13.3): Add GetRowsByParent DB method (WHERE parent_id = X). readDirSynthView and readDirSynthHierarchical use it for parent-pointer tables, resolving directory paths via resolveSynthPath. buildEntriesFromRows converts query results to entries. primeSynthStatCache takes pathPrefix for correct stat cache keys in subdirectories. Write (13.4): ensureSynthParentDirs chains parent_id values, resolving each directory after creation. writeSynthFile stores leaf filename + parent_id on INSERT, uses resolveSynthPath for UPDATE existence check. mkdirSynth resolves parent path and inserts with parent_id. Rename (13.5): renameSynthFile does single-row UPDATE SET filename=newLeaf WHERE id=X. Cross-directory moves also update parent_id. Replaces N-row RenameByPrefix for parent-pointer tables. Delete (13.6): deleteSynthFile resolves path to UUID, checks children via GetRowsByParent(dirID, 1), deletes by PK. Replaces HasChildrenWithPrefix. History (13.7): Add QueryHistoryDistinctFilenamesByParent for per-directory .history/ listing. historyDBFilename helper returns leaf name for parent-pointer model, full path for old model. All history file lookups (stat, read, .id, version) use the helper. Old code paths preserved for pre-migration databases, guarded by info.Roles.ParentID != "".
Unit tests (12 tests in synth_parent_test.go): - joinPathPrefix: empty prefix, single level, multi-level - historyDBFilename: parent-pointer returns leaf, old model returns full path - buildEntriesFromRows: mixed file+dir, empty, plain text format - resolveSynthRow: found, not found, DB error - Parent-pointer operations via mocks: create root file, edit existing, mkdir at root, delete by UUID, rename same directory Integration tests (9 new tests in synthesized_test.go): - MoveFileBetweenDirs: mv inbox/task.md archive/task.md (ADR-017 #5) - MoveDirBetweenParents: mv parent1/child parent2/child (ADR-017 #9) - EmptyDirReadDir: ls empty dir returns nothing (ADR-017 #20) - SameLeafNameDifferentDirs: docs/readme.md + guides/readme.md coexist, independent content, edit one doesn't affect other - RenameDirChildrenUnaffected: rename dir, children accessible under new name - NestedReadDirAtEachLevel: 3-level hierarchy, correct entries at each level - DeleteNestedFileParentPersists: delete child, parent dir remains - HistoryAfterRename: .history/ shows old filename from pre-rename versions - RootFilesAndDirsCoexist: file + directory at root level, correct types
… (Task 13.8) Log type renames and version_id column were done in Task 13.1. logSynthOp already receives the full denormalized path from the filesystem layer. Two new integration tests: - TestSynth_LogEntries_NestedFiles: verifies all 5 operation types on nested files store full paths (create projects/web/todo.md, edit, rename to done.md, move to archive/done.md, delete) - TestSynth_LogEntries_DirRenameOneEntry (ADR-017 #44): renames a directory with 3 child files and verifies exactly 1 log entry is created (single-row operation), not N entries as the old prefix-based model would produce
…3.9) New migration "relational-directories" in the tigerfs migrate framework converts synth apps from path-encoded filenames (ADR-011) to parent-pointer directory model (ADR-017). Auto-discovers apps via view comments. Per-app migration steps: - Add parent_id column, populate by walking old path hierarchy - Strip filenames to leaf names, replace UNIQUE constraint - Add FK (DEFERRABLE INITIALLY IMMEDIATE) and parent index - Recreate view to include new column (PostgreSQL SELECT * snapshots columns at view creation time) - Rename history columns: id->file_id, _history_id->version_id, _operation->operation; populate parent_id from source table - Rename log column history_id->version_id; update type values (insert->create, update->edit); reorder: drop CHECK, rename, add CHECK Integration test verifies full before/after: creates old-schema tables with hierarchical data + history + log entries, runs describe/dry-run/ execute, verifies parent_id chain, leaf filenames, column renames, type renames, TigerFS ReadDir/ReadFile/WriteFile on migrated data, idempotency. Updated ADR-017 and implementation-tasks to reference tigerfs migrate framework instead of standalone SQL script.
…tion Log a one-time warning during synth cache loading when a hierarchical app has filetype column but no parent_id (legacy directory model). Suggests running 'tigerfs migrate' for improved directory performance. Warning fires once per schema load, same pattern as warnLegacyBackingTables.
…e (Task 13.10) Audit of all 45 ADR-017 verification scenarios: - #1-21: All covered (basic ops, path resolution, ReadDir) - #22-36: Deferred to Phase 12 (undo scenarios) - #37-40: All covered (history navigation) - #41-42: Deferred (multi-agent concurrent access) - #43-45: #44 covered; #43/#45 deferred (demo/undo) New test: TestSynth_HistoryAfterMoveAccessibleByUUID (#40) - Creates file in inbox, edits to produce history, moves to archive - Verifies .by/<uuid>/ lists the file UUID after move - Reads oldest version via .by/<uuid>/<version> and confirms v1 content - Tests that move creates a second history entry (BEFORE trigger on parent_id change) and both versions are accessible Fix: historyDBFilename now extracts leaf name from greedy-parsed multi-segment paths (e.g., "inbox/task.md" -> "task.md") for parent-pointer model history lookups.
ADR-016 updates: - Log schema: history_id -> version_id, type values insert/update -> create/edit/rename, modern CREATE TABLE WITH syntax (replaces separate create_hypertable + ALTER TABLE SET + add_compression_policy) - History column refs: _history_id -> version_id, _operation -> operation - Undo SQL examples: updated column names, added parent_id to UPSERT, type values create/edit/rename/delete in undo action table - Implementation mapping: added rename and move as separate rows Phase 12 tasks (implementation-tasks.md): - Task 12.3: version_id column name - Task 12.4: version_id capture, updated test expectations
…ask 13.12) Backing table schema: add parent_id FK, encoding column, UNIQUE NULLS NOT DISTINCT with DEFERRABLE INITIALLY IMMEDIATE. Add explanation of parent-pointer directory model. History table schema: id->file_id, _history_id->version_id, _operation->operation, add parent_id, CHECK constraints, modern CREATE TABLE WITH syntax for hypertable+columnstore. 7-day chunks, segmentby file_id. Version display: updated from old timestamp format to UUIDv7 timestamp+base36 display names (ADR-016 Section 11). ADR-017 already written. No skills files to update. Phase 13 complete: 12/12 tasks done.
ReadFile optimization (ADR-017 "ReadFile / Stat" section): Add fetchSynthRowByPath which resolves parent segments via cache, then fetches the leaf row with a single combined query (SELECT * WHERE parent_id = X AND filename = leaf). Saves one round-trip vs the previous resolve_path -> GetRow approach. Add GetRowByParentAndName DB method with NULL-safe parent_id handling. ADR-017 fixes: - Status: Draft -> Accepted - resolve_path SQL: corrected from pseudocode to actual EXECUTE format() - Key operations table: Rename/Move file uses UPDATE WHERE id (not CAS) - UNIQUE constraint section: clarified that undo must explicitly call SET CONSTRAINTS ALL DEFERRED (INITIALLY IMMEDIATE checks immediately) - "Removed code" -> "Deprecated code": reflects that old paths are kept behind info.Roles.ParentID guards for pre-migration databases - Migration section: already updated in prior commit
…it/rename/delete Source table id DEFAULT changed from gen_random_uuid() (v4) to uuidv7(). History trigger now uses filesystem-centric operation types: compares OLD vs NEW to distinguish edit/rename/delete (no GUC or extra column). Migration updated to convert existing operation values and id DEFAULT.
…ommand Migration warning now shows the actual filesystem path (e.g., "/tmp/mig-test/docs") by plumbing the mount point through to Operations. Default schema paths use mount root; non-default use DirSchemas constant. Cleaned up jargon in user-facing messages: - Warning: "legacy directory format detected" with full mount path - Migration summaries: removed "synth", "ADR-017", "parent-pointer" - Migration output: "Migrated N views" instead of "N items" Added scripts/test-migration.sh for manual migration testing.
Add UserID to Config, --user-id mount flag, and TIGERFS_USER_ID env var (precedence: flag > env > empty). Stored in-memory on Operations struct. Root-level /.info/ directory with read/write "user" file: - cat /.info/user -> returns current identity - echo "agent-9" > /.info/user -> changes identity for subsequent ops - New PathRootInfo type in path parser, distinct from table-level PathInfo Wire userID into logSynthOp: all log entries now include the mount-level identity. Empty string = anonymous (NULL in database). 18 unit tests: path parsing (3), get/set/config (3), ReadDir/Stat/ ReadFile/WriteFile for .info/user (9), log entry userID wiring (3 including identity change mid-session).
Track userIDModTime so Stat returns a stable mtime (prevents editors from warning "file has changed since visited"). Silently ignore writes to unknown .info/ files (e.g., Emacs #user# temp files).
.log/ and .savepoint/ redirect FSContext to tigerfs._log/_savepoint tables and delegate to existing pipeline parsing. .undo/ parses multi-level routing (id/to-id/to-savepoint), target, pipeline filters, and .apply/.info/summary leaves. 34 unit tests covering all path variants from ADR-016 task description.
- Log type names: INSERT/UPDATE/DELETE/UNDO -> create/edit/rename/delete/undo - Symlink resolution rules: use ADR-017 type names in state matrix, diff examples, and /dev/null explanations - Simplify state matrix from 6 rows (with UNDO variants) to 5 rows with note that resolution depends on version_id, not type - Update symlink implementation note: Task 12.2 already added symlink support (was "TigerFS has no symlink support today") - Fix _version_id -> version_id (stale underscore in Section 11) - UPSERT description: "UPDATE entries" -> "edit/rename entries"
…k 12.7) Show .log/, .savepoint/, .undo/ in synth app ReadDir (when history enabled). Wire PathLog/PathSavepoint into ReadDir/Stat dispatchers, delegating to existing data-first table handlers. Diff symlinks (before/after/current) on log entry row directories: - before: version_id -> .history/<filename>/<display_name>, or /dev/null - after: next log entry lookup, fall back to current file or /dev/null - current: file existence check, live path or /dev/null Resolution uses QueryNextLogEntry and QueryFileExists DB methods. Integration tests: ReadDir listing (.log/ appears, entries are dirs, column access works), diff symlinks (create=/dev/null before, edit points to .history/, delete=/dev/null after and current).
Unit tests (16): - synthUndoDirs: HasHistory true/false - uuidToDisplayName: valid UUIDv7 hex, already display name, non-UUID - parseUUIDBytes: valid, invalid length, no dashes - resolveLogDiffSymlink full state matrix (8 cases): before NULL/non-NULL, after with next entry/NULL/no-next-exists/ no-next-deleted, current exists/deleted Integration tests (4 new, 6 total): - AfterChain: create->edit->edit, first edit's after points to .history/ - Pipeline: .log/.last/2 returns exactly 2 entries - NestedFilename: nested file log has denormalized full path - UserIdentity: .info/user value appears in log entry user_id column
Custom readDirSavepoint lists by name, ordered chronologically. .first/N returns oldest, .last/N returns newest. .by/ filters compose. Name-based row lookup for Stat/ReadFile/ReadColumn. Write support: create (touch/echo), update column, delete. User_id auto-populated. Fix: pipeline processors preserve PathLog/PathSavepoint/PathUndo type through capability chains (was unconditionally resetting to PathTable). 5 integration tests: CRUD, chronological .last/N and .first/N ordering, .by/user_id/.last/N filter composition, user_id population, stat 404. Demo: add before-content and before-edits savepoints to blog app.
10 unit tests for savepoint handlers: path type preservation (4), write with/without description/user_id (3), delete by name (1), row-by-name lookup found/not found (2). 5 integration tests: CRUD, pipeline .last/N, filter by user, user_id population, stat 404. NFS fix: clear file cache for .savepoint/ and .info/ paths after write to prevent go-nfs handle type conflicts (file handle from write vs directory handle from ReadDir). Known issue: readDirSavepoint (name-based display) causes empty NFS listings due to go-nfs READDIRPLUS handle issue. Using readDirTable fallback (shows savepoint_id UUIDs). readDirSavepoint retained for future fix. Stat cache priming added for when it's re-enabled. Demo: add --debug flag, add savepoints to blog seed, fix mac mount path (was missing --insecure-no-ssl).
…are-path creates On macOS NFS, creating a new row via bare path (echo data > table/pk) causes the entry to silently disappear from ls output. The NFS client caches the FILE inode type from the CREATE RPC, but READDIRPLUS returns the same name as a DIRECTORY (rows have column children). The client sees conflicting types and drops the entry. Format suffixes (.tsv, .json) avoid this because the write path differs from the listing name. Additionally, writeRowFile was not merging the filename-derived PK into INSERT columns. Writing to /categories/test-cat.tsv with TSV headers "name\tdescription" would INSERT without the slug column, failing on NOT NULL constraints or creating rows with auto-generated PKs instead of the specified value. Changes: - Merge PK columns from filename into INSERT when not in the data body - Reject bare-path inserts with error suggesting .tsv/.json/.csv suffix - Allow bare-path updates (existing rows have no inode type conflict) - Update bare-path write example in implementation-tasks.md
The macOS NFS client caches inode type from CREATE RPCs. When a bare file write (echo > .savepoint/name) creates a FILE inode, the client conflicts with the DIRECTORY type returned by READDIRPLUS for the same name, silently dropping the entry from ls. This was confirmed with both savepoints and data rows: bare-path writes are invisible, but format- suffixed writes (.tsv, .json) work because the write path differs from the listing name. This change simplifies the savepoint model so format-suffixed writes naturally work through standard table operations: - Make name the PRIMARY KEY (was savepoint_id). Standard readDirTable now lists savepoints by human-readable name instead of UUIDs. All CRUD operations work through the generic row/column dispatch. - Remove all savepoint-specific CRUD functions: readDirSavepoint, getSavepointRowByName, writeSavepointColumn, deleteSavepoint, and QuerySavepointNames. These existed only to translate between the name column and the savepoint_id PK. - Keep thin writeSavepoint wrapper that injects user_id from mount identity into TSV/JSON/CSV body before delegating to writeRowFile. - Add format extension stripping to processSegmentsFrom so .savepoint and .log paths support .tsv/.json/.csv suffixes on row names. - Retain savepoint_id as UNIQUE column with UUIDv7 default for undo time-ordering (log_id > savepoint_id comparisons).
- TestSynth_StatDirectory_UsesModifiedAt: unit test verifying that statSynthFile reads modified_at from the database row for directories (not CachedMountTime or time.Now()) - TestMount_DirMtime_ReflectsCreate: mount-level test verifying that os.Stat on a directory returns updated mtime after file creation (full path: trigger -> DB -> stat cache -> NFS GETATTR -> os.Stat) - TestMount_DirMtime_StableOnEdit: mount-level test verifying that os.Stat on a directory returns unchanged mtime after content edit
- Backing Table Schema: document both triggers (modified_at BEFORE UPDATE, parent mtime AFTER INSERT/DELETE/UPDATE OF parent_id,filename) - Timestamps: add directory mtime section explaining POSIX-correct behavior and cross-mount visibility - NFS Write Limitations: document noac mount option and its rationale (stale attributes after undo, cost is NFS round-trips not SQL queries) - CLI Subcommands: add migrate command with modes and migration list
Print operation counts/percentages (grouped into file, directory, and other) and a log-scale histogram of created-file sizes after final validation, so seed replays surface the shape of the workload.
Bias source-dir selection toward non-empty dirs (70/30) so the recursive case is regularly exercised; log file/subdir counts in each op's description. Tighten canExecute for move_dir: checking pool counts alone is insufficient because a lone root-child dir has no valid destination (root is its current parent). OpMoveDir now enumerates all valid (src, dest) pairs deterministically instead of a random-retry loop, so it never fails when a feasible pair exists. Repro for the canExecute bug: seed 1777060269495462000, iter 10.
ExecuteUndoTransaction restores rows via UPSERT in file_id ASC order. When a child file's file_id is older than its parent dir's (e.g., after a move into a newer directory), the child is upserted first and trips parent_id_fkey (SQLSTATE 23503), aborting the undo with EIO. ADR-017 already designed the schema for this: parent_id_fkey and the (parent_id, filename, filetype) uniqueness constraint are DEFERRABLE INITIALLY IMMEDIATE specifically so undo transactions can SET CONSTRAINTS ALL DEFERRED at the start. The runtime side was just never wired up. Adding it lets restorations commit in any order; pg checks the deferred constraints at COMMIT, after every row is in place. Repro seed: 1777065886566418000 -- now runs past the previously-failing step 586 without an FK violation. Tests: a unit test passes RestoreFileIDs in deterministic child-first order to force the FK ordering case; the integration test creates the file at root before the dir, then moves it in, so the file's file_id is older and restore order matches. Both fail without the deferral and pass with it.
The bump_parent_mtime AFTER trigger normally keeps directory mtimes
fresh on child changes, but it can miss bumps during undo:
* Restored rows inserted via UPSERT's INSERT branch carry their
historical modified_at (pre-delete value), not now().
* With deferred FK ordering, a child can be inserted before its
parent dir row exists. The trigger's UPDATE on the (missing)
parent matches 0 rows, and the parent restored later doesn't
re-trigger the bump on its already-inserted children.
When the dir's mtime doesn't change, NFS clients with `noac` don't
invalidate their readdir cache. The result is a ghost listing where
`ls` shows a file but `stat`/`open` returns ENOENT.
Force the bump explicitly after the UPSERTs: each restored row's own
mtime, plus the mtime of any parent dir that contains a restored row.
Repro seeds 1777097481681738000 and 1777065886566418000 now run all
1000 iterations without state-mismatch failures.
Extends the existing unit test to assert modified_at lands within 60s
of now(), against history rows seeded with modified_at = now() - 1h.
Adds five tests pinning down corners of the recent undo fixes that
weren't isolated by the existing coverage:
- TestExecuteUndoTransaction_DefersUniqueConstraint: rename-as-replace
undo. Forces a transient (parent_id, filename, filetype) UNIQUE
violation that only resolves at COMMIT (deferred-UNIQUE half of
SET CONSTRAINTS ALL DEFERRED).
- TestExecuteUndoTransaction_BumpsParentMtimeWhenOnlyChildRestored:
parent dir already exists; only the child is in RestoreFileIDs.
Asserts the parent's modified_at is bumped via the
SELECT DISTINCT parent_id subquery.
- TestExecuteUndoTransaction_DefersFKMultiLevel: 3-level chain
restored in reverse order so the FK chain only resolves at COMMIT.
Catches a regression if deferral is narrowed.
- TestExecuteUndoTransaction_DeleteOnlyUndo: undo with no
RestoreFileIDs. Guardrail for the empty-array branch of the
mtime-bump SQL.
- TestUndo_ToID_RestoresMultipleFilesInDeletedDir: end-to-end multi-
file restore via .apply, verifying ReadDir surfaces every sibling
and content reads back correctly.
Also extends TestUndo_ToID_RestoresDeletedDirWithFile with a ModTime
assertion. The dir's only restored child has an older file_id, so the
bump_parent_mtime trigger can't help the parent dir on its own.
A 500ms sleep between delete and undo absorbs container/host wall-clock
skew while still distinguishing pre-delete vs post-undo mtimes.
Each new assertion was confirmed to fail cleanly when the corresponding
fix is reverted.
ExecuteUndoSingle has resolved display-name log_ids since inception (undo.go:186). ExecuteUndoToLogID was missing the same call, so when writeUndoApply passed parsed.UndoTarget through (which is the display- name path segment per the path parser), the SQL got a non-UUID string. pgx encodes the parameter as TEXT, postgres applies an implicit cast on log_id (UUID -> text) for the comparison, and the resulting lexicographic WHERE log_id::text > '2026-04-25T...' is always false (UUID texts begin with '0'-'9' digits, display names begin with the year '2', and digits '0'-'1' sort below '2'). Result: zero rows match, FilesRestored=0, no error -- a silent no-op for any user who copies a log id from ls .log/.by/type/... into .undo/to-id/<id>/.apply. The stress test masked this by reading log_ids from .log/.last/.../json which emits raw UUIDs; only the user-facing display-name path was broken. One-line fix mirrors ExecuteUndoSingle. Test: a new integration test reads a display-name log_id from .log/.by/type/create and applies undo via .undo/to-id/<display-name>/ .apply, asserting the file content rolls back. Without the fix the file stays at the post-edit content (silent no-op).
statSynthFile populates statCache under parsed.Context.Schema (the
user's schema where the view lives). writeUndoApply and ExecuteUndo
were both invalidating with synth.TigerFSSchema instead, so cached
entries -- including negative entries -- survived an undo.
Symptom: ops Delete + Stat (caches negative) + .apply undo + Stat
returns "file not found: <name> (cached)" even though the row is
back in the DB.
Three changes:
- writeUndoApply: capture parsed.Context.Schema as cacheSchema and
use it for ExecuteUndoSingle/ToLogID/ToSavepoint *and* for the
final statCache/pathCache invalidate calls.
- ExecuteUndo: invalidate statCache under the schema parameter
(not synth.TigerFSSchema). The doc comment had said "schema is
unused"; it now correctly identifies the user-facing schema.
- ExecuteUndoSingle already used `schema` for invalidation, but
callers were feeding it synth.TigerFSSchema; fixing
writeUndoApply's call sites makes that path correct too.
Synth backing tables themselves still live in synth.TigerFSSchema --
we only changed cache keys, not table locations.
Test: TestUndo_Apply_InvalidatesNegativeStatCache reproduces the
exact "(cached)" symptom by populating a negative entry between
delete and undo. Also adds a sanity Stat back into
TestUndo_ToID_RestoresDeletedDirWithFile (which had been removed as
a workaround for this bug).
Two stress-test correctness fixes:
- OpRenameDir: randomDirName has 12 prefixes * 1000 suffixes = 12k
options and can collide with an existing dir's basename. When that
happened, os.Rename(A, A) failed with EEXIST and ended the run
(observed: seed 1777150178369383000, step 500). Re-roll up to 5
times until a different name comes back.
- WorkspaceState.RenameDir: ranged over ws.Files / ws.Dirs while
inserting new keys, which is undefined per the Go spec
(newly-added keys may or may not be visited). Worked in practice
because new keys never shared the old prefix, but it's still UB.
Refactored to collect moves first, then apply after each loop.
Verified:
- Existing TestRenameDir still covers the basic case.
- Seed 1777150178369383000 now completes 500 iterations cleanly.
OpDeleteDir used os.RemoveAll and pushed a single stack entry covering the whole operation, with lastLogID = the most recent log entry from the recursive delete (the dir's own removal). When undo_single later targeted that lastLogID, TigerFS correctly undid one log entry -- restoring just the dir row -- but the stress test's stack pop returned "state before delete_dir" (everything restored). Validation then flagged the still-deleted children as missing files. Observed on seed 500. Refactor: delete each child individually via os.Remove in post-order. For each deletion, push a fresh stack entry capturing the in-progress state and pin the just-emitted log_id. The first deletion uses the runner-supplied entry; subsequent ones get newly pushed entries. This gives undo_single a one-to-one mapping between log entries and stack states for any child of a delete_dir. Also: walk the actual filesystem (os.ReadDir) rather than relying on WorkspaceState. mkdirSynth in TigerFS doesn't log, so undo can't roll back Mkdir-created dirs; the test's tracked state drifts from TigerFS's actual state across an undo, leaving "phantom" dirs the FS still has but state doesn't. A state-based deletion order would miss them and os.Remove on the parent would fail with ENOTEMPTY (surfaced as EIO via the NFS adapter). Walking the FS finds phantoms naturally. Tests: TestCollectDeletionOrder verifies post-order + hidden-entry skip + scoping; TestCollectDeletionOrder_PhantomDir specifically pins down the FS-walk fallback. 8 random seeds (200..1000) plus the two originally-failing seeds (1777097481681738000, 1777065886566418000) now all complete 500 iterations cleanly. (Seed 100 still hits an unrelated flaky retry in OpMoveFile -- the random "pick a different dir" loop can fail probabilistically with just two dirs available; not addressed here.)
OpMoveFile picked the destination directory by sampling pools.Dirs at random for up to 10 attempts, accepting the first one that wasn't the file's current parent. canExecute requires len(pools.Dirs) >= 2, so a valid destination always exists, but with only 2 dirs (root + one) each attempt has a 50% chance of landing on oldDir; 10 attempts fail ~0.1% of the time and the op errors out with "no different directory available". Observed on seed 100, step 312. Same fix pattern as OpMoveDir: enumerate dirs other than oldDir and pick uniformly from that slice. The candidates list is non-empty by the canExecute precondition. Verified: seed 100 now completes 500 iterations cleanly (previously hit this flake). 4 additional random seeds (1100..1400) also pass.
The fallback was added defensively in 4a9ea71 ("invalidate caches under user schema after undo") in case parsed.Context or parsed.Context.Schema were nil/empty. Tracing the call paths: - writeUndoApply is only reachable via WriteFile -> o.parsePath -> writeFileWithParsed. - o.parsePath runs both ParsePath and resolveSchema. - ParsePath dispatches .undo/... segments to processUndo, which errors out with "requires a table context" when Context is nil (path.go:1200). - resolveSchema (operations.go:298) fills empty Context.Schema with current_schema(), returning ErrIO if that DB query fails. So both parsed.Context and parsed.Context.Schema are guaranteed populated by the time writeUndoApply runs -- the fallback is dead code, and worse, it silently reverted to the broken old behavior (invalidate under "tigerfs", caches stay stale) if the invariant ever did break. Replace with a direct read; a future regression that nils out Context will surface as a clear nil-pointer crash rather than a quiet correctness bug. Documented the invariant in a comment so future readers know why we trust the access.
mkdirSynth used to insert a directory row without writing a 'create'
log entry, so undo's QueryUndoAffectedFiles couldn't see it -- any
Mkdir-created dir survived an undo even when its sibling file creates
got rolled back. This left state silently inconsistent and forced
workarounds in the stress test (e.g., walking the actual filesystem
in OpDeleteDir to find phantom dirs).
Two changes:
1. Add logSynthOp("create", ...) to both branches of mkdirSynth
(parent-pointer and old path-encoded models). One log entry per
Mkdir, symmetric with what WriteFile does for files. Production
traffic through FUSE/NFS already issues a separate Mkdir per
directory segment (the kernel rejects open(O_CREAT) on a path
with a missing parent), so this just records what was already
happening.
2. Add WriteFileEnsureDirs(ctx, path, data) -- the mkdir-p variant
of WriteFile. Walks ancestors shallowest-to-deepest, calls Mkdir
for each (treating ErrAlreadyExists as success), then WriteFile.
Each new ancestor produces its own create log entry, matching the
per-segment kernel mkdir(2) sequence in production. For tests and
direct-API callers that want to materialize a deep path in one
call without writing the loop themselves.
Tightened doc comments on WriteFile, Mkdir, and the new helper so the
preconditions and postconditions (especially "ancestors must exist
already" for WriteFile and Mkdir) are explicit.
Tests:
- TestMkdir_LogsCreateEntry: verifies a Mkdir produces exactly one
'create' log entry.
- TestWriteFileEnsureDirs_LogsEachIntermediateDir: deep write
produces 4 create entries (3 dirs + 1 file).
- TestWriteFileEnsureDirs_PreservesExistingDirs: pre-existing dirs
are skipped (Mkdir's ErrAlreadyExists is treated as success).
Each test was confirmed to fail when the new log call is removed.
This is the additive half of #6 -- ensureSynthParentDirs still has
its create-if-missing fallback so deep WriteFile calls without prior
Mkdirs continue to silently auto-create dirs (without logging). That
fallback is removed in the follow-up commit, which also migrates
direct-API tests that relied on it to use WriteFileEnsureDirs.
Production WriteFile and Mkdir used to silently auto-create missing
ancestor directories via ensureSynthParentDirs's InsertIfNotExists
fallback. Those silent inserts didn't go through mkdirSynth and so
were never logged, making the dirs invisible to undo and a source
of state divergence between TigerFS and any client that tracked the
expected log.
In production through FUSE/NFS, the auto-create branch never actually
fired (the kernel rejects open(O_CREAT) on a path whose parent is
missing, so a deep WriteFile can't reach TigerFS without prior
per-segment mkdir RPCs). The fallback existed solely to make
direct-API callers (tests, batch-import code) work without spelling
out the mkdir-p loop. Tightening the contract closes the silent-
insert hole and aligns WriteFile/Mkdir with POSIX
open(O_CREAT)/mkdir(2).
Changes:
- synth_ops.go: rename ensureSynthParentDirs to resolveSynthParentID
and rewrite to be strict. Walks ancestors via resolveSynthPath
(parent-pointer model) or synthRowExists (legacy path-encoded
model). Returns ErrNotExist with a hint pointing at Mkdir or
WriteFileEnsureDirs when any ancestor is missing. The
InsertIfNotExists fallback is gone.
- write.go: add MkdirAll, the mkdir-p variant of Mkdir. Loops
Mkdir per segment shallowest-to-deepest, idempotent on
ErrAlreadyExists. Mirrors WriteFileEnsureDirs. Doc-commented
with preconditions and postconditions.
- integration tests: 19 tests in synthesized_test.go, log_test.go,
and history_test.go relied on the auto-create. Each one is
migrated to either WriteFileEnsureDirs (deep file write in
setup), MkdirAll (deep dir creation), or explicit per-segment
Mkdir chains where the test specifically probes log granularity.
The log-shape tests (TestSynth_LogEntries_NestedFiles etc.) had
their expected log entry counts updated to include the new
mkdir 'create' entries.
After this commit, ensureSynthParentDirs is gone from the codebase
(only docs/adr/017 references remain, correctly preserved as
historical record).
Verified:
- go test -short ./... clean.
- Stress test passes for seeds 42, 100, 500, and the originally
failing 1777097481681738000 (500 iterations each).
Comment block claimed the FS walk was a workaround for mkdirSynth not logging (which caused state drift after undo). That fix landed in 7174ad2 + 36ffaf4, so the original justification no longer holds. Keeping the FS walk anyway as defensive coding: any future state drift (a new unlogged op, a stress-test tracking bug in move_dir or rename_dir, etc.) would otherwise surface as ENOTEMPTY/EIO from os.Remove hitting children the test doesn't know about. Walking the FS is robust to all such cases at modest cost. Updated the comment to reflect the new rationale; no behavior change.
ValidateWorkspace previously checked only files: missing files (expected but not on disk), unexpected files (on disk but not expected), and hash mismatches. The Dirs map was never consulted, so directory-state drift between the stress test's tracked state and TigerFS went silently undetected. Two new checks: every dir in expected.Dirs must exist on disk; no extra dirs may exist on disk. Implementation tracks actualDirs during the same WalkDir, then compares after the walk -- ~16 lines total. Dotfile/dotdir filtering is unchanged, so virtual paths (.log, .savepoint, .undo, .history) stay outside the comparison. This is the verification phase for #6/#9. Pre-#6, mkdirSynth not logging meant Mkdir-created dirs survived undo while the test's stack rolled them back, producing dir drift on every undo. After #6, every dir create logs and undo restores/removes dirs correctly, so the test's state.Dirs should track TigerFS exactly. The stress test confirms the chain holds: - Seeds 42, 100, 200, 300, 400, 500, 600, 700: 500 iters each, clean. - Seed 1777097481681738000: 1000 iters, clean. - Seed 1777065886566418000: 1000 iters, clean. Updates two existing unit tests (TestValidateWorkspace_Passing, TestValidateWorkspace_NestedDirs) to declare the dirs setupTestDir's MkdirAll creates implicitly. Adds three new unit tests: TestValidateWorkspace_MissingDir, TestValidateWorkspace_UnexpectedDir, TestValidateWorkspace_EmptyExpectedDir. The change is purely additive in a test-only function. It cannot introduce production correctness violations or weaken the existing file-validation checks; it can only add error reports to the same errs slice. Worst case is a false positive (test bug), which is visible and fixable.
Audit of the cache for #7 (undoCache.invalidate concurrency). Findings: - All access is properly RWMutex-protected: reads use RLock; writes (store* and invalidate) use Lock. Map allocation is guarded inside the writers. - Returned slices/pointers (e.g., entry.files, *db.Row) are safe for callers to hold; storeX always creates a fresh entry rather than mutating an existing one, so concurrent writers can't race on the cached values themselves. - There is one narrow stale-write race: a reader whose QueryUndoAffectedFiles is in flight when the writer COMMITs and invalidates can re-populate the cache with pre-undo rows (PG's READ COMMITTED snapshot was taken before the commit). The staleness self-heals after the 2-second TTL. Severity is low: the staleness affects only undo preview surfaces (.info/summary, .undo/<target>/<preview>), not the apply path, which re-queries fresh inside ExecuteUndoTransaction. The other cached tables (log/savepoint/history) are append-only or immutable, so their cached entries don't become semantically wrong. Eliminating the race would require either holding the cache lock for the entire undo TX (kills concurrency) or generation-counter keys (complex for the benefit). Not worth it; the TTL already bounds the window. Two doc updates: - docs/spec.md: new "Undo Preview Cache" subsection in the Concurrency and Multi-User chapter, documenting the staleness window for users. - undo_cache.go: detailed comment on invalidate() explaining the race for future implementers, with a pointer to the spec section.
Audit for #8 found that readDirHistoryByFilename's resolveSynthPath call passed synth.TigerFSSchema as the cache schema, mirroring the schema used for the underlying history-table DB query. But pathCache keys are global to the cache, and every other writer (live-path resolutions in synth_ops.go, post-undo invalidates) keys under the user's schema (parsed.Context.Schema, e.g. "public"). Effect: history-path cache writes landed in a disjoint (tigerfs, table) namespace, separate from where live-path code read/wrote/invalidated. The (tigerfs, table) entries got neither shared with normal reads nor cleared by live-path invalidates, so a stale entry could survive a delete + recreate cycle within the 2-second pathCache TTL. User-visible symptom: delete dir A (with children that had history) and immediately mkdir A again with a fresh UUID; reading A's history within 2 seconds returned the OLD A's children's history because resolveSynthPath served the stale (tigerfs, table) "A" -> OLD_A_ID cache entry while the live-path cache had already been invalidated. Fix: pass parsed.Context.Schema (cacheSchema) to resolveSynthPath for the pathCache key, while keeping synth.TigerFSSchema for the DB query against the _history table (which legitimately lives in the tigerfs schema). One extra parameter threaded through readDirHistoryByFilename. Test: TestSynth_HistoryDirCacheUsesUserSchema reproduces the exact delete-and-recreate scenario. Verified to fail with the bug present (history shows old child "x.md" under recreated A) and pass after the fix (recreated A has empty history). Other call sites of resolveSynthPath were audited and all already use the user's schema -- this was the only mismatch.
Two additive tests to round out FK-deferral coverage: #11 -- TestUndo_ToSavepoint_DefersFKConstraint (integration). Same move-into-newer-dir scenario as the existing to-id test, but driven through .undo/to-savepoint/<name>/.apply instead. Confirms the deferred-FK path applies on the savepoint route too, not just to-id. Verified to fail when SET CONSTRAINTS ALL DEFERRED is removed. #12 -- TestExecuteUndoTransaction_FailsAtCommitOnUnrestorableFK (unit). Constructs a genuinely orphan-creating restore: a history row whose parent_id points at a UUID that exists nowhere (no source row, no history row, not in the restore set). The deferred FK lets the UPSERT through at insert time, then must fire at COMMIT and roll back the whole transaction. Asserts SQLSTATE 23503 and that the source table is empty afterward (no orphan committed). The unrestorable case isn't reachable from normal user operations -- the undo classifier always pulls in ancestor rows when their children are affected -- but the test pins down the contract that DEFERRABLE INITIALLY IMMEDIATE doesn't silently allow orphans. If a future refactor weakens or removes the FK, this test catches it.
NFS multi-chunk writes fan a single user-level create_file or edit_file into 1 + ceil(size/wsize) log entries -- go-nfs fabricates Open/Write/Close per WRITE RPC, and each Close commits and logs separately. The stress test tagged each stack entry with only the latest log_id and assumed "1 op = 1 log entry," so undo_single targeted the newest entry while the stack popped the whole op's expected state. The file kept residual content (one chunk's worth) but validation expected it gone. Reproduced 100% with seed 1777245630240695000. - Track LogCount per stack entry; gate undo_single on MostRecentLogIsAtomic() so only single-log-entry ops are valid targets. Intermediate chunked-write states aren't representable in the md5-keyed WorkspaceState, so partial undos can't be validated. - readLogIDsSince captures every log_id produced by an op using logReadSeq+logScanDepth as the read count: unique per call (busts macOS NFS attr/data cache for repeated reads of .log/.last/N) and always >=50 (enough headroom for any single op's fan-out). - TestUndo_ToID_LargeEditOnly verifies that undo_to_id targeting a 10 MB create's log_id rolls back only the subsequent edit, leaving the file at its post-create state. - CLAUDE.md: one-line reminder to audit statCache/pathCache.invalidate schemas on every new write path -- schema-key mismatches silently leave stale entries cached for up to 2s and surface as cross-mount read inconsistencies (the bug pattern fixed in 4a9ea71 and a83c523).
The previous "Replay with: ..." line only echoed --seed and --iterations, omitting flags that change the workload (--large-files, --many-files, --validate-every, --workspace). Replays of large-file failures silently ran the small-file workload and either masked the bug or produced a divergent trace.
On validation failure (or via --dump-at) write a complete diagnostic
snapshot to /tmp/tigerfs-stress-{failure,snapshot}-<seed>-<iter>-<ts>/
and leave infrastructure running for inspection. The failure message
prominently surfaces the dump path so the user can correlate the dump
against the live database and mount.
Each dump contains:
- analysis.txt -- pre-computed cross-references and anomaly findings
- summary.{txt,json} -- failing op, replay command, dump path,
mountpoint, postgres URL
- diff.{txt,json} -- structured ValidationIssue list grouped by kind
- expected_state.json / actual_state.json -- workspace state at the
moment of the dump
- stack.json -- every StackEntry with LogID + LogCount
- operations.{log,json} -- full per-iteration op trace incl. log_id
fan-out per op
- db_state.json -- snapshot of testws / testws_log / testws_savepoint /
testws_history (last 200) via a fresh pgx connection that doesn't
share tigerfs's pool
The analyzer flags four classes of anomaly automatically -- catching
the iter-107 lastLogID regression at write time instead of via ad-hoc
queries:
- log_count > ceil(write_size/128KB)+2 for create/edit: stale
lastLogID from a prior undo
- create_savepoint with log entries: behavior regression
- Stack LogID lexicographically out of order: stack bookkeeping bug
- MissingFile + UnexpectedFile with matching content hash: rename
divergence between TigerFS and stress-test state
--dump-at LIST captures snapshots mid-run (one per listed iteration)
without stopping the run, useful for forensics on non-reproducible
failures (compare snapshots across runs of the same seed). The
runner's replay command now includes every workload-affecting flag
(--seed, --iterations, --validate-every, --large-files, --many-files,
--workspace, --dump-at) so replays are bit-for-bit comparable to the
original run.
Refactored ValidateWorkspace into snapshotWorkspace + diffWorkspace +
formatIssue so the same machinery feeds both the human error string
and the structured dump diff.
Stress runs surfaced an empirically reproducible staleness path: after
a heavy undo (especially undo_to_savepoint), an immediate read of
/.log/.last/N/.export/json sometimes returns a snapshot from before
the undo's new log rows were committed. Symptom is the iter-107
log_count=61 anomaly already documented by the analyzer -- a 1.1KB
create cannot produce 61 log entries; the extra 60 are previously-
observed entries that the runner re-attributes to the wrong op
because lastLogID regressed during readLatestLogID after the prior
undo.
Three regressions observed in 500 iters of seed 1777258557667578000
--large-files --validate-every 10:
- iter 106 (undo_to_savepoint): recovered after ~150ms (3 retries)
- iter 138 (undo_single): did NOT recover within 500ms; "stuck" at
a log_id 1.6 seconds older than the prior known
- iter 258, 268: recovered within 150-300ms
Defensive workaround in the stress test (no TigerFS code change):
- readLatestLogIDMonotonic retries up to 10 times with 50ms sleeps
when the read returns a smaller log_id than the prior known. If
recovery happens, log a warning with retry count + elapsed; if
not, log a warning and KEEP the prior known (don't regress).
- runner's post-undo lastLogID update goes through the new helper.
- TestUndo_LogReadFreshAfterSavepoint pins the boundary: heavy undo
followed by an immediate ops.ReadFile of /.log/.last/1/.export/json
must return a log_id newer than any pre-undo entry. Passes 10/10
on my machine (~0ms recovery), confirming the staleness lives in
the NFS adapter / go-nfs / kernel layer, NOT in TigerFS's
query/cache path. Bypasses NFS entirely.
Underlying NFS-layer staleness is unfixed and intentionally left for
follow-up; this commit makes the stress test resilient to it and
provides the regression test that will prove the fix when found.
…calls
This is a STRESS-TEST MITIGATION, not a bug closure. The underlying
NFS-layer staleness still happens; this commit makes the stress test
robust to it across all the call sites we've now observed it affecting.
Background. Three accumulated dumps of the same seed
(1777258557667578000) and two boundary tests have triangulated the
following:
- ops.ReadFile after a heavy undo: always fresh (10/10 in test)
- OpsFilesystem.OpenFile after a heavy undo: always fresh (10/10)
- Maximally aggressive macOS NFS mount options
(actimeo=0,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,nonegname-
cache): identical regression rate to plain `noac`
- Same iters reliably regress across runs of a seed (deterministic
workload pattern), but the lag varies from 150ms to 1547ms (not
a fixed kernel TTL)
The bug therefore lives in go-nfs (the vendored RPC library) or the
macOS NFS kernel client in a way mount options cannot suppress. The
exact mechanism is not identified.
What this commit changes:
1. Bump readLatestLogIDMonotonic retry budget from 500ms to 2.5s.
Empirical max observed lag is 1547ms; 2.5s gives ~60% headroom and
eliminates the "did not recover within N retries" warnings we kept
seeing at iter 138.
2. Wrap the runner's post-OpDeleteDir lastLogID update with the
monotonic helper. A snapshot dump caught this exact call returning
stale data (1.5s old) after a delete_dir, cascading into the next
op's readLogIDsSince which then reported log_count=65 (64 stale
entries + 1 fresh). Same fallback as the post-undo case: keep the
prior known-good lastLogID rather than regress.
3. Wrap the per-row readLatestLogID calls inside OpDeleteDir's
deletion loop. The same dump showed three of five per-row reads
stale by ~1.5s, leaving non-monotonic LogIDs on adjacent stack
entries. On regression-or-no-progress, leave the entry unlogged
(LogID="") so undo_single skips it -- safer to lose targetability
for one deletion than to misroute undo_single onto a completely
unrelated log row.
Verification:
- Same seed, 500 iters: 4 warnings, all recovered in 100-300ms,
validation passes. Snapshot at iter 475 reports "(none detected)"
in analysis.txt -- the log_count=N>>1 cascades that motivated this
investigation are no longer produced.
- Different seed (42), 100 iters: caught one delete_dir per-row
regression we couldn't even see before this wrap (iter 91,
recovered in 200ms), validation passes.
What this commit does NOT do:
- Does not fix the underlying NFS-layer staleness. Reads of
/.log/.last/N/.export/json after a heavy commit still return
snapshots that lag real-time by up to ~1.5s.
- Does not change any TigerFS code. The bug has been ruled out at
every TigerFS layer.
- Does not affect production correctness. TigerFS undo and validation
are unaffected; only the stress harness consumes log_id streams in
tight post-write loops.
…ilures
Closes a gap where transient NFS errors (EIO and similar) during
executeOperation would tear down the live infra and leave the user with
nothing but a stderr line and a replay command. Now any unrecoverable
runner error -- validation mismatch OR op-level error -- writes the
same diagnostic dump and keeps the infrastructure alive for inspection.
The dump format and machinery don't change; only the trigger widens.
- ValidationFailure renamed to RunFailure with a Kind field
("validation" or "operation"). RunAndExit treats either as
KeepInfra=true.
- executeOperation errors append a marker OpRecord ("X [FAILED: <err>]")
to the op log so the trace shows what failed and why, then call
WriteDump(DumpKindFailure, "operation", ...) before returning.
- summary.json grows a `failure_kind` field (omitempty for snapshots).
Downstream tooling can switch on it without parsing free-form text.
- Renamed dumpSummary.ValidationMessage -> ErrorMessage; renamed
shortValidationMessage -> shortErrorMessage. Same field carries the
abort reason for both kinds; summary.txt's heading adapts.
- New TestWriteDump_OperationFailureKind covers the new path.
Verified that a successful 2000-iter run of the user-reported EIO seed
(1777265015594858000) still passes cleanly. The EIO itself is timing-
dependent and didn't reproduce in the replay, but the dump
infrastructure is now ready for the next occurrence.
Stress-test mitigation only -- the underlying NFS-layer staleness
(see commit 001a894) and any transient EIOs are not addressed here;
this commit just guarantees we capture diagnostic data when they fire.
Two complementary changes that turn the previously stream-only
monotonic warnings into something quantifiable:
1. End-of-run summary
Stats now records every readLatestLogIDMonotonic regression as it
fires (iteration, op desc, retry count, recovered/stuck, elapsed).
Stats.Print emits a new "Monotonicity Warnings" section at the end of
a successful run with:
- total regressions and rate (% of all ops)
- recovered vs. stuck count
- retry-time distribution (fast / medium / slow / stuck buckets)
- op kinds preceding regressions (undo_to_savepoint, delete_dir, etc.)
The streamed [warn iter ...] lines stay as before -- the summary is
additive, intended as the one-glance "is NFS-layer behavior stable
across runs?" view.
Stats now threads through OpDeleteDir as well so per-row regressions
inside a delete_dir loop count alongside post-undo regressions; both
flow into the same summary.
2. CLAUDE.md note under "Testing Requirements"
Documents the expected nature of these warnings: they're rooted in
NFS-layer staleness below TigerFS, ruled out at every TigerFS layer
(ops, OpsFilesystem, mount options), absorbed by the runner's monotonic
helper, and not bugs to fix in TigerFS code. Points future sessions at
the iter-107 investigation commits for context.
Sample output from seed 1777258557667578000 (500 iters, --large-files):
=== Monotonicity Warnings ===
Total: 1 regressions across 500 ops (0.20%)
Recovered: 1 / 1 (0 kept prior lastLogID after retry exhaustion)
fast (≤3 retries, ~300ms) 1
Op kinds preceding regressions:
undo_to_savepoint 1
(none -- no readLatestLogID regressions observed) when clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements undo and recovery for TigerFS file-first workspaces, end-to-end: a relational directory schema, a logged-operation history, savepoints, and filesystem interfaces for browsing the log and applying undo operations. Also ships a comprehensive stress-test framework and the diagnostic infrastructure built around investigating its findings.
This is the implementation of three ADRs:
ADR-016 — Undo and Recovery
The problem. Users and agents make changes that need to be reverted. Today, recovery depends on the agent manually undoing changes, or on having a clean git commit to fall back on.
The design. Build on the existing per-row history hypertable to add three new components, all exposed through the filesystem:
<workspace>_logtable) — every create/edit/rename/delete writes a row capturing{log_id, file_id, type, filename, version_id, user_id, description}. Each undo op also writes a log entry, so undo-of-undo chains are addressable.<workspace>_savepointtable) — named markers pinning a specific log_id, used as targets forundo_to_savepoint.<workspace>/.log/,<workspace>/.savepoint/, and<workspace>/.undo/{id,to-id,to-savepoint}/<target>/.applypaths. The.log/and.savepoint/directories use the existing data-first pipeline machinery (.last/N,.by/type/<kind>,.export/json).Scope. Synth app tables with history enabled. Native/data-first tables are not affected.
📄
docs/adr/016-undo-and-recovery.mdADR-017 — Relational directory structure (parent-pointer model)
The problem. Synth apps used to encode directory structure inside the
filenamecolumn (projects/web/todo.md). This breaks the per-file undo model: a directory rename touches N rows but the log records one row per file_id, leading to partial-undo bugs, ordering hazards, and filter blind spots no batching scheme cleanly resolves.The design. Replace path-encoded filenames with a parent-pointer model:
filenamestores only the leaf name (no slashes)parent_id UUID REFERENCES <table>(id) DEFERRABLE INITIALLY IMMEDIATE— self-referencing FKUNIQUE NULLS NOT DISTINCT (parent_id, filename, filetype) DEFERRABLE INITIALLY IMMEDIATE— uniqueness scoped per-directory, deferrable so undo transactions can re-insert children before parents are restored(parent_id, filename)for ReadDirA directory rename becomes a single-row update on the directory's
filename. The log records one entry, and undo trivially reverses it.Migration.
tigerfs migrateships with a relational-directories migration that converts existing path-encoded apps in place. Existing workspaces get a warning at mount time until migrated.📄
docs/adr/017-relational-directory-structure.mdADR-018 — Stress test
The problem. Integration tests cover individual operations and specific edge cases, but real workloads are long, unpredictable sequences of mixed ops with interleaved undos. Targeted tests miss ordering bugs, state-tracking errors, and corruption that emerge only under sustained mixed activity.
The design. A standalone Go binary at
test/stress/that:WorkspaceState(path → md5 hash + dir set) with a push/pop stack for undo rollbackos.WriteFile,os.ReadFile, etc.)/tmp/tigerfs-stress-<failure|snapshot>-<seed>-<iter>-<ts>/) with pre-computed anomaly analysis, structured diff, full op trace, stack history, and a fresh pgx-driven snapshot of the four undo-related Postgres tables. Triggers on validation failure, operational failure (EIO, etc.), or via--dump-at N[,M,...].📄
docs/adr/018-stress-test.md98 commits, ~25K lines, ~105 files. Merging as-is to preserve fine-grained progression.
What's user-visible
New filesystem interfaces
<workspace>/.log/.last/N,.by/type/<kind>,.by/user_id/<id>,.export/json,.export/csv)<workspace>/.savepoint/<workspace>/.undo/id/<log_id>/.apply<workspace>/.undo/to-id/<log_id>/.apply<workspace>/.undo/to-savepoint/<name>/.apply<workspace>/.history/<file>New CLI
tigerfs migrate— run pending schema migrations on existing workspaces. Two migrations ship: relational directories (parent-pointer model) and parent-dir mtime trigger.bin/tigerfs-stress start [--seed N] [--iterations N] [--large-files] [--many-files] [--validate-every N] [--dump-at LIST]— stress-test framework with diagnostic dump capability.bin/tigerfs-stress stop— tear down a--keep'd run from another terminal.Mount changes
The macOS NFS client now mounts with
noacto disable attribute caching. TigerFS has its own 2s stat cache; cross-mount writes need to be visible immediately, and the kernel attribute cache otherwise serves stale data for up to 60 seconds.Architecture changes that warrant reviewer attention
Schema migration: relational directory model (ADR-017). Each row carries
parent_id(foreign key toidin the same table) instead of encoding paths as filenames. Necessary for log/history to track moves correctly. Migration is gated and reversible per the migrate command. Existing workspaces without the migration get a warning at mount time.DEFERRABLE INITIALLY IMMEDIATEconstraints.parent_id_fkeyand the(parent_id, filename, filetype)UNIQUE constraint are deferrable. Undo transactions issueSET CONSTRAINTS ALL DEFERREDso child files can be restored before their parent dirs within a single transaction. Tests cover the deferral-required path.UUIDv7 throughout. All IDs (file_id, log_id, savepoint_id, version_id) are UUIDv7. Display names are derived from the timestamp+base36 form; both raw UUIDs and display names are accepted at the
.undo/...apply paths.Cache invalidation by user schema. Several bugs in this branch were schema-key mismatches between the synth backing schema (
tigerfs.<workspace>) and the user-facing view schema (public.<workspace>). CLAUDE.md grew a one-line audit reminder for new write paths.Stat cache + path cache + undo cache. All three have 2-second TTLs and are invalidated on writes. The consistency model ("never cache content; only metadata") is documented in CLAUDE.md and observed throughout.
Test coverage
test/integration/exercise mount-based workflows, undo paths, savepoint behavior, schema migration, and history.bin/tigerfs-stressis the long-form correctness check: randomized sequences of mixed ops with hash-based validation after each iteration. Verified clean across 10000-iter runs with--large-files --many-files.TestUndo_LogReadFreshAfterSavepointand_ViaNFSAdapter) pin down a known NFS-layer staleness window — see "Known issue" below.Known issue (mitigated, not fixed)
The macOS NFS client and/or
go-nfslibrary can return stale snapshots of.log/.last/N/.export/jsonfor up to ~1.5s after a heavy commit (large undo, many-row delete_dir). Triangulated to live belowOpsFilesystem— both the ops and OpsFilesystem layers serve fresh data; the kernel-side NFS client does not, regardless of mount options.The stress test's
readLatestLogIDMonotonichelper detects regressions, retries up to 2.5s, and falls back to the prior known-goodlastLogID. End-of-run output now includes a "Monotonicity Warnings" section with rate, recovery distribution, and op kinds that triggered. Validation correctness is unaffected; production users would not encounter this in practice (no real workload reads the log in tight post-undo loops).CLAUDE.md documents the expected behavior under "Stress-test monotonicity warnings."