Skip to content

Auto-recover from corrupt dolt manifest on startup #3290

@azanar

Description

@azanar

Problem

On unclean shutdown, dolt databases can end up with a corrupt manifest: the manifest references a root hash that was never flushed to disk. On the next startup, dolt exits immediately with:

root hash doesn't exist: <hash>

The city init loop removes stale LOCK files and retries, but dolt keeps failing the same way — it will never self-recover because the underlying manifest is corrupt, not just locked.

Reproduction

Occurs when the dolt server is killed mid-write, leaving a manifest that references a root hash that doesn't exist in the journal or oldgen. Affected databases have:

  • An empty journal.idx (0 bytes)
  • A journal file containing only the 40-byte header (no chunk data)
  • An empty oldgen/ directory

Proposed Fix

During bead store startup, after the stale LOCK removal step, detect if dolt exits immediately with "root hash doesn't exist". If the journal is also empty (no recoverable data), automatically reset the manifest and reinitialize the database rather than failing indefinitely.

The detection is unambiguous: empty journal + "root hash doesn't exist" = no data to lose, safe to reinitialize.

Workaround

Manually delete the corrupt manifest file for the affected database(s) (e.g. .beads/dolt/be/.dolt/noms/manifest) and restart. Dolt will reinitialize the database from scratch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions