fix(credstore): use gofrs/flock for advisory file locking#20
Conversation
- Replace the custom O_CREATE|O_EXCL retry loop and 30s stale-mtime heuristic with gofrs/flock, which uses kernel fcntl on POSIX and LockFileEx on Windows and releases the lock automatically when the holder exits - Remove the stale-lock reaper: it had a Stat-then-Remove TOCTOU and could wrongly evict a slow but still-alive holder whose mtime was older than the timeout - Stop deleting the lock file on release; advisory locking protects access, so an orphan file is harmless and does not block new acquirers - Replace the stale-lock test with one that verifies an orphaned lock file does not block a new acquirer - Drop the concurrent-writes assertion that required the lock file to be removed after saves, since the file now intentionally persists
There was a problem hiding this comment.
Pull request overview
This PR updates credstore’s cross-process file locking to use kernel advisory locking via github.com/gofrs/flock, replacing the previous lock-file creation + stale-mtime eviction approach to avoid wrongful eviction and TOCTOU lock removal.
Changes:
- Replace
credstore’s lock acquisition/release logic withgofrs/flockadvisory locking and a fixed 30s acquisition timeout. - Update locking tests to reflect that
.lockfiles are intentionally left on disk and add coverage for orphaned lock-file re-acquisition. - Add the new module dependency to
go.mod/go.sum.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| go.mod | Adds github.com/gofrs/flock dependency. |
| go.sum | Adds checksums for github.com/gofrs/flock. |
| credstore/filelock.go | Reimplements lock acquisition/release using advisory locks with a timeout. |
| credstore/filelock_test.go | Updates expectations around lock-file persistence; adds orphaned lock-file test. |
| credstore/file_store_test.go | Removes assertion that lock files are deleted after concurrent writes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (fl *fileLock) release() error { | ||
| var closeErr error | ||
| if fl.lockFile != nil { | ||
| closeErr = fl.lockFile.Close() | ||
| } | ||
| removeErr := os.Remove(fl.lockPath) | ||
| return errors.Join(closeErr, removeErr) | ||
| return fl.fl.Unlock() |
There was a problem hiding this comment.
The receiver name fl combined with the field name fl leads to fl.fl, which is unnecessarily hard to read. Rename either the receiver (e.g., l/lock) or the struct field (e.g., flock) to make the unlock call clearer.
| // lock file from a crashed process does not block new acquirers because the | ||
| // kernel does not hand out the advisory lock to anyone. |
There was a problem hiding this comment.
The doc comment is misleading: an orphaned .lock file is harmless because no process holds the advisory lock, not because the kernel "does not hand out" the lock. Reword this to clarify that the lock becomes immediately available to new acquirers once no process holds it (including after a crash).
| // lock file from a crashed process does not block new acquirers because the | |
| // kernel does not hand out the advisory lock to anyone. | |
| // lock file from a crashed process does not block new acquirers because once | |
| // no process holds the advisory lock, including after a crash, it becomes | |
| // immediately available to be acquired again. |
- Rename release() receiver from fl to l so the unlock call reads as l.fl.Unlock() instead of the doubled fl.fl.Unlock() - Clarify the orphaned-lock doc comment: explain that an orphan lock file is harmless because no process holds the advisory lock (including after a crash), not because the kernel withholds it
Summary
filelock.gouses anO_CREATE|O_EXCLretry loop with a 30-second stale-mtime heuristic. Two problems:os.Statandos.Remove, the original holder may release and a new process may re-create; theRemovethen deletes a lock that is actively held.github.com/gofrs/flock, which uses kernel advisory locking (fcntlon POSIX,LockFileExon Windows). The kernel releases the lock automatically if the holder crashes, so no stale-timeout heuristic is needed.Behaviour changes
release()no longer deletes the lock file. Tests that observed that side-effect have been updated.Test plan
make test— existing concurrent-locks test still passesTestAcquireOrphanedLockFileverifies a leftover lock file does not block new acquirersTestAcquireAndReleasenow asserts re-acquisition works after release (instead of file deletion)make lintclean🤖 Generated with Claude Code