🐛 Tolerated absolute and pre-prefixed paths in S3Storage#27671
🐛 Tolerated absolute and pre-prefixed paths in S3Storage#27671
Conversation
WalkthroughThe S3Storage adapter's 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
ref https://linear.app/tryghost/issue/ONC-1673 Several callers pass either absolute filesystem paths (built via path.join with getContentPath) or paths that already include the storagePath prefix (built via getTargetDir) when calling exists(), save() or delete(). path.posix.join concatenated those onto storagePath verbatim, producing malformed bucket keys that embedded the local filesystem prefix or duplicated the storagePath segment. The exists() probe then targeted a different key than the eventual write, defeating uniqueness checks; on stricter bucket policies the malformed HEAD threw non-NotFound errors, surfacing as the bookmark favicon fallback users have been reporting. Routed all keys through toCanonicalRelativePath, a chain of named handlers (fromAbsoluteFilesystemPath, fromStoragePathPrefixed, fromLeadingSlashPath) that each handle one input shape and return null when their shape doesn't apply, mirroring how LocalStorageBase absorbs the same shapes via _resolveAndValidateStoragePath. Existing path-traversal protection still fires for `..` segments and the change is forward-only — no existing S3 objects move or rename.
62a04a6 to
d0730d9
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ghost/core/core/server/adapters/storage/S3Storage.ts`:
- Around line 420-447: The canonicalization helpers miss Windows-style absolute
paths so callers on Windows can produce malformed S3 keys; update
fromAbsoluteFilesystemPath and fromLeadingSlashPath to first normalize
backslashes to POSIX slashes and detect Windows absolute/UNC paths (e.g.,
drive-letter like "C:\..." and UNC "\\server\share\...")—strip the drive letter
or UNC server/share prefix and any leading slashes before applying the existing
storagePath-relative logic (keep fromStoragePathPrefixed unchanged); in other
words, normalize input by replacing "\" with "/", handle ^[A-Za-z]:/ and
^//server/share/ cases to produce the same relative/path outputs as POSIX inputs
so S3 keys are canonical on Windows.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3045a037-9e24-4d2f-9cdc-dcd678d5dd32
📒 Files selected for processing (2)
ghost/core/core/server/adapters/storage/S3Storage.tsghost/core/test/unit/server/adapters/storage/s3-storage.test.ts
|
@allouis tagging you since you've been active in this code recently, I would value your input on this! With the rollout to S3 storage for images we've seen a latent bug in how absolute paths are handled, which caused bookmark favicons to break. That's a symptom of a wider issue: there's drift in how S3Storage and LocalStorage adapters deal with absolute and relative file paths. I opted to fix this in S3Storage rather than the favicon flow specifically because I suspect there's other path-resolution wrongness elsewhere that isn't user-visible yet (image imports look like they're storing the absolute on-disk path inside the bucket key, after the prefix). I opted to stay permissive and add support for absolute paths alongside the other shapes rather than harden what the adapter accepts, because that would require 3+ changes in other places to prevent regressions. You might prefer that solution to this one. Let me know what you think! |
There was a problem hiding this comment.
I would prefer to fix the call site, and potentially even do something like
if (relativePath.startsWith('/') || relativePath.includes(storagePath)) {
throw IncorrectUsageError('Pass a relative path');
}
Because I think supporting all these cases adds some complication to the storage layer, and it continues this pattern of consumers knowing way too much information about storage paths etc….
That said - I do get your reasoning. How would you feel about doing both? We fix the callers but add this support, or is that overkill?
| return `${this.tenantPrefix}/${pathWithStorage}`; | ||
| } | ||
|
|
||
| private toCanonicalRelativePath(input: string): string { |
There was a problem hiding this comment.
I think we could implement the path transformations inline here, because they all boil down to "get everything after the storage path" or "remove leading slash" - something like:
toCanonicalRelativePath(input) {
const marker = this.storagePath + '/';
const parts = input.split(marker);
if (parts.length > 1) {
return parts[parts.length - 1];
}
return input.replace(/^\/+/, '');
}Or I guess the lastIndexOf approach works here too - I think it's effectively the same
There was a problem hiding this comment.
I originally had this implementation, but I found myself having to document the different cases and outcomes in comments so I wanted to see if I could encode the accepted inputs and their outputs into the code itself, hence the different function names etc.
Its more verbose in the code, but also more explicit and literal, so we don't have to rely on comments to explain what we accept or transform or for convention to remember what we accept or pass.
The other method of achieving the same goal is to not allow "invalid" input which would require this kind of manipulation, but that would require changing all the call sites, which I wanted to avoid (in this specific design).
Regarding the suggestion vs the current implementation, we could go with either, I think the outcome is the same. I'll hold off actioning anything right now, I'll reply to the outer suggestion about updating call-sites first.
|
Thanks for taking a look @allouis! I spent a while on the stricter approach of restricting inputs and updating call-sites but it really spiralled out of control. We have a few distinct paths which all pass something different and need special consideration. For example, in the "import" path we want images stored at their absolute locations at the root, not relative, so we need some special handling for that case or to update the importer. Its possible, and I have a PR which demonstrates the scope of change here but its quite extensive. In this case I think this PR is a pragmatic choice to fix the issue, and aligning paths is something that requires more time / consideration outside of on-call work. What do you think? |
|
Thank you for exploring the alternative! I think you're right, I didn't envisage the scope being quite so big - I'm happy with the approach in this PR! |
Problem
Bookmark cards stopped rendering favicons on Pro (ONC-1673). Investigation showed this is the user-visible slice of a wider pattern: three callers — the bookmark fetcher, the importer image handler, and the external-media-inliner — build storage
targetDirarguments by joininggetContentPath(...)orgetTargetDir(storage.storagePath)with extra segments, producing either absolute filesystem paths or paths that already include the storagePath prefix.LocalStorageBasetolerates both shapes;S3Storagedoesn't —path.posix.joinconcatenates them onto storagePath verbatim, producing malformed bucket keys with the filesystem prefix embedded or the storagePath duplicated. Theexists()probe then targets a different key than the eventual write, defeating uniqueness checks; on stricter bucket policies the malformed HEAD throws non-NotFound errors, surfacing as the favicon fallback users have been reporting.Solution
Tolerate the three input shapes callers actually pass — absolute filesystem path, path already prefixed with the storage root, and decorative leading slash — by canonicalising each before composing the bucket key. Mirrors how
LocalStorageBaseabsorbs the same variants. Existing path-traversal protection still fires for..segments.Migration
Forward-only. No existing S3 objects move or rename; URLs already stored in posts continue to resolve to whatever object they pointed at before. Existing callers passing properly-relative paths see no behaviour change. Importer-side, where post HTML previously referenced a key that the file was never actually written to, the fix heals new writes by landing them at the matching key.
ref https://linear.app/ghost/issue/ONC-1673