-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix!: return simple stats or extended stats #760
Conversation
Check out the changes to the returned types - https://github.com/ipfs/helia/pull/760/files#diff-026d52cd110beb421b24bb2d95961f27f3f3d083df76eac8302eb4eab783c680 - most of this PR is noise dealing with that. |
Removing the |
9f2e7e0
to
5cd760c
Compare
Splits the unixfs/mfs stat command return types into two types - "regular" stats (these only return stats available on the root node of the DAG) and "extended" - these collect stats from the DAG, potentially going to the network to fetch missing blocks. There are separate simple and extended types for files and dirs, but raw/leaf nodes only get one type since there are never any nodes linked to from them. As a bonus we can now calculate the size of directories correctly, (as the combination of all child files/directories), assuming all blocks are present in the blockstore or fetchable from the network. Fixes #580 BREAKING CHANGE: The return type from fs.stat varies by DAG type - inspect the `.type` property to ensure type safety
5cd760c
to
4ebc680
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only thing confusing me here is that in the tests, we aren't validating that "network requests" will fill out dagSize
more when extended=true
.
It would be good to have a test for that.. maybe have a wrapped blockstore that calls to a "network" blockstore that we load the bytes into?
packages/mfs/test/stat.spec.ts
Outdated
await blockstore.delete(node.Links[0].Hash) | ||
|
||
// block count and local file/dag sizes should be smaller | ||
await expect(fs.stat(filePath)).to.eventually.include({ | ||
fileSize: 5242880n, | ||
blocks: 5, | ||
localFileSize: 4194304n, | ||
localDagSize: 4194563n | ||
const updatedStats = await fs.stat(filePath, { | ||
extended: true | ||
}) | ||
|
||
expect(updatedStats.unixfs?.fileSize()).to.equal(5242880n) | ||
expect(updatedStats.blocks).to.equal(5n) | ||
expect(updatedStats.dagSize).to.equal(4194563n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting dagSize to be the same here because extended=true
but I guess there's no networking done in the test, so it doesn't retrieve that block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, yes. The blockstore is a MemoryBlockstore that isn't wrapped in a NetworkedBlockstore.
packages/unixfs/src/index.ts
Outdated
*/ | ||
type: 'file' | 'directory' | 'raw' | ||
localSize: bigint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since ExtendedStats
can trigger network fetching, I suppose if this is successfully returned, it's accurate, because the blocks had to be fetched for this to return. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, unless the offline
option was true
, in which case it won't pull down and missing blocks so localSize
will only be what's in the local blockstore.
I was confused by the same thing. If you have a NetworkedBlockstore, will the helia/packages/unixfs/src/commands/stat.ts Lines 108 to 118 in 6a9f297
|
unixfs: entry.unixfs, | ||
mode: entry.unixfs.mode ?? (entry.unixfs.isDirectory() ? DEFAULT_DIR_MODE : DEFAULT_FILE_MODE), | ||
mtime: entry.unixfs.mtime, | ||
size: entry.unixfs.fileSize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we expect fileSize()
to return for directories?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will return 0n
.
No, actually this was a bug. I've updated the code and added an interop test that tests pulling the missing block from a network peer. |
Co-authored-by: Daniel Norman <[email protected]>
To better understand and doucment the nuances around stats and extended stats, I created this table: stat options and result
Notes
I'll drop it here for now, and can incorporate it into the READMEs in a follow up to this. Would be great if you can verify this. |
Looks fine. I've made a change so that when statting a directory with This is as close to the "true" or expected directory size as we can get, I think. For the same stat result, |
Splits the unixfs/mfs stat command return types into two types - "regular" stats (these only return stats available on the root node of the DAG) and "extended" - these collect stats from the DAG, potentially going to the network to fetch missing blocks.
There are separate simple and extended types for files and dirs, but raw/leaf nodes only get one type since there are never any nodes linked to from them.
As a bonus we can now calculate the size of directories correctly, (as the combination of all child files/directories), assuming all blocks are present in the blockstore or fetchable from the network.
Fixes #580
BREAKING CHANGE: Fields that would involve DAG traversal have been removed from the output of
fs.stat
- pass theextended
option to have them returnedChange checklist