Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support dag-scope #767

Open
wants to merge 19 commits into
base: main
Choose a base branch
from
Open

feat: support dag-scope #767

wants to merge 19 commits into from

Conversation

SgtPooki
Copy link
Member

@SgtPooki SgtPooki commented Mar 14, 2025

Title

feat: support dag-scope

Description

This PR adds support for the dag-scope query parameter to the export and stream methods. This parameter allows the user to specify the scope of the DAG they want to retrieve. The possible values are all, entity, and block. The default value is all.

Also refactored the code quite a bit to support different strategies for walking the dag, and added two new options: dagRoot & knownDagPath.

This code also updates the blockFilter check to occur before the blockstore.get call to prevent potentially unnecessary network calls.

Related ipfs/helia-verified-fetch#198
Related ipfs/service-worker-gateway#356

Notes & open questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation if necessary (this includes comments as well)
  • I have added tests that prove my fix is effective or that my feature works

@SgtPooki SgtPooki requested a review from a team as a code owner March 14, 2025 20:24
Copy link
Member Author

@SgtPooki SgtPooki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self review

@@ -0,0 +1,27 @@
/**
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that this is in it's own file.. but i was trying to avoid a circular dep, and it's not quite a "type" so...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make it a type and declare it in the index file?

type DagScope = 'block' | 'entity' | 'all'

Using types instead of enums means we won't need to use --experimental-transform-types in future.

Comment on lines +209 to +212
const isTargetRoot = roots.some(r => r.equals(lastCid))
if (!isTargetRoot) {
throw new Error('knownDagPath must end with one of the target roots')
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need some solution here to handle multiple roots where there are paths to each root.

I know this sounds confusing, but "root" according to current kubo/boxo is the "target" root block of a dag.. the "dagRoot" is the root of the dag walked to get to that subdag starting node

@SgtPooki
Copy link
Member Author

cspell is yelling about some words in dag-scope.spec.ts

@SgtPooki
Copy link
Member Author

ping @achingbrain @2color I think some other eyes on this would be good, and this is blocking ipfs/service-worker-gateway#356

Copy link
Member

@achingbrain achingbrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial pass, need to look in more detail tomorrow


constructor (components: CarComponents, init: any) {
this.components = components
this.log = (init.logger ?? defaultLogger()).forComponent('helia:car')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the component logger from Helia, don't create a new one.

}
} catch (err) {
// Handle errors, but don't propagate them to avoid breaking the queue
this.log.error('Error processing block', err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.log.error('Error processing block', err)
this.log.error('error processing block - %e', err)

for (let i = 0; i < knownPath.length; i++) {
const cid = knownPath[i]
if (!(await blockstore.has(cid))) {
throw new Error(`CID in knownDagPath at index ${i} not found in blockstore`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? What if we want to pull the block from the network?

If the user doesn't want network activity they should be able to pass an offline flag which the networked blockstore will use to throw an error when a block is missing.

* when the path from dagRoot to the target root is already known, reducing network requests
* and computation time.
*
* Example usage:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a tag for this:

Suggested change
* Example usage:
* @example
*

* An ordered array of CIDs representing the known path from dagRoot to the target root.
* The array should start with dagRoot (at index 0) and end with the target root CID.
*
* If you provide `knownDagPath`, you should already have verified that the CIDs in the path are present in the blockstore.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it hurt to let the user control this with an offline flag?

@@ -0,0 +1,27 @@
/**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make it a type and declare it in the index file?

type DagScope = 'block' | 'entity' | 'all'

Using types instead of enums means we won't need to use --experimental-transform-types in future.

Comment on lines +305 to +306
const blockCid = (result as { cid: CID }).cid ?? result
const newStrategy = (result as { strategy: TraversalStrategy }).strategy ?? strategy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid the casts here? We don't get type safety like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants