Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: PDP #227

Draft
wants to merge 71 commits into
base: main
Choose a base branch
from
Draft

feat: PDP #227

wants to merge 71 commits into from

Conversation

magik6k
Copy link
Collaborator

@magik6k magik6k commented Sep 30, 2024

Reading:

Roadmap:

  • Database schema
  • PieceCID version correctness
  • HTTP API
    • Basic structure
    • Auth
    • Data ingest
      • POST /piece
      • PUT /piece/upload/..
      • Call notify
        • make optional
      • Store in long-term storage
    • Data retrieval through /piece
      • Auth
    • PDPTool
      • Creating secret/tokens
      • Ping tool
      • Upload tool
      • Retrieve tool
      • Piece set manipulation
    • Piece sets
      • Create
      • Delete
      • Add root (w subroots)
      • Del Root
      • Get root subroots
    • Proof-set prover
  • UI

@magik6k magik6k force-pushed the feat/pdp branch 2 times, most recently from 07a5b22 to 5c3f546 Compare October 4, 2024 18:35
Copy link

@hannahhoward hannahhoward left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mostly put in comments to check for understanding before I start writing code. But generally LGTM -- looks like it's compliant with the API we agreed on :)

pdp/auth.go Show resolved Hide resolved
cuhttp/server.go Outdated Show resolved Hide resolved
pdp/handlers.go Show resolved Hide resolved

// Parse request body
var req struct {
PieceCID string `json:"pieceCid"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you're not requiring piececid v2 here, not in the put handler. Long term, I think we should require this. From the standpoint of the SP, I imagine them to want to filter at the piece level, and I imagine one of the main filter critieria to be size. They should know exactly what size piece they're about to accept before they accept it.

Unless I'm wrong, a PieceCID v1 doesn't specify that, and absent a size parameter, they could be accepting anything from 32 bytes to 256 MB.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, PieceCIDv2 is still todo, can quickly update if that's a blocker

pdp/handlers_upload.go Show resolved Hide resolved
lib/pieceprovider/piecepark_reader.go Show resolved Hide resolved
@hannahhoward
Copy link

am I correct that the retrieval handler already has been modified to support PDP pieces through the parked piece reader?

@hannahhoward
Copy link

am I correct in believing that the next step on the API is that add root will send a location header to a creation status API for the root, then that will be used to get the root ID which is then possible to pass to the root detail and delete root APIs? Not crazy urgent just want to know.

Copy link
Collaborator Author

@magik6k magik6k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github thinks I have an unsubmitted review here and can't discard it..

@magik6k
Copy link
Collaborator Author

magik6k commented Oct 21, 2024

am I correct that the retrieval handler already has been modified to support PDP pieces through the parked piece reader?

Yes, correct, retrievals work already. No ACL mechanism yet, but it's not going to be hard to implement once we have a spec (you can see that all retrieval requests quite quickly become aware what piece they are reading from + ACLs should be quite cachable)

@magik6k
Copy link
Collaborator Author

magik6k commented Oct 21, 2024

am I correct in believing that the next step on the API is that add root will send a location header to a creation status API for the root, then that will be used to get the root ID which is then possible to pass to the root detail and delete root APIs? Not crazy urgent just want to know.

Other than delete all the things should be there. Operations on PDP proofsets (other than create) you should probably watch through eth events. I can also make the message-sending endpoints output the message hash somehow, but you still want to watch for all the events to not miss any contract changes made by the SP

@magik6k magik6k force-pushed the feat/pdp branch 4 times, most recently from fa8d937 to a3a1c3a Compare October 24, 2024 13:28
@hannahhoward
Copy link

hannahhoward commented Oct 27, 2024

Upon further review, it's possible we have a design issue here that we need a solution for.

Please let me know if I am understanding the sequence of events correctly when a user uploads a pdp piece:

  1. The piece is put in the “Stash”
  2. A parked piece record is created
  3. Asychronously, the piece moves from the stash to permaneant storage (Task: https://github.com/filecoin-project/curio/blob/feat/pdp/tasks/piece/task_park_piece.go)
  4. After that, again asynchronously, the notify task is called, which hits the notify URL. (Task: https://github.com/filecoin-project/curio/blob/feat/pdp/tasks/pdp/notify_task.go)

Why this matters: a basic gaurantee of all hot storage, which storacha tries to offer, is read-on-write. The basic expectation is from the users point of view is that when an upload finishes, it should be retrievable.

One thing that is not clear I'd like to clarify: are steps 3 & 4 likely to run in a couple seconds or much longer? If it's only a few seconds, we can work around this. Fortunately our upload process includes a confirmation step AFTER the upload, and what we can do is essentially block the confirmation until step 4 completes. But that only works if it's a short period of time before the piece is fully retrievable.

If not, we really need to revisit the architecture and find a way to shorten the process.

@hannahhoward
Copy link

Follow-up: a pull based approach for checking for piece cid -- #304

magik6k and others added 4 commits November 1, 2024 12:13
Allows accepting JWTs that are signed with Ed25519 as well as ECDSA
x509 does not parse to pointer type on ed25519
* Update PDP to new proof scheduling WIP
* add listener addr to pdp tables
* generate bindings for new proving_schedule iface

* Complete draft of new challenge window scheduling

* Try to get tests building

* No fancy ALTER tables

* Review response

* Fix

* Fix sql syntax

---------

Co-authored-by: zenground0 <[email protected]>
Base automatically changed from feat/market to main January 7, 2025 16:57
ZenGround0 and others added 4 commits January 28, 2025 14:07
* Update contract wrappers

* Update to pdp code freeze -- next pp now called after first add

* Better initializing of init_ready flag

* Name too long

* Bad db select arg from copy paste

* update pdp contract addr

* Fix invalid insert statement

* Remove unnecessary call that fails

* Fix old sql error

* SQL schema tweak

* Fix write to bogus column

* YASQLSE

* DB scan param tweaking

* Debuging bad call to listener

* Lets try this

* Fix attempt 1

* Safer init proving period challenge epoch

* Less aggressive buffer

* Print out init epoch

* more logging

* fix sql update bug

* pass challenge epoch to pdptool

* drand integration

* flexible challenge window extension

* Update contract abi

* Update to refactored contract fee calc endpoint

* Multiply by basefee

* debug

* Correct logging

* Provide value during gas estimation

* Tweak

* Better understanding of proof fee

* correct printing

* more logging

* bigger buffer

* logging for debug of next pp

* fix bug

* Cleanup

* Fix incorrect challenge range when adding new roots

---------

Co-authored-by: zenground0 <[email protected]>
* Core api and prove task logic changes

* Update pdptool to process deletes

* SQL ordering fix

* NoSQL just track removals on chain

* Log removals

* correct pdptool output

* correct delete api interface

* remove

* remove

* Update tasks/pdp/task_prove.go

Co-authored-by: Łukasz Magiera <[email protected]>

* review response

* Fix up

---------

Co-authored-by: zenground0 <[email protected]>
Co-authored-by: Łukasz Magiera <[email protected]>
cmd/pdptool/main.go Outdated Show resolved Hide resolved

// Assign pending transactions with null owner to ourselves
{
n, err := mw.db.Exec(ctx, `UPDATE message_waits_eth SET waiter_machine_id = $1 WHERE waiter_machine_id IS NULL AND tx_status = 'pending'`, machineID)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use an index on this table like https://github.com/filecoin-project/curio/pull/392/files


proofStr += "] ] ]"

log.Infof("PDP Prove Task: proofSetID: %d, taskID: %d, proofs: %s", proofSetID, taskID, proofStr)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should drop the verbosity here

return address, nil
}

func (p *ProveTask) cleanupDeletedRoots(ctx context.Context, proofSetID int64, pdpVerifier *contract.PDPVerifier) error {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also the pdp_piecerefs table, and I'm not entirely sure what's the best way to approach this because:

  • On upload the piece is only referred in pdp_piecerefs (+the piecepark tables)
  • Only after upload is complete and data is verified the PDP service can add the piece to a proofset
  • Technically one piece can be in multiple proofsets / roots (as a subroot)
  • So probably:
    • Add a flag to pdp_piecerefs which is optionally settable through the delete api which tells curio to drop the piece reference when all pdp_proofset_roots are gone
    • Possibly add an endpoint which lists all pdp_proofset_roots for a given piece (so the service can know why a given piece may be "locked")

@magik6k magik6k changed the title [WIP] feat: PDP feat: PDP Jan 29, 2025
@magik6k
Copy link
Collaborator Author

magik6k commented Jan 29, 2025

Did a deeper review pass with focus on whether this will impact other curio functionality - looks like this should be safe to merge, generally most invasive changes are in piece_park, but default behaviors don't change there, and in retrieval paths, but there PDP is plugged in as a fallback, so it shouldn't impact normal operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants