Skip to content

Conversation

@Kubuxu
Copy link
Collaborator

@Kubuxu Kubuxu commented Oct 9, 2025

No description provided.

@Kubuxu Kubuxu requested a review from a team as a code owner October 9, 2025 15:32
@Kubuxu Kubuxu force-pushed the pdp/create-and-upload branch from 64afc4a to 726cf4d Compare October 9, 2025 15:53
return fmt.Errorf("expeted to find dataSetId in receipt but failed to extract: %w", err)
}
// XXX: I considered here chekcing if dataset exists already in DB, but not sure if it is needed
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The majority of my questions are around this file. I'm yet to test it, but I'm using dataset=0 as a sentinel value.

It might be better if it were maybe NULL. Also, I don't know about any possible table relations that this might affect.
@LexLuthr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should look at handleGetPieceAdditionStatus, that's the only place I can see where it might matter - client can ask for piece addition status and it selects by data set id; what should the client ask for in the case of the combined flow and what do we expect them to get, because I think maybe none of it works currently. We call that with getPieceAdditionStatus in the SDK.

The only other place where we have functionality outside of this PR that impacts that table is the trigger we add for transaction resolving and it doesn't care about data set id:

CREATE OR REPLACE FUNCTION update_pdp_data_set_piece_adds()
RETURNS TRIGGER AS $$
BEGIN
IF OLD.tx_status = 'pending' AND (NEW.tx_status = 'confirmed' OR NEW.tx_status = 'failed') THEN
-- Update the add_message_ok field in pdp_data_set_piece_adds if a matching entry exists
UPDATE pdp_data_set_piece_adds
SET add_message_ok = CASE
WHEN NEW.tx_status = 'failed' OR NEW.tx_success = FALSE THEN FALSE
WHEN NEW.tx_status = 'confirmed' AND NEW.tx_success = TRUE THEN TRUE
ELSE add_message_ok
END
WHERE add_message_hash = NEW.signed_tx_hash;
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could special-case dataset=0 in handleGetPieceAdditionStatus

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the client should ask for the dataset creation status instead. After that, and after the piece gets processed, the get status will work because the pdp_data_set_piece_adds.data_set will get updated to the proper dataset.

@Kubuxu Kubuxu changed the title feat(pdp): support create and upload [WIP] feat(pdp): support create and upload Oct 9, 2025
@Kubuxu
Copy link
Collaborator Author

Kubuxu commented Oct 10, 2025

@rvagg would appriciate you taking a look as well.


type RequestBody struct {
RecordKeeper string `json:"recordKeeper"`
Pieces []AddPieceRequest `json:"pieces"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah, nicely typed

@rvagg
Copy link
Member

rvagg commented Oct 10, 2025

Looks good, my only concern is the impact on handleGetPieceAdditionStatus

@rjan90 rjan90 linked an issue Oct 13, 2025 that may be closed by this pull request
@Kubuxu Kubuxu force-pushed the pdp/create-and-upload branch 3 times, most recently from 4ac5aa6 to 3224e22 Compare October 14, 2025 13:58
@Kubuxu
Copy link
Collaborator Author

Kubuxu commented Oct 14, 2025

Looking at schema of pdp_data_set_piece_adds. The insert will fail due to foreign key pdp_proofset_root_adds_proofset_fkey as the pdp_data_sets.id won't exist at the time.

[email protected]:yugabyte> \d pdp_data_set_piece_adds
+-------------------+---------+-------------------------+
| Column            | Type    | Modifiers               |
|-------------------+---------+-------------------------|
| data_set          | bigint  |  not null               |
| piece             | text    |  not null               |
| add_message_hash  | text    |  not null               |
| add_message_ok    | boolean |                         |
| add_message_index | bigint  |  not null               |
| sub_piece         | text    |  not null               |
| sub_piece_offset  | bigint  |  not null               |
| sub_piece_size    | bigint  |  not null               |
| pdp_pieceref      | bigint  |  not null               |
| pieces_added      | boolean |  not null default false |
+-------------------+---------+-------------------------+
Indexes:
    "pdp_data_set_piece_adds_pk" PRIMARY KEY, lsm (data_set HASH, add_message_hash ASC, add_message_index ASC)
    "idx_pdp_data_set_piece_adds_pieces_added" lsm (pieces_added HASH)
Foreign-key constraints:
    "pdp_proofset_root_adds_add_message_hash_fkey" FOREIGN KEY (add_message_hash) REFERENCES message_waits_eth(signed_tx_hash) ON DELETE CASCADE
    "pdp_proofset_root_adds_pdp_pieceref_fkey" FOREIGN KEY (pdp_pieceref) REFERENCES pdp_piecerefs(id) ON DELETE SET NULL
    "pdp_proofset_root_adds_proofset_fkey" FOREIGN KEY (data_set) REFERENCES pdp_data_sets(id) ON DELETE CASCADE

I welcome suggestions on how to solve it cleanly, but it seems like transitioning to dataset being NULLable is the cleanest.

@Kubuxu
Copy link
Collaborator Author

Kubuxu commented Oct 14, 2025

I pushed a commit with a nullable dataset id in pdp_data_set_piece_adds, otherwise, the foreign key constraint would get in the way.

@Kubuxu
Copy link
Collaborator Author

Kubuxu commented Oct 14, 2025

Ohh, data_set cannot be nullable because it is a part of the primary key
Should we just remove it from the PK? I don't think we support flows where a single message adds to multiple different datasets, so that uniqueness constraint is not needed.

@Kubuxu Kubuxu requested a review from rvagg October 15, 2025 10:45
Copy link
Contributor

@LexLuthr LexLuthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the table, I don't see any way to get around the problem while keeping the primary key.

Comment on lines 114 to 123
resolvedDataSetId := pieceAdd.DataSet
if !resolvedDataSetId.Valid {
var err error
resolvedDataSetId.Int64, err = extractDataSetIdFromReceipt(receipt)
if err != nil {
return fmt.Errorf("expeted to find dataSetId in receipt but failed to extract: %w", err)
}
resolvedDataSetId.Valid = true
var exists bool
// we check if the dataset exists already to avoid foreign key violation
err = db.QueryRow(ctx, `
SELECT EXISTS (
SELECT 1
FROM pdp_data_sets
WHERE id = $1
)`, resolvedDataSetId.Int64).Scan(&exists)
if err != nil {
return fmt.Errorf("failed to check if data set exists: %w", err)
}
if !exists {
// XXX: maybe return nil instead to avoid warning?
return fmt.Errorf("data set %d not found in pdp_data_sets", resolvedDataSetId.Int64)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not try to create_watch here as well. Just scan from DB and if it is NULL then just skip processing addPiece for this tipset. But that might cause unnecessary delay of 1 tipset. Any other ideas do this this without 2 watches?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just scan from DB and if it is NULL then just skip processing addPiece for this tipset

That is what I'm doing here. If we get an add_piece triggered, and the dataset doesn't exist yet, we leave it alone and wait for another trigger for when the dataset is created.
A cleaner way might be to get create_watch to be processed with a higher priority than add_pieces. This could be achieved by combining them into one watcher and executing sequentially with create first.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After processing it sequentially through the combined watcher, I think we could obtain the dataset ID from the pdp_data_sets based on the create_message_hash.

@Kubuxu Kubuxu force-pushed the pdp/create-and-upload branch 3 times, most recently from c18dbf2 to f8f7507 Compare October 22, 2025 14:02
@Kubuxu Kubuxu changed the base branch from pdpv0 to pdpNext October 22, 2025 14:03
@Kubuxu Kubuxu changed the title [WIP] feat(pdp): support create and upload feat(pdp): support create and add endpoint Oct 22, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these files probably should be renamed too 😬

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside of this PR, it doesn't cause any 'damage' right now and given that it is in pdpNext branch, I don't want extra conflicts.

@rjan90 rjan90 added the team/fs-wg Items being worked on or tracked by the "FS Working Group". See FilOzone/github-mgmt #10 label Oct 23, 2025
@FilOzzy FilOzzy added this to FS Oct 23, 2025
@github-project-automation github-project-automation bot moved this to 📌 Triage in FS Oct 23, 2025
SubPieceSize int64 `db:"sub_piece_size"`
PDPPieceRefID int64 `db:"pdp_pieceref"`
AddMessageOK *bool `db:"add_message_ok"`
PDPDataSetId uint64 `db:"data_set"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah, what was going on here ..

@rvagg
Copy link
Member

rvagg commented Oct 23, 2025

looks fine overall, just a bunch of relatively minor items in this review pass

@Kubuxu Kubuxu force-pushed the pdp/create-and-upload branch from 24f9639 to d087dce Compare October 23, 2025 11:30
@github-project-automation github-project-automation bot moved this from 📌 Triage to ✔️ Approved by reviewer in FS Oct 23, 2025
@Kubuxu Kubuxu merged commit a53ee5f into pdpNext Oct 23, 2025
15 checks passed
@Kubuxu Kubuxu deleted the pdp/create-and-upload branch October 23, 2025 13:12
@github-project-automation github-project-automation bot moved this from ✔️ Approved by reviewer to 🎉 Done in FS Oct 23, 2025
rvagg pushed a commit that referenced this pull request Oct 29, 2025
* feat(pdp): support create and upload

Signed-off-by: Jakub Sztandera <[email protected]>

* use nullable data_set id in pdp_data_set_piece_adds to bypass FK

Signed-off-by: Jakub Sztandera <[email protected]>

* combine pdp watchers for optimistic sequence create and add

Signed-off-by: Jakub Sztandera <[email protected]>

* Rename the migration file to trigger migration

Signed-off-by: Jakub Sztandera <[email protected]>

* Check error for insertPieceAdds in combined flow

Signed-off-by: Jakub Sztandera <[email protected]>

* pieceAddHandler: Remove WHERE dataSet = $id because it doesn't work with NULL

Signed-off-by: Jakub Sztandera <[email protected]>

* lint clean up and overwrite possible bad entry.DataSet

Signed-off-by: Jakub Sztandera <[email protected]>

* use decodeExtraData

Signed-off-by: Jakub Sztandera <[email protected]>

* log when transformAddPiecesRequest fails

Signed-off-by: Jakub Sztandera <[email protected]>

* Rename proofset -> data set in new files

Signed-off-by: Jakub Sztandera <[email protected]>

---------

Signed-off-by: Jakub Sztandera <[email protected]>
rvagg pushed a commit that referenced this pull request Oct 29, 2025
* feat(pdp): support create and upload

Signed-off-by: Jakub Sztandera <[email protected]>

* use nullable data_set id in pdp_data_set_piece_adds to bypass FK

Signed-off-by: Jakub Sztandera <[email protected]>

* combine pdp watchers for optimistic sequence create and add

Signed-off-by: Jakub Sztandera <[email protected]>

* Rename the migration file to trigger migration

Signed-off-by: Jakub Sztandera <[email protected]>

* Check error for insertPieceAdds in combined flow

Signed-off-by: Jakub Sztandera <[email protected]>

* pieceAddHandler: Remove WHERE dataSet = $id because it doesn't work with NULL

Signed-off-by: Jakub Sztandera <[email protected]>

* lint clean up and overwrite possible bad entry.DataSet

Signed-off-by: Jakub Sztandera <[email protected]>

* use decodeExtraData

Signed-off-by: Jakub Sztandera <[email protected]>

* log when transformAddPiecesRequest fails

Signed-off-by: Jakub Sztandera <[email protected]>

* Rename proofset -> data set in new files

Signed-off-by: Jakub Sztandera <[email protected]>

---------

Signed-off-by: Jakub Sztandera <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team/fs-wg Items being worked on or tracked by the "FS Working Group". See FilOzone/github-mgmt #10

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PDPv0: uploadAndCreate

5 participants