Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions that occured to me while reading this spec #2

Open
Gozala opened this issue Dec 16, 2023 · 3 comments
Open

Some questions that occured to me while reading this spec #2

Gozala opened this issue Dec 16, 2023 · 3 comments

Comments

@Gozala
Copy link

Gozala commented Dec 16, 2023

  1. IPLD schema says that entity id is bytes, but then int is used to in the examples. This is really confusing and I'm not sure which one out of two is correct

pomo-ipld/README.md

Lines 51 to 64 in 6548480

``` js
// DAG-JSON
[123, "name/last", "Monroe", [{"/": "bafyreiaajfbxfnbbdbhvxmowe6t63ytsimv4daiitv5gkqetwrpww5zmsy"}]]
```
``` rust
// Rust
Fact {
attribute: "name/last",
caused_by: [Link("bafyreiaajfbxfnbbdbhvxmowe6t63ytsimv4daiitv5gkqetwrpww5zmsy")],
entity_id: 123,
value: "Monroe",
}
```

  1. Could you please provide some rational for the 128bit limit rational ? I'm under impression that perhaps use ofy u128 was intention which might explain my first question

Restricting an entity ID to 128-bits is RECOMMENDED.

  1. Should CausedBy canonicalization be prescribed, or better yet be defined as a Map<CIDBytes, Unit> to ensure that same fact will hash the same ?

pomo-ipld/README.md

Lines 105 to 109 in 6548480

Links to other facts MUST be placed in an array in the `causedBy` field.
``` ipldsch
type CausedBy = [&Fact]
```

  1. I think something is missing in this sentence, is trivial to handle ?

the same fact entered into the store twice is trivial for operations that only depend on the graph structure of the store

  1. Perhaps it's better idea to recommend using multihashes instead ? IPFS stack through experience settled on using multihashes (stripping codec and cid tags) for block stores as that information usually does not affect most operations. In case of DB it makes little sense IMO to have even hashing algorithm retained, individual DB could probably choose whatever algorithm it cares and use untagged digests for keys. When communicating with others DB could choose to add prefixes etc...

pomo-ipld/README.md

Lines 145 to 153 in 6548480

Certain aggregate functions (e.g. counts, sums, averages) and stateful queries (e.g. graph colorings) depend on a node being present no more than once per graph. Deduplication is thus imperative for many use cases. It is RECOMMENDED that all facts added to a store have a canonical CID. This MAY be of any CID configuration. To reduce the amount of recomputation, using the following parameters are RECOMMENDED:
| Parameter | Recommended Setting |
|--------------|---------------------|
| CID Version | [CIDv1] |
| [Multicodec] | [DAG-CBOR] |
| [Multihash] | [SHA2-256] |
Note that the [multibase] of a CID is defined by the codec and CID version.

@Gozala
Copy link
Author

Gozala commented Dec 16, 2023

I also have been talking to @expede on discord about what were limits for causedBy and how to handle cases where you may need to have more than that number. I think spec should probably cover that

@expede
Copy link
Member

expede commented Dec 17, 2023

Yep agreed! The spec is very WIP and Fission has it on hold. PRs welcome :)

@Gozala
Copy link
Author

Gozala commented Dec 17, 2023

I'm interested in contributing, having above questions answered would enable me to make align with the vision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants