Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Term and LeafAnswer serde serialization #2707

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Conversation

bakaq
Copy link
Contributor

@bakaq bakaq commented Dec 14, 2024

As a lot of things moved around, I thought making a new PR would be easier than rebasing #2493. This also gives us a clean slate to discuss. The old PR was getting a bit too big, and Github is notoriously bad for big threads of discussions in issues/PRs.

This basically gets to the same point that #2493 was, but with a change to rationals following #2505 (comment) (basically now they are serialized as {"rational": {"numerator": ..., "denominator": ...}}), and also without migrating the integration tests. I believe this will still change a bunch so I defer migrating the integration tests (which depend on serialization) to after this gets merged. This also doesn't have the deserialization from #2505.

@jjtolton, if you really want to continue the JSON API route in #2465 you should rebase onto this. I feel maybe just doing C APIs for dealing with Term (and with that wrapping libraries can do JSON themselves in whatever way they want) will be a faster path to merging than waiting to reach consensus on the serialization format here.

Most relevant points in the discussion until now, to recap:

Also, tip when writing JSON on Github: use json5 in codeblocks. It allows things like comments without freaking out. Compare:

// json
{ "a": 1 }
// json5
{ "a": 1 }

@guregu
Copy link

guregu commented Feb 14, 2025

My 2c having hacked out a similar thing for trealla-js12: I ended up with pretty much the same structure as #2493 (comment), which I'm pretty happy with. For integers, I serialize directly to number within the JS safe integer range (±9007199254740991), otherwise as object-wrapped strings which become bigint on the JS side. I did a similar thing with trealla-go and int64 vs *big.Int. In general it's kind of annoying to work with bigints in those languages, so making the more common small integers a regular number type is desirable.

Footnotes

  1. encoding: https://github.com/guregu/trealla/blob/main/library/wasm.pl

  2. decoding: https://github.com/guregu/trealla-js/blob/9c9cc972c795895fde93bf02783cb73935e3bcdc/src/term.ts#L220

@jjtolton
Copy link

trealla fam! ❤️ I didn't even know trealla-js existed, so cool! Love the TS bindings. Great work and thanks for the information! 👀

@guregu
Copy link

guregu commented Feb 14, 2025

Thanks @jjtolton! One of my goals is to eventually provide a Typescript library where the backend implementation can be swapped between Trealla/Scryer, and I'd be happy to change my JSON internals to match what Scryer decides to make things easier to do that in the future. Just wanted to chime in that the path Scryer is taking looks good to me.

@jjtolton
Copy link

That's great! I'm working on a similar thing with libscryer-clj. @bakaq is doing some phenomenal work with WASM right now, he even "beat me to the punch"! After that I'd like to add ClojureScript support for libscryer-clj.

But ultimately I'd like to support Trealla as well!

I'm working on embedding Prolog into video games, and if I ever make a console port, I think I will need to use Trealla!

I hope we get to chat more about it.

@bakaq
Copy link
Contributor Author

bakaq commented Feb 14, 2025

[...] otherwise as object-wrapped strings which become bigint on the JS side.

In #2825 I just made everything into BigInt, but that is just a temporary shortcut. The problem with this on this issue here is that there is no big int type in JSON (and other serialization encodings). In languages where there are big ints it seems to be trivial to convert between a string and them, so I still think big integers (and therefore also rationals) should be serialized as tagged strings which can then be converted into a big int by the consumer.

{ "type": "integer", "value": 100 }
{ "type": "integer", "value": "100000000000000000000000000000" }

@guregu
Copy link

guregu commented Feb 14, 2025

[...] otherwise as object-wrapped strings which become bigint on the JS side.

In #2825 I just made everything into BigInt, but that is just a temporary shortcut. The problem with this on this issue here is that there is no big int type in JSON (and other serialization encodings). In languages where there are big ints it seems to be trivial to convert between a string and them, so I still think big integers (and therefore also rationals) should be serialized as tagged strings which can then be converted into a big int by the consumer.

{ "type": "integer", "value": 100 }
{ "type": "integer", "value": "100000000000000000000000000000" }

I agree. I’ve seen such an encoding scheme used in JSON-based document databases (DynamoDB) for the same reason.
I didn’t wrap small numbers or floats in an object, which makes parsing in JS slightly easier, but makes life harder for languages with stricter type systems. When in doubt wrapping seems like the best solution; it’s easy for client libraries to convert types especially when they are explicitly tagged. Looks good to me!

@bakaq
Copy link
Contributor Author

bakaq commented Feb 14, 2025

Well, serializing small ints as just 100 instead of { "type": "integer", "value": 100 } seems worth the trouble actually.

@guregu
Copy link

guregu commented Feb 14, 2025

Well, serializing small ints as just 100 instead of { "type": "integer", "value": 100 } seems worth the trouble actually.

I think it’s nice, the only potential issue is some ambiguity with floats if they also get serialized as non-wrapped numbers. JS treats them the same of course so it’s no problem there. Tagging them can provide a hint to typed languages to avoid having to do heuristics (“does it have a dot?”). This is basically ambiguity built into JSON so I don’t think there’s a right or wrong answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants