Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Create number from string in System.Text.Json.Nodes #111548

Open
Joy-less opened this issue Jan 17, 2025 · 7 comments
Open

[API Proposal]: Create number from string in System.Text.Json.Nodes #111548

Joy-less opened this issue Jan 17, 2025 · 7 comments
Assignees
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json untriaged New issue has not been triaged by the area owner

Comments

@Joy-less
Copy link

Joy-less commented Jan 17, 2025

Background and motivation

System.Text.Json.Nodes.JsonNode is a very powerful API to build up a JsonElement dynamically. However, it has some limitations that prevent custom parsers for JSON-compatible formats outside System.Text.Json (e.g. JSON5, HJSON, HOCON) from achieving true interoperability with System.Text.Json.

A primitive value node can be created like this:

JsonValue.Create(123); // number node
JsonValue.Create("abc"); // string node

However, there is no way to create a number value from a string without first parsing it into a fixed-size value. This is a problem for values outside of the range limits of those values (e.g. numbers larger than 2^63-1).

This means you're left with three options:

  1. Parse the number as JSON. This can be done using JsonSerializer.Deserialize<JsonNode>("123"). However, this is not performant (the number has to be re-parsed) and it doesn't support numbers in non-standard formats, meaning parsers for formats like JSON5 and HJSON must normalise the number (e.g. hexadecimal must be converted to decimal), and you lose data.
  2. Parse the number as a fixed-size number. This can be done using JsonValue.Create(int.Parse("123")). Again, this is not performant, but the primary issue is that it doesn't support very large or very precise numbers.
  3. Create a string value and use JsonNumberHandling.AllowReadingFromString. This can be done using JsonValue.Create("123"). This is likely the best option, but it provides no way to distinguish between strings and numbers (e.g. there will be no difference between 123 and "123").

API Proposal

namespace System.Text.Json.Nodes;

public abstract class JsonValue : JsonNode {
    /// <summary>
    /// Similar to JsonValue.Create(string, JsonNodeOptions), but creates a node for a specific type (e.g. number) instead of a string node.<br/>
    /// The value is not validated by design, so the caller should check that the value is valid.
    /// </summary>
    public static JsonValue Create(JsonTokenType tokenType, string value, JsonNodeOptions? options = null);
}

API Usage

JsonValue.Create(123); // number node
JsonValue.Create("abc"); // string node
JsonValue.Create(JsonTokenType.Number, "123"); // number node (NEW API)

Alternative Designs

JsonValue.Create("123", asNumber: true);
JsonValue.CreateNumber("123");
JsonValue.CreateNumberFromString("123");
JsonValue.CreateStringNumber("123");
new CustomJsonValueNumber("123"); // special type inheriting from JsonValue

Risks

If the API is misused, valid JsonNodes could contain poorly-formatted or invalid values. Different APIs using JsonNode may have different ideas on what values are valid. For example, some implementations may create number nodes with trailing decimal points (5.), while others throw an exception trying to deserialize this.

Use Cases

See my HJSON/JSON5 parser: https://github.com/Joy-less/HjsonSharp

@Joy-less Joy-less added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jan 17, 2025
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jan 17, 2025
@gregsdennis
Copy link
Contributor

JsonValue.Create("123"); // number node

This doesn't create a JSON number. It creates a string containing only digits.

You want

JsonValue.Create(123); // number node

https://learn.microsoft.com/en-us/dotnet/api/system.text.json.nodes.jsonvalue.create?view=net-9.0#system-text-json-nodes-jsonvalue-create(system-int32-system-nullable((system-text-json-nodes-jsonnodeoptions)))

Moreover, there are implicit casts that do this for you.

JsonNode value = 123;

@Joy-less
Copy link
Author

@gregsdennis Apologies, that was a typo that has been fixed. Please see the rest of the proposal for info on why that API isn't enough.

@gregsdennis
Copy link
Contributor

Also pertinent to this discussion is that the JsonElement model does withhold applying data type limits until the value is requested.

For example

// the element holds the value as a span of the source string
// and separately recognizes it as a JSON number
var value = JsonDocument.Parse("123").RootElement;

// the value is only converted to the requested type here
var asInt32 = value.GetInt32();
//or
var asInt16 = value.GetInt16();

@jeffhandley
Copy link
Member

Assigned to @PranavSenthilnathan for triage and consideration.

@PranavSenthilnathan
Copy link
Member

The risks you shared for allowing arbitrary formatting (e.g. hex) in JsonNodes are concerning. Is perf the only downside to first processing the input string with custom formatting aware logic and then calling JsonSerializer.Deserialize/JsonNode.Parse on the result?

For perf, what are the bottlenecks that you run into? Currently we materialize a JsonElement for JsonSerializer.Deserialize/JsonNode.Parse which requires transcoding the input string to UTF-8 and creating the JsonDocument internally, so maybe that can be simplified and sped up in the case of parsing a number/primitive value from a string. Alternatively, we could consider a JsonValue.ParseNumber API or similar.

@Joy-less
Copy link
Author

@PranavSenthilnathan Hi. Since posting this I've realized that it's better to standardize the number before putting it into a JsonElement. That's how strings work after all; it doesn't matter whether triple-quoted or single-quoted strings are used.

However, having to use JsonSerializer.Deserialize is a shame performance- and clarity-wise. I think a JsonValue.ParseNumber API could be a good idea. Alternatively, an API like JsonValue.Parse(IEnumerable<JsonTokenType> allowedTokenTypes) could be better for scalability.

@Joy-less
Copy link
Author

Could the generic math feature added in net7.0 work here?

public static JsonValue? Create<T>(T? value, JsonNodeOptions? options = null) where T : INumberBase<T>;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

4 participants