Skip to content

Safely dump unknown strings. #873

@samuel-williams-shopify

Description

Sometimes we produce JSON from unknown sources, e.g. system supplied strings which could contain invalid byte sequences. We need to do something like produce JSON for a logging system or to capture details of the operation (result of HTTP request, RPC, etc).

It's kind of tricky to do this correctly and I've found that often things look good in testing but fail in non-test environments due to invalid strings.

It would be nice if JSON.dump could be a bit more robust by default. So rather than failing when encountering malformed UTF-8, we could instead encode it, e.g. string.encode(Encoding::UTF_8, invalid: :replace, undef: :replace) for all strings.

Such a feature could be opt-in, e.g.

JSON.dump(thing, encode: true) # or safe: true or something else

Here is an example implementation: https://github.com/socketry/console/blob/main/lib/console/format/safe.rb which I use for logging. It's a bit more elaborate as it handles even more safety issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions