-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use TextEncoder
and TextDecoder
for utf8 strings
#4513
base: master
Are you sure you want to change the base?
Conversation
master, af1a9c9
seia-soto:textencoder, 1de005e
|
- ~65535 ASCII only characters
> 147.50420889870574 / 156.1296767089117 // benchEngineDeserialization
0.944754463135876
> 147.50420889870574 / 148.7394726802865 // benchEngineSerialization
0.9916951179177841 seia-soto:textencoder, 65764e9
master, de7bfb5
|
This PR is awaiting for the final review. Further changes are expected to be categorized as a performance improvement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about other uses of encode
in this file?
@@ -20,6 +20,9 @@ export const EMPTY_UINT32_ARRAY = new Uint32Array(0); | |||
// Check if current architecture is little endian | |||
const LITTLE_ENDIAN: boolean = new Int8Array(new Int16Array([1]).buffer)[0] === 1; | |||
|
|||
// TextEncoder doesn't need to be recreated every time unlike TextDecoder | |||
const TEXT_ENCODER = new TextEncoder(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively you could do:
const encoder = new TextEncoder();
const encode = encoder.encode.bind(encoder);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
encode
is already exported by punycode
. Would you like to have another variable name?
Co-authored-by: Krzysztof Modras <[email protected]>
Co-authored-by: Krzysztof Modras <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add a test for the exact filter that was a problem in #4424
Also, would be great to have a fuzz test setup to test serialize/deserialize on random data
f7d75c0
to
3bfc064
Compare
chore: add description to smaz structure fix: out of range error fix: ASCII and Smaz is not compatible chore: remove unused padding
fixes #4424
This PR replaces punycode encoder and decoder with
TextEncoder
andTextDecoder
for utf8 strings.\ufeff
should be skipped when decoding to ensure the original formUint8Array.subarray
doesn't copy the array but provides a direct interface to subarrayTextEncoder.encodeInto
doesn't produce EOL character ( NULL, U+0000 ) but we don't care becauseTextDecoder
can stop nicely when provided buffer endsTo be safe: