Use JSON snapshots for all code size tests. NFC #24672

RReverser · 2025-07-09T19:41:01Z

Right now, code size tests in the codebase use two different formats - either everything in a single JSON file, or multiple single-line files.

I found the discrepancy a bit annoying and wanted to unify them.

Between the two, the one with individual files tends to be harder to review as Github (as well as local git tools) add quite a lot of boilerplate UI around each file, you need to manually scan filenames to understand which size diffs are related to each other, and overall feels more difficult to scan quickly as a human.

In this PR I'm unifying both towards a single JSON file format, and other supplementary metadata (imports, sent etc) goes into the same JSON as well so you can see all relevant context together during reviews.

sbc100

Thanks for doing this! I've been thinking about doing this myself for a while now.

I really like the idea, especially for the code sizes themselves. I'm less sure about the symbol lists since the extra syntax of quotes and commas makes them harder to read than flat files.

Would you consider using json only for the file sizes and continue to use flat files for symbol lists? Or maybe its not worth it?

test/test_other.py

RReverser · 2025-07-09T21:29:12Z

Would you consider using json only for the file sizes and continue to use flat files for symbol lists? Or maybe its not worth it?

I could, but I find it kind of nice to have in the same context for the same reason it's nice to have different sizes in the same context - eg there were times when code size increased because some symbol was accidentally exported that shouldn't have been, and then it's helpful for debugging the increase.

I guess we could output YAML / TOML / custom text output if the quotes and commas are the concern?

sbc100 · 2025-07-09T22:02:01Z

Would you consider using json only for the file sizes and continue to use flat files for symbol lists? Or maybe its not worth it?

I could, but I find it kind of nice to have in the same context for the same reason it's nice to have different sizes in the same context - eg there were times when code size increased because some symbol was accidentally exported that shouldn't have been, and then it's helpful for debugging the increase.

I guess we could output YAML / TOML / custom text output if the quotes and commas are the concern?

No this seems fine to me if its useful to have it all together.

sbc100 · 2025-07-09T22:38:09Z

test/common.py

@@ -1694,14 +1697,76 @@ def assertBinaryEqual(self, file1, file2):
    self.assertEqual(read_binary(file1),
                     read_binary(file2))

-  def check_expected_size_in_file(self, desc, filename, size):
+  def check_output_sizes(self, outputs: List[str], metadata=None):
+    test_name = self.id().split('.')[-1]


I guess this makes a lot of sense to set the filename automatically like this, but it would break if we ever had two different test suites with the same test name... i guess that is very unlikely and we would notice right away.

FWIW that's how existing logic already worked in run_codesize_test, I just copied it here.

RReverser · 2025-07-10T23:31:19Z

(I'll do final rebaseline to fix CI before merging)

RReverser · 2025-07-11T19:57:42Z

@sbc100 Anything left on this one? Could be nice to land before eg #24679 as it would make diff much more compact :)

RReverser · 2025-07-11T20:07:06Z

~~Oh damn I just accidentally added unnecessary files, fixing up...~~ Done.

RReverser · 2025-07-12T20:49:13Z

@sbc100 I think the CI check expectations script doesn't correctly handle removed/added tests, as it tries to find the same test both before and after this PR and fails.

What is the solution in such cases? Manual merge?

sbc100 · 2025-07-14T19:26:08Z

@sbc100 I think the CI check expectations script doesn't correctly handle removed/added tests, as it tries to find the same test both before and after this PR and fails.

What is the solution in such cases? Manual merge?

Fair enough we can bypass that check for this PR.

RReverser · 2025-07-15T00:25:41Z

Fair enough we can bypass that check for this PR.

Can you please do it? I don't have the bypass rights :(

RReverser · 2025-07-15T00:28:07Z

I rebaselined & fixed conflicts again, but it might need yet another rebaseline right before you merge it.

RReverser · 2025-07-21T17:33:57Z

Ping.

sbc100 · 2025-07-21T18:04:39Z

lgtm!

Do we need another rebase or can we just land this now?

sbc100 · 2025-07-21T18:04:58Z

Oh, we do need another rebase (I see conflicts).

RReverser · 2025-07-21T19:41:31Z

Oh, we do need another rebase (I see conflicts).

Yeah see above:

Can you please do it? I don't have the bypass rights :(

I rebaselined & fixed conflicts again, but it might need yet another rebaseline right before you merge it.

I basically have to keep rebaselining, but I don't have rights to merge it, so unless you merge right after my rebaseline, it just goes in a loop 😅

I can do yet another rebaseline and hope you're free to merge soon, or can I leave it to you to run it locally right before merging - which is easier?

sbc100 · 2025-07-21T20:03:36Z

Can you do one more rebase now. If it doesn't auto-merge, I'll take a look again before EOD

RReverser · 2025-07-21T21:30:32Z

I think it still won't auto-merge because of

I think the CI check expectations script doesn't correctly handle removed/added tests, as it tries to find the same test both before and after this PR and fails.

but yeah, I'll do one more.

RReverser · 2025-07-21T22:36:10Z

Hm I accidentally re-added deleted files during rebase. Trying again.

RReverser · 2025-07-21T22:46:23Z

Done.

RReverser · 2025-07-21T23:37:11Z

I rebased but I see test-mac-arm64 CI reporting a lot of

node:internal/child_process:420
    throw new ErrnoException(err, 'spawn');
    ^

Error: spawn Unknown system error -86
    at ChildProcess.spawn (node:internal/child_process:420:11)
    at spawn (node:child_process:753:9)
    at Compiler.run (/Users/distiller/project/node_modules/google-closure-compiler/lib/node/closure-compiler.js:76:26)
    at Object.<anonymous> (/Users/distiller/project/node_modules/google-closure-compiler/cli.js:83:10)
    at Module._compile (node:internal/modules/cjs/loader:1730:14)
    at Object..js (node:internal/modules/cjs/loader:1895:10)
    at Module.load (node:internal/modules/cjs/loader:1465:32)
    at Function._load (node:internal/modules/cjs/loader:1282:12)
    at TracingChannel.traceSync (node:diagnostics_channel:322:14)
    at wrapModuleLoad (node:internal/modules/cjs/loader:235:24) {
  errno: -86,
  code: 'Unknown system error -86',
  syscall: 'spawn'
}

Node.js v22.16.0
building:WARNING: falling back to java version of closure compiler
node:internal/child_process:420
    throw new ErrnoException(err, 'spawn');
    ^

That's not related to my changes, is it?

sbc100 · 2025-07-22T00:06:35Z

I rebased but I see test-mac-arm64 CI reporting a lot of

node:internal/child_process:420
    throw new ErrnoException(err, 'spawn');
    ^

Error: spawn Unknown system error -86
    at ChildProcess.spawn (node:internal/child_process:420:11)
    at spawn (node:child_process:753:9)
    at Compiler.run (/Users/distiller/project/node_modules/google-closure-compiler/lib/node/closure-compiler.js:76:26)
    at Object.<anonymous> (/Users/distiller/project/node_modules/google-closure-compiler/cli.js:83:10)
    at Module._compile (node:internal/modules/cjs/loader:1730:14)
    at Object..js (node:internal/modules/cjs/loader:1895:10)
    at Module.load (node:internal/modules/cjs/loader:1465:32)
    at Function._load (node:internal/modules/cjs/loader:1282:12)
    at TracingChannel.traceSync (node:diagnostics_channel:322:14)
    at wrapModuleLoad (node:internal/modules/cjs/loader:235:24) {
  errno: -86,
  code: 'Unknown system error -86',
  syscall: 'spawn'
}

Node.js v22.16.0
building:WARNING: falling back to java version of closure compiler
node:internal/child_process:420
    throw new ErrnoException(err, 'spawn');
    ^

That's not related to my changes, is it?

Are those actually causing errors? I think that is due to the arm bot not being able to execute the x86 binary for closure-compiler (i.e. no rossetta installed?) its supposed to fall back to java I think.

sbc100 · 2025-07-22T00:07:43Z

The real error looks like its coming from test_codesize_hello_dylink_all.. I recently added that but was thinking maybe I should revert it since it will be very sensitive.

RReverser · 2025-07-22T00:09:39Z

The real error looks like its coming from test_codesize_hello_dylink_all.

Yeah I wasn't sure if that's related to different Closure binary or not.

I did emsdk install tot + emcc --clear-cache before rebaselining locally, not sure why else CI would have different sizes for these couple of tests.

sbc100 · 2025-07-22T00:12:47Z

test_codesize_hello_dylink_all is highly sensitive since it includes basically all symbols ..

RReverser · 2025-07-22T00:22:37Z

So should I do anything else here?

sbc100 · 2025-07-22T01:05:55Z

The reason the test failure is hard to read is because is exceeding the limits of self.assertEqual.

Can you do this:

obtained_results = json.dumps(obtained_results, indent=2)
expected_results = json.dumps(expected_results, indent=2)
self.assertTextDataIdentical(obtained_results, expected_results)

Our assertTextDataIdentical is better at diffing large things and showing just what changed.

RReverser · 2025-07-22T01:16:18Z

The reason the test failure is hard to read is because is exceeding the limits of self.assertEqual.

FWIW it also has proper logs above where the test is actually running, but neither test runner nor CircleCI capture stdout so it's not shown in the same place as test failure:

a.out.js: size=246743, expected=246743
a.out.js.gz: size=81085, expected=81085
a.out.nodebug.wasm: size=597771, expected=597771
a.out.nodebug.wasm.gz: size=330359, expected=330181, delta=178 (+0.05%)
total: size=844514, expected=844514
total_gz: size=411444, expected=411266, delta=178 (+0.04%)
test_codesize_hello_dylink_all (test_other.other.test_codesize_hello_dylink_all) ... FAIL

https://app.circleci.com/pipelines/github/emscripten-core/emscripten/43867/workflows/e8a294b4-9e23-43ae-a557-723109f35139/jobs/974710/parallel-runs/0/steps/0-111?invite=true#step-111-97665_38

RReverser · 2025-07-22T01:17:06Z

I'll go to sleep for now, will take a look tomorrow evening if this is not merged by then.

RReverser · 2025-07-22T18:58:10Z

I can't keep rebaselining... Can you please help with landing this?

sbc100 · 2025-07-22T19:03:41Z

Sure, I'm trying. We will get there eventually :)

One downside I just notices to having the symbol list all in the single JSON is grep-ability:

$ git grep __wasm_init_tls
src/lib/libdylink.js:      '__wasm_init_tls',
system/lib/pthread/emscripten_tls_init.c:extern void __wasm_init_tls(void *memory);
system/lib/pthread/emscripten_tls_init.c:  __wasm_init_tls(tls_block);
system/lib/wasm_worker/library_wasm_worker.c:void __wasm_init_tls(void *memory);
system/lib/wasm_worker/library_wasm_worker.c:  __wasm_init_tls((void*)*sbrk_ptr);
system/lib/wasm_worker/wasm_worker_initialize.S:  // __wasm_init_tls(stackLowestAddress);
system/lib/wasm_worker/wasm_worker_initialize.S:  .functype __wasm_init_tls (PTR) -> ()
system/lib/wasm_worker/wasm_worker_initialize.S:  call __wasm_init_tls
system/lib/wasm_worker/wasm_worker_initialize.S:  // N.b. The function __wasm_init_tls above does not need
system/lib/wasm_worker/wasm_worker_initialize.S:  // So we must initialize __stack_pointer only *after* completing __wasm_init_tls:
test/other/codesize/test_codesize_minimal_pthreads.funcs:$__wasm_init_tls
test/other/codesize/test_codesize_minimal_pthreads_memgrowth.funcs:$__wasm_init_tls

Here I can see that __wasm_init_tls is a declared function in those two code size tests.

After this change the symbol will appear in the json output but it won't be clear which list it is part of.. Not a huge deal I suppose.

It also means I can't do a simple wc -l to count the symbols... again not a big deal. I still think we should land this.

RReverser · 2025-07-22T19:13:41Z

If it does inhibit your workflow, I don't mind reverting these lists, just let me know. I suppose you could use jq instead with current workflow, but might be an overkill.

RReverser · 2025-07-22T20:25:25Z

I did emsdk install tot + emcc --clear-cache before rebaselining locally, not sure why else CI would have different sizes for these couple of tests.

Ah, so it was another of cross-platform shenanigans...

sbc100 · 2025-07-22T21:09:36Z

I did emsdk install tot + emcc --clear-cache before rebaselining locally, not sure why else CI would have different sizes for these couple of tests.

Ah, so it was another of cross-platform shenanigans...

Yes, this seems to be the first occurrence of it showup up in our CI machines, not just reports of user machines with this gzip discrepancy.

Note that the test in question was not previously measuring the gzip size of the wasm file which was why this didn't show up until this change.

RReverser · 2025-07-22T21:12:16Z

Interesting. I guess it makes sense that it would show up in the most sensitive test like you said.

RReverser · 2025-07-22T22:20:55Z

@sbc100 Thank you!

RReverser requested a review from sbc100 July 9, 2025 19:41

RReverser changed the title ~~Use JSON snapshots for all code size tests~~ Use JSON snapshots for all code size tests. NFC Jul 9, 2025

sbc100 reviewed Jul 9, 2025

View reviewed changes

test/test_other.py Outdated Show resolved Hide resolved

test/test_other.py Outdated Show resolved Hide resolved

test/test_other.py Outdated Show resolved Hide resolved

sbc100 reviewed Jul 9, 2025

View reviewed changes

test/test_other.py Show resolved Hide resolved

sbc100 reviewed Jul 9, 2025

View reviewed changes

RReverser requested a review from sbc100 July 11, 2025 10:38

RReverser force-pushed the code-size-json branch 2 times, most recently from 09926be to 0bb40ba Compare July 11, 2025 20:02

RReverser force-pushed the code-size-json branch 2 times, most recently from 43cda7c to 424ba38 Compare July 11, 2025 20:08

sbc100 approved these changes Jul 12, 2025

View reviewed changes

RReverser enabled auto-merge (squash) July 12, 2025 01:00

RReverser force-pushed the code-size-json branch from 14759a3 to c8c225d Compare July 15, 2025 00:27

RReverser force-pushed the code-size-json branch from c8c225d to 5df3f36 Compare July 21, 2025 21:42

RReverser force-pushed the code-size-json branch from 787af2a to 220f237 Compare July 21, 2025 22:52

RReverser added 2 commits July 22, 2025 17:45

Use JSON snapshots for all code size tests. NFC

1168c8f

Compare JSON in codesize

95e8de9

RReverser force-pushed the code-size-json branch from 8cb4137 to 95e8de9 Compare July 22, 2025 16:59

sbc100 added 4 commits July 22, 2025 12:15

rebaseline

8fecf1d

Merge remote-tracking branch 'upstream/main' into code-size-json

b952b01

Use main branch version of rebaseline_tests script

d9cd14e

skip gz files for dylink_all test

8c49320

RReverser merged commit 2191239 into emscripten-core:main Jul 22, 2025
30 checks passed

RReverser deleted the code-size-json branch July 22, 2025 21:30

Use JSON snapshots for all code size tests. NFC #24672

Use JSON snapshots for all code size tests. NFC #24672

Uh oh!

Conversation

RReverser commented Jul 9, 2025

Uh oh!

sbc100 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RReverser commented Jul 9, 2025

Uh oh!

sbc100 commented Jul 9, 2025

Uh oh!

sbc100 Jul 9, 2025

Choose a reason for hiding this comment

Uh oh!

RReverser Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RReverser commented Jul 10, 2025

Uh oh!

RReverser commented Jul 11, 2025

Uh oh!

RReverser commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RReverser commented Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 commented Jul 14, 2025

Uh oh!

RReverser commented Jul 15, 2025

Uh oh!

RReverser commented Jul 15, 2025

Uh oh!

RReverser commented Jul 21, 2025

Uh oh!

sbc100 commented Jul 21, 2025

Uh oh!

sbc100 commented Jul 21, 2025

Uh oh!

RReverser commented Jul 21, 2025

Uh oh!

sbc100 commented Jul 21, 2025

Uh oh!

RReverser commented Jul 21, 2025

Uh oh!

RReverser commented Jul 21, 2025

Uh oh!

RReverser commented Jul 21, 2025

Uh oh!

RReverser commented Jul 21, 2025

Uh oh!

sbc100 commented Jul 22, 2025

Uh oh!

sbc100 commented Jul 22, 2025

Uh oh!

RReverser commented Jul 22, 2025

Uh oh!

sbc100 commented Jul 22, 2025

Uh oh!

RReverser commented Jul 22, 2025

Uh oh!

sbc100 commented Jul 22, 2025

Uh oh!

RReverser commented Jul 22, 2025

Uh oh!

RReverser commented Jul 22, 2025

Uh oh!

RReverser commented Jul 22, 2025

Uh oh!

sbc100 commented Jul 22, 2025

Uh oh!

RReverser commented Jul 22, 2025

Uh oh!

RReverser Jul 9, 2025 •

edited

Loading

RReverser commented Jul 11, 2025 •

edited

Loading

RReverser commented Jul 12, 2025 •

edited

Loading