-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Use JSON snapshots for all code size tests. NFC #24672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! I've been thinking about doing this myself for a while now.
I really like the idea, especially for the code sizes themselves. I'm less sure about the symbol lists since the extra syntax of quotes and commas makes them harder to read than flat files.
Would you consider using json only for the file sizes and continue to use flat files for symbol lists? Or maybe its not worth it?
I could, but I find it kind of nice to have in the same context for the same reason it's nice to have different sizes in the same context - eg there were times when code size increased because some symbol was accidentally exported that shouldn't have been, and then it's helpful for debugging the increase. I guess we could output YAML / TOML / custom text output if the quotes and commas are the concern? |
No this seems fine to me if its useful to have it all together. |
test/common.py
Outdated
@@ -1694,14 +1697,76 @@ def assertBinaryEqual(self, file1, file2): | |||
self.assertEqual(read_binary(file1), | |||
read_binary(file2)) | |||
|
|||
def check_expected_size_in_file(self, desc, filename, size): | |||
def check_output_sizes(self, outputs: List[str], metadata=None): | |||
test_name = self.id().split('.')[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this makes a lot of sense to set the filename automatically like this, but it would break if we ever had two different test suites with the same test name... i guess that is very unlikely and we would notice right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW that's how existing logic already worked in run_codesize_test
, I just copied it here.
(I'll do final rebaseline to fix CI before merging) |
09926be
to
0bb40ba
Compare
|
43cda7c
to
424ba38
Compare
@sbc100 I think the CI check expectations script doesn't correctly handle removed/added tests, as it tries to find the same test both before and after this PR and fails. What is the solution in such cases? Manual merge? |
Fair enough we can bypass that check for this PR. |
Can you please do it? I don't have the bypass rights :( |
14759a3
to
c8c225d
Compare
I rebaselined & fixed conflicts again, but it might need yet another rebaseline right before you merge it. |
Ping. |
lgtm! Do we need another rebase or can we just land this now? |
Oh, we do need another rebase (I see conflicts). |
Yeah see above:
I basically have to keep rebaselining, but I don't have rights to merge it, so unless you merge right after my rebaseline, it just goes in a loop 😅 I can do yet another rebaseline and hope you're free to merge soon, or can I leave it to you to run it locally right before merging - which is easier? |
Can you do one more rebase now. If it doesn't auto-merge, I'll take a look again before EOD |
I think it still won't auto-merge because of
but yeah, I'll do one more. |
c8c225d
to
5df3f36
Compare
Hm I accidentally re-added deleted files during rebase. Trying again. |
Done. |
787af2a
to
220f237
Compare
I rebased but I see test-mac-arm64 CI reporting a lot of
That's not related to my changes, is it? |
Are those actually causing errors? I think that is due to the arm bot not being able to execute the x86 binary for closure-compiler (i.e. no rossetta installed?) its supposed to fall back to java I think. |
The real error looks like its coming from test_codesize_hello_dylink_all.. I recently added that but was thinking maybe I should revert it since it will be very sensitive. |
Yeah I wasn't sure if that's related to different Closure binary or not. I did |
test_codesize_hello_dylink_all is highly sensitive since it includes basically all symbols .. |
So should I do anything else here? |
The reason the test failure is hard to read is because is exceeding the limits of Can you do this:
Our |
FWIW it also has proper logs above where the test is actually running, but neither test runner nor CircleCI capture stdout so it's not shown in the same place as test failure:
|
I'll go to sleep for now, will take a look tomorrow evening if this is not merged by then. |
8cb4137
to
95e8de9
Compare
I can't keep rebaselining... Can you please help with landing this? |
Sure, I'm trying. We will get there eventually :) One downside I just notices to having the symbol list all in the single JSON is grep-ability:
Here I can see that After this change the symbol will appear in the json output but it won't be clear which list it is part of.. Not a huge deal I suppose. It also means I can't do a simple |
If it does inhibit your workflow, I don't mind reverting these lists, just let me know. I suppose you could use |
Ah, so it was another of cross-platform shenanigans... |
Yes, this seems to be the first occurrence of it showup up in our CI machines, not just reports of user machines with this gzip discrepancy. Note that the test in question was not previously measuring the gzip size of the wasm file which was why this didn't show up until this change. |
Interesting. I guess it makes sense that it would show up in the most sensitive test like you said. |
@sbc100 Thank you! |
Right now, code size tests in the codebase use two different formats - either everything in a single JSON file, or multiple single-line files.
I found the discrepancy a bit annoying and wanted to unify them.
Between the two, the one with individual files tends to be harder to review as Github (as well as local git tools) add quite a lot of boilerplate UI around each file, you need to manually scan filenames to understand which size diffs are related to each other, and overall feels more difficult to scan quickly as a human.
In this PR I'm unifying both towards a single JSON file format, and other supplementary metadata (
imports
,sent
etc) goes into the same JSON as well so you can see all relevant context together during reviews.