Speed up SSE4.1 test by splitting individual unrolled blocks into their own functions. #24401

juj · 2025-05-22T22:48:52Z

Before:

test_sse4_1 (test_core.core_2gb) ... ok (5.89s)
test_sse4_1 (test_core.lsan) ... ok (5.94s)
test_sse4_1 (test_core.minimal0) ... ok (5.98s)
test_sse4_1 (test_core.strict_js) ... ok (5.98s)
test_sse4_1 (test_core.strict) ... ok (6.13s)
test_sse4_1 (test_core.bigint) ... ok (6.13s)
test_sse4_1 (test_core.core0) ... ok (6.17s)
test_sse4_1 (test_core.core1) ... ok (6.36s)
test_sse4_1 (test_core.instance) ... ok (6.41s)
test_sse4_1 (test_core.asan) ... ok (9.62s)
test_sse4_1 (test_core.wasmfs) ... ok (10.01s)
test_sse4_1 (test_core.core2) ... ok (10.12s)
test_sse4_1 (test_core.corez) ... ok (11.15s)
test_sse4_1 (test_core.cores) ... ok (37.72s)
test_sse4_1 (test_core.core3) ... ok (140.27s)

Total core time: 273.854s. Wallclock time: 140.702s. Parallelization: 1.95x.

After:

test_sse4_1 (test_core.strict) ... ok (7.11s)
test_sse4_1 (test_core.strict_js) ... ok (7.16s)
test_sse4_1 (test_core.bigint) ... ok (7.18s)
test_sse4_1 (test_core.minimal0) ... ok (7.44s)
test_sse4_1 (test_core.core_2gb) ... ok (7.46s)
test_sse4_1 (test_core.core0) ... ok (7.49s)
test_sse4_1 (test_core.instance) ... ok (7.53s)
test_sse4_1 (test_core.lsan) ... ok (7.64s)
test_sse4_1 (test_core.core1) ... ok (8.15s)
test_sse4_1 (test_core.asan) ... ok (9.63s)
test_sse4_1 (test_core.wasmfs) ... ok (10.54s)
test_sse4_1 (test_core.core2) ... ok (10.44s)
test_sse4_1 (test_core.corez) ... ok (11.38s)
test_sse4_1 (test_core.cores) ... ok (11.69s)
test_sse4_1 (test_core.core3) ... ok (50.80s)

Total core time: 171.622s. Wallclock time: 51.223s. Parallelization: 3.35x.

sbc100 · 2025-05-23T14:21:43Z

test/sse/test_sse.h

@@ -547,9 +550,10 @@ __m128 ExtractIntInRandomOrder(unsigned int *arr, int i, int n, int prime) {
        char str[256]; tostr(&m1, str); \
        char str2[256]; tostr(&ret, str2); \
        printf("%s(%s, 0x%08X, %d) = %s\n", #func, str, interesting_ints[j], Tint, str2); \
-      }
+      } \
+  }();


Can we use simple C functions instead of C++ lambda's perhaps? That would also help with debug-ability as it would yield meaningful backtraces. If its not easy then this change is still better of course. lgtm either way.

I don't know how to do that easily. The function bodies would then need to be emitted somewhere else by one set of macro expansion magic, and then calls to those functions separately in another place.

…ir own functions.

juj · 2025-05-29T15:59:42Z

Ping - is this ok to land?

sbc100 reviewed May 23, 2025

View reviewed changes

juj force-pushed the further_optimize_sse4_1_test_suite branch from 2be37fa to d375a38 Compare May 23, 2025 15:04

juj enabled auto-merge (squash) May 23, 2025 15:04

Speed up SSE4.1 test by splitting individual unrolled blocks into the…

0524c96

…ir own functions.

juj force-pushed the further_optimize_sse4_1_test_suite branch from d375a38 to 0524c96 Compare May 23, 2025 16:41

sbc100 approved these changes May 29, 2025

View reviewed changes

juj merged commit a7cdef6 into emscripten-core:main May 29, 2025
30 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up SSE4.1 test by splitting individual unrolled blocks into their own functions. #24401

Speed up SSE4.1 test by splitting individual unrolled blocks into their own functions. #24401

Uh oh!

juj commented May 22, 2025

Uh oh!

sbc100 May 23, 2025

Uh oh!

juj May 23, 2025 •

edited

Loading

Uh oh!

juj commented May 29, 2025

Uh oh!

Uh oh!

Uh oh!

Speed up SSE4.1 test by splitting individual unrolled blocks into their own functions. #24401

Speed up SSE4.1 test by splitting individual unrolled blocks into their own functions. #24401

Uh oh!

Conversation

juj commented May 22, 2025

Uh oh!

sbc100 May 23, 2025

Choose a reason for hiding this comment

Uh oh!

juj May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

juj commented May 29, 2025

Uh oh!

Uh oh!

Uh oh!

juj May 23, 2025 •

edited

Loading