NEON-optimize 9-7 IDWT#1629
Conversation
|
I ran openjpeg's test suite with this, and the same 50 files fail with and without this PR. I also ran pdfium's test suite with this and everything still passes with it. |
|
this breaks the build: see https://my.cdash.org/builds/3496357/build |
|
Thanks for triggering CI! Is there a way to see the full build command, with all flags? It builds fine on my system, using cmake 4.2.1 and Xcode 26.2's clang. Here's the compile command that runs on my machine for dwt.c: If I manually add |
see |
|
Thanks! Any reason this isn't set in the cmake files themselves, so that people get that behavior locally automatically? :) I have fixed the warnings locally. Do you squash or merge? I.e. should I amend my existing commit and push -f, or do you prefer a separate commit for the follow-up? |
I don't know. Wasn't involved yet in the project when this was set up. Openjpeg is a very slowly moving project nowadays
you can amend your commit and force push |
|
(Also, if I set Want me to make a PR fixing those, or should I leave it alone? Also, https://my.cdash.org/builds/3496357/build claims that -Werror is on, but the travis line you linked to doesn't enable -Werror except for one specific warning.) |
Takes `bin/bench_dwt -I` from 0.865 s to 0.672 s on my system.
Done, thanks! |
(separate from this one) PR welcome |
|
what about ARMv7 ? I'm vaguely aware it handles a subset of ARMv8 NEON. We don't have CI testing for it, and I wouldn't want we discover build breakage when packagers start to package this. (in another project I've cross build testing of armhf: https://github.com/OSGeo/gdal/tree/master/.github/workflows/armhf) |
That's a good question. My upstream CL is here https://pdfium-review.googlesource.com/c/pdfium/+/144450 It does have green android_32 bots, and those use armv7a. So I think it should work. Are there particular sysroots you'd like me to test against? |
ok fine, thanks |
|
For completeness: I also tried with a linux sysroot, and the file builds fine there too: |
-> #1632 Apparently we also have a bunch of downstream patches in pdfium (see list here: https://source.chromium.org/chromium/chromium/src/+/main:third_party/pdfium/third_party/libopenjpeg/README.pdfium). Would you be interested in me trying to upstream some of those? |
I don't know what to answer to that. This code base / JPEG2000 topic is very hard to comprehend. Anything non trivial requires me time&energy that I don't have... There's effectively no other people left in the team, and I've never signed to be the sole maintainer. |
|
Ah, the old open-source trap :) I can relate. As can many others, but we never talk to each other. There should be an open source maintainer support group somewhere. I saw https://felixge.de/2013/03/11/the-pull-request-hack/ a while ago. I never tried it, but it's something you could consider. (To be clear, I'm not looking for commit access myself.) If you want, I could only send out smaller patches that add early-outs for things found by fuzzers. All that code has been shipping in Chrome and Android for years and is hopefully not completely busted, at least. But if you prefer we keep things downstream, that's fine too of course. |
sounds good. hopefully most of them might be straightforward to assess |
Takes
bin/bench_dwt -Ifrom 0.865 s to 0.672 s on my system."My system" being a Apple M4 Max MacBook Pro.
This code is hot when rendering https://archive.org/details/disquisitionesa00gaus in pdfium. This speeds up
pdfium_test --pages=1-50 ~/Downloads/disquisitionesa00gaus.pdfby ca 10%.This is similar to the existing SSE fast path.