Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Cannot render some* PDFs with .jp2 embedded images #19517

Open
gsamokovarov opened this issue Feb 20, 2025 · 4 comments
Open

[Bug]: Cannot render some* PDFs with .jp2 embedded images #19517

gsamokovarov opened this issue Feb 20, 2025 · 4 comments

Comments

@gsamokovarov
Copy link

gsamokovarov commented Feb 20, 2025

Attach (recommended) or Link to PDF file

pdf-with-jp2.pdf

Web browser and its version

Chrome, Arc, Firefox, Microsoft Edge

Operating system and its version

macOS 14.7.4

PDF.js version

4.10.38

Is the bug present in the latest PDF.js version?

Yes

Is a browser extension

No

Steps to reproduce the problem

  1. Upload the attached PDF to a pdf.js-based viewer.
  2. See a blank image instead of a red image spawning the whole document.

What is the expected behavior?

Display the .jp2 image embedded in the PDF document.

What went wrong?

No image was displayed in the document

Link to a viewer

No response

Additional context

I'm running into issues rendering PDFs with some embedded .jp2 images. The document is loaded, but the image is not rendered. I see that pdf.js has a WASM compiled openjpeg-based JPX decoder. However, it can not decode the picture embedded in the document and issues the following warning in the browser dev tools console:

Warning: Unable to decode image "img_p0_1": "JpxError: Size of tile data exceeds system limits

Failed to decode.

Failed to decode tile 1/1

Failed to decode the codestream in the JP2 file

Failed to decode the image".

My limited debugging leads me to believe that we are hitting some of the branches issuing this error (here is an example) because of the 32-bit memory model WASM compiled build. I may be wrong here, and I need help debugging the exact C code branch that is giving the warning and what we can do about it. I also see that browsers are starting to support 64bit WASM, again I'm not sure if compiling a 64-bit WASM openjpeg build is the proper solution as browsers are just starting to support this.

I am attaching a document for testing. The native Chrome PDF viewer (pdfium) and macOS Preview can display the document. Interestingly, pdfium also uses openjpeg to decode the image, but I guess the 64-bit native build handles such images.

@calixteman
Copy link
Contributor

I think we can try to reduce the size of the tile when decoding in using opj_set_decoded_resolution_factor.
I'll have a look on that asap.

@calixteman
Copy link
Contributor

In considering the overall size of the image, it'd better to reduce it directly while decoding it (as mentioned in the above comment): it'll help to avoid us to have to do it ourselves and in a slower way.
Something else we can consider is to use opj_set_decode_area in a loop in order to get small enough chunk at a time and put everything in the final image (which is maybe something interesting in general to avoid to use too much memory).

@gsamokovarov
Copy link
Author

I have this issue on smaller PDF documents too: 1.5MB or 1.7MB in size. The one I have attached is something I could generate without attaching user data. I'm attaching another one that is 3.2MB.

pdf-with-smaller-jp2.pdf

@mitio
Copy link

mitio commented Mar 6, 2025

@calixteman thank you for looking into this! Can you point me to where these functions are defined/used so we can perhaps try to have a stab at this as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants