add workaround for CPython's incorrect handling of native utf-8 with 8-bit CTE by bastidest · Pull Request #5 · plenaerts/eml2pdf

bastidest · 2026-02-15T15:14:12Z

I noticed that some of my emails, which can be correctly rendered in email clients (Thunderbird, Outlook), contain charset encoding errors when converted to a PDF with this tool. This pull request fixes this issue.

The emails with this error contain the following headers, followed by utf-8 encoded payload:

Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit

I added a test case for this.

This seems to be caused by incorrect handling of this case in the CPython implementation of message.get_payload(decode=True).

My expectation to this software would be to produce output similar to an email client's "export as PDF" function.

References

…8-bit CTE * https://bugs.python.org/issue18271 * python/cpython#105285 * python/cpython#105306

plenaerts · 2026-02-15T20:25:42Z

I was wondering when charset / conversion errors would pop up.

Thanks for including the test case and not make me ask for them ;-)

One thing on my mind: you suggest an argument for one specific CTE string, is_8bit_cte. Wouldn't it be more generalistic and in line with the other argument content_charset to have a cte or content_transfer_encoding argument which we test on in line 263? I don't know if we well get other similar issues, but this seems the better approach to me.

What do you think?

bastidest · 2026-02-15T21:07:28Z

I think that depends on how much effort you are willing to put into handling the tail end of encoding errors. As the commit says, this is just a workaround for the incorrect CPython implementation, hence the quite specific is_8bit_cte flag.

Since I suspect that this will not be the last encoding error, in my opinion we should implement the get_payload method ourself (based on the existing CPython implementation) and handle the encoding quirks directly there.

add workaround for CPython's incorrect handling of native utf-8 with …

5e47b5a

…8-bit CTE * https://bugs.python.org/issue18271 * python/cpython#105285 * python/cpython#105306

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add workaround for CPython's incorrect handling of native utf-8 with 8-bit CTE#5

add workaround for CPython's incorrect handling of native utf-8 with 8-bit CTE#5
bastidest wants to merge 1 commit intoplenaerts:mainfrom
bastidest:fix/native-utf8

bastidest commented Feb 15, 2026

Uh oh!

plenaerts commented Feb 15, 2026 •

edited

Loading

Uh oh!

bastidest commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

bastidest commented Feb 15, 2026

Uh oh!

plenaerts commented Feb 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bastidest commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

plenaerts commented Feb 15, 2026 •

edited

Loading