Skip to content

gh-87112: Ensure that only ASCII digits are accepted as section number in MIME header parameter #136877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

matthieucan
Copy link
Contributor

@matthieucan matthieucan commented Jul 20, 2025

With those changes, the MIME parameter parser discards parameters with an invalid section number that uses a non 0-9 digit such as super-script "²" or "𐩃" (Kharosthi number).

Before:

>>> import email.headerregistry
>>> reg = email.headerregistry.HeaderRegistry()
>>> reg('Content-Disposition', 'inline; foo*0=bar')
'inline; foo="bar"'
>>> reg('Content-Disposition', 'inline; foo*X=bar')
'inline;'
>>> reg('Content-Disposition', 'inline; foo*²=bar')
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    reg('Content-Disposition', 'inline; foo*²=bar')
    ~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/matthieu/git/cpython/Lib/email/headerregistry.py", line 604, in __call__
    return self[name](name, value)
           ~~~~~~~~~~^^^^^^^^^^^^^
  File "/Users/matthieu/git/cpython/Lib/email/headerregistry.py", line 192, in __new__
    cls.parse(value, kwds)
    ~~~~~~~~~^^^^^^^^^^^^^
  File "/Users/matthieu/git/cpython/Lib/email/headerregistry.py", line 448, in parse
    kwds['parse_tree'] = parse_tree = cls.value_parser(value)
                                      ~~~~~~~~~~~~~~~~^^^^^^^
  File "/Users/matthieu/git/cpython/Lib/email/_header_value_parser.py", line 2736, in parse_content_disposition_header
    disp_header.append(parse_mime_parameters(value[1:]))
                       ~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
  File "/Users/matthieu/git/cpython/Lib/email/_header_value_parser.py", line 2601, in parse_mime_parameters
    token, value = get_parameter(value)
                   ~~~~~~~~~~~~~^^^^^^^
  File "/Users/matthieu/git/cpython/Lib/email/_header_value_parser.py", line 2464, in get_parameter
    token, value = get_section(value)
                   ~~~~~~~~~~~^^^^^^^
  File "/Users/matthieu/git/cpython/Lib/email/_header_value_parser.py", line 2417, in get_section
    section.number = int(digits)
                     ~~~^^^^^^^^
ValueError: invalid literal for int() with base 10: '²'

After:

>>> import email.headerregistry
>>> reg = email.headerregistry.HeaderRegistry()
>>> reg('Content-Disposition', 'inline; foo*0=bar')
'inline; foo="bar"'
>>> reg('Content-Disposition', 'inline; foo*X=bar')
'inline;'
>>> reg('Content-Disposition', 'inline; foo*²=bar')
'inline;'

@picnixz picnixz changed the title gh-87112: Do not fail when non 0-9 digit is used as section number in MIME header parameter gh-87112: Ensure that only ASCII digits are accepted as section number in MIME header parameter Jul 20, 2025
raise errors.HeaderParseError("Expected section number but "
"found {}".format(value))
digits = ''
while value and value[0].isdigit():
while value and '0' <= value[0] <= '9':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while value and '0' <= value[0] <= '9':
while value and ('0' <= value[0] <= '9'):

It will a bit clearer. Or you can still use a separate function to make it even cleareer. The bottleneck won't be the function call IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that, but not the separate function. It was my understanding that @StanFromIreland was leaning towards not having an inner function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, I was against the function to check if it is in a dictionary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Just moved it to a separate function for extra-clarity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants