Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of docstrings with tokenization errors (Fixes #18388) #18575

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

gareth-cross
Copy link
Contributor

This changes fixes #18388.

When parsing docstrings, stubgen will bail on a docstring when the first tokenization error is encountered. This behavior is brittle, because docstrings need not be entirely valid python and can contain characters that cause early failure. Consider the following example:

def thing():
  """
  thing(*args, **kwargs)
  Overloaded function.

  1. thing(x: int) -> None

  .. math::
    \mathbf{x} = 3 \cdot \mathbf{y}

  2. thing(x: int, y: int) -> str

  This signature will never get parsed due to TokenError.
  """

The presence of the LaTeX code will cause TokenError to occur, and the second overload will never get parsed.

This change causes mypy to resume parsing after an error is occurred, such that later overloads can still be discovered. The new behavior is somewhat more robust to failures of this kind. I also added two tests with example docstrings that previously failed.

Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[stubgen] Overloaded signatures are dropped if TokenError is encountered while parsing docstrings.
1 participant