Skip to content

Conversation

visz11
Copy link
Collaborator

@visz11 visz11 commented Oct 6, 2025

(Explain how this PR changes mypy.)

Summary by CodeRabbit

  • New Features

    • Improved and standardized diagnostics for string and bytes formatting errors.
    • Clearer messages for invalid specifiers, unmatched braces, excessive nesting, unrecognized format specs, invalid conversion types, and bytes-related formatting guidance.
    • Consistent error codes applied to formatting diagnostics.
  • Refactor

    • Centralized formatting error messages for consistency across checks; no behavior changes to analysis or results.

Copy link

Summary of Changes

Hello @visz11, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors how error messages related to string formatting are managed within MyPy. The primary goal is to centralize these messages by moving them from direct string literals in checkstrformat.py into a dedicated message_registry.py module. This change enhances the maintainability and consistency of error reporting across the codebase, making it easier to update or localize messages in the future.

Highlights

  • Centralized Error Messages: Replaced numerous hardcoded string formatting error messages in mypy/checkstrformat.py with references to new ErrorMessage objects defined in mypy/message_registry.py.
  • New Error Message Definitions: Added 20 new ErrorMessage constants to mypy/message_registry.py to encapsulate various string formatting related error messages and their associated error codes, improving consistency and maintainability.
  • fail Method Overload: Modified the fail method in mypy/messages.py to support accepting message_registry.ErrorMessage objects directly, simplifying error reporting and reducing redundancy.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@visz11
Copy link
Collaborator Author

visz11 commented Oct 6, 2025

/refacto-bot

Copy link

coderabbitai bot commented Oct 6, 2025

Walkthrough

Centralizes format-string error messages into message_registry constants, updates checkstrformat to use them, refines MessageBuilder.fail to accept either strings or ErrorMessage instances via overloads, and tightens typing of two public error codes to Final[ErrorCode]. No control-flow changes in format checking; minor branching added in fail.

Changes

Cohort / File(s) Summary
Message registry constants
mypy/message_registry.py
Added numerous exported ErrorMessage constants for string and bytes formatting diagnostics, each associated with appropriate error codes. No logic changes.
Format checker refactor
mypy/checkstrformat.py
Replaced hard-coded error strings with references to new message_registry ErrorMessage constants; control flow unchanged.
Message builder overloads
mypy/messages.py
Introduced @overload for MessageBuilder.fail to accept str or ErrorMessage; at runtime, extracts value/code when given ErrorMessage and forwards string to report().
Error code typing
mypy/errorcodes.py
Updated public constants to typed finals: STRING_FORMATTING and STR_BYTES_PY3 now use Final[ErrorCode].

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Checker as checkstrformat
  participant MB as MessageBuilder
  participant ER as Errors.report

  Checker->>MB: fail(ErrorMessage, context)
  Note right of MB: New: accepts ErrorMessage<br/>Extracts msg.value and msg.code
  MB->>ER: report(message: str, code: ErrorCode, ...)
  ER-->>MB: recorded
  MB-->>Checker: None
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I stack my strings with tidy cheer,
New constants gathered, crystal clear.
I thump my foot—no curly fright,
The builder knows just what to write.
With typed finals snug and tight,
I hop through tests by moonlit night. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title Check ⚠️ Warning The title contains a typo and fails to clearly summarize the core change, which is replacing hard-coded format-string error messages in checkstrformat with message_registry constants; as written, it is unclear and uninformative. Rename the pull request to something concise and descriptive, for example “Replace checkstrformat error strings with message_registry constants” to accurately reflect the main change.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 81.25% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch clone-msg-regostry-checkstrformat

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 380cb8d and 45242d8.

📒 Files selected for processing (4)
  • mypy/checkstrformat.py (17 hunks)
  • mypy/errorcodes.py (1 hunks)
  • mypy/message_registry.py (1 hunks)
  • mypy/messages.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
mypy/checkstrformat.py (3)
mypy/messages.py (4)
  • fail (273-283)
  • fail (286-295)
  • fail (297-322)
  • format (2436-2437)
mypy/checker.py (1)
  • fail (6474-6481)
mypy/message_registry.py (1)
  • format (21-22)
mypy/messages.py (3)
mypy/plugin.py (2)
  • fail (171-173)
  • fail (242-246)
mypy/checker.py (1)
  • fail (6474-6481)
mypy/message_registry.py (1)
  • ErrorMessage (17-25)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

refacto-test bot commented Oct 6, 2025

Clone msg regostry checkstrformat

TL;DR: Migrates string formatting error messages from inline strings to centralized message registry constants.


Refacto PR Summary

Refactors string formatting error messages in mypy to use the message registry pattern instead of inline strings.
This PR moves all string formatting error messages from the checkstrformat.py file into the centralized message_registry.py module as named constants. The change improves maintainability by centralizing error messages and simplifies the error reporting code by replacing multi-line msg.fail() calls with single-line references to message registry constants. The PR also adds proper type annotations to the error codes and enhances the MessageBuilder.fail() method with overloads to support the message registry pattern.

Change Highlights

Click to expand
  • message_registry.py: Added 20+ new error message constants for string formatting errors
  • messages.py: Enhanced MessageBuilder.fail() with overloads to accept ErrorMessage objects
  • checkstrformat.py: Replaced all inline error messages with message registry constants
  • errorcodes.py: Added type annotations to error code constants
  • messages.py: Modified fail() implementation to handle both string and ErrorMessage inputs

Sequence Diagram

sequenceDiagram
    participant CF as checkstrformat.py
    participant MB as MessageBuilder
    participant MR as message_registry.py
    participant EC as errorcodes.py
    
    CF->>MR: Reference error message constant
    MR->>EC: Use error code constant
    CF->>MB: Call fail(message_registry.ERROR_CONSTANT, ctx)
    MB->>MB: Extract message string and code
    MB->>MB: Report error with extracted values
Loading

Testing Guide

Click to expand
  1. Run mypy's test suite to verify all string formatting error messages still work correctly
  2. Trigger a string formatting error (e.g., f"{x}" where x is bytes) and verify the error message is displayed correctly
  3. Check that error codes are properly associated with the messages by running with --show-error-codes
  4. Verify that the error message text matches what was previously shown before the refactoring

Copy link

refacto-test bot commented Oct 6, 2025

Refacto is reviewing this PR. Please wait for the review comments to be posted.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors error messages for string formatting checks by moving them from mypy/checkstrformat.py to mypy/message_registry.py. The MessageBuilder.fail method is now overloaded to accept ErrorMessage objects, which encapsulate both the message and its error code. This is a good refactoring that improves maintainability and centralizes error messages. The implementation is correct and the changes are consistent across the codebase. I have no suggestions for code improvements. As minor points, the PR title has a typo ("regostry" -> "registry") and the PR description could be filled out to summarize the change. Overall, this is a great change.

Copy link

refacto-test bot commented Oct 6, 2025

Code Review: Error Message Registry Refactoring

👍 Well Done
Error Message Centralization

Moving error messages to message_registry improves maintainability and consistency.

Type Annotations Added

Adding Final type annotations to error codes improves type safety.

📌 Files Processed
  • mypy/checkstrformat.py
  • mypy/errorcodes.py
  • mypy/message_registry.py
  • mypy/messages.py
📝 Additional Comments
mypy/messages.py (2)
Overload Order Optimization

The overload ordering places the more specific ErrorMessage type second, while Python's type resolution would benefit from having the more specific type first. This follows the principle of most specific to least specific ordering for function overloads.

Standards:

  • Algorithm-Correctness-Type-Resolution
  • Logic-Verification-Pattern-Matching
Redundant Code Parameter

The 'code' parameter is redundant when an ErrorMessage is provided since the message already contains its code. This could lead to confusion if different codes are provided in both places.

Standards:

  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • DbC-Interface-Consistency
mypy/message_registry.py (1)
Message Grouping Opportunity

Related error messages are defined sequentially but not grouped into logical categories. Consider organizing messages into classes or enums by feature area to improve discoverability and maintenance. This would make future additions more structured.

Standards:

  • Clean-Code-Class-Organization
  • Maintainability-Quality-Cohesion
mypy/errorcodes.py (1)
Type Annotation Consistency

The PR changes type annotations from 'Final = ErrorCode' to 'Final[ErrorCode] = ErrorCode' for some constants but not all. Inconsistent type annotation styles reduce code predictability and maintainability. Consider applying the same annotation style consistently throughout the codebase.

Standards:

  • Clean-Code-Consistency
  • Maintainability-Quality-Predictability
mypy/checkstrformat.py (2)
Message Registry Lookup

Direct message registry lookups in hot paths could introduce minor performance overhead compared to inlined strings. While centralizing messages improves maintainability, consider measuring impact if this code is in a performance-critical section that's called frequently.

Standards:

  • ISO-IEC-25010-Performance-Efficiency-Time-Behavior
  • Optimization-Pattern-Hot-Path
  • Performance-Tradeoff-Maintainability
Error Code Consistency

While most error messages now include error codes through the message registry, this pattern maintains the error code parameter separately. Consider updating all error messages to consistently handle error codes through the message registry to ensure uniform error reporting.

Standards:

  • CWE-755
  • OWASP-A04

Comment on lines +297 to 306
def fail(
self,
msg: str | message_registry.ErrorMessage,
context: Context | None,
*,
code: ErrorCode | None = None,
file: str | None = None,
allow_dups: bool = False,
secondary_context: Context | None = None,
) -> None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Type Overload

The fail method implementation accepts both str and ErrorMessage types, but the second overload signature is missing the 'code' parameter. This inconsistency could lead to runtime errors when code is provided with an ErrorMessage parameter.

Standards
  • ISO-IEC-25010-Functional-Correctness-Appropriateness
  • ISO-IEC-25010-Reliability-Maturity

Comment on lines +356 to 358
message_registry.UNRECOGNIZED_FORMAT_SPEC.format(spec.format_spec[1:]),
call,
code=codes.STRING_FORMATTING,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Error Code

The error code parameter has been removed when migrating to message registry, but the message registry constant should include this code. All other migrations maintain the error code through the message registry constant.

Standards
  • Logic-Verification-Data-Flow
  • Business-Rule-Consistency
  • Algorithm-Correctness-Pattern-Matching

Comment on lines 320 to 322
FORMAT_STR_INVALID_SPECIFIER: Final = ErrorMessage(
"Invalid conversion specifier in format string", codes.STRING_FORMATTING
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Message Naming Convention

Error message constants use inconsistent naming patterns. Some messages use verb-noun format (e.g., FORMAT_STR_INVALID_SPECIFIER) while others use noun-adjective format. Standardizing naming conventions would improve maintainability by making message identifiers more predictable.

Standards
  • Clean-Code-Naming-Conventions
  • Maintainability-Quality-Consistency

Comment on lines 272 to 295
@@ -290,10 +279,40 @@ def fail(
file: str | None = None,
allow_dups: bool = False,
secondary_context: Context | None = None,
) -> None:
...

@overload
def fail(
self,
msg: message_registry.ErrorMessage,
context: Context | None,
*,
file: str | None = None,
allow_dups: bool = False,
secondary_context: Context | None = None,
) -> None:
...
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overload Method Complexity

The overloaded fail() method introduces complexity with duplicate parameter lists. Consider using a single method with a union type (str | ErrorMessage) and type narrowing instead of overloads to reduce duplication and improve maintainability.

Standards
  • Clean-Code-Functions
  • SOLID-DRY
  • Maintainability-Quality-Complexity

Comment on lines +356 to 358
message_registry.UNRECOGNIZED_FORMAT_SPEC.format(spec.format_spec[1:]),
call,
code=codes.STRING_FORMATTING,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Error Code

The code parameter has been removed from this error message call but not replaced with a message registry entry that includes the code. This could lead to inconsistent error reporting and missing error codes in output.

Standards
  • ISO-IEC-25010-Reliability-Maturity
  • ISO-IEC-25010-Functional-Completeness

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants