Skip to content

Fix: Validate File Paths and Parsing Logic #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: dev-11171
Choose a base branch
from

Conversation

Prabhat-Thapa45
Copy link

@Prabhat-Thapa45 Prabhat-Thapa45 marked this pull request as ready for review March 25, 2025 08:27
@x612skm
Copy link
Owner

x612skm commented Mar 27, 2025

Hey @Prabhat-Thapa45, Thanks for the collaboration! I appreciate it! Did you find time to check on the bug mentioned?

It's failing on the line you added — section["files"].strip(). AttributeError: 'list' object has no attribute 'strip' suggests that section["files"] contains a list and is not a string.

@Prabhat-Thapa45
Copy link
Author

Hi, @x612skm. Yes, I did fix that. It had to do with how files names were given as a list rather than str in pyproject.toml. So had to make sure the trailing comma removal logic was only applied if it was a str just a normal fix. Also I have reformatted test cases as well taking notes from feedback on your changes I just need to push it now.

@webknjaz
Copy link

@x612skm looks like you don't have the CI enabled here. Go to https://github.com/x612skm/mypy-dev/actions and correct that. After doing that, you can close the PR and then re-open it right away, which should trigger the CI run for the first time.

@@ -287,6 +284,22 @@ def _find_config_file(

return None

def parse_and_validate_filenames(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a function has an and in the name, this usually indicates that it's doing too many things. Although, validation is usually implied, so I'd drop its mention from the name. Additionally, it's possible to name it more accurately:

Suggested change
def parse_and_validate_filenames(
def convert_raw_files_string_to_list(

Copy link

@webknjaz webknjaz Mar 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be reasonable to make it private, too:

Suggested change
def parse_and_validate_filenames(
def _convert_raw_files_string_to_list(

def parse_and_validate_filenames(
raw_files: str
) -> list[str]:
# Split and strip filenames

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to use code comments to retell what's already written on the following line. You're just repeating the same thing twice, and that has no value. It's distracting at best. Instead, only use code comments to document motivation/justification for things that code is doing. As a very rare exception, code comments can be used to explain very obscure unobvious logic. But then again, it's best to avoid that and find a way to communicate that through structuring code better, naming the variables, defining abstraction layers clearly.

@@ -287,6 +284,22 @@ def _find_config_file(

return None

def parse_and_validate_filenames(
raw_files: str
) -> list[str]:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plz add a PEP 257-compliant docstring to this function.

@@ -109,9 +109,6 @@ def split_and_match_files_list(paths: Sequence[str]) -> list[str]:
expanded_paths = []

for path in paths:
if not path:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

)

options.files = files_split
if "files" in section and isinstance(raw_files := section["files"], str):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this only a string in the case of INI configs? Does TOML not hit this code path? Would the string check be unnecessary when it's known it's going through the INI code path?

It seems to me that a better place for injecting this conversion would be inside _parse_individual_file() where it has the differentiation between is_toml() and not. And you'd only apply it in the else-branch there.

This would likely let you drop the str check for good.

("[mypy]\nfiles = file1.py,\nfile2.py,\nfile3.py", ["file1.py", "file2.py", "file3.py"], None),
]
)
def test_parse_config_file(tmp_path, config_content, expected_files, expected_exception):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@x612skm do you remember if function-based tests are being picked up? I remember we had some issues with these, but I don't recall the detail. Was it just the filename that was problematic, or having the class was necessary as well?

]
)
def test_parse_config_file(tmp_path, config_content, expected_files, expected_exception):
"""Parameterized test for parse_config_file handling various configurations."""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function docstrings should start with a verb so that they reference the action being performed. Only constants / objects need to be described as terms because they don't do anything.

Additionally, it should probably be more accurate. Generic statements aren't very useful because they don't include any specifics and so it's normally pointless to have them like that.


if __name__ == "__main__":
main()
if expected_exception:
Copy link

@webknjaz webknjaz Mar 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bad idea to have logic in tests. Introducing branching makes tests more fragile and unreliable. The more complex structures there are, them more uncertainty there is. You'd have to write tests for tests to make sure they function at all.

Negative and positive scenarios are two distinct tests. They shouldn't co-exist in the same test function.

if __name__ == "__main__":
main()
if expected_exception:
with pytest.raises(expected_exception) as exc_info:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest.raises() must always have a match= regexp set.

"""
)
@pytest.mark.parametrize(
"config_content, expected_files, expected_exception",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's usually nicer to have an iterable of strings for the params list. Sparse structures read better.

Suggested change
"config_content, expected_files, expected_exception",
("config_content", "expected_files", "expected_exception"),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wouldn't hurt to have a module docstring here.

@webknjaz
Copy link

@Prabhat-Thapa45 when writing commit messages, make sure to include descriptions, not just titles. This helps others understand why you did what you did. Here's some more materials on the topic: https://gist.github.com/webknjaz/cb7d7bf62c3dda4b1342d639d0e78d79.

@webknjaz
Copy link

Yes, I did fix that.

Is there a test that can prove this?

@Prabhat-Thapa45
Copy link
Author

python#18621 This issue seems to be resolved. I think we should close our prs @x612skm and also issue python#11171 can be closed. @webknjaz .

@webknjaz
Copy link

Looks like it might be addressing the ini cfg trailing comma bit, but not empty strings post parsing. Has anyone checked the main branch against the repros?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants