-
Notifications
You must be signed in to change notification settings - Fork 1
Fix: Validate File Paths and Parsing Logic #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev-11171
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
@@ -109,9 +109,6 @@ def split_and_match_files_list(paths: Sequence[str]) -> list[str]: | |||||||||
expanded_paths = [] | ||||||||||
|
||||||||||
for path in paths: | ||||||||||
if not path: | ||||||||||
continue | ||||||||||
|
||||||||||
path = expand_path(path.strip()) | ||||||||||
globbed_files = fileglob.glob(path, recursive=True) | ||||||||||
if globbed_files: | ||||||||||
|
@@ -287,6 +284,22 @@ def _find_config_file( | |||||||||
|
||||||||||
return None | ||||||||||
|
||||||||||
def parse_and_validate_filenames( | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When a function has an
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It may be reasonable to make it private, too:
Suggested change
|
||||||||||
raw_files: str | ||||||||||
) -> list[str]: | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Plz add a PEP 257-compliant docstring to this function. |
||||||||||
# Split and strip filenames | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No need to use code comments to retell what's already written on the following line. You're just repeating the same thing twice, and that has no value. It's distracting at best. Instead, only use code comments to document motivation/justification for things that code is doing. As a very rare exception, code comments can be used to explain very obscure unobvious logic. But then again, it's best to avoid that and find a way to communicate that through structuring code better, naming the variables, defining abstraction layers clearly. |
||||||||||
files_split = [file.strip() for file in raw_files.split(",")] | ||||||||||
|
||||||||||
# Remove trailing empty entry if present | ||||||||||
if files_split and files_split[-1] == "": | ||||||||||
files_split.pop() | ||||||||||
|
||||||||||
if "" in files_split: | ||||||||||
raise ValueError( | ||||||||||
"Invalid config: Empty filenames are not allowed except for trailing commas." | ||||||||||
) | ||||||||||
|
||||||||||
return files_split | ||||||||||
|
||||||||||
def parse_config_file( | ||||||||||
options: Options, | ||||||||||
|
@@ -322,21 +335,8 @@ def parse_config_file( | |||||||||
else: | ||||||||||
section = parser["mypy"] | ||||||||||
|
||||||||||
if "files" in section: | ||||||||||
raw_files = section["files"].strip() | ||||||||||
files_split = [file.strip() for file in raw_files.split(",")] | ||||||||||
|
||||||||||
# Remove trailing empty entry if present | ||||||||||
if files_split and files_split[-1] == "": | ||||||||||
files_split.pop() | ||||||||||
|
||||||||||
# Raise an error if there are any remaining empty strings | ||||||||||
if "" in files_split: | ||||||||||
raise ValueError( | ||||||||||
"Invalid config: Empty filenames are not allowed except for trailing commas." | ||||||||||
) | ||||||||||
|
||||||||||
options.files = files_split | ||||||||||
if "files" in section and isinstance(raw_files := section["files"], str): | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this only a string in the case of INI configs? Does TOML not hit this code path? Would the string check be unnecessary when it's known it's going through the INI code path? It seems to me that a better place for injecting this conversion would be inside This would likely let you drop the str check for good. |
||||||||||
options.files = parse_and_validate_filenames(raw_files) | ||||||||||
|
||||||||||
prefix = f"{file_read}: [mypy]: " | ||||||||||
updates, report_dirs = parse_section( | ||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It wouldn't hurt to have a module docstring here. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,168 +1,50 @@ | ||||||
import os | ||||||
import tempfile | ||||||
from unittest import TestCase, main | ||||||
|
||||||
import pytest | ||||||
from mypy.config_parser import parse_config_file | ||||||
from mypy.options import Options | ||||||
|
||||||
|
||||||
class TestConfigParser(TestCase): | ||||||
def test_parse_config_file_with_single_file(self) -> None: | ||||||
"""A single file should be correctly parsed.""" | ||||||
with tempfile.TemporaryDirectory() as tmpdirname: | ||||||
config_path = os.path.join(tmpdirname, "test_config.ini") | ||||||
|
||||||
with open(config_path, "w") as f: | ||||||
f.write( | ||||||
""" | ||||||
[mypy] | ||||||
files = file1.py | ||||||
""" | ||||||
) | ||||||
|
||||||
options = Options() | ||||||
|
||||||
parse_config_file(options, lambda: None, config_path, stdout=None, stderr=None) | ||||||
|
||||||
self.assertEqual(options.files, ["file1.py"]) | ||||||
|
||||||
def test_parse_config_file_with_no_spaces(self) -> None: | ||||||
"""Files listed without spaces should be correctly parsed.""" | ||||||
with tempfile.TemporaryDirectory() as tmpdirname: | ||||||
config_path = os.path.join(tmpdirname, "test_config.ini") | ||||||
|
||||||
with open(config_path, "w") as f: | ||||||
f.write( | ||||||
""" | ||||||
[mypy] | ||||||
files =file1.py,file2.py,file3.py | ||||||
""" | ||||||
) | ||||||
|
||||||
options = Options() | ||||||
|
||||||
parse_config_file(options, lambda: None, config_path, stdout=None, stderr=None) | ||||||
|
||||||
self.assertEqual(options.files, ["file1.py", "file2.py", "file3.py"]) | ||||||
|
||||||
def test_parse_config_file_with_extra_spaces(self) -> None: | ||||||
"""Files with extra spaces should be correctly parsed.""" | ||||||
with tempfile.TemporaryDirectory() as tmpdirname: | ||||||
config_path = os.path.join(tmpdirname, "test_config.ini") | ||||||
|
||||||
with open(config_path, "w") as f: | ||||||
f.write( | ||||||
""" | ||||||
[mypy] | ||||||
files = file1.py , file2.py , file3.py | ||||||
""" | ||||||
) | ||||||
|
||||||
options = Options() | ||||||
|
||||||
parse_config_file(options, lambda: None, config_path, stdout=None, stderr=None) | ||||||
|
||||||
self.assertEqual(options.files, ["file1.py", "file2.py", "file3.py"]) | ||||||
|
||||||
def test_parse_config_file_with_empty_files_key(self) -> None: | ||||||
"""An empty files key should result in an empty list.""" | ||||||
with tempfile.TemporaryDirectory() as tmpdirname: | ||||||
config_path = os.path.join(tmpdirname, "test_config.ini") | ||||||
|
||||||
with open(config_path, "w") as f: | ||||||
f.write( | ||||||
""" | ||||||
[mypy] | ||||||
files = | ||||||
""" | ||||||
) | ||||||
|
||||||
options = Options() | ||||||
|
||||||
parse_config_file(options, lambda: None, config_path, stdout=None, stderr=None) | ||||||
|
||||||
self.assertEqual(options.files, []) | ||||||
|
||||||
def test_parse_config_file_with_only_comma(self) -> None: | ||||||
"""A files key with only a comma should raise an error.""" | ||||||
with tempfile.TemporaryDirectory() as tmpdirname: | ||||||
config_path = os.path.join(tmpdirname, "test_config.ini") | ||||||
|
||||||
with open(config_path, "w") as f: | ||||||
f.write( | ||||||
""" | ||||||
[mypy] | ||||||
files = , | ||||||
""" | ||||||
) | ||||||
|
||||||
options = Options() | ||||||
|
||||||
with self.assertRaises(ValueError) as cm: | ||||||
parse_config_file(options, lambda: None, config_path, stdout=None, stderr=None) | ||||||
|
||||||
self.assertIn("Invalid config", str(cm.exception)) | ||||||
|
||||||
def test_parse_config_file_with_only_whitespace(self) -> None: | ||||||
"""A files key with only whitespace should result in an empty list.""" | ||||||
with tempfile.TemporaryDirectory() as tmpdirname: | ||||||
config_path = os.path.join(tmpdirname, "test_config.ini") | ||||||
|
||||||
with open(config_path, "w") as f: | ||||||
f.write( | ||||||
""" | ||||||
[mypy] | ||||||
files = | ||||||
""" | ||||||
) | ||||||
|
||||||
options = Options() | ||||||
|
||||||
parse_config_file(options, lambda: None, config_path, stdout=None, stderr=None) | ||||||
|
||||||
self.assertEqual(options.files, []) | ||||||
|
||||||
def test_parse_config_file_with_mixed_valid_and_invalid_entries(self) -> None: | ||||||
"""Mix of valid and invalid filenames should raise an error.""" | ||||||
with tempfile.TemporaryDirectory() as tmpdirname: | ||||||
config_path = os.path.join(tmpdirname, "test_config.ini") | ||||||
|
||||||
with open(config_path, "w") as f: | ||||||
f.write( | ||||||
""" | ||||||
[mypy] | ||||||
files = file1.py, , , file2.py | ||||||
""" | ||||||
) | ||||||
@pytest.mark.parametrize( | ||||||
"config_content, expected_files, expected_exception", | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's usually nicer to have an iterable of strings for the params list. Sparse structures read better.
Suggested change
|
||||||
[ | ||||||
# Files listed without spaces | ||||||
("[mypy]\nfiles =file1.py,file2.py,file3.py", ["file1.py", "file2.py", "file3.py"], None), | ||||||
|
||||||
options = Options() | ||||||
# Files listed with adequate space | ||||||
("[mypy]\nfiles =file1.py, file2.py, file3.py", ["file1.py", "file2.py", "file3.py"], None), | ||||||
|
||||||
with self.assertRaises(ValueError) as cm: | ||||||
parse_config_file(options, lambda: None, config_path, stdout=None, stderr=None) | ||||||
# Files with extra spaces | ||||||
("[mypy]\nfiles = file1.py , file2.py , file3.py", ["file1.py", "file2.py", "file3.py"], None), | ||||||
|
||||||
self.assertIn("Invalid config", str(cm.exception)) | ||||||
# Files listed with a trailing comma | ||||||
("[mypy]\nfiles = file1.py, file2.py, file3.py,", ["file1.py", "file2.py", "file3.py"], None), | ||||||
|
||||||
def test_parse_config_file_with_newlines_between_files(self) -> None: | ||||||
"""Newlines between file entries should be correctly handled.""" | ||||||
with tempfile.TemporaryDirectory() as tmpdirname: | ||||||
config_path = os.path.join(tmpdirname, "test_config.ini") | ||||||
# Empty files key | ||||||
("[mypy]\nfiles =", [], None), | ||||||
|
||||||
with open(config_path, "w") as f: | ||||||
f.write( | ||||||
""" | ||||||
[mypy] | ||||||
files = file1.py, | ||||||
file2.py, | ||||||
file3.py | ||||||
""" | ||||||
) | ||||||
# Files key with only a comma | ||||||
("[mypy]\nfiles = ,", None, ValueError), | ||||||
|
||||||
options = Options() | ||||||
# Mixed valid and invalid filenames | ||||||
("[mypy]\nfiles = file1.py, , , file2.py", None, ValueError), | ||||||
|
||||||
parse_config_file(options, lambda: None, config_path, stdout=None, stderr=None) | ||||||
# Files listed with multiple trailing comma | ||||||
("[mypy]\nfiles = file1.py, file2.py, file3.py,", None, ValueError), | ||||||
|
||||||
self.assertEqual(options.files, ["file1.py", "file2.py", "file3.py"]) | ||||||
# Newlines between file entries | ||||||
("[mypy]\nfiles = file1.py,\nfile2.py,\nfile3.py", ["file1.py", "file2.py", "file3.py"], None), | ||||||
] | ||||||
) | ||||||
def test_parse_config_file(tmp_path, config_content, expected_files, expected_exception): | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @x612skm do you remember if function-based tests are being picked up? I remember we had some issues with these, but I don't recall the detail. Was it just the filename that was problematic, or having the class was necessary as well? |
||||||
"""Parameterized test for parse_config_file handling various configurations.""" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Function docstrings should start with a verb so that they reference the action being performed. Only constants / objects need to be described as terms because they don't do anything. Additionally, it should probably be more accurate. Generic statements aren't very useful because they don't include any specifics and so it's normally pointless to have them like that. |
||||||
config_path = tmp_path / "test_config.ini" | ||||||
config_path.write_text(config_content) | ||||||
|
||||||
options = Options() | ||||||
|
||||||
if __name__ == "__main__": | ||||||
main() | ||||||
if expected_exception: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a bad idea to have logic in tests. Introducing branching makes tests more fragile and unreliable. The more complex structures there are, them more uncertainty there is. You'd have to write tests for tests to make sure they function at all. Negative and positive scenarios are two distinct tests. They shouldn't co-exist in the same test function. |
||||||
with pytest.raises(expected_exception) as exc_info: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||||||
parse_config_file(options, lambda: None, str(config_path), stdout=None, stderr=None) | ||||||
assert "Invalid config" in str(exc_info.value) | ||||||
else: | ||||||
parse_config_file(options, lambda: None, str(config_path), stdout=None, stderr=None) | ||||||
assert options.files == expected_files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?