Generalize parser tests and allow parsing of the old log files#1
Generalize parser tests and allow parsing of the old log files#1roman-dvorak merged 4 commits intomasterfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR refactors parser logic into a dedicated module and adds comprehensive test coverage. The main goal is to decouple parsing functionality from the GUI layer (PyQt) to enable independent testing and improve code organization.
- Parser classes moved from
dosview/__init__.pyto newdosview/parsers.pymodule - Added parametrized tests for parser detection and validation in
tests/test_parser.py - Included test fixture data file and GitHub Actions CI workflow
Reviewed Changes
Copilot reviewed 6 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| dosview/parsers.py | New module containing extracted and refactored parser classes with type hints |
| dosview/init.py | Removed inline parser code and replaced with import from parsers module |
| tests/test_parser.py | Added comprehensive parser tests with fixture-based validation |
| data/DATALOG_AIRDOS_GEO.TXT | Test fixture data for legacy AIRDOS log format |
| .github/workflows/tests.yml | CI workflow configuration for automated testing |
| dosview/version.py | Version bump to 0.1.22 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for idx, val in enumerate(parts[4:]): | ||
| try: | ||
| current_hist[idx] += int(val) | ||
| except ValueError: |
There was a problem hiding this comment.
[nitpick] Catching a bare ValueError masks potential parsing issues. The old code used a bare except Exception which was also problematic. While this is an improvement, consider logging the error or at least the line number/context when ValueError occurs to aid debugging malformed log files.
| case "$HIST": | ||
| df_lines.append(parts[1:]) | ||
| case "$HITS": | ||
| for i in range(2, len(parts) - 1, 2): |
There was a problem hiding this comment.
Off-by-one error: range(2, len(parts) - 1, 2) will skip the last value if len(parts) is even. For example, if parts has indices 0,1,2,3,4,5 (length 6), this range produces [2,4], missing the pair at indices 4,5. Should be range(2, len(parts), 2) or ensure pairs exist with range(2, len(parts) - 1, 2) only if len(parts) is odd.
| for i in range(2, len(parts) - 1, 2): | |
| for i in range(2, len(parts), 2): |
| case _: | ||
| continue | ||
| if not df_lines: | ||
| raise ValueError("Soubor neobsahuje žádné záznamy $HIST pro starší log.") |
There was a problem hiding this comment.
[nitpick] Error message is in Czech. Consider using English for consistency with the rest of the codebase (comments, docstrings, and other error messages are in English).
| raise ValueError("Soubor neobsahuje žádné záznamy $HIST pro starší log.") | |
| raise ValueError("The file does not contain any $HIST records for the older log format.") |
| for parser_cls in LOG_PARSERS: | ||
| if parser_cls.detect(file_path): | ||
| return parser_cls(file_path) | ||
| raise ValueError("Neznámý typ logu nebo žádný vhodný parser.") |
There was a problem hiding this comment.
[nitpick] Error message is in Czech. Consider using English for consistency with the rest of the codebase (comments, docstrings, and other error messages are in English).
| raise ValueError("Neznámý typ logu nebo žádný vhodný parser.") | |
| raise ValueError("Unknown log type or no suitable parser found.") |
| import importlib.util | ||
|
|
||
| import numpy as np | ||
| import pytest | ||
|
|
||
| ROOT = Path(__file__).resolve().parent.parent | ||
| PARSERS_PATH = ROOT / "dosview" / "parsers.py" | ||
|
|
||
| spec = importlib.util.spec_from_file_location("dosview_parsers", PARSERS_PATH) | ||
| parsers = importlib.util.module_from_spec(spec) | ||
| assert spec.loader is not None | ||
| spec.loader.exec_module(parsers) | ||
|
|
||
| LOG_PARSERS = parsers.LOG_PARSERS | ||
| get_parser_for_file = parsers.get_parser_for_file | ||
| parse_file = parsers.parse_file | ||
|
|
There was a problem hiding this comment.
[nitpick] The test uses dynamic module loading via importlib.util instead of standard imports. Since dosview.parsers is now a proper module in the package, consider using from dosview import parsers or from dosview.parsers import ... directly. This approach is more maintainable and allows IDEs to provide proper code completion and type checking.
| import importlib.util | |
| import numpy as np | |
| import pytest | |
| ROOT = Path(__file__).resolve().parent.parent | |
| PARSERS_PATH = ROOT / "dosview" / "parsers.py" | |
| spec = importlib.util.spec_from_file_location("dosview_parsers", PARSERS_PATH) | |
| parsers = importlib.util.module_from_spec(spec) | |
| assert spec.loader is not None | |
| spec.loader.exec_module(parsers) | |
| LOG_PARSERS = parsers.LOG_PARSERS | |
| get_parser_for_file = parsers.get_parser_for_file | |
| parse_file = parsers.parse_file | |
| import numpy as np | |
| import pytest | |
| from dosview.parsers import LOG_PARSERS, get_parser_for_file, parse_file |
| raise ValueError("Neznámý typ logu nebo žádný vhodný parser.") | ||
|
|
||
|
|
||
| def parse_file(file_path: str | Path): |
There was a problem hiding this comment.
Missing return type annotation. Based on the implementation, this function returns a list containing [np.ndarray, np.ndarray, np.ndarray, dict]. Consider adding the return type annotation for better API documentation and type safety.
| def parse_file(file_path: str | Path): | |
| def parse_file(file_path: str | Path) -> list[np.ndarray | dict]: |
…ary form module, separate AIRDOS04_hm module as standalon module - still part of this repo
Summary
test_parser.pydata/and skip cleanly when none exist