Skip to content

Conversation

@blanca-pablos
Copy link
Collaborator

No description provided.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 20, 2025

Claude finished @blanca-pablos's task —— View job


Code Review: PR #270 - Fix legacy mpp factor + multiple dcm files

Summary

This PR addresses two important issues in WSI (Whole Slide Image) processing:

  1. Legacy MPP (microns per pixel) correction factor for TIFF files created with buggy libvips versions <8.8.3
  2. DICOM series filtering to avoid processing multiple files from the same multi-file DICOM series

Overall, this is a solid medical imaging fix addressing real-world compatibility issues. However, there are 5 linting issues that must be fixed and missing test coverage for the new functionality.


🔴 BLOCKING ISSUES (Must Fix)

1. Linting Failures (5 errors)

The following linting issues were found:

File: src/aignostics/application/_service.py

Line 351: Loop variables `r`, `c` not used within loop body

Fix:

# Current (line 351):
for file_path, r, c in files_with_dims:

# Should be:
for file_path, _r, _c in files_with_dims:

File: src/aignostics/wsi/_openslide_handler.py

Line 67: Method `_get_mpp_correction_factor` could be a static method
Line 77: Variable `_LEGACY_MPP_FACTOR` should be lowercase (LEGACY_MPP_FACTOR)
Line 77: Remove leading underscore from `_LEGACY_MPP_FACTOR` (RUF052)

Fix for line 67:

# Change from instance method to static method:
@staticmethod
def _get_mpp_correction_factor(props: dict[str, Any]) -> float:

Fix for line 77:

# Change:
_LEGACY_MPP_FACTOR = 1 / 10

# To (define at module level as constant):
LEGACY_MPP_FACTOR = 1 / 10

# Or inline it:
legacy_mpp_factor = 1 / 10

Verify fixes:

uv run ruff check --fix src/aignostics/application/_service.py src/aignostics/wsi/_openslide_handler.py
make lint

2. Missing Test Coverage (CRITICAL for Medical Device Software)

The new functionality has zero test coverage:

_filter_dicom_series_files() method (application/_service.py:299-345):

  • Handles multi-file DICOM series selection
  • Critical logic: selects highest resolution file
  • No tests exist for this functionality

_get_mpp_correction_factor() method (wsi/_openslide_handler.py:67-93):

  • Handles libvips version detection and MPP scaling bug
  • Critical for accurate microns-per-pixel calculations
  • No tests exist for this functionality

Required test cases:

  1. DICOM filtering tests (add to tests/aignostics/application/service_test.py):

    • Single DICOM file (should not be filtered)
    • Multiple DICOM files with same SeriesInstanceUID (keep highest resolution)
    • Multiple series with different SeriesInstanceUIDs (filter each independently)
    • DICOM files that fail to read (should not be excluded)
    • Edge case: All files have same dimensions
  2. MPP correction tests (add to tests/aignostics/wsi/service_test.py):

    • libvips version <8.8.3 (should apply 0.1 correction)
    • libvips version ≥8.8.3 (should apply 1.0 correction)
    • Missing libvips version metadata (should default to 0.1)
    • Non-TIFF files or files without tiff.XResolution (should use 1.0)
    • Malformed XML in ImageDescription (should default to 0.1)

Test example:

@pytest.mark.unit
def test_filter_dicom_series_files_single_file(tmp_path):
    """Test that single DICOM files are not filtered."""
    # Create test DICOM file
    ds = pydicom.Dataset()
    ds.SeriesInstanceUID = "1.2.3.4.5"
    ds.Rows = 1024
    ds.Columns = 1024
    dcm_file = tmp_path / "test.dcm"
    ds.save_as(dcm_file)
    
    excluded = Service._filter_dicom_series_files(tmp_path)
    assert len(excluded) == 0

@pytest.mark.unit
def test_mpp_correction_factor_legacy_libvips():
    """Test MPP correction for legacy libvips <8.8.3."""
    props = {
        "tiff.ImageDescription": """<vips libVips-version="8.7.0" />"""
    }
    handler = OpenSlideHandler(test_file_path)
    factor = handler._get_mpp_correction_factor(props)
    assert factor == 0.1  # Legacy correction

Why this is critical:

  • This is medical device software (FDA/MDR regulated)
  • Incorrect MPP can lead to misdiagnosis
  • DICOM series filtering affects which files get processed
  • 85% coverage requirement must be maintained

🟡 SUGGESTIONS (Non-Blocking)

3. Commit Message Convention

Current commits:

c504295 Update wsi info command in README
a3597c5 Add mpp correction factor and highest res .dcm selection.

Issue: First commit doesn't follow conventional commits format.

Should be:

docs(wsi): update wsi info command in README
fix(wsi): add mpp correction factor and highest res .dcm selection

Why it matters: Conventional commits enable automatic changelog generation.

4. PR Description Missing

Issue: No PR description provided. For a medical device SDK, this is concerning.

Should include:

  • Problem statement: What bugs are being fixed?
  • Root cause: Why did these issues occur?
  • Solution approach: How are they being fixed?
  • Testing strategy: How was it validated?
  • References: Link to related issues, libvips bug report, DICOM specification

Example:

## Problem
1. TIFF files exported with libvips <8.8.3 have incorrect MPP values (10x off)
2. Multi-file DICOM series cause duplicate processing of the same image data

## Solution
1. Detect libvips version from TIFF metadata and apply 0.1 correction factor for buggy versions
2. Filter DICOM files by SeriesInstanceUID, keeping only highest resolution file per series

## Testing
- Manual testing with legacy TIFF files from libvips 8.7.0
- Manual testing with multi-file DICOM series

## References
- libvips bug: https://github.com/libvips/libvips/issues/1421

5. Documentation Updates Needed

File: src/aignostics/wsi/CLAUDE.md

Should document:

  • The libvips MPP scaling bug and correction factor logic
  • DICOM series filtering behavior
  • OpenSlide's automatic detection of multi-file DICOM series

File: src/aignostics/application/CLAUDE.md

Should document:

  • DICOM series filtering in metadata generation
  • Why only highest resolution file is included

6. Error Handling in _filter_dicom_series_files()

Current code (line 324):

except Exception as e:
    logger.debug(f"Could not read DICOM {dcm_file}: {e}")
    # Treat as standalone - don't exclude

Issue: Bare Exception catch is too broad.

Suggestion:

except (pydicom.errors.InvalidDicomError, OSError, KeyError) as e:
    logger.debug(f"Could not read DICOM {dcm_file}: {e}")
    # Treat as standalone - don't exclude

7. Redundant MPP Correction Factor Calculation

File: _openslide_handler.py:163

mpp_correction_factor = self._get_mpp_correction_factor(props) if "tiff.XResolution" in props else 1.0

Then again at line 211:

mpp_correction_factor = self._get_mpp_correction_factor(props)

Issue: The conditional check (if "tiff.XResolution" in props) is only applied once.

Suggestion: Either:

  1. Make _get_mpp_correction_factor() handle this check internally, OR
  2. Use the same pattern in both places for consistency

✅ POSITIVE ASPECTS

  1. Critical bug fix: Addresses real-world compatibility issues with legacy TIFF files
  2. Performance optimization: Avoids redundant processing of multi-file DICOM series
  3. Clean separation: New methods are well-encapsulated
  4. Good logging: Debug messages help with troubleshooting
  5. Type safety: MyPy passes with no issues
  6. Documentation inline: Docstrings explain the libvips bug clearly

🔍 ARCHITECTURAL COMPLIANCE

Modulith principles: Changes are within appropriate module boundaries
Service pattern: No architectural violations
Dependencies: Appropriate use of pydicom and packaging libraries
No breaking changes: Backward compatible


📋 ACTION ITEMS

Before merge:

  1. Fix 5 linting errors (see section 1 above)
  2. Add test coverage for both new methods (see section 2)
  3. Update PR description with problem/solution/testing
  4. Verify tests pass: make test_unit
  5. Verify linting passes: make lint

Optional but recommended:

  1. Update commit messages to follow conventional commits
  2. Update CLAUDE.md documentation for both modules
  3. Consider more specific exception handling in DICOM filtering

🎯 VERIFICATION COMMANDS

# Fix linting issues
uv run ruff check --fix src/aignostics/application/_service.py src/aignostics/wsi/_openslide_handler.py

# Run full lint check
make lint

# Run tests
make test_unit

# Check coverage
uv run pytest --cov=aignostics.application --cov=aignostics.wsi tests/aignostics/application tests/aignostics/wsi

🏥 MEDICAL DEVICE CONTEXT REMINDER

This SDK processes medical images used in cancer diagnosis. The fixes in this PR directly impact:

  • MPP accuracy: Critical for measuring tumor sizes and features
  • DICOM processing: Must comply with DICOM standards
  • Reproducibility: Same image must produce same results

Therefore: Test coverage is not optional—it's a regulatory requirement.


Overall Assessment: Good fix for real-world issues, but needs linting fixes and test coverage before merge. Medical device software demands comprehensive testing.

@codecov
Copy link

codecov bot commented Nov 20, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
526 1 525 3
View the full list of 1 ❄️ flaky test(s)
tests.aignostics.wsi.cli_test::test_inspect_openslide_dicom

Flake rate in main: 20.00% (Passed 8 times, Failed 2 times)

Stack Traces | 0.082s run time
runner = <typer.testing.CliRunner object at 0x7f64eda734d0>
record_property = <function record_property.<locals>.append_property at 0x7f64edaa4c20>

    @pytest.mark.integration
    @pytest.mark.timeout(timeout=60 * 5)
    def test_inspect_openslide_dicom(runner: CliRunner, record_property) -> None:
        """Check expected column returned."""
        record_property("tested-item-id", "SPEC-WSI-SERVICE")
        file_path = Path(__file__).parent.parent.parent / "resources" / "run" / "small-pyramidal.dcm"
        result = runner.invoke(cli, ["wsi", "inspect", str(file_path)])
        assert result.exit_code == 0
>       assert all(
            index in result.output
            for index in [
                "Format: dicom",
                "MPP (x): 8.065226874391001",
                "MPP (y): 8.065226874391001",
                "Dimensions: 2054 x 1529 pixels",
                "Tile size: 256 x 256 pixels",
            ]
        )
E       assert False
E        +  where False = all(<generator object test_inspect_openslide_dicom.<locals>.<genexpr> at 0x7f64edbda330>)

.../aignostics/wsi/cli_test.py:26: AssertionError

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

* Add mpp correction factor for tiffs written using older versions of
vips containing a bug that wrote resolution as px / mm instead of cm.
* Select only highest resolution file from DICOM series in multiple files.
@blanca-pablos blanca-pablos force-pushed the fix/legacy-mpp-multiple-dcm-files branch from c504295 to 55c4343 Compare November 20, 2025 10:34
@github-actions
Copy link
Contributor

github-actions bot commented Nov 20, 2025

Claude finished @blanca-pablos's task —— View job


PR Review: Fix legacy mpp factor + multiple dcm files

Comprehensive review of PR #270 for medical device SDK completed.

Todo List:

  • Read changed files to understand implementation
  • Check linting status
  • Investigate failing test
  • Check test coverage for new code
  • Review architecture compliance
  • Check commit messages
  • Verify medical device considerations
  • Provide final review summary

Summary

This PR addresses two important medical imaging compatibility issues:

  1. Legacy MPP correction for TIFF files from buggy libvips <8.8.3
  2. DICOM series filtering to avoid processing duplicate files from multi-file series

Overall Assessment: Good fix for real-world issues, but has 1 CRITICAL bug causing test failure, missing test coverage (regulatory requirement), and several smaller issues.


🔴 BLOCKING ISSUES (Must Fix Before Merge)

1. CRITICAL BUG: MPP Correction Applied to DICOM Files ❌

Location: src/aignostics/wsi/_openslide_handler.py:212

Issue: The MPP correction factor is being incorrectly applied to DICOM files, breaking the test and producing wrong MPP values (10x error in medical measurements!).

Root Cause: Inconsistent logic between line 164 and line 212.

  • Line 164 (✅ correct): mpp_correction_factor = self._get_mpp_correction_factor(props) if "tiff.XResolution" in props else 1.0
  • Line 212 (❌ wrong): mpp_correction_factor = self._get_mpp_correction_factor(props)

Evidence:

  • Test failure: expects MPP (x): 8.065226874391001, got MPP (x): 0.8065226874391002 (10x smaller)
  • DICOM files have no tiff.ImageDescription, so function hits exception and returns LEGACY_MPP_FACTOR = 0.1

Fix:

# Line 212 should be:
mpp_correction_factor = self._get_mpp_correction_factor(props) if "tiff.XResolution" in props else 1.0

Why Critical: In computational pathology, incorrect MPP can lead to:

  • Wrong tumor size measurements
  • Incorrect cell counting
  • Failed regulatory audits
  • Potential misdiagnosis

Verify Fix:

uv run pytest tests/aignostics/wsi/cli_test.py::test_inspect_openslide_dicom -v

2. Missing Test Coverage (Regulatory Requirement) ❌

No tests exist for the new functionality:

A. _filter_dicom_series_files() (application/_service.py:314-359)

  • Critical logic for multi-file DICOM series selection
  • Zero test coverage

B. _get_mpp_correction_factor() (wsi/_openslide_handler.py:67-93)

  • Handles libvips version detection and MPP scaling
  • Zero test coverage

Required Test Cases:

For DICOM filtering (tests/aignostics/application/service_test.py):

@pytest.mark.unit
def test_filter_dicom_series_single_file(tmp_path):
    """Single DICOM file should not be filtered."""
    
@pytest.mark.unit
def test_filter_dicom_series_multiple_files_same_series(tmp_path):
    """Keep highest resolution file from multi-file series."""
    
@pytest.mark.unit
def test_filter_dicom_series_multiple_series(tmp_path):
    """Filter each series independently."""
    
@pytest.mark.unit
def test_filter_dicom_series_invalid_files(tmp_path):
    """Files that fail to read should not be excluded."""

For MPP correction (tests/aignostics/wsi/openslide_handler_test.py or service_test.py):

@pytest.mark.unit
def test_mpp_correction_legacy_libvips():
    """libvips <8.8.3 should apply 0.1 correction."""
    props = {"tiff.ImageDescription": '<vips libVips-version="8.7.0" />'}
    # Assert correction = 0.1
    
@pytest.mark.unit
def test_mpp_correction_modern_libvips():
    """libvips ≥8.8.3 should apply 1.0 correction."""
    props = {"tiff.ImageDescription": '<vips libVips-version="8.8.3" />'}
    # Assert correction = 1.0
    
@pytest.mark.unit
def test_mpp_correction_no_version():
    """Missing version should default to 0.1 (legacy)."""
    props = {"tiff.ImageDescription": '<vips />'}
    # Assert correction = 0.1
    
@pytest.mark.unit
def test_mpp_correction_no_tiff_metadata():
    """Non-TIFF files should use 1.0 correction."""
    props = {}  # No tiff.ImageDescription
    # Assert correction = 1.0

Why Critical:

  • This is FDA/MDR regulated medical device software
  • 85% coverage requirement must be maintained (currently: 20% on new code per SonarQube)
  • Test coverage is a regulatory requirement, not optional

Action Required: Add comprehensive unit tests before merge.


3. Quality Gate Failures ❌

SonarQube Report:

  • ❌ 1 New issue
  • ❌ 1 Security Hotspot (needs review)
  • 20.0% Coverage on New Code (required ≥ 80%)

Action: Address SonarQube findings and add tests to meet 80% coverage threshold.


🟡 IMPORTANT ISSUES (Should Fix)

4. Commit Message Convention Violation

Current commits:

55c4343 Update wsi info command in README
9e89446 Add mpp correction factor and highest res .dcm selection.

Issues:

  • First commit missing conventional commit prefix
  • Second commit has period at end (should be removed per convention)

Should be:

docs(wsi): update wsi info command in README
fix(wsi): add mpp correction factor and highest res dcm selection

Why it matters: Conventional commits enable automatic changelog generation per cliff.toml.

Reference: See root CLAUDE.md - "Commit Convention" section.


5. Missing PR Description

Issue: No PR description provided. For medical device software, this is concerning.

Should include:

  • Problem statement: What bugs are being fixed?
  • Root cause: Why did these issues occur?
  • Solution approach: How are they being fixed?
  • Testing strategy: How was it validated? (Manual testing is not sufficient)
  • References: Link to libvips bug report, DICOM specification

Example:

## Problem
1. TIFF files exported with libvips <8.8.3 have incorrect MPP values (10x off)
2. Multi-file DICOM series cause duplicate processing of same image data

## Solution
1. Detect libvips version from TIFF metadata, apply 0.1 correction for buggy versions
2. Filter DICOM files by SeriesInstanceUID, keeping highest resolution per series

## Testing
- ✅ Linting passes
- ❌ Need unit tests for new functionality (see review)
- ✅ Manual testing with legacy TIFF files
- ✅ Manual testing with multi-file DICOM series

## References
- libvips bug: https://github.com/libvips/libvips/issues/1421
- DICOM SeriesInstanceUID: DICOM PS3.3 Section C.7.3.1

6. Documentation Updates Needed

Files to update:

A. src/aignostics/wsi/CLAUDE.md

  • Document the libvips MPP scaling bug and correction logic
  • Explain DICOM series filtering behavior
  • Note OpenSlide's automatic multi-file DICOM series detection

B. src/aignostics/application/CLAUDE.md

  • Document DICOM series filtering in metadata generation
  • Explain why only highest resolution file is included per series

Why: The root CLAUDE.md states: "Each module has detailed CLAUDE.md documentation" - these should be kept up to date.


🟢 POSITIVE ASPECTS

  1. Critical bug fix: Addresses real-world compatibility with legacy TIFF files
  2. Performance optimization: Avoids redundant processing of multi-file DICOM series
  3. Clean code: Methods are well-encapsulated and documented
  4. Good logging: Debug messages help with troubleshooting
  5. Linting passes: All 4 checks (ruff format, ruff check, pyright, mypy) pass
  6. Type safety: No type checking issues
  7. Good inline docs: Docstrings explain the libvips bug clearly
  8. Architecture compliance: No modulith boundary violations

🏥 MEDICAL DEVICE CONSIDERATIONS

Severity of Issues

MPP accuracy is critical in computational pathology:

  • Used for tumor size measurements (T-stage in TNM classification)
  • Cell density calculations
  • Feature extraction for AI/ML models
  • Regulatory compliance (FDA/MDR)

The bug at line 212 would cause:

  • All DICOM files to have 10x smaller MPP values
  • Measurements to be 10x off
  • Potential misdiagnosis if used clinically

Regulatory Context

Per repository guidelines (CLAUDE.md - "Medical Domain Context"):

  • This is HIPAA-compliant medical data processing
  • DICOM standard compliance required
  • Test coverage is regulatory requirement (not "nice to have")
  • FDA/MDR submission may require test documentation

📋 ACTION ITEMS

Must Fix Before Merge:

  1. Fix the MPP correction bug (line 212 in _openslide_handler.py)

    mpp_correction_factor = self._get_mpp_correction_factor(props) if "tiff.XResolution" in props else 1.0
  2. Add comprehensive unit tests for:

    • _filter_dicom_series_files() (5+ test cases)
    • _get_mpp_correction_factor() (4+ test cases)
  3. Verify all tests pass:

    make test_unit
    uv run pytest tests/aignostics/wsi/cli_test.py::test_inspect_openslide_dicom -v
  4. Verify linting still passes:

    make lint
  5. Check coverage (must be ≥85% overall, ≥80% on new code):

    uv run pytest --cov=aignostics.application --cov=aignostics.wsi \
      tests/aignostics/application tests/aignostics/wsi

Should Fix (Recommended):

  1. Add PR description with problem/solution/testing/references
  2. Update commit messages to follow conventional commits format
  3. Update CLAUDE.md documentation for both modules
  4. Address SonarQube security hotspot

🔍 ARCHITECTURAL REVIEW

Modulith principles: Changes within appropriate module boundaries
Service pattern: _filter_dicom_series_files() is static method, appropriate
Dependencies: Proper use of pydicom, packaging.version
No circular imports: Clean dependency graph
No breaking changes: Backward compatible
Error handling: Appropriate exception handling in DICOM filtering


🎯 VERIFICATION COMMANDS

After fixing the bug and adding tests:

# Fix linting if needed
make lint

# Run specific failing test
uv run pytest tests/aignostics/wsi/cli_test.py::test_inspect_openslide_dicom -v

# Run all WSI tests
uv run pytest tests/aignostics/wsi/ -v

# Run all application tests  
uv run pytest tests/aignostics/application/ -v

# Check coverage
uv run pytest --cov=aignostics.application --cov=aignostics.wsi \
  --cov-report=term-missing tests/aignostics/application tests/aignostics/wsi

# Full test suite
make test

📊 REVIEW SUMMARY

Category Status Details
Linting ✅ Pass All checks pass
Type Checking ✅ Pass MyPy & PyRight pass
Tests 1 Failure test_inspect_openslide_dicom
Test Coverage Critical 20% on new code (need 80%)
Architecture ✅ Pass No violations
Security 🟡 Review 1 SonarQube hotspot
Documentation 🟡 Missing No PR description, CLAUDE.md updates needed
Commits 🟡 Non-standard Not following convention

🏁 FINAL VERDICT

Cannot merge until:

  1. ✅ Line 212 bug is fixed (MPP correction for DICOM)
  2. ✅ Comprehensive unit tests added
  3. ✅ All tests passing
  4. ✅ Coverage ≥80% on new code

This is excellent work addressing real-world medical imaging issues, but medical device software requires comprehensive testing. The line 212 bug is critical and would cause 10x measurement errors in DICOM files.

Recommendation: Fix the bug first, then add tests, then we can discuss documentation improvements in a follow-up.


@github-actions
Copy link
Contributor

github-actions bot commented Nov 22, 2025

Claude encountered an error —— View job


I'll analyze this and get back to you.

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
1 New issue
1 Security Hotspot
14.3% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants