-
Notifications
You must be signed in to change notification settings - Fork 102
🆕 Define NucleusDetector
#967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev-define-engines-abc
Are you sure you want to change the base?
🆕 Define NucleusDetector
#967
Conversation
|
This PR will take over #538 |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a new NucleusDetector engine for nucleus detection models (MapDe, SCCNN) that processes whole slide images (WSI) using Dask for parallel processing. The detection pipeline performs WSI-level segmentation, applies post-processing via map_overlap, and converts results to detection records stored in SQLiteStore.
Key Changes:
- New
NucleusDetectorclass extendingSemanticSegmentorwith specialized post-processing for nucleus detection - Modified
postprocmethods in MapDe and SCCNN models to support chunk-based processing with Dask - Updated model configurations to include
postproc_tile_shapeparameter and corrected ioconfig class paths
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| tiatoolbox/models/engine/nucleus_detector.py | New NucleusDetector engine with methods for WSI-level detection, NMS, chunk processing, and AnnotationStore conversion |
| tiatoolbox/models/engine/engine_abc.py | Updated ProgressBar import path to use specific module |
| tiatoolbox/models/engine/init.py | Added nucleus_detector to module exports |
| tiatoolbox/models/architecture/sccnn.py | Refactored postproc method for Dask compatibility, updated infer_batch return type, added postproc_tile_shape parameter |
| tiatoolbox/models/architecture/mapde.py | Refactored postproc method for Dask compatibility, updated infer_batch return type, added postproc_tile_shape parameter |
| tiatoolbox/data/pretrained_model.yaml | Added postproc_tile_shape to model configs, corrected ioconfig class paths, removed tile_shape from ioconfig sections |
| tests/engines/test_nucleus_detection_engine.py | Placeholder test file with commented-out test cases (needs implementation) |
| test.py | Temporary test script with hardcoded paths (should be removed or gitignored) |
| docs/pretrained.rst | Corrected resolution values from 0.25 to 0.5 mpp in ioconfig examples |
Comments suppressed due to low confidence (4)
tiatoolbox/models/engine/nucleus_detector.py:286
- Variable r2 is not used.
r2 = float(radius) * float(radius)
tiatoolbox/models/engine/nucleus_detector.py:221
- This assignment to 'scale_factor' is unnecessary as it is redefined before this value is used.
scale_factor = kwargs.get("scale_factor", (1.0, 1.0))
tiatoolbox/models/engine/nucleus_detector.py:223
- This assignment to 'class_dict' is unnecessary as it is redefined before this value is used.
class_dict = kwargs.get("class_dict")
tiatoolbox/models/engine/nucleus_detector.py:285
- This assignment to 'coords' is unnecessary as it is redefined before this value is used.
coords = sub[["x", "y"]].to_numpy(dtype=np.float64)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import pathlib | ||
|
|
||
| from tiatoolbox.models.engine.nucleus_detector import ( | ||
| NucleusDetector, | ||
| ) | ||
| from tiatoolbox.utils import env_detection as toolbox_env | ||
|
|
||
| ON_GPU = not toolbox_env.running_on_ci() and toolbox_env.has_gpu() | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| # model_name = "sccnn-crchisto" | ||
| model_name = "mapde-conic" | ||
|
|
||
| detector = NucleusDetector(model=model_name, batch_size=16, num_workers=8) | ||
| detector.run( | ||
| images=[pathlib.Path("/media/u1910100/data/slides/CMU-1-Small-Region.svs")], | ||
| patch_mode=False, | ||
| device="cuda", | ||
| save_dir=pathlib.Path("/media/u1910100/data/overlays/test"), | ||
| overwrite=True, | ||
| output_type="annotationstore", | ||
| class_dict={0: "nucleus"}, | ||
| auto_get_mask=True, | ||
| memory_threshold=80, | ||
| ) |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test file contains hardcoded absolute paths (/media/u1910100/data/...) that are specific to a development environment. This file should either be removed before merging or added to .gitignore as it appears to be a temporary testing script rather than part of the codebase.
| sub = df.sort_values("prob", ascending=False).reset_index(drop=True) | ||
|
|
||
| # Coordinates as float64 for distance math | ||
| coords = sub[["x", "y"]].to_numpy(dtype=np.float64) |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate code: Variable coords is assigned twice on lines 285 and 288. Line 288 reassigns coords with different values, making line 285 redundant. Consider removing the first assignment or renaming one of the variables.
| coords = sub[["x", "y"]].to_numpy(dtype=np.float64) |
|
|
||
| # Coordinates as float64 for distance math | ||
| coords = sub[["x", "y"]].to_numpy(dtype=np.float64) | ||
| r2 = float(radius) * float(radius) |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable r2 is assigned but never used (line 286). This appears to be dead code that should be removed.
| r2 = float(radius) * float(radius) |
| def _chunk_to_records( | ||
| block: np.ndarray, block_info: dict | ||
| ) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type hint for return type. The function signature shows -> Tuple[...] but Tuple is not imported. Add from typing import Tuple to the imports at the top of the file, or use tuple (lowercase) which is the built-in Python 3.9+ syntax.
| def _write_records_to_store( | ||
| recs: Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray], |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type hint for parameter. The recs parameter has type Tuple[...] but Tuple is not imported. Add from typing import Tuple to the imports at the top of the file, or use tuple (lowercase) which is the built-in Python 3.9+ syntax.
| def _write_records_to_store( | ||
| recs: Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray], | ||
| store: SQLiteStore, | ||
| scale_factor: Tuple[float, float], |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type hint for parameter. The scale_factor parameter has type Tuple[float, float] but Tuple is not imported. Add from typing import Tuple to the imports at the top of the file, or use tuple (lowercase) which is the built-in Python 3.9+ syntax.
|
|
||
| ON_GPU = not toolbox_env.running_on_ci() and toolbox_env.has_gpu() | ||
|
|
||
|
|
||
| # def _rm_dir(path): | ||
| # """Helper func to remove directory.""" | ||
| # if pathlib.Path(path).exists(): | ||
| # shutil.rmtree(path, ignore_errors=True) | ||
|
|
||
|
|
||
| # def check_output(path): | ||
| # """Check NucleusDetector output.""" | ||
| # coordinates = pd.read_csv(path) | ||
| # assert coordinates.x[0] == pytest.approx(53, abs=2) | ||
| # assert coordinates.x[1] == pytest.approx(55, abs=2) | ||
| # assert coordinates.y[0] == pytest.approx(107, abs=2) | ||
| # assert coordinates.y[1] == pytest.approx(127, abs=2) | ||
|
|
||
|
|
||
| # def test_nucleus_detector_engine(remote_sample, tmp_path): | ||
| # """Test for nucleus detection engine.""" | ||
| # mini_wsi_svs = pathlib.Path(remote_sample("wsi4_512_512_svs")) | ||
|
|
||
| # nucleus_detector = NucleusDetector(pretrained_model="mapde-conic") | ||
| # _ = nucleus_detector.predict( | ||
| # [mini_wsi_svs], | ||
| # mode="wsi", | ||
| # save_dir=tmp_path / "output", | ||
| # on_gpu=ON_GPU, | ||
| # ) | ||
|
|
||
| # check_output(tmp_path / "output" / "0.locations.0.csv") | ||
|
|
||
| # _rm_dir(tmp_path / "output") | ||
|
|
||
| # ioconfig = IONucleusDetectorConfig( | ||
| # input_resolutions=[{"units": "mpp", "resolution": 0.5}], | ||
| # output_resolutions=[{"units": "mpp", "resolution": 0.5}], | ||
| # save_resolution=None, | ||
| # patch_input_shape=[252, 252], | ||
| # patch_output_shape=[252, 252], | ||
| # stride_shape=[150, 150], | ||
| # ) | ||
|
|
||
| # nucleus_detector = NucleusDetector(pretrained_model="mapde-conic") | ||
| # _ = nucleus_detector.predict( | ||
| # [mini_wsi_svs], | ||
| # mode="wsi", | ||
| # save_dir=tmp_path / "output", | ||
| # on_gpu=ON_GPU, | ||
| # ioconfig=ioconfig, | ||
| # ) | ||
|
|
||
| # check_output(tmp_path / "output" / "0.locations.0.csv") |
Copilot
AI
Nov 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment appears to contain commented-out code.
| ON_GPU = not toolbox_env.running_on_ci() and toolbox_env.has_gpu() | |
| # def _rm_dir(path): | |
| # """Helper func to remove directory.""" | |
| # if pathlib.Path(path).exists(): | |
| # shutil.rmtree(path, ignore_errors=True) | |
| # def check_output(path): | |
| # """Check NucleusDetector output.""" | |
| # coordinates = pd.read_csv(path) | |
| # assert coordinates.x[0] == pytest.approx(53, abs=2) | |
| # assert coordinates.x[1] == pytest.approx(55, abs=2) | |
| # assert coordinates.y[0] == pytest.approx(107, abs=2) | |
| # assert coordinates.y[1] == pytest.approx(127, abs=2) | |
| # def test_nucleus_detector_engine(remote_sample, tmp_path): | |
| # """Test for nucleus detection engine.""" | |
| # mini_wsi_svs = pathlib.Path(remote_sample("wsi4_512_512_svs")) | |
| # nucleus_detector = NucleusDetector(pretrained_model="mapde-conic") | |
| # _ = nucleus_detector.predict( | |
| # [mini_wsi_svs], | |
| # mode="wsi", | |
| # save_dir=tmp_path / "output", | |
| # on_gpu=ON_GPU, | |
| # ) | |
| # check_output(tmp_path / "output" / "0.locations.0.csv") | |
| # _rm_dir(tmp_path / "output") | |
| # ioconfig = IONucleusDetectorConfig( | |
| # input_resolutions=[{"units": "mpp", "resolution": 0.5}], | |
| # output_resolutions=[{"units": "mpp", "resolution": 0.5}], | |
| # save_resolution=None, | |
| # patch_input_shape=[252, 252], | |
| # patch_output_shape=[252, 252], | |
| # stride_shape=[150, 150], | |
| # ) | |
| # nucleus_detector = NucleusDetector(pretrained_model="mapde-conic") | |
| # _ = nucleus_detector.predict( | |
| # [mini_wsi_svs], | |
| # mode="wsi", | |
| # save_dir=tmp_path / "output", | |
| # on_gpu=ON_GPU, | |
| # ioconfig=ioconfig, | |
| # ) | |
| # check_output(tmp_path / "output" / "0.locations.0.csv") | |
| import pathlib | |
| import shutil | |
| import pandas as pd | |
| import pytest | |
| from tiatoolbox.tools.engine import NucleusDetector, IONucleusDetectorConfig | |
| ON_GPU = not toolbox_env.running_on_ci() and toolbox_env.has_gpu() | |
| def _rm_dir(path): | |
| """Helper func to remove directory.""" | |
| if pathlib.Path(path).exists(): | |
| shutil.rmtree(path, ignore_errors=True) | |
| def check_output(path): | |
| """Check NucleusDetector output.""" | |
| coordinates = pd.read_csv(path) | |
| assert coordinates.x[0] == pytest.approx(53, abs=2) | |
| assert coordinates.x[1] == pytest.approx(55, abs=2) | |
| assert coordinates.y[0] == pytest.approx(107, abs=2) | |
| assert coordinates.y[1] == pytest.approx(127, abs=2) | |
| def test_nucleus_detector_engine(remote_sample, tmp_path): | |
| """Test for nucleus detection engine.""" | |
| mini_wsi_svs = pathlib.Path(remote_sample("wsi4_512_512_svs")) | |
| nucleus_detector = NucleusDetector(pretrained_model="mapde-conic") | |
| _ = nucleus_detector.predict( | |
| [mini_wsi_svs], | |
| mode="wsi", | |
| save_dir=tmp_path / "output", | |
| on_gpu=ON_GPU, | |
| ) | |
| check_output(tmp_path / "output" / "0.locations.0.csv") | |
| _rm_dir(tmp_path / "output") | |
| ioconfig = IONucleusDetectorConfig( | |
| input_resolutions=[{"units": "mpp", "resolution": 0.5}], | |
| output_resolutions=[{"units": "mpp", "resolution": 0.5}], | |
| save_resolution=None, | |
| patch_input_shape=[252, 252], | |
| patch_output_shape=[252, 252], | |
| stride_shape=[150, 150], | |
| ) | |
| nucleus_detector = NucleusDetector(pretrained_model="mapde-conic") | |
| _ = nucleus_detector.predict( | |
| [mini_wsi_svs], | |
| mode="wsi", | |
| save_dir=tmp_path / "output", | |
| on_gpu=ON_GPU, | |
| ioconfig=ioconfig, | |
| ) | |
| check_output(tmp_path / "output" / "0.locations.0.csv") |
This PR defines a new
NucleusDetectorengine. It would be used by nucleus detection models such asMapDeandKongNet.After having obtained the WSI-level segmentation map (dask array from
SemanticSegmentor.infer_wsi()), we use daskmap_overlapto map the model's postproc function across the entire WSI segmentation map,map_overlapallows the postproc function to be executed in parallel then automatically then merges the results back into dask array -> "detection_map".The "detection_map" has the same dimensions as the segmentation map, but the nuclei centroids would have the values of the detection probability, if the model does not produce detection probability then it is set to 1.
The "detection_map" is converted into "detection_records" using dask
map_blocks, which processes chunks in parallel, each chunk would be processed into a tuple of numpy arrays:[[x_coords], [y_coords], [class_ids], [probs]]representing the detected nuclei; then the records are saved intoSQLiteStorechunk by chunk.Currently I've only tested it with
mapdemodels,sccnndetects 0 nuclei but I think there is something wrong with the model itsself #969.Also it currently only saves the results as SQLiteStore. We can save the "detection_map" as zarr although I don't know why would any want this for nuclei detection, but I can add it anyway.
NucleusDetectorengine