Skip to content

Conversation

@Jiaqi-Lv
Copy link
Collaborator

@Jiaqi-Lv Jiaqi-Lv commented Nov 6, 2025

This PR defines a new NucleusDetector engine. It would be used by nucleus detection models such as MapDe and KongNet.

After having obtained the WSI-level segmentation map (dask array from SemanticSegmentor.infer_wsi()), we use dask map_overlap to map the model's postproc function across the entire WSI segmentation map, map_overlap allows the postproc function to be executed in parallel then automatically then merges the results back into dask array -> "detection_map".
The "detection_map" has the same dimensions as the segmentation map, but the nuclei centroids would have the values of the detection probability, if the model does not produce detection probability then it is set to 1.
The "detection_map" is converted into "detection_records" using dask map_blocks, which processes chunks in parallel, each chunk would be processed into a tuple of numpy arrays: [[x_coords], [y_coords], [class_ids], [probs]] representing the detected nuclei; then the records are saved into SQLiteStore chunk by chunk.

Currently I've only tested it with mapde models, sccnn detects 0 nuclei but I think there is something wrong with the model itsself #969.
Also it currently only saves the results as SQLiteStore. We can save the "detection_map" as zarr although I don't know why would any want this for nuclei detection, but I can add it anyway.

  • Copy code from the old PR over so we can close the old PR ✨ NEW: Add Nucleus Detection Engine #538.
  • Add NucleusDetector engine
  • Update existing detection models (mapde is done, sccnn has bug)
  • Test "patch mode" inference
  • Add unit tests

@Jiaqi-Lv Jiaqi-Lv added the enhancement New feature or request label Nov 6, 2025
@shaneahmed
Copy link
Member

This PR will take over #538

@Jiaqi-Lv Jiaqi-Lv marked this pull request as draft November 7, 2025 17:29
@Jiaqi-Lv Jiaqi-Lv self-assigned this Nov 7, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new NucleusDetector engine for nucleus detection models (MapDe, SCCNN) that processes whole slide images (WSI) using Dask for parallel processing. The detection pipeline performs WSI-level segmentation, applies post-processing via map_overlap, and converts results to detection records stored in SQLiteStore.

Key Changes:

  • New NucleusDetector class extending SemanticSegmentor with specialized post-processing for nucleus detection
  • Modified postproc methods in MapDe and SCCNN models to support chunk-based processing with Dask
  • Updated model configurations to include postproc_tile_shape parameter and corrected ioconfig class paths

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
tiatoolbox/models/engine/nucleus_detector.py New NucleusDetector engine with methods for WSI-level detection, NMS, chunk processing, and AnnotationStore conversion
tiatoolbox/models/engine/engine_abc.py Updated ProgressBar import path to use specific module
tiatoolbox/models/engine/init.py Added nucleus_detector to module exports
tiatoolbox/models/architecture/sccnn.py Refactored postproc method for Dask compatibility, updated infer_batch return type, added postproc_tile_shape parameter
tiatoolbox/models/architecture/mapde.py Refactored postproc method for Dask compatibility, updated infer_batch return type, added postproc_tile_shape parameter
tiatoolbox/data/pretrained_model.yaml Added postproc_tile_shape to model configs, corrected ioconfig class paths, removed tile_shape from ioconfig sections
tests/engines/test_nucleus_detection_engine.py Placeholder test file with commented-out test cases (needs implementation)
test.py Temporary test script with hardcoded paths (should be removed or gitignored)
docs/pretrained.rst Corrected resolution values from 0.25 to 0.5 mpp in ioconfig examples
Comments suppressed due to low confidence (4)

tiatoolbox/models/engine/nucleus_detector.py:286

  • Variable r2 is not used.
        r2 = float(radius) * float(radius)

tiatoolbox/models/engine/nucleus_detector.py:221

  • This assignment to 'scale_factor' is unnecessary as it is redefined before this value is used.
        scale_factor = kwargs.get("scale_factor", (1.0, 1.0))

tiatoolbox/models/engine/nucleus_detector.py:223

  • This assignment to 'class_dict' is unnecessary as it is redefined before this value is used.
        class_dict = kwargs.get("class_dict")

tiatoolbox/models/engine/nucleus_detector.py:285

  • This assignment to 'coords' is unnecessary as it is redefined before this value is used.
        coords = sub[["x", "y"]].to_numpy(dtype=np.float64)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1 to +26
import pathlib

from tiatoolbox.models.engine.nucleus_detector import (
NucleusDetector,
)
from tiatoolbox.utils import env_detection as toolbox_env

ON_GPU = not toolbox_env.running_on_ci() and toolbox_env.has_gpu()


if __name__ == "__main__":
# model_name = "sccnn-crchisto"
model_name = "mapde-conic"

detector = NucleusDetector(model=model_name, batch_size=16, num_workers=8)
detector.run(
images=[pathlib.Path("/media/u1910100/data/slides/CMU-1-Small-Region.svs")],
patch_mode=False,
device="cuda",
save_dir=pathlib.Path("/media/u1910100/data/overlays/test"),
overwrite=True,
output_type="annotationstore",
class_dict={0: "nucleus"},
auto_get_mask=True,
memory_threshold=80,
)
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test file contains hardcoded absolute paths (/media/u1910100/data/...) that are specific to a development environment. This file should either be removed before merging or added to .gitignore as it appears to be a temporary testing script rather than part of the codebase.

Copilot uses AI. Check for mistakes.
sub = df.sort_values("prob", ascending=False).reset_index(drop=True)

# Coordinates as float64 for distance math
coords = sub[["x", "y"]].to_numpy(dtype=np.float64)
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate code: Variable coords is assigned twice on lines 285 and 288. Line 288 reassigns coords with different values, making line 285 redundant. Consider removing the first assignment or renaming one of the variables.

Suggested change
coords = sub[["x", "y"]].to_numpy(dtype=np.float64)

Copilot uses AI. Check for mistakes.

# Coordinates as float64 for distance math
coords = sub[["x", "y"]].to_numpy(dtype=np.float64)
r2 = float(radius) * float(radius)
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable r2 is assigned but never used (line 286). This appears to be dead code that should be removed.

Suggested change
r2 = float(radius) * float(radius)

Copilot uses AI. Check for mistakes.
Comment on lines +334 to +336
def _chunk_to_records(
block: np.ndarray, block_info: dict
) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing type hint for return type. The function signature shows -> Tuple[...] but Tuple is not imported. Add from typing import Tuple to the imports at the top of the file, or use tuple (lowercase) which is the built-in Python 3.9+ syntax.

Copilot uses AI. Check for mistakes.
Comment on lines +375 to +376
def _write_records_to_store(
recs: Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray],
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing type hint for parameter. The recs parameter has type Tuple[...] but Tuple is not imported. Add from typing import Tuple to the imports at the top of the file, or use tuple (lowercase) which is the built-in Python 3.9+ syntax.

Copilot uses AI. Check for mistakes.
def _write_records_to_store(
recs: Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray],
store: SQLiteStore,
scale_factor: Tuple[float, float],
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing type hint for parameter. The scale_factor parameter has type Tuple[float, float] but Tuple is not imported. Add from typing import Tuple to the imports at the top of the file, or use tuple (lowercase) which is the built-in Python 3.9+ syntax.

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +57

ON_GPU = not toolbox_env.running_on_ci() and toolbox_env.has_gpu()


# def _rm_dir(path):
# """Helper func to remove directory."""
# if pathlib.Path(path).exists():
# shutil.rmtree(path, ignore_errors=True)


# def check_output(path):
# """Check NucleusDetector output."""
# coordinates = pd.read_csv(path)
# assert coordinates.x[0] == pytest.approx(53, abs=2)
# assert coordinates.x[1] == pytest.approx(55, abs=2)
# assert coordinates.y[0] == pytest.approx(107, abs=2)
# assert coordinates.y[1] == pytest.approx(127, abs=2)


# def test_nucleus_detector_engine(remote_sample, tmp_path):
# """Test for nucleus detection engine."""
# mini_wsi_svs = pathlib.Path(remote_sample("wsi4_512_512_svs"))

# nucleus_detector = NucleusDetector(pretrained_model="mapde-conic")
# _ = nucleus_detector.predict(
# [mini_wsi_svs],
# mode="wsi",
# save_dir=tmp_path / "output",
# on_gpu=ON_GPU,
# )

# check_output(tmp_path / "output" / "0.locations.0.csv")

# _rm_dir(tmp_path / "output")

# ioconfig = IONucleusDetectorConfig(
# input_resolutions=[{"units": "mpp", "resolution": 0.5}],
# output_resolutions=[{"units": "mpp", "resolution": 0.5}],
# save_resolution=None,
# patch_input_shape=[252, 252],
# patch_output_shape=[252, 252],
# stride_shape=[150, 150],
# )

# nucleus_detector = NucleusDetector(pretrained_model="mapde-conic")
# _ = nucleus_detector.predict(
# [mini_wsi_svs],
# mode="wsi",
# save_dir=tmp_path / "output",
# on_gpu=ON_GPU,
# ioconfig=ioconfig,
# )

# check_output(tmp_path / "output" / "0.locations.0.csv")
Copy link

Copilot AI Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment appears to contain commented-out code.

Suggested change
ON_GPU = not toolbox_env.running_on_ci() and toolbox_env.has_gpu()
# def _rm_dir(path):
# """Helper func to remove directory."""
# if pathlib.Path(path).exists():
# shutil.rmtree(path, ignore_errors=True)
# def check_output(path):
# """Check NucleusDetector output."""
# coordinates = pd.read_csv(path)
# assert coordinates.x[0] == pytest.approx(53, abs=2)
# assert coordinates.x[1] == pytest.approx(55, abs=2)
# assert coordinates.y[0] == pytest.approx(107, abs=2)
# assert coordinates.y[1] == pytest.approx(127, abs=2)
# def test_nucleus_detector_engine(remote_sample, tmp_path):
# """Test for nucleus detection engine."""
# mini_wsi_svs = pathlib.Path(remote_sample("wsi4_512_512_svs"))
# nucleus_detector = NucleusDetector(pretrained_model="mapde-conic")
# _ = nucleus_detector.predict(
# [mini_wsi_svs],
# mode="wsi",
# save_dir=tmp_path / "output",
# on_gpu=ON_GPU,
# )
# check_output(tmp_path / "output" / "0.locations.0.csv")
# _rm_dir(tmp_path / "output")
# ioconfig = IONucleusDetectorConfig(
# input_resolutions=[{"units": "mpp", "resolution": 0.5}],
# output_resolutions=[{"units": "mpp", "resolution": 0.5}],
# save_resolution=None,
# patch_input_shape=[252, 252],
# patch_output_shape=[252, 252],
# stride_shape=[150, 150],
# )
# nucleus_detector = NucleusDetector(pretrained_model="mapde-conic")
# _ = nucleus_detector.predict(
# [mini_wsi_svs],
# mode="wsi",
# save_dir=tmp_path / "output",
# on_gpu=ON_GPU,
# ioconfig=ioconfig,
# )
# check_output(tmp_path / "output" / "0.locations.0.csv")
import pathlib
import shutil
import pandas as pd
import pytest
from tiatoolbox.tools.engine import NucleusDetector, IONucleusDetectorConfig
ON_GPU = not toolbox_env.running_on_ci() and toolbox_env.has_gpu()
def _rm_dir(path):
"""Helper func to remove directory."""
if pathlib.Path(path).exists():
shutil.rmtree(path, ignore_errors=True)
def check_output(path):
"""Check NucleusDetector output."""
coordinates = pd.read_csv(path)
assert coordinates.x[0] == pytest.approx(53, abs=2)
assert coordinates.x[1] == pytest.approx(55, abs=2)
assert coordinates.y[0] == pytest.approx(107, abs=2)
assert coordinates.y[1] == pytest.approx(127, abs=2)
def test_nucleus_detector_engine(remote_sample, tmp_path):
"""Test for nucleus detection engine."""
mini_wsi_svs = pathlib.Path(remote_sample("wsi4_512_512_svs"))
nucleus_detector = NucleusDetector(pretrained_model="mapde-conic")
_ = nucleus_detector.predict(
[mini_wsi_svs],
mode="wsi",
save_dir=tmp_path / "output",
on_gpu=ON_GPU,
)
check_output(tmp_path / "output" / "0.locations.0.csv")
_rm_dir(tmp_path / "output")
ioconfig = IONucleusDetectorConfig(
input_resolutions=[{"units": "mpp", "resolution": 0.5}],
output_resolutions=[{"units": "mpp", "resolution": 0.5}],
save_resolution=None,
patch_input_shape=[252, 252],
patch_output_shape=[252, 252],
stride_shape=[150, 150],
)
nucleus_detector = NucleusDetector(pretrained_model="mapde-conic")
_ = nucleus_detector.predict(
[mini_wsi_svs],
mode="wsi",
save_dir=tmp_path / "output",
on_gpu=ON_GPU,
ioconfig=ioconfig,
)
check_output(tmp_path / "output" / "0.locations.0.csv")

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants