Skip to content

Commit d2ba7b0

Browse files
author
Fabian Hörst
committed
Extending preprocessing
1 parent a3a7f78 commit d2ba7b0

File tree

9 files changed

+333
-46
lines changed

9 files changed

+333
-46
lines changed

README.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -70,11 +70,10 @@ This repository contains the code implementation of CellViT, a deep learning-bas
7070

7171
1. Clone the repository:
7272
`git clone https://github.com/TIO-IKIM/CellViT.git`
73-
2. Create a conda environment with Python 3.9.7 version and install conda requirements: `conda env create -f environment.yml`. You can change the environment name by editing the `name` tag in the environment.yaml file.
73+
2. Create a conda environment with Python 3.10.12 version and install conda requirements: `conda env create -f environment.yml`. You can change the environment name by editing the `name` tag in the environment.yaml file.
7474
This step is necessary, as we need to install `Openslide` with binary files. This is easier with conda. Otherwise, installation from [source](https://openslide.org/api/python/) needs to be performed and packages installed with pi
7575
3. Activate environment: `conda activate cellvit_env`
76-
4. Install torch for for system, as described [here](https://pytorch.org/get-started/locally/). Preferred version is 1.13, see [optional_dependencies](./optional_dependencies.txt) for help. You can find all version here: https://pytorch.org/get-started/previous-versions/
77-
Example for CUDA 11.7: `pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117`
76+
4. Install torch (>=2.0) for your system, as described [here](https://pytorch.org/get-started/locally/). Preferred version is 2.0, see [optional_dependencies](./optional_dependencies.txt) for help. You can find all version here: https://pytorch.org/get-started/previous-versions/
7877

7978
5. Install optional dependencies `pip install -r optional_dependencies.txt` to get a speedup using [NVIDIA-Clara](https://www.nvidia.com/de-de/clara/) and [CuCIM](https://github.com/rapidsai/cucim) for preprocessing during inference. Please select your CUDA versions. Help for installing cucim can be found [online](https://github.com/rapidsai/cucim).
8079
**Note Error: cannot import name CuImage from cucim**

configs/examples/preprocessing/patch_extraction/patch_extraction.yaml

+9-1
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,14 @@ patch_overlap: # The percentage amount pixels that should overlap
1414
downsample: # Each WSI level is downsampled by a factor of 2, downsample
1515
# expresses which kind of downsampling should be used with
1616
# respect to the highest possible resolution. [int][Optional, defaults to 0]
17+
target_mpp: # If this parameter is provided, the output level of the WSI
18+
# corresponds to the level that is at the target microns per pixel of the WSI.
19+
# Alternative to target_mag, downsaple and level. Highest priority, overwrites all other setups for magnifcation, downsample, or level.
20+
# [int][Optional, defaults to None]
1721
target_mag: # If this parameter is provided, the output level of the WSI
1822
# corresponds to the level that is at the target magnification of the WSI.
19-
# Alternative to downsaple and level. [int][Optional, defaults to None]
23+
# Alternative to target_mpp, downsaple and level. High priority, just target_mpp has a higher priority, overwrites downsample and level if provided.
24+
# [int][Optional, defaults to None]
2025
level: # The tile level for sampling, alternative to downsample. [int][Optional, defaults to None]
2126
context_scales: # Define context scales for context patches. Context patches are centered around a central patch.
2227
# The context-patch size is equal to the patch-size, but downsampling is different.
@@ -56,8 +61,11 @@ tissue_annotation: # Can be used to name a polygon annotation to dete
5661
masked_otsu: # Use annotation to mask the thumbnail before otsu-thresholding is used. [bool][Optional, defaults to False]
5762
otsu_annotation: # Can be used to name a polygon annotation to determine the area
5863
# for masked otsu thresholding. [List][Optional, defaults to None]
64+
filter_patches: # Post-extraction patch filtering to sort out artefacts, marker and other non-tissue patches with a DL model. Time consuming.
65+
# [bool] [Optional, defaults to False]
5966

6067
# logging
6168
log_path: # Path where log files should be stored. Otherwise, log files are stored in the output folder. [str][Optional, defaults to None]
6269
log_level: # Set the logging level. [str][Optional, defaults to info]
6370
hardware_selection: # Select hardware device (just if available, otherwise always cucim). [str] [Optional, defaults to cucim]
71+
wsi_properties: # Dictionary with manual WSI metadata. Required keys are: ... TODO: add keys [dict] [Optional, default selection from files]

configs/python/config.py

+11-3
Original file line numberDiff line numberDiff line change
@@ -7,10 +7,18 @@
77

88
from typing import List
99

10-
WSI_EXT: List[str] = ["svs"]
11-
ANNOTATION_EXT: List[str] = ["json", "xml"]
10+
WSI_EXT: List[str] = [
11+
"svs",
12+
"tiff",
13+
"tif",
14+
"bif",
15+
"scn",
16+
"ndpi",
17+
"vms",
18+
"vmu",
19+
] # mirax not tested yet
20+
ANNOTATION_EXT: List[str] = ["json"]
1221
LOGGING_EXT: List[str] = ["critical", "error", "warning", "info", "debug"]
13-
1422
BACKBONES: List[str] = ["ResNet50", "ResNet50Bottleneck", "ResNet18", "ResNet34"]
1523

1624
# Currently: 30 Colors

docs/readmes/preprocessing.md

+21-4
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,21 @@
11
# Preprocessing
22

3+
In our Pre-Processing pipeline, we are able to extract quadratic patches from detected tissue areas, load annotation files (`.json`) and apply color normlizations. We make use of the popular [OpenSlide](https://openslide.org/) library, but extended it with the [RAPIDS cuCIM](https://github.com/rapidsai/cucim) framework for a speedup in patch-extraction.
4+
35
The CLI of the main script for patch extraction ([main_extraction](preprocessing/main_extraction.py)) is as follows:
6+
47
```bash
58
python3 main_extraction.py [-h]
6-
usage: main_extraction.py [-h]
79
[--wsi_paths WSI_PATHS]
810
[--wsi_filelist WSI_FILELIST]
911
[--output_path OUTPUT_PATH]
1012
[--wsi_extension {svs}]
1113
[--config CONFIG]
1214
[--patch_size PATCH_SIZE]
1315
[--patch_overlap PATCH_OVERLAP]
14-
[--downsample DOWNSAMPLE]
16+
[--target_mpp TARGET_MPP]
1517
[--target_mag TARGET_MAG]
18+
[--downsample DOWNSAMPLE]
1619
[--level LEVEL]
1720
[--context_scales [CONTEXT_SCALES ...]]
1821
[--check_resolution CHECK_RESOLUTION]
@@ -32,9 +35,11 @@ usage: main_extraction.py [-h]
3235
[--tissue_annotation TISSUE_ANNOTATION]
3336
[--masked_otsu]
3437
[--otsu_annotation OTSU_ANNOTATION]
38+
[--filter_patches FILTER_PATCHES]
3539
[--log_path LOG_PATH]
3640
[--log_level {critical,error,warning,info,debug}]
3741
[--hardware_selection {cucim,openslide}]
42+
[--wsi_properties DICT]
3843

3944
optional arguments:
4045
-h, --help show this help message and exit
@@ -62,10 +67,16 @@ optional arguments:
6267
downsampling should be used with respect to the highest possible resolution. Medium
6368
priority, gets overwritten by target_mag if provided, but overwrites level. (default:
6469
None)
70+
--target_mpp TARGET_MPP
71+
If this parameter is provided, the output level of the WSI corresponds to the level that
72+
is at the target microns per pixel of the WSI. Alternative to target_mag, downsaple and level.
73+
Highest priority,
74+
overwrites target_mag, downsample and level if provided. (default: None)
6575
--target_mag TARGET_MAG
6676
If this parameter is provided, the output level of the WSI corresponds to the level that
67-
is at the target magnification of the WSI. Alternative to downsaple and level. Highest
68-
priority, overwrites downsample and level if provided. (default: None)
77+
is at the target magnification of the WSI. Alternative to target_mpp, downsaple and level.
78+
High priority, just target_mpp has a higher priority,
79+
overwrites downsample and level if provided. (default: None)
6980
--level LEVEL The tile level for sampling, alternative to downsample. Lowest priority, gets overwritten
7081
by target_mag and downsample if they are provided. (default: None)
7182
--context_scales [CONTEXT_SCALES ...]
@@ -112,13 +123,19 @@ optional arguments:
112123
--otsu_annotation OTSU_ANNOTATION
113124
Can be used to name a polygon annotation to determine the area for masked otsu
114125
thresholding. Seperate multiple labels with ' ' (whitespace) (default: None)
126+
--filter_patches FILTER_PATCHES
127+
Post-extraction patch filtering to sort out artefacts, marker and other non-tissue patches with a DL model. Time consuming. Defaults to False.
128+
(default: False)
115129
--log_path LOG_PATH Path where log files should be stored. Otherwise, log files are stored in the output
116130
folder (default: None)
117131
--log_level {critical,error,warning,info,debug}
118132
Set the logging level. Options are ['critical', 'error', 'warning', 'info', 'debug']
119133
(default: None)
120134
--hardware_selection {cucim,openslide}
121135
Select hardware device (just if available, otherwise always cucim). Defaults to cucim.)
136+
--wsi_properties WSI_PROPERTIES
137+
Can be used to pass the wsi properties manually
138+
(default: None)
122139
```
123140
124141
**Label-Map**:

preprocessing/patch_extraction/src/cli.py

+41-11
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,9 @@ class PreProcessingYamlConfig(BaseModel):
3333
# basic setups
3434
patch_size: Optional[int]
3535
patch_overlap: Optional[float]
36-
downsample: Optional[int]
36+
target_mpp: Optional[float]
3737
target_mag: Optional[float]
38+
downsample: Optional[int]
3839
level: Optional[int]
3940
context_scales: Optional[List[int]]
4041
check_resolution: Optional[float]
@@ -62,11 +63,13 @@ class PreProcessingYamlConfig(BaseModel):
6263
tissue_annotation: Optional[str]
6364
masked_otsu: Optional[bool]
6465
otsu_annotation: Optional[str]
66+
filter_patches: Optional[bool]
6567

6668
# other
6769
log_path: Optional[str]
6870
log_level: Optional[str]
6971
hardware_selection: Optional[str]
72+
wsi_properties: Optional[dict]
7073

7174

7275
class PreProcessingConfig(BaseModel):
@@ -84,12 +87,15 @@ class PreProcessingConfig(BaseModel):
8487
patch_overlap (float, optional): The percentage amount pixels that should overlap between two different patches.
8588
Please Provide as integer between 0 and 100, indicating overlap in percentage.
8689
Defaults to 0.
90+
target_mpp (float, optional): If this parameter is provided, the output level of the WSI
91+
corresponds to the level that is at the target microns per pixel of the WSI.
92+
Alternative to target_mag, downsaple and level. Highest priority, overwrites all other setups for magnifcation, downsample, or level.
93+
target_mag (float, optional): If this parameter is provided, the output level of the WSI
94+
corresponds to the level that is at the target magnification of the WSI.
95+
Alternative to target_mpp, downsaple and level. High priority, just target_mpp has a higher priority, overwrites downsample and level if provided. Defaults to None.
8796
downsample (int, optional): Each WSI level is downsampled by a factor of 2, downsample
8897
expresses which kind of downsampling should be used with
8998
respect to the highest possible resolution. Defaults to 0.
90-
target_mag (float, optional): If this parameter is provided, the output level of the WSI
91-
corresponds to the level that is at the target magnification of the WSI.
92-
Alternative to downsaple and level. Defaults to None.
9399
level (int, optional): The tile level for sampling, alternative to downsample. Defaults to None.
94100
context_scales ([List[int], optional): Define context scales for context patches. Context patches are centered around a central patch.
95101
The context-patch size is equal to the patch-size, but downsampling is different.
@@ -125,9 +131,12 @@ class PreProcessingConfig(BaseModel):
125131
masked_otsu (bool, optional): Use annotation to mask the thumbnail before otsu-thresholding is used. Defaults to False.
126132
otsu_annotation (bool, optional): Can be used to name a polygon annotation to determine the area
127133
for masked otsu thresholding. Seperate multiple labels with ' ' (whitespace). Defaults to None.
134+
filter_patches (bool, optional): Post-extraction patch filtering to sort out artefacts, marker and other non-tissue patches with a DL model. Time consuming.
135+
Defaults to False.
128136
log_path (str, optional): Path where log files should be stored. Otherwise, log files are stored in the output folder. Defaults to None.
129137
log_level (str, optional): Set the logging level. Defaults to "info".
130138
hardware_selection (str, optional): Select hardware device (just if available, otherwise always cucim). Defaults to "cucim".
139+
wsi_properties (dict, optional): Dictionary with manual WSI metadata. Required keys are: ... TODO: add keys
131140
132141
Raises:
133142
ValueError: Patch-size must be positive
@@ -150,6 +159,7 @@ class PreProcessingConfig(BaseModel):
150159
patch_size: Optional[int] = 256
151160
patch_overlap: Optional[float] = 0
152161
downsample: Optional[int] = 1
162+
target_mpp: Optional[float]
153163
target_mag: Optional[float]
154164
level: Optional[int]
155165
context_scales: Optional[List[int]]
@@ -178,11 +188,13 @@ class PreProcessingConfig(BaseModel):
178188
tissue_annotation: Optional[str]
179189
masked_otsu: Optional[bool] = False
180190
otsu_annotation: Optional[str]
191+
filter_patches: Optional[bool] = False
181192

182193
# other
183194
log_path: Optional[str]
184195
log_level: Optional[str] = "info"
185196
hardware_selection: Optional[str] = "cucim"
197+
wsi_properties: Optional[dict]
186198

187199
def __init__(__pydantic_self__, **data: Any) -> None:
188200
super().__init__(**data)
@@ -340,19 +352,26 @@ def __init__(self) -> None:
340352
"Please Provide as integer between 0 and 100, indicating overlap in percentage.",
341353
)
342354
parser.add_argument(
343-
"--downsample",
344-
type=int,
345-
help="Each WSI level is downsampled by a factor of 2, downsample "
346-
"expresses which kind of downsampling should be used with "
347-
"respect to the highest possible resolution. Medium priority, gets overwritten by target_mag if provided, "
348-
"but overwrites level.",
355+
"--target_mpp",
356+
type=float,
357+
help="If this parameter is provided, the output level of the WSI "
358+
"corresponds to the level that is at the target microns per pixel of the WSI. "
359+
"Alternative to target_mag, downsaple and level. Highest priority, overwrites all other setups for magnifcation, downsample, or level.",
349360
)
350361
parser.add_argument(
351362
"--target_mag",
352363
type=float,
353364
help="If this parameter is provided, the output level of the WSI "
354365
"corresponds to the level that is at the target magnification of the WSI. "
355-
"Alternative to downsaple and level. Highest priority, overwrites downsample and level if provided.",
366+
"Alternative to target_mpp, downsaple and level. High priority, just target_mpp has a higher priority, overwrites downsample and level if provided.",
367+
)
368+
parser.add_argument(
369+
"--downsample",
370+
type=int,
371+
help="Each WSI level is downsampled by a factor of 2, downsample "
372+
"expresses which kind of downsampling should be used with "
373+
"respect to the highest possible resolution. Medium priority, gets overwritten by target_mag and target_mpp if provided, "
374+
"but overwrites level.",
356375
)
357376
parser.add_argument(
358377
"--level",
@@ -485,6 +504,12 @@ def __init__(self) -> None:
485504
help="Can be used to name a polygon annotation to determine the area "
486505
"for masked otsu thresholding. Seperate multiple labels with ' ' (whitespace)",
487506
)
507+
parser.add_argument(
508+
"--filter_patches",
509+
action="store_true",
510+
default=None,
511+
help="Post-extraction patch filtering to sort out artefacts, marker and other non-tissue patches with a DL model. Time consuming. Defaults to False.",
512+
)
488513

489514
# other
490515
parser.add_argument(
@@ -504,6 +529,11 @@ def __init__(self) -> None:
504529
choices=["cucim", "openslide"],
505530
help="Select hardware device (just if available, otherwise always cucim). Defaults to cucim.",
506531
)
532+
parser.add_argument(
533+
"--wsi_properties",
534+
type=dict,
535+
help="Can be used to pass the wsi properties manually",
536+
)
507537

508538
self.parser = parser
509539

Binary file not shown.

0 commit comments

Comments
 (0)