
This script performs Whole Slide Image (WSI) preprocessing, including masking, tiling, normalization, quality checks, encoding and optional whole slide reconstruction after tiling. It can be used as a customazible pipeline for mass WSI processing or to directly call functions to perform specific tasks. As a pipeline, it is designed to ensure that if stopped for any reason, you will be able to continue at the last step that was completed. It also includes both an error report (for any error that may occur and the location it occurred) and a summary report with statistics like % of tissue and time taken to process file.
In order to use this package, you must have python (>3.9) installed on your system. Conda is also recommended for installation.
#Installation
First clone the repository and cd into repository:
git clone https://github.com/lolmomarchal/SlideLab.git
cd SlideLab
To install required dependencies you can do through conda:
conda env create -f environment.yml
conda activate slidelab
Argument | Description | Default |
---|---|---|
-i , --input_path |
Path to the input WSI file | None |
-o , --output_path |
Path to save the output tiles | None |
Argument | Description | Default |
---|---|---|
-s , --desired_size |
Desired size of the tiles (in pixels) | 256 |
-m , --desired_magnification |
Desired magnification level (ex: 20x | 20 |
-ov , --overlap |
Factor of overlap between tiles | 1 (no overlap) |
Overlap example: With a size of 256 and an overlap of 2, tiles would overlap by 128 pixels.
Argument | Description | Default |
---|---|---|
-rb , --remove_blurry_tiles |
Remove blurry tiles using a Laplacian filter | False |
-n , --normalize_staining |
Normalize staining of the tiles | False |
-e , --encode |
Encode tiles into an .h5 file |
False |
--extract_high_quality |
Extract features for high quality heatmaps | False |
--augmentations |
Get various augmentations for encoded tiles for model training | 0 |
Argument | Description | Default |
---|---|---|
-th , --tissue_threshold |
Minimum tissue content to consider a tile valid | 0.7 |
-bh , --blur_threshold |
Threshold for Laplacian filter variance (blur detection) | 0.015 |
--red_pen_check |
Sanity check for % of red pen detected. If above threshold, red_pen mask will be ignored | 0.4 |
--blue_pen_check |
Sanity check for % of red pen detected. If above threshold, blue_pen mask will be ignored | 0.4 |
Argument | Description | Default |
---|---|---|
--device |
Specify device (e.g., GPU/CPU) | None (will utilize gpu if available) |
--cpu_processes |
Number of CPU processes to use | os.cpu_count() |
--batch_size |
Number of CPU processes to use | 16. If using augmentations batch_size will be recalculated by using # of augmentations/batch size |
Argument | Description | Default |
---|---|---|
--min_tiles |
Minimum number of valid tiles for a sample to be counted as "valid". Will create additional filtered sample metadata file. | 0 |
python SlidePreprocessing.py -i /path/to/input/-o /path/to/output/ \
-s 512 -m 40 --remove_blurry_tiles --normalize_staining --encode \
-th 0.8 -bh 0.02 --device cuda --batch_size 256
Can create related objects and use their associated methods. Please see Example.ipynb
and Masking Examples.ipynb
.