Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -207,3 +207,7 @@ sample_cls.bmp
sample_cls.jpg
sample_he.jpg
testo.sh
debug.py
direct_cpu_pp.sh
run_inference_container_jc.sh
sample_analysis.ipynb
278 changes: 137 additions & 141 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,141 +1,137 @@
# HoVer-NeXt Inference
HoVer-NeXt is a fast and efficient nuclei segmentation and classification pipeline.

Supported are a variety of data formats, including all OpenSlide supported datatypes, `.npy` numpy array dumps, and common image formats such as JPEG and PNG.
If you are having trouble with using this repository, please create an issue and we will be happy to help!

For training code, please check the [hover-next training repository](https://github.com/digitalpathologybern/hover_next_train)

Find the Publication here: [https://openreview.net/pdf?id=3vmB43oqIO](https://openreview.net/pdf?id=3vmB43oqIO)

## Setup

Environments for train and inference are the same so if you already have set the environment up for training, you can use it for inference as well.

Otherwise:

```bash
conda env create -f environment.yml
conda activate hovernext
pip install torch==2.1.1 torchvision==0.16.1 --index-url https://download.pytorch.org/whl/cu118
```

or use predefined [docker/singularity container](#docker-and-apptainersingularity-container)

## Model Weights

Weights are hosted on [Zenodo](https://zenodo.org/records/10635618)
By specifying one of the ID's listed, weights are **automatically** downloaded and loaded.

| Dataset | ID | Weights |
|--------------|--------|-----|
| Lizard-Mitosis | "lizard_convnextv2_large" | [Large](https://zenodo.org/records/10635618/files/lizard_convnextv2_large.zip?download=1) |
| | "lizard_convnextv2_base" |[Base](https://zenodo.org/records/10635618/files/lizard_convnextv2_base.zip?download=1) |
| | "lizard_convnextv2_tiny" |[Tiny](https://zenodo.org/records/10635618/files/lizard_convnextv2_tiny.zip?download=1) |
| PanNuke | "pannuke_convnextv2_tiny_1" | [Tiny Fold 1](https://zenodo.org/records/10635618/files/pannuke_convnextv2_tiny_1.zip?download=1) |
| | "pannuke_convnextv2_tiny_2" | [Tiny Fold 2](https://zenodo.org/records/10635618/files/pannuke_convnextv2_tiny_2.zip?download=1) |
| | "pannuke_convnextv2_tiny_3" | [Tiny Fold 3](https://zenodo.org/records/10635618/files/pannuke_convnextv2_tiny_3.zip?download=1) |

If you are manually downloading weights, unzip them in the directory, such that the folder (e.g. ```lizard_convnextv2_large```) sits in the same directory as ```main.py```.

## WSI Inference

This pipeline uses OpenSlide to read images, and therefore supports all formats which are supported by OpenSlide.
If you want to run this pipeline on custom ome.tif files, ensure that the necessary metadata such as resolution, downsampling and dimensions are available.
Before running a slide, choose [appropriate parameters for your machine](#optimizing-inference-for-your-machine)

To run a single slide:

```bash
python3 main.py \
--input "/path-to-wsi/wsi.svs" \
--output_root "results/" \
--cp "lizard_convnextv2_large" \
--tta 4 \
--inf_workers 16 \
--pp_tiling 10 \
--pp_workers 16
```

To run multiple slides, specify a glob pattern such as `"/path-to-folder/*.mrxs"` or provide a list of paths as a `.txt` file.

### Slurm

if you are running on a slurm cluster you might consider separating pre and post-processing to improve GPU utilization.
Use the `--only_inference` parameter and submit another job on with the same parameters, but removing the `--only_inference`.

## NPY / Image inference

NPY and image inference works the same as WSI inference, however output files are only a ZARR array.

```bash
python3 main.py \
--input "/path-to-file/file.npy" \
--output_root "/results/" \
--cp "lizard_convnextv2_large" \
--tta 4 \
--inf_workers 16 \
--pp_tiling 10 \
--pp_workers 16
```

Support for other datatypes are easy to implement. Check the NPYDataloader for reference.

## Optimizing inference for your machine:

1. WSI is on the machine or on a fast access network location
2. If you have multiple machines, e.g. CPU-only machines, you can move post-processing to that machine
3. '--tta 4' yields robust results with very high speed
4. '--inf_workers' should be set to the number of available cores
5. '--pp_workers' should be set to number of available cores -1, with '--pp_tiling' set to a low number where the machine does not run OOM. E.g. on a 16-Core machine, '--pp_workers 16 --pp_tiling 8 is good. If you are running out of memory, increase --pp_tiling.

## Using the output files for downstream analysis:

By default, the pipeline produces an instance-map, a class-lookup with centroids and a number of .tsv files to load in QuPath.
sample_analysis.ipynb shows exemplarily how to use the files.

### Polygon/geojson output

If you only need to vizualize some examples on WSI, it might be useful to use a geojson output.
For this, you can add `--save_polygon` as an argument. However, polygon creation and saving takes additional time and is not recommended for large scale analyses.

## Docker and Apptainer/Singularity Container:

Download the singularity image from [Zenodo](https://zenodo.org/records/10649470/files/hover_next.sif)

```bash
# don't forget to mount your local directory
export APPTAINER_BINDPATH="/storage"
apptainer exec --nv /path-to-container/hover_next.sif \
python3 /path-to-repo/main.py \
--input "/path-to-wsi/*.svs" \
--output_root "results/" \
--cp "lizard_convnextv2_large" \
--tta 4
```
# License

This repository is licensed under GNU General Public License v3.0 (See License Info).
If you are intending to use this repository for commercial usecases, please check the licenses of all python packages referenced in the Setup section / described in the requirements.txt and environment.yml.

# Citation

If you are using this code, please cite:
```
@inproceedings{baumann2024hover,
title={HoVer-NeXt: A Fast Nuclei Segmentation and Classification Pipeline for Next Generation Histopathology},
author={Baumann, Elias and Dislich, Bastian and Rumberger, Josef Lorenz and Nagtegaal, Iris D and Martinez, Maria Rodriguez and Zlobec, Inti},
booktitle={Medical Imaging with Deep Learning},
year={2024}
}
```
and
```
@INPROCEEDINGS{rumberger2022panoptic,
author={Rumberger, Josef Lorenz and Baumann, Elias and Hirsch, Peter and Janowczyk, Andrew and Zlobec, Inti and Kainmueller, Dagmar},
booktitle={2022 IEEE International Symposium on Biomedical Imaging Challenges (ISBIC)},
title={Panoptic segmentation with highly imbalanced semantic labels},
year={2022},
pages={1-4},
doi={10.1109/ISBIC56247.2022.9854551}}
```
# HoVer-NeXt Inference
HoVer-NeXt is a fast and efficient nuclei segmentation and classification pipeline.

Supported are a variety of data formats, including all OpenSlide supported datatypes, `.npy` numpy array dumps, and common image formats such as JPEG and PNG.
If you are having trouble with using this repository, please create an issue and we will be happy to help!

For training code, please check the [hover-next training repository](https://github.com/digitalpathologybern/hover_next_train)

Find the Publication here: [https://openreview.net/pdf?id=3vmB43oqIO](https://openreview.net/pdf?id=3vmB43oqIO)

## Setup

Environments for train and inference are the same so if you already have set the environment up for training, you can use it for inference as well.

Otherwise:

```bash
conda env create -f environment.yml
conda activate hovernext
pip install torch==2.1.1 torchvision==0.16.1 --index-url https://download.pytorch.org/whl/cu118
```

or use predefined [docker/singularity container](#docker-and-apptainersingularity-container)

## Model Weights

Weights are hosted on [Zenodo](https://zenodo.org/records/10635618)
By specifying one of the ID's listed, weights are **automatically** downloaded and loaded.

| Dataset | ID | Weights |
|--------------|--------|-----|
| Lizard-Mitosis | "lizard_convnextv2_large" | [Large](https://zenodo.org/records/10635618/files/lizard_convnextv2_large.zip?download=1) |
| | "lizard_convnextv2_base" |[Base](https://zenodo.org/records/10635618/files/lizard_convnextv2_base.zip?download=1) |
| | "lizard_convnextv2_tiny" |[Tiny](https://zenodo.org/records/10635618/files/lizard_convnextv2_tiny.zip?download=1) |
| PanNuke | "pannuke_convnextv2_tiny_1" | [Tiny Fold 1](https://zenodo.org/records/10635618/files/pannuke_convnextv2_tiny_1.zip?download=1) |
| | "pannuke_convnextv2_tiny_2" | [Tiny Fold 2](https://zenodo.org/records/10635618/files/pannuke_convnextv2_tiny_2.zip?download=1) |
| | "pannuke_convnextv2_tiny_3" | [Tiny Fold 3](https://zenodo.org/records/10635618/files/pannuke_convnextv2_tiny_3.zip?download=1) |

If you are manually downloading weights, unzip them in the directory, such that the folder (e.g. ```lizard_convnextv2_large```) sits in the same directory as ```main.py```.

## WSI Inference

This pipeline uses OpenSlide to read images, and therefore supports all formats which are supported by OpenSlide.
If you want to run this pipeline on custom ome.tif files, ensure that the necessary metadata such as resolution, downsampling and dimensions are available.
Additionally, czi is is supported via pylibCZIrw.
Before running a slide, choose [appropriate parameters for your machine](#optimizing-inference-for-your-machine)

To run a single slide:

```bash
python3 main.py \
--input "/path-to-wsi/wsi.svs" \
--output_root "results/" \
--cp "lizard_convnextv2_large" \
--tta 4 \
--inf_workers 16 \
--pp_tiling 10 \
--pp_workers 16
```

To run multiple slides, specify a glob pattern such as `"/path-to-folder/*.mrxs"` or provide a list of paths as a `.txt` file.

### Slurm

if you are running on a slurm cluster you might consider separating pre and post-processing to improve GPU utilization.
Use the `--only_inference` parameter and submit another job on with the same parameters, but removing the `--only_inference`.

## NPY / Image inference

NPY and image inference works the same as WSI inference, however output files are only a ZARR array.

```bash
python3 main.py \
--input "/path-to-file/file.npy" \
--output_root "/results/" \
--cp "lizard_convnextv2_large" \
--tta 4 \
--inf_workers 16 \
--pp_tiling 10 \
--pp_workers 16
```

Support for other datatypes are easy to implement. Check the NPYDataloader for reference.

## Optimizing inference for your machine:

1. WSI is on the machine or on a fast access network location
2. If you have multiple machines, e.g. CPU-only machines, you can move post-processing to that machine
3. '--tta 4' yields robust results with very high speed
4. '--inf_workers' should be set to the number of available cores
5. '--pp_workers' should be set to number of available cores -1, with '--pp_tiling' set to a low number where the machine does not run OOM. E.g. on a 16-Core machine, '--pp_workers 16 --pp_tiling 8 is good. If you are running out of memory, increase --pp_tiling.

## Using the output files for downstream analysis:

By default, the pipeline produces an instance-map, a class-lookup with centroids and a number of .tsv files to load in QuPath.
sample_analysis.ipynb shows exemplarily how to use the files.

## Docker and Apptainer/Singularity Container:

Download the singularity image from [Zenodo](https://zenodo.org/records/10649470/files/hover_next.sif)

```bash
# don't forget to mount your local directory
export APPTAINER_BINDPATH="/storage"
apptainer exec --nv /path-to-container/hover_next.sif \
python3 /path-to-repo/main.py \
--input "/path-to-wsi/*.svs" \
--output_root "results/" \
--cp "lizard_convnextv2_large" \
--tta 4
```
# License

This repository is licensed under GNU General Public License v3.0 (See License Info).
If you are intending to use this repository for commercial usecases, please check the licenses of all python packages referenced in the Setup section / described in the requirements.txt and environment.yml.

# Citation

If you are using this code, please cite:
```
@inproceedings{baumann2024hover,
title={HoVer-NeXt: A Fast Nuclei Segmentation and Classification Pipeline for Next Generation Histopathology},
author={Baumann, Elias and Dislich, Bastian and Rumberger, Josef Lorenz and Nagtegaal, Iris D and Martinez, Maria Rodriguez and Zlobec, Inti},
booktitle={Medical Imaging with Deep Learning},
year={2024}
}
```
and
```
@INPROCEEDINGS{rumberger2022panoptic,
author={Rumberger, Josef Lorenz and Baumann, Elias and Hirsch, Peter and Janowczyk, Andrew and Zlobec, Inti and Kainmueller, Dagmar},
booktitle={2022 IEEE International Symposium on Biomedical Imaging Challenges (ISBIC)},
title={Panoptic segmentation with highly imbalanced semantic labels},
year={2022},
pages={1-4},
doi={10.1109/ISBIC56247.2022.9854551}}
```
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,5 @@ toml
numcodecs
imagecodecs
timm==0.9.6
geojson
geojson
pylibCZIrw
Loading
Loading