DeepDicom provides tools to process DICOM files for deep learning research. It includes tools to extract images and metadata from DICOM files, standardize image dimensions, and standardize structure names. The repository also includes a training pipeline for a PyTorch UNet model to do dose prediction.
This repository should support DICOMs from any radiotherapy treatment planning system. This includes clinical data publicly available DICOM from the Cancer Imaging Archive (e.g., Pancreatic-CT-CBCT-SEG).
DeepDicom consists of four modules:
- Interface: Contains the
Case
class to organize DICOM data. - Dicom Extraction: Extracts images and metadata from DICOM files and stores them in an efficient format.
- Prediction: Standardizes data from
Case
objects, creates training samples, and defines data splits. - Model: Contains the
Trainer
class that trains a UNet model for dose prediction using PyTorch.
- Linux
- Python 3.10.12
- NVIDIA GPU with CUDA and CuDNN (recommended)
- Create a virtual environment and activate it:
virtualenv -p python3 deep-dicom source deep-dicom/bin/activate
- Clone the repository and install dependencies:
git clone https://github.com/ababier/deep-dicom cd deep-dicom pip3 install -r requirements.txt
If everything is set up correctly, run the main script to start processing DICOM files, generate training samples, and train the model.
Run the following command in your virtual environment:
python3 main.py
This will:
- Extract DICOM data and organize it into
Case
objects. - Standardize images, voxel spacing, and structure names.
- Create train/validation/test data splits.
- Train a UNet model for dose prediction, logging progress with TensorBoard.
-
Data Splits:
The code uses dataset splits. The training, validation, and test IDs are saved and re-used for consistency. -
Model Training:
The Trainer class uses gradient checkpointing and AMP (Automatic Mixed Precision) for efficient training, along with an exponential learning rate scheduler. -
TensorBoard Logging:
Training metrics and image visualizations are logged. Launch TensorBoard by running:tensorboard --logdir=runs/ --port=6006 --bind_all
-
Customization:
You can adjust hyperparameters (e.g., batch size, learning rate) in the Trainer class, and modify the UNet architecture or data processing as needed.
Happy coding and deep learning with DICOM data!