This project implements deep learning (CNNs) and gradient-boosted tree models to identify breast cancer subtypes from medical imaging data.
The project aims to:
- Classify breast cancer subtypes using medical imaging data
- Compare performance between CNNs and gradient-boosted tree models
- Provide interpretable results through explainable AI techniques
.
├── data/ # Data directory (not tracked in git)
│ ├── raw/ # Raw input data
│ └── processed/ # Processed data
├── notebooks/ # Jupyter notebooks
│ └── data_requirements.ipynb # Documentation of required data
├── src/ # Source code
│ ├── __init__.py
│ ├── data/ # Data processing scripts
│ ├── models/ # Model implementations
│ └── utils/ # Utility functions
├── requirements.txt # Python dependencies
└── README.md # This file
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtSee notebooks/data_requirements.ipynb for detailed information about:
- Required data formats
- Data preprocessing steps
- Data sources and acquisition
The project implements two main approaches:
-
Convolutional Neural Networks (CNNs)
- Architecture optimized for medical imaging
- Transfer learning from pre-trained models
-
Gradient Boosted Trees
- XGBoost/LightGBM implementation
- Feature engineering pipeline
Please read the data requirements notebook before contributing to ensure all necessary data formats and preprocessing steps are followed.
MIT License