Skip to content

Official repository for pTBLightNet, a multi-view deep learning framework designed to detect pediatric pulmonary TB by identifying TB-compatible CXRs with consistent radiological findings.

License

Notifications You must be signed in to change notification settings

dani-capellan/pTBLightNet

Repository files navigation

pTBLightNet

Official repository for pTBLightNet: Multi-View Deep Learning Framework for the Detection of Chest X-Rays Compatible with Pediatric Pulmonary Tuberculosis

Paper

Table of Contents

Introduction

Tuberculosis (TB) remains a major global health burden, particularly in low-resource, high-prevalence regions. Pediatric TB presents unique challenges due to its non-specific symptoms and less distinct radiological manifestations compared to adult TB. Many children who die from TB are never diagnosed or treated, highlighting the need for early detection and treatment. The World Health Organization recommends chest X-ray (CXR) for TB screening and triage as part of diagnostic protocols due to its ability to rapidly assess pulmonary TB-related abnormalities and its widespread availability. In this study, we present pTBLightNet, a novel multi-view deep learning framework designed to detect pediatric pulmonary TB by identifying TB-compatible CXRs with consistent radiological findings. Our approach leverages both frontal and lateral CXR views to enhance prediction accuracy. We used diverse adult CXR datasets (N = 114,173) to pre-train our framework and CXR datasets (N = 918) from three pediatric TB cohorts for fine-tuning or training from scratch, and for evaluation. Our approach achieved an area under the curve of 0.903 on internal testing. External evaluation confirmed its effectiveness and generalizability using as CXR TB compatibility, expert reading, microbiological confirmation and case definition as reference standards. Age-specific models (<5 and 5-18 years old) performed competitively to those trained in larger undifferentiated populations, and incorporating lateral CXRs improved diagnosis in younger children compared to using only frontal CXR. Comparisons across different age groups demonstrated the robustness of the model, indicating its promise for improving TB diagnosis across ages, particularly in resource-limited settings.

pTBLightNet - Graphical Abstract

Installation

We strongly recommend starting by creating a virtual environment using Miniconda or Anaconda. Access to GPU resources is also highly recommended for optimal performance. This code was developed with CUDA 11.8 and Python 3.9, though newer versions may work with minimal adjustments. After setting up the environment, please install the required libraries using pip (installation time should be under 10 minutes in a typical setup):

pip install -r requirements.txt

Weights

You can download the model weights from this link: https://huggingface.co/dani-capellan/pTBLightNet.

To use these weights, simply copy the weights directory into the main project folder of the corresponding code repository. The model will automatically locate the weights following the internal path structure.

Usage

In this code, we provide a Minimal Working Example (MWE). The example data provided (3 internal and 1 external independent testing cases) is located in data/. There, you will find a CSV file with the dataset metadata and a PKL file with the images.

IMPORTANT: Ensure that all images are cropped to the lung region, as this is required for optimal model performance.

The images are first converted into a .pkl file, then the code reads this file and processes the images. Although some example images are already converted in to a .pkl file in data/mwe_data.pkl, we also provide a script to convert custom data into this file. The command to run this process is as follows:

python ./data/img_to_pkl.py

Inference (MWE)

In order to continue with the MWE, you can run inference on the example data provided by running the following command:

bash run_test_mwe.sh

This script will do the following:

  1. Run inference on the example cases with the AP model. Results will be stored in results_test_mwe/AP. In this step, Grad-CAMs will be generated automatically.

  2. Run inference on the example cases with the LAT model. Results will be stored in results_test_mwe/LAT. In this step, Grad-CAMs will be generated automatically.

  3. Run inference on the example cases ensembling features from both the AP and LAT models. Results will be stored in results_test_mwe/AP-LAT_ensemble. Steps 1 and 2 are not required to run this third step. However, in order to generate Grad-CAMs, steps 1 and 2 have to be conducted separately.

Note: MWE inference time < 1 min in NVIDIA 4090 24GB GPU.

Use Custom Dataset

You can use your own dataset by following the next steps:

  1. Create a CSV with your custom dataset (see data/dataset_mwe.csv and follow the same format). The split of the data should be already done at this point (column split and fold_cv(if cross-validation) contain this information).

  2. Create a pickle file (.pkl) with a Python dict containing the images in numpy arrays the following way:

    dict: {
        <filename1>: {"AP": <Numpy Array1, 256x256>, "LAT": <Numpy  Array1, 256x256>}
        <filename2>: {"AP": <Numpy Array2, 256x256>, "LAT": <Numpy Array2, 256x256>},
        <filename3>: {"AP": <Numpy Array3, 256x256>, "LAT": <Numpy Array3, 256x256>},
        ...
        <filenameN>: {"AP": <Numpy ArrayN, 256x256>, "LAT": <Numpy ArrayN, 256x256>},
    }
    

    Where <filename> corresponds to the value of the filename field that appears in the corresponding row of the dataset (CSV). This is done this way to facilitate data access and to allow faster training than accessing and preprocessing the images at each step. If LAT view is not available, that field can be kept empty.

  3. Once you have a CSVfile and a PKL file for your custom dataset, you can generate a custom config.yaml configuration file. Please refer to config/ directory to follow the same structure.

Inference on Custom Data

In order to test one or more trained models on our data, we will need to:

  1. Once you have a CSVfile and a PKL file for your custom dataset, you must generate a custom config.yaml configuration file. This file contains all the configurations regarding the inference process. A default and functional version is already provided to you in config directory. We encourage users to experiment with different configurations.

  2. Use the following command if wanting to infer with the single AP/LAT model:

    python test.py -cfg <path_to_config_file>
  3. Use the following command if wanting to infer with the AP & LAT model ensemble:

    python test_ensemble.py -cfg <path_to_config_file>
  4. You can check the output results in out_dir directory defined in the configuration file.

Useful Tips and Considerations

  • By default, preprocessing with CLAHE is applied to the input images. If CLAHE is not desired in the preprocessing step, change the clahe -> enabled parameter in the corresponding config file.

  • To select a specific GPU in which to carry out the training or testing process, use CUDA_VISIBLE_DEVICES=0,1,... at the beginning of the command. Example:

     CUDA_VISIBLE_DEVICES=0 python test.py
  • The program won't take into account the columns patient_id, age_yo and sex columns from the input CSV file that describes our dataset. You can omit that data or put them to 0. Moreover, if cross validation (CV) is enabled, please make sure that fold_cv column is properly included in the CSV.

  • Suggested file naming for custom datasets: <COHORT IDENTIFIER>_<CASE_IDENTIFIER>.<FORMAT>. Example: COH_001.jpg

How to cite

If you use pTBLightNet in your research, please cite:

  • Capellán-Martín, D., Gómez-Valverde, J. J., Sánchez-Jacob, R., et al. (2025, October). Multi-view deep learning framework for the detection of chest X-rays compatible with pediatric pulmonary tuberculosis. In Nature Communications. https://doi.org/10.1038/s41467-025-64391-1

  • Capellán-Martín, D., Gómez-Valverde, J. J., Bermejo-Peláez, D., & Ledesma-Carbayo, M. J. (2023, April). A lightweight, rapid and efficient deep convolutional network for chest x-ray tuberculosis detection. In 2023 IEEE 20th International Symposium on Biomedical Imaging (ISBI) (pp. 1-5). IEEE. https://doi.org/10.1109/ISBI53787.2023.10230500

  • Capellán-Martín, D., Gómez-Valverde, J. J., Sanchez-Jacob, R., Bermejo-Peláez, D., García-Delgado, L., López-Varela, E., & Ledesma-Carbayo, M. J. (2023, April). Deep learning-based lung segmentation and automatic regional template in chest X-ray images for pediatric tuberculosis. In Medical Imaging 2023: Computer-Aided Diagnosis (Vol. 12465, pp. 451-459). SPIE. DOI: https://doi.org/10.1117/12.2652626

How to contribute

If you have questions or suggestions, feel free to open an issue or PR at https://github.com/dani-capellan/pTBLightNet.

Contact

Corresponding authors: [email protected] | [email protected] | [email protected]

License

CC BY-NC-ND 4.0

This project is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International (CC BY-NC-ND) License. You are free to share (copy and redistribute) the material in any medium or format for non-commercial purposes only, without modifications, and with appropriate attribution. You may not use the material for commercial purposes, and you may not distribute modified versions of this work.

For more details, see the full license text in the LICENSE file or visit the Creative Commons website.

If you are interested in using this work commercially or wish to request permission for a derivative work, please contact the authors to discuss potential licensing arrangements.

© 2025 Biomedical Image Technologies - Universidad Politécnica de Madrid

CC BY-NC-ND 4.0

About

Official repository for pTBLightNet, a multi-view deep learning framework designed to detect pediatric pulmonary TB by identifying TB-compatible CXRs with consistent radiological findings.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published