COVID-19 prediction from Chest X-Rays

Disclaimer: The scripts and models presented here have only been tested on a very limited dataset. This repository is not production-ready and is not intended to be used for clinical diagnosis. Please use it responsibly and at your own risk.

This repository is being made available for free, with the hope that it will prove useful to someone during this ongoing global pandemic. Attribution is not required, but will be appreciated, if you find this repository useful.

If you have access to reliable PA/AP Chest X-Ray images, which are not included in the training data listed in the Data section below, that you would like to share, to help improve this model, please respond here.

Inference Pipeline

The inference pipeline uses two models:

Segmentation model (unet - resnet34)
Classifier (resnet34)

Prediction is performed as follows:

Lungs are identified in the input image by the segmentation model
The bounding box is computed for the region containing the lungs
The input image is cropped and some additional preprocessing is performed on the cropped image (CLAHE, thresholding)
A prediction (COVID-19 / Normal / Pneumonia) is obtained from the classifier model, along with an optional heatmap

Here are a few examples, for a visual representation of the steps above

Results

Confusion matrices for the results produced on two test sets, are given below.

Covid-Net test set

	COVID-19	Normal	Pneumonia	Sensitivity
COVID-19	94	4	2	0.9400
Normal	4	863	18	0.9751
Pneumonia	5	46	543	0.9141
P.P.V.	0.9126	0.9452	0.9645

Non-public test set + 20% of RICORD data

	COVID-19	Normal	Pneumonia	Sensitivity
COVID-19	117	1	0	0.9915
Normal	4	25	0	0.8621
Pneumonia	0	0	0	-
P.P.V.	0.9669	0.9615	-

Environment

The env folder contains scripts to help set up an environment, for using the code in this repository, on an Ubuntu 18.04 host. These scripts may work under other debian-based distros as well, but have not been tested. In any case, it should be trivial to adapt these scripts to work under most environments.

setup.sh - This script is meant to be run as root
setup-user.sh - Run this script as the user that will use the repository. By default, the CPU version of pytorch is installed. To install the CUDA (v10.1) version, before running the script, comment out the line:
```
pip3 install torch==1.6.0+cpu torchvision==0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
```
and uncomment the line that reads:
```
pip3 install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
```
download-models.sh - Download the latest version of the trained models. This is invoked automatically when you run setup-user.sh but is provided as a separate script to simplify acquisition of new models when they become available.

Prediction

Once the environment is set up correctly, it should be possible to run inference.py from the inference folder to produce predictions on individual images or folders containing images.

python3 inference.py --help
usage: inference.py [-h] --config CONFIG --xraypath XRAYPATH
                    [--heatmappath HEATMAPPATH]

COVID-19_CXR_AI Inference

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG       Config file path
  --xraypath XRAYPATH   Full path to image (or dir containing images) to be
                        inferenced
  --heatmappath HEATMAPPATH
                        Directory in which generated heatmaps are to be stored

Provided that the models are placed in the default location i.e. the models/current folder, it should be possible to use the included model-config.json file as-is.

Ideally, use full-sized X-Ray images in PNG format.

Training

Should you wish to train the models further or retrain from scratch, the public data that was used for training is listed in the Data section below. The following notebooks can serve as guidelines for training:

segmentation/segmentation-train.ipynb
classification/classifier-train.ipynb

Data

The notebooks used to create usable datasets for training, are in the datasets folder. Please note that these notebooks create hard links to the original images, to avoid duplication. Therefore, it is advisable to put the final datasets on the same logical disk partition, as the original images.

The data used for training, was acquired from the sources listed below.

Segmentation

NLM Tuberculosis Chest X-ray Image Data Sets
Shenzhen subset segmentation masks
Additional non-public, manually segmented (using Fiji) images

Classification

Compiled by the Covid-Net team:

Additional

RICORD -> TCIA
Local non-public images

To create the training datasets:

Download images from the above links
Convert DICOM images to PNG using tools of your choice (e.g. mogrify or convert from imagemagick). Please note that create_COVIDx_v2_RICORD.ipynb expects the converted RICORD images to retain the original folder structure.
Specify appropriate paths in segmentation-prepare.ipynb and run the notebook to create training data for segmentation
Specify appropriate paths in create_COVIDx_v2_RICORD.ipynb and run the notebook to create a Covid-Net style dataset
Specify appropriate paths in segmentation-apply.ipynb and run the notebook to
1. apply segmentation to the classification training images
2. save lung bounds for all the training images and
3. transform the dataset into the expected form for classifier training

Credits

The Covid-Net project for their pioneering work in this field and for creating a comprehensive collection of training data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COVID-19 prediction from Chest X-Rays

Inference Pipeline

Results

Environment

Prediction

Training

Data

Segmentation

Classification

Credits

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
classification		classification
datasets		datasets
env		env
images		images
inference		inference
segmentation		segmentation
.gitignore		.gitignore
README.md		README.md

singhaxn/COVID-19_CXR_AI

Folders and files

Latest commit

History

Repository files navigation

COVID-19 prediction from Chest X-Rays

Inference Pipeline

Results

Environment

Prediction

Training

Data

Segmentation

Classification

Credits

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages