|
2 | 2 |
|
3 | 3 | Welcome to the RCDML Project!
|
4 | 4 |
|
5 |
| -The RNA-seq Count Drug-response Machine Learning (RCDML) Workflow is a ML workflow that can be used for drug response classification of rare disease patients. Given drug response data and RNA-seq count data, the model follows the ML workflow below to classify patients as "high responder" or "low responder" for a given inhibitor. |
| 5 | +The ***R**NA-seq **C**ount **D**rug-response **M**achine **L**earning **(RCDML)*** Workflow is a ML workflow that can be used for drug response classification of rare disease patients. Given drug response data and RNA-seq count data, the model follows the ML workflow below to classify patients as "high responder" or "low responder" for a given inhibitor. |
6 | 6 |
|
7 |
| -The ML pipeline was evaluated using RNA-seq count and drug response data available for Acute Myeloid Leukemia (AML) patients and over 100 different drugs as part of the BeatAML project. |
| 7 | +The *RCDML* pipeline was evaluated using RNA-seq count and drug response data available for Acute Myeloid Leukemia (AML) patients and over 100 different drugs as part of the BeatAML project. |
8 | 8 |
|
9 |
| -For documentation on how to get started with the ML workflow visit the [wiki](https://github.com/UD-CRPL/RCDML/wiki). |
| 9 | +**The *RCDML* source code consist of:** |
| 10 | + |
| 11 | +`parser.py` – Contains the code used to parse the configuration file. |
| 12 | + |
| 13 | +`data_preprocessing.py` – Contains the code that loads and configures the dataset, and the code for assigning responder/non-responder labels. |
| 14 | + |
| 15 | +`feature_selection.py` – Contains the implementations of the feature selection techniques and the feature counter generator. |
| 16 | + |
| 17 | +`classification.py` – Contains the implementation of the classifiers and the hyperparameter optimization technique. The hyperparameter lists used are found here. |
| 18 | + |
| 19 | +`validation.py` – Contains the code used to create the confusion matrices, ROC curves, run inference and make predictions. |
| 20 | + |
| 21 | +`main.py` – Contains the framework structure. This is the script that needs to get called to run the ML pipeline. |
| 22 | + |
| 23 | +`parameters.cfg` – Parameter configuration for the ML pipeline run. The user can select the feature selection techniques, classifiers, and other options that will be used in the run. |
| 24 | + |
| 25 | +`/tools/` - Utility tools for gathering results, create feature counters, get family drug list, etc. For more information on each tool follow this wiki page link. |
| 26 | + |
| 27 | +`/setup/` - Contains conda environment yml file and parameter configuration presets. For more information on each setup preset follow this wiki page. |
| 28 | + |
| 29 | + |
| 30 | +For documentation on how to get started with the *RCDML* workflow visit the [wiki](https://github.com/UD-CRPL/RCDML/wiki). |
0 commit comments