Skip to content

Commit b95b614

Browse files
authored
description on source files
1 parent 28e6b22 commit b95b614

File tree

1 file changed

+24
-3
lines changed

1 file changed

+24
-3
lines changed

README.md

+24-3
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,29 @@
22

33
Welcome to the RCDML Project!
44

5-
The RNA-seq Count Drug-response Machine Learning (RCDML) Workflow is a ML workflow that can be used for drug response classification of rare disease patients. Given drug response data and RNA-seq count data, the model follows the ML workflow below to classify patients as "high responder" or "low responder" for a given inhibitor.
5+
The ***R**NA-seq **C**ount **D**rug-response **M**achine **L**earning **(RCDML)*** Workflow is a ML workflow that can be used for drug response classification of rare disease patients. Given drug response data and RNA-seq count data, the model follows the ML workflow below to classify patients as "high responder" or "low responder" for a given inhibitor.
66

7-
The ML pipeline was evaluated using RNA-seq count and drug response data available for Acute Myeloid Leukemia (AML) patients and over 100 different drugs as part of the BeatAML project.
7+
The *RCDML* pipeline was evaluated using RNA-seq count and drug response data available for Acute Myeloid Leukemia (AML) patients and over 100 different drugs as part of the BeatAML project.
88

9-
For documentation on how to get started with the ML workflow visit the [wiki](https://github.com/UD-CRPL/RCDML/wiki).
9+
**The *RCDML* source code consist of:**
10+
11+
`parser.py` – Contains the code used to parse the configuration file.
12+
13+
`data_preprocessing.py` – Contains the code that loads and configures the dataset, and the code for assigning responder/non-responder labels.
14+
15+
`feature_selection.py` – Contains the implementations of the feature selection techniques and the feature counter generator.
16+
17+
`classification.py` – Contains the implementation of the classifiers and the hyperparameter optimization technique. The hyperparameter lists used are found here.
18+
19+
`validation.py` – Contains the code used to create the confusion matrices, ROC curves, run inference and make predictions.
20+
21+
`main.py` – Contains the framework structure. This is the script that needs to get called to run the ML pipeline.
22+
23+
`parameters.cfg` – Parameter configuration for the ML pipeline run. The user can select the feature selection techniques, classifiers, and other options that will be used in the run.
24+
25+
`/tools/` - Utility tools for gathering results, create feature counters, get family drug list, etc. For more information on each tool follow this wiki page link.
26+
27+
`/setup/` - Contains conda environment yml file and parameter configuration presets. For more information on each setup preset follow this wiki page.
28+
29+
30+
For documentation on how to get started with the *RCDML* workflow visit the [wiki](https://github.com/UD-CRPL/RCDML/wiki).

0 commit comments

Comments
 (0)