Skip to content

ben-dom393/teamedward

Repository files navigation

teamedward

Installation Instructions

  1. Start a Ubuntu 20.04 Large Instance of type t3.large or bigger with EBSVolumeSize of 100 GB.
  2. SSH into the instance.
  3. Run the following commands
sudo apt update
sudo apt -y install python3-pip
git clone https://github.com/ben-dom393/teamedward.git
cd teamedward
pip install -r requirements.txt

Running Prediction Script on Sample Dataset (~10 seconds)

# Make sure you are in teamedward/ directory
python3 predict_script.py sample_dataset.json predictions.csv

Output: predictions.csv with the columns transcript_id, transcript_position and score (i.e. probability of m6A modification). Stored in current directory.

Prediction Script Manual

Description: Generate predictions for RNA-seq data

Usage: predict_script.py [-h] [-m MODEL] [-s SCALER] [-e ENCODER] json_data_dir output_dir

positional arguments:
  json_data_dir         File path for RNA-seq data (.json)
  output_dir            File path for predictions output (.csv)

optional arguments:
  -h, --help            show this help message and exit
  -m MODEL, --model MODEL
                        File path for fitted model object (.h5). Default: models/fitted_model.h5
  -s SCALER, --scaler SCALER
                        File path for fitted scaler object (.pkl). Default: models/fitted_scaler.pkl
  -e ENCODER, --encoder ENCODER
                        File path for fitted one-hot encoder object (.pkl). Default: models/fitted_encoder.pkl

Running Training Script on Sample Dataset (~15 seconds)

# Make sure you are in teamedward/ directory
python3 train_script.py sample_dataset.json data.info model1

Output: Fitted Keras model model1_model.h5, scaler model1_scaler.pkl and one-hot encoder model1_encoder.pkl. Stored in current directory.

Training Script Manual

Description: Train a ML model to predict m6A modification

Usage: train_script.py [-h] json_data_dir data_info_dir model_name

positional arguments:
  json_data_dir  File path for RNA-seq data (.json)
  data_info_dir  File path for m6A labels (.info)
  model_name     Name of model. The scaler and encoder would be named after this as well.

optional arguments:
  -h, --help     show this help message and exit

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •