Phyllochron (Maximum Likelihood Assignment for Longitudinal Reconstruction)

Phyllochron employs an ILP to solve the Maximum Likelihood Longitudinal Assignment Problem to infer a Longitudinally Observed Perfect Phylogeny Time-Labelled Matrix.

Phyllchron takes as input:

Variant and total read counts associated with each cell
Discrete timepoints associated with each cell
A perfect phylogeny clone tree
A fractional threshold representing the minimum proportion of cells at a sample assigned to a clone for that clone to be present in that sample

Pre-requisites (see .yaml file for versions)

python3 (>= 3.6.10)
numpy
pandas
gurobipy
networkx
scipy
(for generating simulation and real data instances) snakemake (>=5.2.0)

Usage instructions

I/O formats

The input for Phyllochron is

A comma-delimited file, which has on each line the variant read counts associated with a single cell for each mutation locus.
- Example: data/AML/input_data/AML-63_variant_readcounts.csv
A comma-delimited file, which has on each line the total read counts associated with a single cell for each mutation locus.
- Example: data/AML/input_data/AML-63_total_readcounts.csv Alternatively, instead of readcount information, Phyllochron can also take in a character matrix as input.
A comma-delimited file, which has on each line the character state associated with a single cell for each mutation locus.
- Example: data/AML/input_data/AML-63_character_matrix.csv
A comma-delimited file, which has on each line the timepoint associated with each cell.
- Example: data/AML/input_data/AML-63_timepoints.csv
A comma-delimited file, which has on each line has a binary clone profile corresponding to the mutation profile assigned to all present clones.
- Example: data/AML/input_data/AML-63_mutation_tree.csv
A fractional threshold z representing the minimum proportion of cells at a sample assigned to a clone for that clone to be present in that sample. For example, z = 0.10 means that at least 10% of cells in a sample must be assigned to a clone for it to be present in that sample.

Phyllochron

usage: phyllochron.py [-i CHARACTER_MATRIX] [-t TIMEPOINTS] [--mutation-tree MUTATION_TREE][-o OUTPUT_PREFIX] [-z Z] [-a FP] [-b FN] [--ado ADO] [--time-limit TIME_LIMIT]

       phyllochron.py [-r TOTAL_READS] [-v VARIANT_READS] [-t TIMEPOINTS]  [--mutation-tree MUTATION_TREE] [-o OUTPUT_PREFIX] [-z Z] [-a FP] [-b FN] [--ado ADO] [--time-limit TIME_LIMIT]

required arguments:
  -i CHARACTER_MATRIX   filepath for the character matrix csv file     
  or
  -r TOTAL_READS   filepath for the total read counts csv file     
  -v VARIANT_READS   filepath for the variant read counts csv file     
  and
  --mutation-tree MUTATION_TREE filepath for the mutation tree csv file
  -t TIMEPOINTS   filepath for the timepoints file  
  -o OUTPUT_PREFIX filepath indicating the output prefix for all output files
optional arguments:
  -z Z  fractional clonal presence threshold. Default is z = 0.05 
  -a FP false positive error rate. Default is a = 0.001
  -b FN false negative error rate. Default is b = 0.001
  --ado ADO precision parameter for ADO. Default is ado = 15
  --time-limit TIME_LIMIT time limit for solver in seconds. Default is 1800 seconds

An example of usage is as follows. This command can be run from the directory that contains this README file.

python src/phyllochron.py -r data/AML/input_data/AML-63_total_readcounts.csv -v data/AML/input_data/AML-63_variant_readcounts.csv -t data/AML/input_data/AML-63_timepoints.csv --mutation-tree data/AML/input_data/AML-63_mutation_tree.csv -o data/AML/output_data/AML-63 -z 0.05 -a 0.01 -b 0.038 --ado 15 --time-limit 1000

Data

Currently, the csv files encoding the AML-63 & AML-97 readcount data are stored in data/AML/input_data and the Phyllochron inferred cell assignments are stored in data/AML/output_data.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data/AML		data/AML
notebooks		notebooks
simulations		simulations
src		src
LICENSE		LICENSE
README.md		README.md
phyllochron.png		phyllochron.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phyllochron (Maximum Likelihood Assignment for Longitudinal Reconstruction)

Contents

Pre-requisites (see .yaml file for versions)

Usage instructions

I/O formats

Phyllochron

Data

About

Releases

Packages

Languages

License

raphael-group/Phyllochron

Folders and files

Latest commit

History

Repository files navigation

Phyllochron (Maximum Likelihood Assignment for Longitudinal Reconstruction)

Contents

Pre-requisites (see .yaml file for versions)

Usage instructions

I/O formats

Phyllochron

Data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages