A single object API that makes working with biological sequences in Python more ergonomic. It'll handle anything like a sequence.
Built around the Biopython SeqRecord class, SeqLikes abstract over the semantics of molecular biology (DNA -> RNA -> AA) and data structures (strings, Seqs, SeqRecords, numerical encodings) to allow manipulation of a biological sequence at the level which is most computationally convenient.
def f(seq: SeqLikeType, *args):
seq = SeqLike(seq, seq_type="nt").to_seqrecord()
# ...
prediction = model(aaSeqLike('MSKGEELFTG').to_onehot())
new_seq = ntSeqLike(generative_model.sample(), alphabet="-ACGTUN")
Back-translation is conveniently built-in!
s_nt = ntSeqLike("ATGTCTAAAGGTGAA")
s_nt[0:3] # ATG
s_nt.aa()[0:3] # MSK, nt->aa is well defined
s_nt.aa()[0:3].nt() # ATGTCTAAA, works because SeqLike now has both reps
s_nt[:-1].aa() # TypeError, len(s_nt) not a multiple of 3
s_aa = aaSeqLike("MSKGE")
s_aa.nt() # AttributeError, aa->nt is undefined w/o codon map
s_aa = aaSeqLike(s_aa, codon_map=random_codon_map)
s_aa.nt() # now works, backtranslated to e.g. ATGTCTAAAGGTGAA
s_aa[:1].nt() # ATG, codon_map is maintained
seqs = [s for s in SeqIO.parse("file.fasta", "fasta")]
df = pd.DataFrame(
{
"names": [s.name for s in seqs],
"seqs": [aaSeqLike(s) for s in seqs],
}
)
df["aligned"] = df["seqs"].seq.align()
df["aligned"].seq.plot()
# Assume you have a dataframe with a column of 10 SeqLikes of length 90
df["seqs"].seq.to_onehot().shape # (10, 90, 23), padded if needed
To see more in action, please check out the docs!
Install the library with pip
or conda
.
With pip
pip install seqlike
With conda
conda install -c conda-forge seqlike
- Questions about usage should be posed on Stack Overflow with the #seqlike tag.
- Bug reports and feature requests are managed using the Github issue tracker.
Thanks goes to these wonderful people (emoji key):
Nasos Dousis 💻 |
andrew giessel 💻 |
Max Wall 💻 📖 |
Eric Ma 💻 📖 |
Mihir Metkar 🤔 💻 |
Marcus Caron 📖 |
pagpires 📖 |
Sugato Ray 🚇 🚧 |
Damien Farrell 💻 |
Farbod Mahmoudinobar 💻 |
Jacob Hayes 🚇 |
This project follows the all-contributors specification. Contributions of any kind welcome!