I have a deep passion for both biology and data. As an engineer, I love building things and applying my technical skills to solve complex problems. My true passion lies in biology, particularly in proteins, genetics, and everything related to them. I have hands-on experience working in a wet lab, and I've gained extensive knowledge of proteins, RNA, and mRNA genes. My goal is to continue expanding my understanding of these fascinating fields and contribute to advancements in biopharma, biotech, and healthcare.
I am a data scientist with extensive knowledge and experience in biological systems. In addition to my strong background in data science, I have spent nearly two years working in a biotech company in Canada.
HER2 Mutation Analysis and Peptide Generation
Analyzes HER2 mutations in the extracellular domain and generates high-affinity peptide candidates using Python. Plans: Add ML optimization and structural analysis. Github
Computational Protein Analysis
The Peptide class I developed is a versatile tool designed to handle various peptide-related calculations, such as generating random peptide sequences, computing physicochemical properties, and determining sequence-specific descriptors. It encapsulates complex bioinformatics algorithms, including the computation of hydrophobicity, aliphatic index, theoretical charge, and more. I plan to enhance the project by incorporating additional biological properties, including structural class prediction, and integrate various QSAR descriptors (e.g., BLOSUM indices, Cruciani properties). Github
Rosalind Bioinformatics
My solutions to bioinformatics problems found on the Rosalind website, worked primarily in Python. Github
DNA and Genomic Toolkits
DNA toolkit: Counting nucleotides, Transcription, Reverse Complement, GC-content calculations, Translating DNA into amino acids, and Finding proteins in DNA Sequence. Genomic toolkit: Search for k-mers. Github
Genotype PCA Analysis
This project reads genotype data from a VCF file, processes it, performs PCA, and visualizes the results to analyze genetic variations. Github
NGS Data Processing Pipeline
Developed a Next-Generation Sequencing (NGS) data processing pipeline, utilizing tools such as Cutadapt, BWA, FreeBayes, and Python for trimming, alignment, variant calling, and data visualization. Github
Gene Expression and Regulatory Topic Modeling
Data preprocessing, gene expression analysis, and topic modeling using non-negative matrix factorization (NMF) to identify regulatory mechanisms and potential therapeutic targets. Github
- Data visualization and manipulation, Processing large datasets, Genomic data analysis, Predictive analytics, Hypothesis testing, t-test, ANOVA, A/B test
- Python (pandas, NumPy, scikit-learn, TensorFlow), R, SQL, JMP, MATLAB, Cloud computing (Azure, Google Colab)
- Leadership, Time management, Critical thinking, Teamwork, Storytelling
- Protein Engineering, Molecular and cellular biology, Genomics
- 🔭 I’m currently working on Peptide toolkit
- 🌱 I’m currently learning SQL
- 📫 How to reach me: (https://www.linkedin.com/in/farnoosh-ostad/)
- 😄 Pronouns: She/Her