Just some examples to have better understanding of the GAP project, under GSOC'25.😊
Note- This is just for better understanding of the possible approaches, and in no way reflects the actual approaches that will be implemented.
- Nextflow - https://www.nextflow.io/
- nf core - https://nf-co.re/docs/
- Sanger-tol - https://github.com/sanger-tol
- GAP database - https://gap.cog.sanger.ac.uk/
- Project Board - https://github.com/orgs/sanger-tol/projects/3/views/23
- All pipelines - https://pipelines.tol.sanger.ac.uk/genome_after_party
- Genome Assembly - https://github.com/sanger-tol/genomeassembly
- fasta_windows - https://github.com/tolkit/fasta_windows
- All file formats info - https://genome.ucsc.edu/FAQ/FAQformat.html
- Sanger Guidelines - https://pipelines.tol.sanger.ac.uk/docs/contributing/review_checklist
- Bedtools - https://bedtools.readthedocs.io/en/latest/
- bedtobigBed - https://www.encodeproject.org/software/bedToBigBed/
- Some Genome assemblies datasets I found online - NCBI - https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000001405.40/
- View Done Datasets
https://github.com/sanger-tol/sequencecomposition
- refer GDA pipeline - https://github.com/sanger-tol/gda
- Telomeric annotations -
- treeval - https://github.com/sanger-tol/treeval
- telomere finder - https://pipelines.tol.sanger.ac.uk/treeval/1.2.2/output#telo-finder
-
TRASH - https://github.com/vlothec/TRASH
-
Earlgrey - https://github.com/TobyBaril/EarlGrey
-
Pantera - https://github.com/piosierra/pantera
-
ModDotPlot - https://github.com/marbl/ModDotPlot
- Genmap - https://nf-co.re/modules/genmap_map/
Variant Calling - https://github.com/sanger-tol/variantcalling/tree/main
- deepvariant - https://github.com/google/deepvariant
- PSMC - https://github.com/lh3/psmc
- RaisD - https://github.com/alachins/raisd
- SnpEff - https://github.com/pcingola/SnpEff
- Info on SnpEff and SnpSift: https://pcingola.github.io/SnpEff/snpeff/introduction/
- Samtools RoH - https://samtools.github.io/bcftools/howtos/roh-calling.html