GitHub - therapist3003/GeneTrie: A DNA Database Explorer developed in C++, is a powerful console application that analyzes and stores DNA data from vast databases supporting faster retrieval time

KEY IDEA: A Substring is a prefix of a suffix Reference: https://youtube.com/watch?v=llTjA5SeS7k&list=PL2mpR0RYFQsDFNyRsTNcWkFTHTkxWREeb&index=2 (15:00)

Why SUFFIX TRIE and why not TRIE (Prefix Tree)? Tries may not be as suitable for DNA sequence matching due to their limitation in efficiently handling variable-length sequences and managing common prefixes without redundancy.

This project creates a Suffix Trie of large DNA dataset and given any sufficiently long DNA sequence, it outputs whether a match is found with O(n) time , where n is the length of the query sequence.

Additionally, there are few smaller sequences along with the characteristics of the person with that sequence stored in separate text files (Data created manually for testing purpose). Using this, given any query sequence, if a person is found with the match it outputs the possible characteristics he/she may possess.

Can be extended (Ideas to improve the project):

Persons' characteristics can be stored at leaf node. So that when a query is matched, corresponding traits can be outputted.

Resources :

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
GeneTrie.cpp		GeneTrie.cpp
README.md		README.md
human.txt		human.txt
person1.txt		person1.txt
person2.txt		person2.txt
person3.txt		person3.txt
person4.txt		person4.txt
person5.txt		person5.txt
person6.txt		person6.txt
person7.txt		person7.txt
person8.txt		person8.txt
persons.txt		persons.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

therapist3003/GeneTrie

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages