Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CGD-2145 - Provide additional CCDS transcripts #212

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jPleyte
Copy link
Contributor

@jPleyte jPleyte commented Jan 28, 2025

Somebody noticed a variant imported into CGD that doesn't have any CCDS transcripts even though it should.

It looks like there is something we can do to get more CCDS transcripts sent to CGD but it requires that we lower or standards for data quality.

We originally decided that when multiple versions of the same refseq accession are found for a variant, we choose just one of them to be sent to CGD.

Example:
For 22-46929555-C-T,
Annovar comes up with NM_014246.3
And hgvs/uta comes up with NM_014246.4, NM_014246.3, and NM_014246.1.
So NM_014246.3 is the one with the most information and is what gets sent to CGD.
But for whatever reason NM_014246.3 doesn't map to a CCDS, while the other two (.1 and .4) do.

The way things have been working is that the best transcript (NM_014246.3) was selected and then we checked to see if there was a corresponding CCDS for it. And since there isn't then only NM_014246.3 makes it into the tfx vcf.

Now, NM_014246.3 is still the only refseq transcript sent to CGD. But, tx_eff_hgvs has been changed so it realises that NM_014246.3 doesn't have a CCDS. And it looks at the other two versions of the accession and makes a copy of NM_014246.4 and changes the accession to CCDS14076.1. So we end up with NM_014246.3 and CCDS14076.1 being sent to cgd.

The downside to this solution is that there is a reason we discarded NM_014246.4 in the first place. It isn't known to annovar, so it is missing the fields that annovar populates; it's incomplete. But if we lower our standards and accept it, then we can have more ccds transcripts in CGD.

@jPleyte jPleyte self-assigned this Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants