Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Genes statistics file to snpeff output #6688

Open
paulzierep opened this issue Jan 22, 2025 · 0 comments
Open

Add Genes statistics file to snpeff output #6688

paulzierep opened this issue Jan 22, 2025 · 0 comments

Comments

@paulzierep
Copy link
Contributor

As per snpeff docs there is a Genes statistics file:

The "Genes statistics" file path is by default the directory where SnpEff is executed and called snpEff_genes.txt. If you change the summary file name / path by either using -stats or csvStats command line, the "Genes statistics" file path will be the same directory as the summary file and the file name is the same "base name" plus a ".genes.txt".

Which gives stats per gene:

$ head snpEff_genes.txt
# The following table is formatted as tab separated values.
#GeneName   GeneId  TranscriptId    BioType variants_impact_HIGH    variants_impact_LOW variants_impact_MODERATE    variants_impact_MODIFIER    variants_effect_3_prime_UTR_variant variants_effect_5_prime_UTR_premature_start_codon_gain_variant  variants_effect_5_prime_UTR_variant variants_effect_downstream_gene_variant variants_effect_intron_variant  variants_effect_missense_variant    variants_effect_non_coding_exon_variant variants_effect_splice_acceptor_variant variants_effect_splice_donor_variant    variants_effect_splice_region_variant   variants_effect_start_lost  variants_effect_stop_gained variants_effect_stop_lost   variants_effect_synonymous_variant  variants_effect_upstream_gene_variant   bases_affected_DOWNSTREAM   total_score_DOWNSTREAM  length_DOWNSTREAM   bases_affected_EXON total_score_EXON    length_EXON bases_affected_INTRON   total_score_INTRON  length_INTRON   bases_affected_SPLICE_SITE_ACCEPTOR total_score_SPLICE_SITE_ACCEPTOR    length_SPLICE_SITE_ACCEPTOR bases_affected_SPLICE_SITE_DONOR    total_score_SPLICE_SITE_DONOR   length_SPLICE_SITE_DONOR    bases_affected_SPLICE_SITE_REGION   total_score_SPLICE_SITE_REGION  length_SPLICE_SITE_REGION   bases_affected_TRANSCRIPT   total_score_TRANSCRIPT  length_TRANSCRIPT   bases_affected_UPSTREAM total_score_UPSTREAM    length_UPSTREAM bases_affected_UTR_3_PRIME  total_score_UTR_3_PRIME length_UTR_3_PRIME  bases_affected_UTR_5_PRIME  total_score_UTR_5_PRIME length_UTR_5_PRIME
AC000029.1  ENSG00000221069 ENST00000408142 miRNA   0   0   0   2   0   0   0   2   0   0   0   0   0   0   0   0   5000    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
AC000068.5  ENSG00000185065 ENST00000431090 antisense   0   0   0   1   0   0   0   0   0   0   0   0   0   0   0   5000    0   0   0   0   0   0
AC000081.2  ENSG00000230194 ENST00000433141 processed_pseudogene    0   0   0   8   0   0   0   3   0   0   0   0   0   0   5000    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   5   0   5000    0   0
AC000089.3  ENSG00000235776 ENST00000424559 processed_pseudogene    0   0   0   1   0   0   0   0   0   0   0   0   0   0   5000    0   0   0   0   0   0
AC002472.1  ENSG00000269103 ENST00000547793 protein_coding  0   0   0   6   0   0   0   5   0   0   0   0   0   0   0   5000    0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   0   5000    0   0
AC002472.11 ENSG00000226872 ENST00000450652 antisense   0   0   0   13  0   0   0   5   2   0   0   0   0   0   0   5000    0   0   0   2   0   11199   0   0   0   0   0   0   0   0   0   0   0   0   6   0   5000    0   0
AC002472.13 ENSG00000187905 ENST00000342608 protein_coding  0   1   6   1   0   0   0   0   1   6   0   0   0   1   0   116 1   0   934 0   0   0   0   0   0   1   0   3   0   0   0   0   0   0   0   0   0   0   0
AC002472.13 ENSG00000187905 ENST00000442047 protein_coding  0   1   6   1   0   0   0   0   1   6   0   0   0   1   0   116 1   0   934 0   0   0   0   0   0   1   0   3   0   0   0   0   0   0   0   0   0   0   0

This file could be rather useful for correlation analysis we are planning in a project.
Currently, only the html summary stats are caught by the output. Need to add the option to fetch this output as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant