Skip to content

Problem with protein/transcript identifiers #2

@TDDB-limagrain

Description

@TDDB-limagrain

Hi Hesham,

this is not a real problem but solving it would make life easier :-D
I successfully ran VCF2PROT v0.1.4 but I add to correct the transcript identifier in the VCF file as well as in the reference fasta file.

My bcftools-annotated VCF file has Solyc02g062560.3|Solyc02g062560.3.1 identifiers in the BCSQ fields and my protein file header is Solyc02g062560.3.1.
It seems that the . in the sequence name is causing some problem and in this case, the output .fasta file was empty. After removing the end of the sequence name (moving to Solyc02g062560.3|Solyc02g062560 in the VCF and to Solyc02g062560 in the reference fasta), vcf2prot finally succeeded in writing the proper corrected sequences.

Hope it will help for the future!

Best regards,

Thomas

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions