Skip to content

Incorrect numbering of human TRAV30 #110

@nh3

Description

@nh3

Hi there,

Thank you for developing this excellent tool. However, I noticed that the numbering of TRAV30 appears to be incorrect:

$ conda list | grep anarci

anarci                    2026.2.13.2              pypi_0    pypi

$ cat TRAV_LV_aa.fa

>AE000660|TRAV30*01|Homo sapiens|F|L-PART1+V-EXON|172357..172408+172632..172916|337 nt|1| | | |112 AA|112+0=112| | |
METLLKVLSGTLLWQLTWVRSQQPVQSPQAVILREGEDAVINCSSSKALYSVHWYRQKHG
EAPVFLMILLKGGEQKGHEKISASFNEKKQQSSLYLTASQLSYSGTYFCGTE

$ ANARCI -i TRAV_LV_aa.fa --scheme imgt --assign_germline | head -n 20

# AE000660|TRAV30*01|Homo sapiens|F|L-PART1+V-EXON|172357..172408+172632..172916|337 nt|1| | | |112 AA|112+0=112| | |
# ANARCI numbered
# Domain 1 of 1
# Most significant HMM hit
#|species|chain_type|e-value|score|seqstart_index|seqend_index|
#|human|A|2.4e-28|89.6|20|111|
# Most sequence-identical germlines
#|species|v_gene|v_identity|j_gene|j_identity|
#|human|TRAV30*01|0.97|TRAJ7*01|0.00|
# Scheme = imgt
A 1       S               <--- numbering started 1 aa before the IMGT annotated start
A 2       Q
A 3       Q
A 4       P
A 5       V
A 6       Q
A 7       S
A 8       P
A 9       Q
A 10      A

$ cat TRAV_V_gapped_aa.fa

>AE000660|TRAV30*01|Homo sapiens|F|V-REGION|172643..172916|274 nt|1| | | |91 AA|91+16=107| | |
QQPV.QSPQAVILREGEDAVINCSSSKAL.......YSVHWYRQKHGEAPVFLMILLKG.
..GEQKGH.....EKISASFNEKKQQSSLYLTASQLSYSGTYFCGTE

$ ANARCI -i $(seqtk seq -l 0 TRAV_V_gapped_aa.fa | tail -1 | sed 's/\.//g') --scheme imgt --assign_germline | head -n 20

# Input sequence
# ANARCI numbered
# Domain 1 of 1
# Most significant HMM hit
#|species|chain_type|e-value|score|seqstart_index|seqend_index|
#|human|A|1.6e-28|90.2|1|90|
# Most sequence-identical germlines
#|species|v_gene|v_identity|j_gene|j_identity|
#|human|TRAV30*01|0.96|TRAJ7*01|0.00|
# Scheme = imgt
A 1       -               <--- numbering still started 1 aa before the IMGT annotated start
A 2       -
A 3       Q
A 4       P
A 5       V
A 6       Q
A 7       S
A 8       P
A 9       Q
A 10      A

The sequence and numbering of TRAV30 are correct in germlines.py in the installed package:

"TRAV30*01": "QQPV-QSPQAVILREGEDAVINCSSSKAL-------YSVHWYRQKHGEAPVFLMILLKG---GEQKGH-----EKISASFNEKKQQSSLYLTASQLSYSGTYFCGTE---------------------",

Thank you in advance for looking into this.

Best,
Ni

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions