Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdbfixer is mislabeling built residues with negative res IDs #175

Open
nitroamos opened this issue Oct 3, 2018 · 5 comments
Open

pdbfixer is mislabeling built residues with negative res IDs #175

nitroamos opened this issue Oct 3, 2018 · 5 comments

Comments

@nitroamos
Copy link

In this line

        newResidue = chain.topology.addResidue(residueName, chain, "%d" % ((firstIndex+i)%10000))

PDBFixer is wrapping negative residue numbers around 10000, meaning that a residue whose number is supposed to be -4 is ending up as 9996.

One fix would look like this:

        newResId = firstIndex+i
        if len(str(newResId)) >= 5:
	   newResId = (firstIndex+i)%10000
        newResidue = chain.topology.addResidue(residueName, chain, "%d" % (newResId))

which is closer to what happens in OpenMM

Or even simpler would be to not do the modulo in PDBFixer since OpenMM does it.

@peastman
Copy link
Member

peastman commented Oct 3, 2018

Where did you find a PDB file with negative residue numbers? Residue numbers are supposed to be the index within the SEQRES section, which by definition can never be negative.

@nitroamos
Copy link
Author

In my test case, it's coming from a REMARK 465 section which is integrated with PDBFixer as outlined here. For example, here's a random one Google found for me, take a look here

REMARK 465                                                                      
REMARK 465 MISSING RESIDUES                                                     
REMARK 465 THE FOLLOWING RESIDUES WERE NOT LOCATED IN THE                       
REMARK 465 EXPERIMENT. (RES=RESIDUE NAME; C=CHAIN IDENTIFIER;                   
REMARK 465 SSSEQ=SEQUENCE NUMBER; I=INSERTION CODE.)                            
REMARK 465     RES C SSSEQI                                                     
REMARK 465     MET A   -19                                                      
REMARK 465     GLY A   -18                                                      
REMARK 465     SER A   -17                                                      
REMARK 465     SER A   -16
...

I think the scientific origin of this is when people want to number their residues based on a pre-existing sequence alignment.

@peastman
Copy link
Member

peastman commented Oct 4, 2018

Ok, that makes sense. Your solution looks fine. Note that when PDBFixer calls PDBFile.writeFile(), it specifies keepIds=True. That's why it needs to do the modulo itself instead of relying on PDBFile to do it.

@nitroamos nitroamos reopened this Oct 5, 2018
@nitroamos
Copy link
Author

oops, didn't mean to close it. 😄

@peastman
Copy link
Member

peastman commented Oct 5, 2018

Good point. So PDBFixer really doesn't need to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants