Skip to content

Conversation

@peastman
Copy link
Member

I think I have everything working correctly. @zhang-ivy @tristanic if you have time to look this over, that would be great. I have a test case that computes the energy of a simple glycopeptide, but I don't have anything to compare it to. With enough effort I can probably figure out how to compute a reference value with Amber, but if someone who is more of an Amber user than me can do it, that would also be very much appreciated!

I had to make a small change to the API for template matchers. It was already marked as experimental and subject to change, so I think that's acceptable, but be aware that this breaks the existing GLYCAM version in openmmforcefields. That version is already broken in other ways, though, such as the incorrect <ExternalBond> tags for terminal residues.

Fixes #3175.

@tristanic
Copy link
Contributor

I'm not an AMBER user myself, so I can't really help with the reference values. All looks reasonable to me as far as it goes - but I do think it would be really valuable to also support using GLYCAM for models with sugar residues named according to wwPDB standards. That of course gets a bit more complicated, because you'd need to incorporate its naming rules in your code. That's how I'm doing it in ISOLDE: it requires the model to strictly adhere to wwPDB naming conventions for sugars, and assigns the templates for them and their anchoring amino acid here:
https://github.com/tristanic/isolde/blob/d894e9c102795ed5b149596f81311752347ab853/isolde/src/openmm/openmm_interface.py#L3106-L3127
... using the lookup code here:
https://github.com/tristanic/isolde/blob/master/isolde/src/openmm/amberff/glycam.py

That's been working for N- and O-linked glycans for a few years now. However it doesn't yet work for free sugars or oligosaccharides - that gets a bit more complicated because models from the wwPDB incorporate the reducing-terminal OH into the sugar residue where GLYCAM appends it as a separate 2-atom residue. Same idea goes for sulfated and phosphorylated sugars - I've left those in the too-hard basket for the time being.

@peastman
Copy link
Member Author

I haven't tested it, but here is what I think will happen if you give it a file using PDB names. The custom template matcher won't find a match, since the names will be different. It will then use the standard method of comparing topologies. If there's a single unique match, it will work, but if there are two templates for identical topologies but different chiralities, they'll both match. It will then compare the templates to see if they assign identical parameters. If they do, it will work. If they assign different parameters, it will throw an exception.

@zhang-ivy
Copy link
Contributor

zhang-ivy commented Oct 28, 2021

I've opened a PR to fix the broken GLYCAM_06j-1.xml here: openmm/openmmforcefields#180
Just need some help from @peastman in finishing it up.

@zhang-ivy
Copy link
Contributor

zhang-ivy commented Oct 28, 2021

if someone who is more of an Amber user than me can do it, that would also be very much appreciated!

Here are the reference values I get from parametrizing with tleap:

[('HarmonicBondForce', 12.799392157853614), ('HarmonicAngleForce', 19.64880309443367), ('PeriodicTorsionForce', 116.90037674510984), ('NonbondedForce', 620.4976939739155), ('CMMotionRemover', 0.0)]
Total: 769.8462659713126 kT

Code:

import parmed
from copy import deepcopy
import openmm
from openmm import unit as u

# set beta
temperature = 300.0 * u.kelvin
kB = u.BOLTZMANN_CONSTANT_kB * u.AVOGADRO_CONSTANT_NA
kT = kB * temperature
beta = 1.0/kT

prmtop = "/home/zhangi/choderalab/openmmforcefields/amber/files/glycam/glycoprotein.prmtop"
inpcrd = "/home/zhangi/choderalab/openmmforcefields/amber/files/glycam/glycoprotein.inpcrd"

# Get AMBER system
parm_amber = parmed.load_file(prmtop, inpcrd)
system_amber = parm_amber.createSystem()

def compute_potential_components(system, positions, beta=beta):
    # Note: this is copied from perses
    # TODO: consider moving this outside of assert_energies_glyco_protein()
    system = deepcopy(system)
    for index in range(system.getNumForces()):
        force = system.getForce(index)
        force.setForceGroup(index) 
    integrator = openmm.VerletIntegrator(1.0*u.femtosecond)
    platform = openmm.Platform.getPlatformByName('Reference')
    context = openmm.Context(system, integrator, platform)
    context.setPositions(positions)
    energy_components = list()
    for index in range(system.getNumForces()):
        force = system.getForce(index)
        forcename = force.__class__.__name__
        groups = 1<<index
        potential = beta * context.getState(getEnergy=True, groups=groups).getPotentialEnergy()
        energy_components.append((forcename, potential))
    del context, integrator
    return energy_components

amber_energies = compute_potential_components(system_amber, parm_amber.positions)
print(amber_energies)

leap_files_for_peter.zip

@peastman
Copy link
Member Author

Thanks!

@zhang-ivy
Copy link
Contributor

How come the energies you hard coded in the test are different than the ones in my last message?

@peastman
Copy link
Member Author

The positions in your inpcrd file are slightly different from the ones in the PDB file.

@peastman
Copy link
Member Author

All tests are passing. Is this ok to merge?

@zhang-ivy
Copy link
Contributor

Should we wait until openmm/openmmforcefields#180 is merged so we can include that version of GLYCAM_06j-1.xml?

@peastman
Copy link
Member Author

It looks to me like the only difference between them is that the <ExternalBond> tags have been reordered.

@zhang-ivy
Copy link
Contributor

zhang-ivy commented Oct 28, 2021

If you search for OME, OLP, HYP, and any terminal residue (e.g. `NNLN'), you can see that the external bonds have been corrected as well.

@peastman
Copy link
Member Author

Right. I meant, compared to the one in this PR. Can you make it produce the <ExternalBond> tags in the same order as before, so we can run a diff and see if there's any other difference between them?

@zhang-ivy
Copy link
Contributor

zhang-ivy commented Oct 29, 2021

I've added a one line hack to my the parmed code to maintain atom ordering to make the diff easier to read. Please take a look at the ffxml in openmm/openmmforcefields#180 again -- there should be a number of differences (for OME, ROH, TBT, OLP, etc).

@peastman
Copy link
Member Author

Thanks! That gives a much simpler diff. Here are the changes compared to the one currently in this PR.

OME: eliminate external bonds to H1 and CH3
ROH: eliminate external bond to HO1
SO3: eliminate external bonds to O1, O2, and O3
TBT: eliminate external bond to C2
HYP and variants: eliminate external bonds to CG, OD1, and HD1
OLP and variants: external bond is to OD1 instead of CG

I'm pretty sure those changes are correct, so I'll update this PR with the new version. Just to confirm though, HYP really isn't supposed to have any external bonds other than the standard N and C ones? It isn't an amino acid that can have a glycan attached to it?

@tristanic
Copy link
Contributor

tristanic commented Oct 29, 2021 via email

@peastman
Copy link
Member Author

Thanks!

@zhang-ivy
Copy link
Contributor

Yes, agreed with Tristan -- I'm not sure why they included HYP and related variants in the GLYCAM force field.

I've just reviewed everything else in the PR and it looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

addHydrogens() fails to add hydrogens to topology with glycam residues

3 participants