Skip to content

Conversation

@aizvorski
Copy link
Contributor

@aizvorski aizvorski commented Jun 22, 2022

Adding OPC and OPC3 water models, references:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4226301/

For OPC, any parameters not directly copied from the paper were calculated as follows:

H-O-H angle: deg2rad(103.6)
O charge: -2*0.6791
Virtual site weight 2 and 3: 0.1594/(2*(cos(deg2rad(103.6)/2) * 0.8724))
Virtual site weight 1: 1 - 2*weight2

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4991989/

For OPC3:

H-O-H angle: deg2rad(109.47)
O charge: -2*0.447585

In terms of xml file format, opc.xml was initially cloned from tip4pfb.xml, and opc3.xml from tip3p.xml.

The models are intended to be rigid, so the papers don't provide bond and angle force constants. The constants used here (only for convenience so flexible water simulations don't blow up) are also copied from TIP4FB / TIP3P.

@aizvorski aizvorski marked this pull request as ready for review June 22, 2022 05:20
@peastman
Copy link
Member

Thanks! Is it possible to add a test case that verifies these models? Something like this:

  • Create a system composed of two water molecules.
  • Set the positions to values that ought to match the expected geometry.
  • Call context.applyConstraints() and make sure the positions don't move significantly.
  • Evaluate the potential energy and verify that it matches the result from some other program.

We also should add them to the list of water models in the manual. You'll find it in this file.

@aizvorski
Copy link
Contributor Author

aizvorski commented Jun 24, 2022

@peastman I was thinking the density of an equilibrated box would be a pretty good test. I don't have a reference for the equilibrium geometry for a pair of molecules.

Is there a specific way to compare energy amber <-> openmm?

Also, how were the equilibrated water boxes eg openmm/app/data/tip3p.pdb made? Would OPC and OPC3 need their own?

@peastman
Copy link
Member

I suggest keeping it to just an energy comparison for a minimal system. Simulating a water box long enough to converge the density will take a while, and we want unit tests to be as fast as possible (preferably much less than a second for each one). And the result will be stochastic, so it might occasionally fail. Computing the energy of two molecules is fast and deterministic.

Is there a specific way to compare energy amber <-> openmm?

You can just compute the energy with sander and hardcode the result in the test case (with a comment saying where the value came from).

@aizvorski
Copy link
Contributor Author

aizvorski commented Jun 25, 2022

@peastman I added an OPC water box and set up a unit test, the energies do match pretty closely.

First, using ambertools-22.0-py38h6177452_1 installed from conda-forge, in tleap

source leaprc.water.opc
loadoff opcbox.off
saveamberparm OPCBOX opcbox.prmtop opcbox.inpcrd
savepdb OPCBOX opcbox.pdb

Then, get energy in sander (not super familiar with running sander so please let me know if there is a better way to just get energy without minimizing or running dynamics):

mdin

Test
 &cntrl
  imin=1,
  maxcyc=0,
  cut=7.0,
 /

sander command and results

sander -i mdin -p opcbox.prmtop -c opcbox.inpcrd
...
   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -2.6476E+03     2.0113E+01     9.0801E+01     O         609

 BOND    =        0.0587  ANGLE   =        0.0000  DIHED      =        0.0000
 VDWAALS =      362.2542  EEL     =    -3009.9362  HBOND      =        0.0000
 1-4 VDW =        0.0000  1-4 EEL =        0.0000  RESTRAINT  =        0.0000

Energy calculated in openmm with same PME cutoff using opcbox.pdb and opc.xml = -2647.2222697324237 kcal/mol, after applyConstraints(1e-12) -2647.441600693312 kcal/mol, vs sander value -2647.6233 kcal/mol.

Let me know if that looks okay and I'll do the same for OPC3.

P.S. I ran the test in a notebook and then copied it to a unittest.TestCase. Could you please let me know how openmm unit tests are run usually?

@aizvorski
Copy link
Contributor Author

aizvorski commented Jun 25, 2022

@peastman I serialized a system created from the opcbox.prmtop and opcbox.inpcrd, and kept modifying the opc.xml file until a system serialized from opcbox.pdb and opc.xml matched as closely as possible. However - that requires values which are slightly different from the ones described in ambertools's frcmod and lib files

For example:

ambertools-22.0-py38h6177452_1/dat/leap/lib/opcbox.off

 "O" "OW" 0 1 131072 1 8 0.0
 "H1" "HW" 0 1 131072 2 1 0.679142
 "H2" "HW" 0 1 131072 3 1 0.679142
 "EPW" "EP" 0 1 131072 4 0 -1.358284

opcbox.xml (created from opcbox.prmtop)

                                <Particle eps=".8903586017365637" q="0" sig=".316655208182147"/>
                                <Particle eps="0" q=".679142001832919" sig=".08908987181403394"/>
                                <Particle eps="0" q=".679142001832919" sig=".08908987181403394"/>
                                <Particle eps="0" q="-1.3582839981780566" sig=".08908987181403394"/>

Is a charge difference of 1.8e-09 something to worry about? Where does the difference come from? Which value should we use?

There may be similar very slight differences in geometry - bond lengths and constraints are exact, but angles formed by the OW-HW and HW-HW triangle are not exactly the same as the specified HW-OW-HW angle of 103.6 degrees, rather the angle is 103.59892973636502 degrees. For this I matched the exact HW-HW length not the angle.

Conversion code:

import openmm as mm
from openmm import app
from openmm import unit

prmtop = app.AmberPrmtopFile('opcbox.prmtop')

system = prmtop.createSystem(
    nonbondedMethod=app.PME,
    nonbondedCutoff=0.7*unit.nanometer,
    constraints=app.HBonds,
    rigidWater=True
)

with open('opcbox.xml', 'w') as fh:
    fh.write(mm.XmlSerializer.serialize(system))

@jchodera
Copy link
Member

OpenMM uses a different set of physical constants than Amber's sander: OpenMM uses CODATA 2018, while Amber sander uses some historical choices [the square root of 1/(4*pi*epsilon0) in kcal/mol/A used by sander is given here.

As a result, I'd expect to see some differences on the order of 3e-5 in energies and forces.

@aizvorski
Copy link
Contributor Author

aizvorski commented Jun 26, 2022

OpenMM uses a different set of physical constants than Amber's sander:

@jchodera Thanks! Which do you think is better:

(a) have opc.xml use the same numerical values as leap/lib/opcbox.off and leap/parm/frcmod.opc, but have a system created from tleap -> prmtop -> openmm be different from a system created from pdb -> openmm, or
(b) have opc.xml use different numbers, but have a system created tleap -> prmtop -> openmm be identical to pdb -> openmm?

I'm still not sure where the difference in numerical values comes from now, even though the unit definitions are different. Do we essentially convert units when reading prmtop files now? Ie, does reading a prmtop cause 1 prmtop-e charge to be converted to 1.000001 opemm-e charges or something similar?

@jchodera
Copy link
Member

Which do you think is better:

@peastman may have a more succinct statement of our philosophy on converting parameters developed with other packages, but I believe out approach to date has been to convert the parameters directly, even if the difference in physical constants causes a difference in potential energies between packages. More modern approaches for converting parameterized systems like openff-interchange from @mattwthompson may have a different philosophy here.

does reading a prmtop cause 1 prmtop-e charge to be converted to 1.000001 opemm-e charges or something similar?

It's important to remember that Amber prmtop files store the square root of the atomic charge (in units of |e|^{1/2}). If you go from off -> prmtop -> xml -> OpenMM, it will compute the square root, print it to a prmtop file truncating the decimal places (fortran%FORMAT(5E16.8)), read it again, and then square it. This will certainly introduce differences in the exact numerical values.

@mattwthompson
Copy link
Contributor

What currently happens in in OpenMM's infrastructure when Amber infrastructure reports a value using its unit definitions? I would be floored if this issue hasn't been run into years ago; as far as I can tell this isn't the first set of Amber parameters that are being ported into OpenMM code.

Let me add that it's a difficult problem and I wouldn't at all blame people for sweeping under the rug in the past.

@peastman
Copy link
Member

Different values for fundamental constants is a problem through the whole community. I believe CHARMM uses yet another set of values which are different from either OpenMM or Amber. Still, when people convert force fields they generally use the exact parameter values. This means you shouldn't expect energies from different codes to match to more than about five significant digits. In practice that doesn't generally matter, since other uncertainties are much larger.

@aizvorski
Copy link
Contributor Author

aizvorski commented Jun 26, 2022

@peastman Ready for review!

  • Both OPC and OPC3 are done
  • Exact parameter values for bond lengths and charges (from frcmod.opc, opcbox.off and frcmod.opc3, opc3box.off)
  • Derivation for sigma, epsilon, angle and virtual site weights from exact values is shown in comments in the xml files
  • Added opcbox.pdb and opc3box.pdb
  • Added unit tests for energies, the values for the water boxes match those calculated with sander to 0.02% or better
  • Updated docs

@peastman
Copy link
Member

Aside from the minor comments above about the test case, this looks great.

@peastman peastman merged commit 583471a into openmm:master Jun 28, 2022
@peastman
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants