Add support for espaloma small molecule parameters #182

jchodera · 2021-10-29T02:27:04Z

This PR adds support for espaloma small molecule parameters.

Addresses #181

Refactor SMIRNOFFResidueTemplateGenerator to separate out OpenMM System conversion into a mixin
Add EspalomaResidueTemplateGenerator
Add tests for EspalomaResidueTemplateGenerator
Add espaloma tests for SystemGenerator
Update revision history
Update docs in README.md
Ensure tests are passing
Update conda recipe to include validators (espaloma will be optional)

…re it?

jchodera · 2021-10-29T16:24:52Z

The current test failure is because I don't have a way to tell espaloma to use user-provided charges yet:

___________ TestEspalomaTemplateGenerator.test_charge_from_molecules ___________

self = <test_template_generators.TestEspalomaTemplateGenerator testMethod=test_charge_from_molecules>

    def test_charge_from_molecules(self):
        """Test that user-specified partial charges are used if requested"""
        # Create a generator that does not know about any molecules
        generator = self.TEMPLATE_GENERATOR()
        # Create a ForceField
        from openmm.app import ForceField
        forcefield = ForceField()
        # Register the template generator
        forcefield.registerTemplateGenerator(generator.generator)
    
        # Check that parameterizing a molecule using user-provided charges produces expected charges
        import numpy as np
        from openmm import unit
        molecule = self.molecules[0]
        charges = np.random.random([molecule.n_particles])
        total_charge = molecule.total_charge
        if type(total_charge) is unit.Quantity:
            # Handle openforcefield >= 0.7.0
            total_charge /= unit.elementary_charge
        charges += (total_charge - charges.sum()) / molecule.n_particles
        molecule.partial_charges = unit.Quantity(charges, unit.elementary_charge)
        assert (molecule.partial_charges is not None) and not np.all(molecule.partial_charges / unit.elementary_charge == 0)
        # Add the molecule
        generator.add_molecules(molecule)
        # Create the System
        from openmm.app import NoCutoff
        openmm_topology = molecule.to_topology().to_openmm()
        system = forcefield.createSystem(openmm_topology, nonbondedMethod=NoCutoff)
>       assert self.charges_are_equal(system, molecule)
E       AssertionError: assert False
E        +  where False = <bound method TestGAFFTemplateGenerator.charges_are_equal of <test_template_generators.TestEspalomaTemplateGenerator testMethod=test_charge_from_molecules>>(<openmm.openmm.System; proxy of <Swig Object of type 'OpenMM::System *' at 0x7efcfff1d450> >, Molecule with name 'DrugBank_2791' and SMILES '[H]c1c(c(c(c(c1C([H])([H])[H])[H])[H])S(=O)(=O)O[H])[H]')
E        +    where <bound method TestGAFFTemplateGenerator.charges_are_equal of <test_template_generators.TestEspalomaTemplateGenerator testMethod=test_charge_from_molecules>> = <test_template_generators.TestEspalomaTemplateGenerator testMethod=test_charge_from_molecules>.charges_are_equal

openmmforcefields/tests/test_template_generators.py:248: AssertionError

@yuanqing-wang: Is there a way to optionally tell espaloma to use charges from the Molecule object?
And how do I select whether it should use the GCN charges or OpenFF AM1-BCC charges?

jchodera · 2021-10-29T22:47:16Z

This is waiting on a new release following the merger of choderalab/espaloma#95, where we will be able to add support for retaining the charges provided in Molecule and selecting between AM1-BCC or espaloma charges.

Hopefully, we can also get espaloma onto conda-forge.

jchodera · 2021-11-14T15:30:39Z

The espaloma charge API in openmm_system_from_graph() has been updated in choderalab/espaloma#95:

    charge_method : str, optional, default='nn'
        Method to use for assigning partial charges:
        'nn' : Assign partial charges from the espaloma graph net model
        'am1-bcc' : Allow the OpenFF toolkit to assign AM1-BCC charges using default backend
        'gasteiger' : Assign Gasteiger partial charges (not recommended)
        'from-molecule' : Use partial charges provided in the original `Molecule` object

I'll update this PR to test!

…om nn or molecule

jchodera · 2021-11-15T21:36:32Z

@yuanqing-wang: It looks like espaloma is failing (perhaps due to the charge model?) here. Can you take a look at the exception and stack trace?

yuanqing-wang · 2021-11-16T01:29:26Z

@yuanqing-wang: It looks like espaloma is failing (perhaps due to the charge model?) here. Can you take a look at the exception and stack trace?

It looks like it's because your testing molecule doesn't have an improper. Let me quickly make a PR indicating that if you don't have improper we're gonna let it go.

yuanqing-wang · 2021-11-18T21:30:24Z

@jchodera could you rerun the tests? it should work now

jchodera · 2021-11-19T00:07:07Z

Tests have been re-run! It seems I'm running into a large discrepancy between the espaloma -> ffxml -> System energy and the System energy:

=================================== FAILURES ===================================
_________________ TestEspalomaTemplateGenerator.test_energies __________________

self = <test_template_generators.TestEspalomaTemplateGenerator testMethod=test_energies>

    def test_energies(self):
        """Test potential energies match between openff-toolkit and OpenMM ForceField"""
        # DEBUG
        from openff.toolkit.topology import Molecule
        molecule = Molecule.from_smiles('C=O')
        molecule.generate_conformers(n_conformers=1)
        from openmm import unit
        molecule.conformers[0][0,0] += 0.1*unit.angstroms
        self.molecules.insert(0, molecule)
        # Test all supported SMIRNOFF force fields
        for small_molecule_forcefield in EspalomaTemplateGenerator.INSTALLED_FORCEFIELDS:
            print(f'Testing energies for {small_molecule_forcefield}...')
            # Create a generator that knows about a few molecules
            # TODO: Should the generator also load the appropriate force field files into the ForceField object?
            generator = EspalomaTemplateGenerator(molecules=self.molecules, forcefield=small_molecule_forcefield)
            # Create a ForceField
            import openmm
            openmm_forcefield = openmm.app.ForceField()
            # Register the template generator
            openmm_forcefield.registerTemplateGenerator(generator.generator)
            # Parameterize some molecules
            for molecule in self.molecules:
                # Create OpenMM System using OpenMM app
                from openmm.app import NoCutoff
                openmm_system = openmm_forcefield.createSystem(molecule.to_topology().to_openmm(), removeCMMotion=False, onbondedMethod=NoCutoff)
    
                # Retrieve System generated by Espaloma
                espaloma_system = generator.get_openmm_system(molecule)
    
                # Compare energies and forces
>               self.compare_energies(molecule, openmm_system, espaloma_system)

openmmforcefields/tests/test_template_generators.py:988: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'test_template_generators.TestEspalomaTemplateGenerator'>
molecule = Molecule with name '' and SMILES '[H]C(=O)[H]'
openmm_system = <openmm.openmm.System; proxy of <Swig Object of type 'OpenMM::System *' at 0x7fda23cd3d50> >
espaloma_system = <openmm.openmm.System; proxy of <Swig Object of type 'OpenMM::System *' at 0x7fda23c1a9f0> >

    @classmethod
    def compare_energies(cls, molecule, openmm_system, espaloma_system):
        """Compare energies between Open Force Field Initiative and OpenMM ForceField objects
    
        The OpenMM System object created internally by the EspalomaTemplateGenerator is used to
        avoid any issues with stochasticity of partial charges due to conformer generation.
    
        Parameters
        ----------
        molecule : openff.toolkit.topology.Molecule
            The Molecule object to compare energy components (including positions)
        openmm_system : openmm.System
            System generated by OpenMM ForceField
        espaloma_system : openmm.System
            System generated by Espaloma engine
    
        """
    
        # Compute energies
        openmm_energy, openmm_forces = cls.compute_energy(openmm_system, molecule.conformers[0])
        espaloma_energy, espaloma_forces = cls.compute_energy(espaloma_system, molecule.conformers[0])
    
        from openmm import unit
    
        def write_xml(filename, system):
            import openmm
            with open(filename, 'w') as outfile:
                print(f'Writing {filename}...')
                outfile.write(openmm.XmlSerializer.serialize(system))
    
        # Compare energies
        ENERGY_DEVIATION_TOLERANCE = 1.0e-2 * unit.kilocalories_per_mole
        delta = (openmm_energy['total'] - espaloma_energy['total'])
        if abs(delta) > ENERGY_DEVIATION_TOLERANCE:
            # Show breakdown by components
            print('Energy components:')
            print(f"{'component':24} {'OpenMM (kcal/mol)':>20} {'Espaloma (kcal/mol)':>20}")
            for key in openmm_energy['components'].keys():
                openmm_component_energy = openmm_energy['components'][key]
                espaloma_component_energy = espaloma_energy['components'][key]
                print(f'{key:24} {(openmm_component_energy/unit.kilocalories_per_mole):20.3f} {(espaloma_component_energy/unit.kilocalories_per_mole):20.3f} kcal/mol')
            print(f'{"TOTAL":24} {(openmm_energy["total"]/unit.kilocalories_per_mole):20.3f} {(espaloma_energy["total"]/unit.kilocalories_per_mole):20.3f} kcal/mol')
            write_xml('system-espaloma.xml', espaloma_system)
            write_xml('system-openmm.xml', openmm_system)
>           raise Exception(f'Energy deviation for {molecule.to_smiles()} ({delta/unit.kilocalories_per_mole} kcal/mol) exceeds threshold ({ENERGY_DEVIATION_TOLERANCE})')
E           Exception: Energy deviation for [H]C(=O)[H] (-109.24874862885405 kcal/mol) exceeds threshold (0.01 kcal/mol)

openmmforcefields/tests/test_template_generators.py:878: Exception
------------------------------ Captured log call -------------------------------
INFO     openmmforcefields.generators.template_generators:template_generators.py:1547 Using espaloma model cached at /home/runner/.espaloma/espaloma-0.2.0.pt
INFO     openmmforcefields.generators.template_generators:template_generators.py:270 Requested to generate parameters for residue <Residue 0 () of chain 0>
INFO     openmmforcefields.generators.template_generators:template_generators.py:1597 Generating a residue template for [H]C(=O)[H] using espaloma-0.2.0
INFO     openff.toolkit.typing.engines.smirnoff.parameters:parameters.py:3121 0 torsions added
INFO     openff.toolkit.typing.engines.smirnoff.parameters:parameters.py:3382 1 impropers added, each applied in a six-fold trefoil
INFO     openff.toolkit.typing.engines.smirnoff.parameters:parameters.py:2858 3 bonds added (0 skipped due to constraints)
INFO     openff.toolkit.typing.engines.smirnoff.parameters:parameters.py:2960 3 angles added (0 skipped due to constraints)
=============================== warnings summary ===============================

I'll have to run this locally and manually inspect the differences between the serialized XML files written during the failure.

jchodera · 2021-12-21T03:50:58Z

I've figured out why the energy test is failing: The improper ffxml entries we create are not being applied to the molecular system because the impropers defined by espaloma do not match the expected smirnoff 3-fold atom ordering scheme used in the improper matching code in openmm.app.ForceField. In fact, espaloma appears to depart from what SMIRNOFF does in two ways:

It defines a different 6-fold matching scheme
It uses all periodicities 1 through 6 rather than just periodicity 1 terms like OpenFF force fields

The espaloma 6-fold scheme compared with the SMIRNOFF 3-fold scheme is illustrated here:

I would have thought the inclusion of both directions (e.g. 1023 and 1032) would cancel out, but it appears that there may be some reason (???) these do not cancel.

To remedy this, we will either have to produce a new generation of espaloma models that use the SMIRNOFF 3-fold consistent impropers or add a new improper matching mode to openmm.app.ForceField in the next release.

davidlmobley · 2022-02-21T22:10:17Z

Yuanqing tagged us on this. We discussed this in a call with John and the details from that must not have gotten propagated back here. I'm dealing with something on deadline, but to summarize very briefly: If you include all six permutations of the relevant atoms, impropers can only be used to force atomic centers to be planar (essentially, one set of three gives force vectors which follow the right hand rule, basically, whereas the other set follows the left-hand rule, so the only configuration of atoms where you don't have a restoring force towards "flat" is when all of the atoms are co-planar). In OpenFF we also allow impropers which are NOT attempting to ensure the atoms involved are co-planar (e.g. such as for trivalent nitrogens), so it's important to ensure you only use the three with the same handedness.

davidlmobley · 2022-02-21T22:13:40Z

In terms of periodicity: I don't know why you would want these to have periodicities up to 6. If the minima is away from zero, the best way to deal with this would be to move the minima of the potential away from zero (e.g. applying symmetric potentials but with minima shifted away from zero). But chemically, these should not have high periodicities since there is always only one or two minima (either at "flat" or at "somewhat puckered").

yuanqing-wang · 2022-02-22T00:21:40Z

If you include all six permutations of the relevant atoms, impropers can only be used to force atomic centers to be planar

Sorry I didn't quite follow this argument. If one set of parameters force atomic centers to be planar, then if you take the negative of all those parameters, they would force atomic centers to not be planar?

yuanqing-wang · 2022-02-22T00:24:13Z

In terms of periodicity: I don't know why you would want these to have periodicities up to 6. If the minima is away from zero, the best way to deal with this would be to move the minima of the potential away from zero

if we have infinitive periodicities, we should be able to approximate any arbitrary functions of the dihedral angles. we just picked a number that's close enough to infinity.

it would be nice if we could also optimize periodicities but that is a discrete variable and is very difficult to optimize

davidlmobley · 2022-02-22T01:15:07Z

Sorry I didn't quite follow this argument. If one set of parameters force atomic centers to be planar, then if you take the negative of all those parameters, they would force atomic centers to not be planar?

No. The issue arises if you TRY and force atomic centers to not be planar (as we are doing in some of our fitting experiments, as some centers SHOULD be nonplanar). The direction of the force in such a situation can be computed from the cross product of relevant vectors (per OpenMM docs) but, if the atoms are puckered up (for example) then the cross products of three of the permutations will point up, and the other three will point down. Essentially, they fight with one another. The magnitudes differ in the different directions (up vs down), and the end result is that the force becomes zero (and the center stabilizes) ONLY if it is planar.

To put it another way, it's impossible to impropers that have minima away from zero if you're applying all six permutations; it's only possible to have minima away from zero if you use only terms with the same "handedness".

if we have infinitive periodicities, we should be able to approximate any arbitrary functions of the dihedral angles. we just picked a number that's close enough to infinity.

Of course you could, yes. However, that can introduce a lot of terms. More conventionally force fields have used a SINGLE periodicity for these, though. Impropers are certainly non-periodic so it's unclear to me why you would want to use an expansion with many different periodicities. The two most important scenarios you have to deal with are:

You have a single minimum at an improper angle of 0, and it is roughly harmonic around that, and
You have two symmetric minima at improper angles somewhat away from zero (e.g. +/- 30 degrees) and the potential is roughly harmonic near that

A smal fraction of the time you may see a third scenario where the potential is roughly flat between +/-30 degrees.

yuanqing-wang · 2022-02-22T03:19:20Z

To put it another way, it's impossible to impropers that have minima away from zero if you're applying all six permutations; it's only possible to have minima away from zero if you use only terms with the same "handedness".

Sorry I still don't quite understand what's wrong with this little thought experiment of mine:

Imagine I used all six permutations. This argument says that it's impossible to have minima away from zero. So the minima has to be at zero.

Now I flip the sign of all of my parameters---zero is now a maxima. I don't know where my minima is. But it sure isn't at zero.

jchodera · 2022-02-22T16:08:05Z

@davidlmobley : Thanks for the detailed points here!

I don't think there is a concern with using multiple periodicities here---they're all symmetric and periodic, and there is no reason to stick to the lowest-order (n=1) term like the openff-* force fields do.

Instead, the question is whether six-fold impropers can be used for the same class of interactions that the SMIRNOFF three-fold impropers are used (for enforcing planarity or pyramidality). Because the six-fold impropers almost cancel (and would cancel exactly if bond lengths and angles are symmetric), I'm not even sure they can even be used to enforce planarity. I think they end up adding a kind of local multibody term that espaloma may be exploiting advantageously.

@kaminow is adding the standard SMIRNOFF three-fold impropers for us to refit and compare both three-fold and six-fold models, but now you have me wondering if we may not want both: By including three-fold and six-fold impropers (with separate parameters generated for each), we could potentially have the best of both worlds. We should also try ablating impropers entirely to see how much this degrades performance.

For now, I'll proceed by adding support for six-fold impropers to OpenMM as well, so that we can deploy all of these combinations to test the real impact on binding free energies.

…x.y.z.pt`

jchodera · 2022-04-07T15:05:35Z

@yuanqing-wang : Looks like this is finally working! I'll get this merged and then we can do some testing with perses.

jchodera added 7 commits October 28, 2021 18:45

Refactor OpenMM System -> ffxml conversion into mixin

e027de0

Initial implementation of EspalomaTemplateGenerator and unit tests

243709b

Fix test env YAML

5597b73

Add espaloma to SystemGenerator tests

3268267

Fix indentation in test environment YAML

f7d0142

Update README with espaloma use instructions

0b9803d

Add qcportal to conda env requirements, since espaloma seems to requi…

d06c944

…re it?

jchodera added 6 commits November 14, 2021 16:31

Merge remote-tracking branch 'origin/master' into espaloma

036762b

Update conda env to test against espaloma master

94d86fa

Update espaloma charge generation to use new API to allow charging fr…

54d1f95

…om nn or molecule

Log charge discrepancies as INFO

a965bea

Fix error in EspalomaTemplateGenerator.INSTALLED_FORCEFIELDS

23dc2af

Update INSTALLED_FORCEFIELDS

b3abca3

Ensure we use a sufficiently recent qcportal

1dab2f1

jchodera mentioned this pull request Jan 21, 2022

Add SMIRNOFF improper handling choderalab/espaloma#101

Closed

jchodera mentioned this pull request Feb 23, 2022

Add additional improper torsion classes to ForceField openmm/openmm#3482

Open

jchodera added 3 commits March 2, 2022 06:44

Debugging espaloma

99a4b43

Handle espaloma with SMIRNOFF torsions.

4b4c3e0

Remove debug code

d7b1b2e

jchodera added this to the 0.11.0 milestone Mar 8, 2022

jchodera mentioned this pull request Apr 6, 2022

Update for upcoming OpenFF Toolkit API breaks #191

Merged

jchodera added 5 commits April 5, 2022 20:32

Update to espaloma-0.2.2

617c5d9

Fix typo in test calls to ForceField.createSystem()

8263899

Test on Python 3.7, 3.8, 3.9

f4fe71a

We're using espaloma-x.y.z.pt for naming models now, not `espaloma_…

2af3603

…x.y.z.pt`

Fix retrieve forcefield test

f8d9890

Update SystemGenerator tests to use espaloma-0.2.2

9816f01

jchodera marked this pull request as ready for review April 7, 2022 15:11

Fix detection of whether tests are running on CI

5b219a4

jchodera merged commit 1fabf43 into master Apr 7, 2022

jchodera deleted the espaloma branch April 7, 2022 18:30

epretti mentioned this pull request Nov 12, 2025

Discrepancy in espaloma energies when using template generator, which calls regenerate_impropers #412

Open

Add support for espaloma small molecule parameters #182

Add support for espaloma small molecule parameters #182

Uh oh!

Conversation

jchodera commented Oct 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jchodera commented Oct 29, 2021

Uh oh!

jchodera commented Oct 29, 2021

Uh oh!

jchodera commented Nov 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jchodera commented Nov 15, 2021

Uh oh!

yuanqing-wang commented Nov 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuanqing-wang commented Nov 18, 2021

Uh oh!

jchodera commented Nov 19, 2021

Uh oh!

jchodera commented Dec 21, 2021

Uh oh!

davidlmobley commented Feb 21, 2022

Uh oh!

davidlmobley commented Feb 21, 2022

Uh oh!

yuanqing-wang commented Feb 22, 2022

Uh oh!

yuanqing-wang commented Feb 22, 2022

Uh oh!

davidlmobley commented Feb 22, 2022

Uh oh!

yuanqing-wang commented Feb 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jchodera commented Feb 22, 2022

Uh oh!

jchodera commented Apr 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jchodera commented Oct 29, 2021 •

edited

Loading

jchodera commented Nov 14, 2021 •

edited

Loading

yuanqing-wang commented Nov 16, 2021 •

edited

Loading

yuanqing-wang commented Feb 22, 2022 •

edited

Loading