-
Notifications
You must be signed in to change notification settings - Fork 86
Improve molecule filtering in tests #370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #370 +/- ##
=======================================
Coverage 71.41% 71.41%
=======================================
Files 5 5
Lines 822 822
=======================================
Hits 587 587
Misses 235 235 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I found you may also need to add another exception to exclude from the list of forcefields. Looks like there are some other actual failures in there in terms of energies not matching up: |
|
So just taking a look at the one molecule linked above that is failing, This could be fixed by changing the energy difference check openmmforcefields/openmmforcefields/tests/test_template_generators.py Lines 199 to 201 in e6e6248
to also include a relative as well as an absolute tolerance, and/or running a minimization between these lines openmmforcefields/openmmforcefields/tests/test_template_generators.py Lines 1025 to 1026 in e6e6248
Perhaps that would fix some other failures as well. The minimized geometry for this molecule looks extremely suspect to me. The root issue for this particular molecule is that this force field is assigning a 4 angstrom (!) bond length. |
|
Heh, I was just in the process of tidying up a small number of details in that file before passing it off to you for a more serious analysis. I hoped to find an easy way to get I think I speak for the entirety of the initiative in saying that smirnoff99Frosst was a essential stepping stone in the development of our force fields but was never meant to be used in perpetuity. Sage force fields mostly (but not completely) replace Parsley force fields which themselves mostly replaced smirnoff99Frosst. Skipping over some details, there's just no reason for somebody to use it as a "starting point" force field in the middle of 2025. I don't think we should necessarily make it hard for people to use it in downstream packages like this, but it's also not worth the trouble of digging deep into its particular issues or putting a ton of CI time into testing it. I suggest skipping over it either in tests (my preference) or altogether. |
|
Happy to take a look at it whenever you're ready. Thanks for the note on the context of the development of smirnoff99Frosst. |
|
Okay, the failure is with 2.2.1RC1 but it's the same as the released force field: $ diff ~/software/openff-forcefields/openforcefields/offxml/openff-2.2.1-rc1.offxml ~/software/openff-forcefields/openforcefields/offxml/openff-2.2.1.offxml
4c4
< <Date>2024-07-22</Date>
---
> <Date>2024-09-11</Date>@epretti what should we do here? |
|
OK, I can reproduce, and I'm seeing this with a few force fields. It's the same kind of problem as above with the other molecule I looked at. In this case I think nothing's wrong with the force field, but the generated conformer is bad. A close contact between two atoms creates large forces, the dynamics become unstable, and the resulting configuration can have extremely large energies. Though they match to high precision in the test, the absolute value of the difference can be large. openmmforcefields/openmmforcefields/tests/test_template_generators.py Lines 1025 to 1026 in e6e6248
to context.setPositions(molecule.conformers[0].to_openmm())
openmm.LocalEnergyMinimizer.minimize(context)
integrator.step(nsteps) |
| molecules = molecules[:MAX_MOLECULES] | ||
| ligand_structures = ligand_structures[:MAX_MOLECULES] | ||
| print(f"{len(molecules)} molecules remain after filtering") | ||
| ligand_structures = self.filter_molecules(molecules, max_molecules=3 if CI else 6) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause a problem because the ligand_structures are ParmEd Structure objects, and not Molecule objects. Adding them to Structure objects below, as protein_structure + ligand_structure, puts ParmEd into an infinite recursion. This is a ParmEd bug as that should really raise TypeError, but in any case, the addition should be Structure + Structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, didn't even know that was the type of that object. Only was reading portions of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have caused some test(s) to fail
I think we can leverage an accident here in which these classes have similar and similarly-named attributes? ff351ba
In [1]: import parmed
In [2]: from openff.toolkit import Molecule
In [3]: molecule = Molecule.from_smiles(1000 * "C")
In [4]: structure = parmed.Structure()
In [5]: isinstance(molecule.atoms, list)
Out[5]: True
In [6]: isinstance(structure.atoms, list)
Out[6]: True
In [7]: len(molecule.atoms)
Out[7]: 3002
In [8]: len(structure.atoms)
Out[8]: 0Fix typing Fix
89abb60 to
78550eb
Compare
|
@epretti I think this is mostly ready for you to take over |
|
This should be working now. I updated the troublesome GAFF partial charge test that we discussed recently to just create example user-defined charges that sum to the specified amount. I'm creating a separate issue for what to do about this problem in general, and if it eventually leads to a PR, I can add separate tests there. |



Improves molecule filtering in CI tests by
Might fix #330