-
Notifications
You must be signed in to change notification settings - Fork 164
Allow some 1-4 interactions to be unscaled #1149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Looks like master needs to be merged first? |
|
What do you mean? |
|
I don't get that message! It just tells me merging is restricted to authorized users. I've hopefully done the merge it wants. |
|
@jchodera - do you have a test for this? |
|
@peastman: I'm a bit confused by the implementation. It looks like you go through the effort of tracking which atom types have nonstandard scee/scnb and modifying the 1-4 interactions for those atoms, but only for those that set scee/scnb to 1 exactly. Anything else still throws an exception. Am I missing something, or wouldn't it be straightforward to extend this to track any nonstandard scee/scnb and apply those nonstandard values in the script?
@swails: I'm guessing that the easiest way to properly test this is to convert The difficulty here is that preparing even simple glycosylated peptide systems in Amber is an incredible pain. It would be much easier if there was a pre-prepped system ready to go. Update: I think I may have found a pre-prepared system in ambpdb -p Glycoprotein.parm7.save -c Glycoprotein.rst7.save -conect > Glycoprotein.pdbI can work on this test, but I'll have to open a separate PR since this is contributed from @peastman's branch. Note that the GLYCAM force field conversion may be a problem in general since they inexplicably seem to have made the choice to use a different definition of what constitutes a "residue" than the PDB, so while we can go from AMBER PDB -> OpenMM System, we won't be able to go from RCSB PDB -> OpenMM System with these definitions. Really looking forward to OpenFF adding support for glycans so all of this will "just work"... |
|
@peastman : I realize I'm running really short on time, and don't want to hold this PR up further, so here is a ZIP archive of a directory you can use in this test:
Note that the test uses the older ff10 force field and matching glycam force field because more modern test systems were not already pre-prepared in the AmberTools pool of tests. |
See the discussion at #1136 (comment). |
|
@peastman: I'm not sure I quite follow. Couldn't we throw an exception only if we run into an ambiguous case, such as those cases you enumerate? This PR seems to do the opposite, and only handles the special case where the scee/scnb are exactly unity. |
This PR is quite narrow in that it supports any combination of force fields where there is only one non-unity set of scaling factors supported. While narrow, it does provide support for all Amber force fields and combinations out there: GLYCAM uses unity scaling factors and the protein, RNA, and Lipid force fields all have the same scaling factors (1/1.2 electrostatic and 1/2 Lennard-Jones). So this PR should support GLYCAM + any/all of the other Amber force fields, since it will support the single scaling factor from the protein, RNA, and Lipid force fields along with the unity (unscaled) GLYCAM force field. |
|
OK, let's run with it, then. @peastman: Here are the files you can use in a test! |
|
I'll try to get to it within the next few days. I've been busy with other things. |
|
Thanks! Once this is done, I can quickly convert the GLYCAM force field, which will really help our group prepare a bunch of the new SARS-CoV-2 mutants and get them on Folding@home ASAP so we can assess their impact on ACE2 binding! The lack of GLYCAM is blocking them. |
|
I'm not sure what I'm supposed to do with those files? There's a PDB file, and a bash script that just sets a single variable, and then some other files of types I'm not familiar with. I'm attaching the converted force field. I edited the |
@peastman: Apologies for the miscommunication. Here's a summary:
@peastman: Can you also review the important details in openmm/openmmforcefields#156 where my group attempted to use this PR? There are two main issues:
Thanks so much for your help in wrapping this up, @peastman---the group is trying to get these simulations of SARS-CoV-2 mutants set up ASAP so we can help our experimental collaborators determine whether we need to worry about emerging variants in developing therapies. |
There are no prmtop or inpcrd files in your zip. Can you check that you included the right set of files?
You left out the biggest problem. Quoting from that issue,
I can confirm that only those residues are present in GLYCAM_amino_06j_12SB.lib. I don't know enough about the force field to comment on whether others should be present as well, or where it should get them from.
Possibly so. Alternatively it could be done with a modification to convert_amber.py. See my comment at the top of this thread. That's a separate feature though. |
Sorry about the information being spread out in multiple comments---I mentioned this here. Older versions of amber sometimes use different file naming conventions, so
I think I can fix that problem! The residue definitions are loaded by The issue appears to be that ParmEd does not implement the The GLYCAM leaprc files seem to be the only files in the current AmberTools distribution that use It appears I can manually create a where I have to list the whole set of 100+ residue names in It looks like I can do this and include the modified @swails: How would we get such a change eventually made upstream in the AmberTools distribution?
I think it would simplify things greatly to add this feature to ParmEd so we don't have to hack XML files. |
|
I think we can simply do a regression test here. Convert a GLYCAM force field into an OpenMM XML file and then just compare the generated XML with what this version of the code produces.
It's not really fair to try and force the GLYCAM maintainers to abandon PREP files to make this particular task easier, although they may be amenable if it increases adoption of GLYCAM. It may be time to simply bite the bullet and add a PREP file parser (but no writer, since I think Amber is actively trying to kill off that format). I'll also need to break down and actually write code (i.e., copy code from somewhere) to interconvert between internal and cartesian coordinates. |
The |
The PDB format is over 40 years old... 🤷♂️ |
That's why the RCSB was sensible enough to formally deprecate it and force everyone to use a more modern, flexible format. :) Supporting old formats should be "best effort", but can't drive development into a corner. |
Absolutely. If everyone stopped using the PDB format tomorrow I would clap my hands and jump for joy. My point however is that as of right now it is still one of the most widely-used formats, so supporting it remains a necessary evil. If this issue can be solved without adding support for the PREP format, that would be great. However, if supporting PREP is the only (or most straightforward) way to address this issue, then as Jason said it might be time to bite the bullet. I think the next step is to reach out to the Glycam devs directly and get their take. If they are willing to abandon the PREP format, then this becomes a non-issue and that old format can finally be put out to pasture... |
My understanding is that a two-line LEaP script is all that is needed to generate the requisite In the meantime, I can run the two-line LEaP script manually and bundle the files with |
|
@peastman Yes, this is the first issue I encountered when trying to convert the forcefields. For the long term, we'll have to make a decision on whether we should enforce protein atom names in ffxml/GLYCAM_06j-1.xml to contain the |
|
The problem seems to be from missing definitions. I examined one particular bond that is present in
The prefix is essential. The force field won't work correctly without it. See my comments at #1136 (comment). |
Isn't the prefix only essential when using |
|
Looking at GLYCAM_06j.dat, I see that it also contains no parameters for bonds involving |
|
The prefix is essential when combining multiple types of molecules. The protein and DNA force fields each define a type called "C", for example. Without the prefixes, OpenMM will report an error when you try to combine them, since you aren't allowed to define two different types with the same name. |
|
At the top of I also don't see bond parameters defined for CX, but I do see angle and torsion parameters. Not sure why this is. Perhaps we |
|
That's defining the atom type. Scroll down to the next section which defines bond parameters. There are no definitions involving CX. |
|
Sorry, missed the second half of your post. I don't think Amber supports any kind of subclassing for atom types? If an atom has type CX, so far as I know it will only use definitions for that specific type. |
For the test case above, I noted that I am only using ff14SB.xml and GLYCAM_06j-1_trunc.xml, so there won't be any naming clashes for parametrizing this system. |
|
I think I see. The problem is in the prefixes. CX is not a glycam type. It's the standard type used for protein CA atoms. So the type of that atom should be In the definition of the NLN residue, there's a mix of atoms with and without the |
I used this function in openmmforcefields: https://github.com/openmm/openmmforcefields/blob/master/amber/convert_amber.py#L548 |
|
Ok, I think that clears it up. We already knew that method would need to change to support this force field. |
|
Ok, it seems like there are a lot of atom types that appear in both |
|
I suggest we merge this, then open a separate issue to discuss prefixes. There are several issues that need to be considered, and it may require changes in both ParmEd and openmmforcefields. |
|
Sure, I'd be happy to open an issue and start the discussion. My only hesitation with merging this PR is that the nonbonded energies are still very discrepant: #1149 (comment) |
|
That's expected. The code to handle scaling of 1-4 interactions depends on atom types. Since the prefixes are currently being applied inconsistently, it won't work correctly. |
|
Right, but we should probably wait until the atom type prefixes have been fixed (and then run my test again to confirm that the energy discrepancy is no longer present) before merging this. Otherwisse, we cannot assess whether the changes in this PR are scaling energies appropriately. |
|
This PR wasn't intended to fix all problems with GLYCAM, just one particular one. Getting it working will take multiple PRs across multiple repositories. That's going to be hard to do if we don't merge any of them until all of them are finished. I already verified that this produces correct scaling when given a force field where all atom types are defined correctly. |
|
Is this ok to merge? |
|
Yes go ahead! |
|
@swails is the one who can merge it. |
|
Merged. |

This partially address #1136. This code has only been minimally tested! To fully convert a force field and do proper tests, we need to also update the openforcefields conversion script, and possibly add a feature to ParmEd so you can specify a prefix for atom types.