Skip to content

More robust conformer embed in QM module#2272

Merged
amarkpayne merged 4 commits intomainfrom
qm_mol
Mar 1, 2022
Merged

More robust conformer embed in QM module#2272
amarkpayne merged 4 commits intomainfrom
qm_mol

Conversation

@xiaoruiDong
Copy link
Copy Markdown
Contributor

This PR aims to partially solve issue #2138, by implementing a more robust conformed embedding scheme. The RDKit Default algorithm ETKDG may fail some time, which causes no conformers actually created in rd_embed functions that further raises an error when GetConformer at line 196.

The fix implements two backup algorithms to embed conformers: embedding using random coordinates, and embedding using 2D geometries. Since there is a following force field optimization step, it shouldn't be too bad an idea.

Copy link
Copy Markdown
Contributor

@kspieks kspieks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR that addresses the corresponding issue! The code works perfectly on my end for some test smiles. Can you please create a moleculeTest.py file that just tests the rd_embed method? Maybe test 1 or 2 smiles that ETKDG handles just fine, such as CC or any easy example. And also test 1 or 2 smiles, such as [N:1](=[C-:2][C@@:3]1([H:8])[O:4][C@@:5]2([H:9])[C:6]([H:10])([H:11])[C+:7]12)[H:12] and [N+:1]1#[C:2][C@@:3]2([H:8])[O:4][C-:5]([H:9])[C:6]([H:10])([H:11])[C@@:7]12[H:12], that ETKDG fails to embed so that your new code for handling ETKDG failures can be tested as well.

AllChem.EmbedMultipleConfs(rdmol, num_conf_attempts, randomSeed=1)

if rdmol.GetNumConformers() == 0:
# There are molecules that the default ETKDG method fails.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reword this to say "Occasionally, ETKDG fails to embed some molecules" (sorry super minor I know)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Corrected.

@xiaoruiDong
Copy link
Copy Markdown
Contributor Author

xiaoruiDong commented Feb 28, 2022

@kspieks Thank you for reviewing this PR. I have

  • corrected the wording
  • added unit tests for rd_embed
  • found rdkit force field optimization may fail some time, so I added a try...except... to catch the error. Since, only one conformed is needed in the end, it is okay to have some failures. I also added two flags to check the quality of embedding and optimization, but I diden't implement anything other than logging to prevent the calculation if both of them are bad. It requires some refactoring which is be beyond the scope of my original intention.

I will squash fixes and rebase the commits once you think proper.

@kspieks
Copy link
Copy Markdown
Contributor

kspieks commented Feb 28, 2022

The new changes look great. Can you address the very minor comment about subprocess and then squash and rebase? This PR looks good to me :) Thanks again!

@xiaoruiDong xiaoruiDong force-pushed the qm_mol branch 2 times, most recently from df23080 to e20160a Compare February 28, 2022 19:19
@xiaoruiDong xiaoruiDong linked an issue Feb 28, 2022 that may be closed by this pull request
@kspieks kspieks self-requested a review February 28, 2022 19:27
kspieks
kspieks previously approved these changes Feb 28, 2022
@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Feb 28, 2022

This pull request fixes 1 alert when merging e20160a into 387fe33 - view on LGTM.com

fixed alerts:

  • 1 for Variable defined multiple times

@amarkpayne
Copy link
Copy Markdown
Member

@xiaoruiDong can you go ahead and update this branch when you get a chance?

@xiaoruiDong
Copy link
Copy Markdown
Contributor Author

@amarkpayne Rebased the branch, thanks!

@lgtm-com
Copy link
Copy Markdown

lgtm-com bot commented Mar 1, 2022

This pull request fixes 1 alert when merging 28a6973 into 32c2b92 - view on LGTM.com

fixed alerts:

  • 1 for Variable defined multiple times

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Quantum calulation problem

3 participants