More robust conformer embed in QM module#2272
Conversation
kspieks
left a comment
There was a problem hiding this comment.
Thanks for this PR that addresses the corresponding issue! The code works perfectly on my end for some test smiles. Can you please create a moleculeTest.py file that just tests the rd_embed method? Maybe test 1 or 2 smiles that ETKDG handles just fine, such as CC or any easy example. And also test 1 or 2 smiles, such as [N:1](=[C-:2][C@@:3]1([H:8])[O:4][C@@:5]2([H:9])[C:6]([H:10])([H:11])[C+:7]12)[H:12] and [N+:1]1#[C:2][C@@:3]2([H:8])[O:4][C-:5]([H:9])[C:6]([H:10])([H:11])[C@@:7]12[H:12], that ETKDG fails to embed so that your new code for handling ETKDG failures can be tested as well.
rmgpy/qm/molecule.py
Outdated
| AllChem.EmbedMultipleConfs(rdmol, num_conf_attempts, randomSeed=1) | ||
|
|
||
| if rdmol.GetNumConformers() == 0: | ||
| # There are molecules that the default ETKDG method fails. |
There was a problem hiding this comment.
Can you reword this to say "Occasionally, ETKDG fails to embed some molecules" (sorry super minor I know)
There was a problem hiding this comment.
Thanks. Corrected.
|
@kspieks Thank you for reviewing this PR. I have
I will squash fixes and rebase the commits once you think proper. |
|
The new changes look great. Can you address the very minor comment about subprocess and then squash and rebase? This PR looks good to me :) Thanks again! |
df23080 to
e20160a
Compare
|
This pull request fixes 1 alert when merging e20160a into 387fe33 - view on LGTM.com fixed alerts:
|
|
@xiaoruiDong can you go ahead and update this branch when you get a chance? |
|
@amarkpayne Rebased the branch, thanks! |
|
This pull request fixes 1 alert when merging 28a6973 into 32c2b92 - view on LGTM.com fixed alerts:
|
This PR aims to partially solve issue #2138, by implementing a more robust conformed embedding scheme. The RDKit Default algorithm ETKDG may fail some time, which causes no conformers actually created in
rd_embedfunctions that further raises an error whenGetConformerat line 196.The fix implements two backup algorithms to embed conformers: embedding using random coordinates, and embedding using 2D geometries. Since there is a following force field optimization step, it shouldn't be too bad an idea.