Skip to content

Conversation

@AlexHeide
Copy link
Contributor

@AlexHeide AlexHeide commented Sep 26, 2022

Description

Remove c++ optking. Add new python optking driver.

User API & Changelog headlines

  • RN 1 The fixed_* optimization keywords have been changed to ranged_* options
  • RN 2 output will be changed. Check output.dat for simple convergence / step info. output.log for detailed info
  • RN 3 IRC convergence behavior different for minima and substep.
  • Downstream plugin users who were still getting wfn from globals will find it has now departed. Please follow the advice it's been issuing for years to do wfn passing. [LAB added]

Dev notes & details

  • New optimizer. Most of driver is very similar
  • Hessian Updating and optimization logic is in optking as much as possible
  • Restarting optimizations now supported. Optking can write entire state to disc (json)
  • Driver attempts to symmetrize geometries
  • New keywords added. Some removed / updated.
  • old optking removed from CMakeLists.txt from /psi4/src/psi4
  • optking py_funcs removed from core.cc
  • All tests have been verified for equal or better convergence. (except opt-irc-2)
  • globals legacy gradient, wfn, and molecule removed [LAB added]

Questions

  • A document describing differences between the optimzers may be necessary
  • Unclear if I have fully removed old optking and building with new optking is fully working.

Checklist

  • test15 tests restart
  • Can remove old deprecated set_gradient
  • ctest -L opt run

Status

  • Ready for review
  • Ready for merge

For documentation on the new optimizer please see https://optking.readthedocs.io/en/latest/
The lines changed is almost entirely due to new test output.

@JonathonMisiewicz
Copy link
Contributor

Just pointing out that this PR doesn't remove C-side optking code yet. Is that coming?

@AlexHeide
Copy link
Contributor Author

Just pointing out that this PR doesn't remove C-side optking code yet. Is that coming?

Just an oversight. deleted but not comitted.

@susilehtola
Copy link
Member

Just pointing out that this PR doesn't remove C-side optking code yet. Is that coming?

Just an oversight. deleted but not comitted.

The Optking is dead. Long live the Pyoptking.

@AlexHeide
Copy link
Contributor Author

The test that is currently failing is a test in gcp/pbeh3c/. This is due to an optking side issue where the CustomHelper class being used by optking is not accepting a psi4.core.Molecule the type checking was looking for qcdb.Molecule. As a backup optking defaulted to psi4's active molecule.

This is the call.
E = optimize('pbeh3c/def2-msvp', molecule=unopethene)

optking takes the molecule here in the driver and uses a default fallback instead:
opt_object = optking.opt_helper.CustomHelper(molecule, params=optimizer_params)

The type check will get changed in optking to include core.Molecule. An alternative question this raises for me is whether the active_molecule should get updated at some point in the optimization. Updating the active molecule patches the issue but is that desired?

@psi-rking
Copy link
Contributor

The test that is currently failing is a test in gcp/pbeh3c/. This is due to an optking side issue where the CustomHelper class being used by optking is not accepting a psi4.core.Molecule the type checking was looking for qcdb.Molecule. As a backup optking defaulted to psi4's active molecule.

This is the call. E = optimize('pbeh3c/def2-msvp', molecule=unopethene)

optking takes the molecule here in the driver and uses a default fallback instead: opt_object = optking.opt_helper.CustomHelper(molecule, params=optimizer_params)

The type check will get changed in optking to include core.Molecule. An alternative question this raises for me is whether the active_molecule should get updated at some point in the optimization. Updating the active molecule patches the issue but is that desired?

My vote would be "no"; it's too much like changing the input. However, I do think that some users would expect the final active molecule to be updated for them. And workflows inside the python input may seem more intuitive that way.

@loriab
Copy link
Member

loriab commented Sep 27, 2022

An alternative question this raises for me is whether the active_molecule should get updated at some point in the optimization. Updating the active molecule patches the issue but is that desired?

My vote would be "no"; it's too much like changing the input. However, I do think that some users would expect the final active molecule to be updated for them. And workflows inside the python input may seem more intuitive that way.

I may not be following this right. I'd say the state of the psi4 active mol during an optimization is arbitrary -- whatever works for you. In cpp-optking, I think the communication was through legacymolecule anyways. The molecule optking is acting on should be updated by the time control returns to the user at the end of the opt. I thought this was already happening through https://github.com/psi4/psi4/pull/2727/files#diff-acf663ccea13592c4c656cf89c7b62e6f5bd3b2e8b4f12ba354129bd39d096f8R1296-R1297 . That's consistent with cpp-optking, and I think that must be happening b/c many of the tests check NRE before and after opt.

from addons import *

@ctest_labeler("opt;cart;noc1")
def test_opt14():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def test_opt14():
def test_opt15():

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops

@psi-rking
Copy link
Contributor

An alternative question this raises for me is whether the active_molecule should get updated at some point in the optimization. Updating the active molecule patches the issue but is that desired?

My vote would be "no"; it's too much like changing the input. However, I do think that some users would expect the final active molecule to be updated for them. And workflows inside the python input may seem more intuitive that way.

I may not be following this right. I'd say the state of the psi4 active mol during an optimization is arbitrary -- whatever works for you. In cpp-optking, I think the communication was through legacymolecule anyways. The molecule optking is acting on should be updated by the time control returns to the user at the end of the opt. I thought this was already happening through https://github.com/psi4/psi4/pull/2727/files#diff-acf663ccea13592c4c656cf89c7b62e6f5bd3b2e8b4f12ba354129bd39d096f8R1296-R1297 . That's consistent with cpp-optking, and I think that must be happening b/c many of the tests check NRE before and after opt.

OK, you changed my hasty mind on that. What concerns me is that the user may, in some instances, not realize that the default active molecule is the one that optking is acting on and changing. But the upside convenience wins, I agree.

@loriab
Copy link
Member

loriab commented Sep 27, 2022

OK, you changed my hasty mind on that. What concerns me is that the user may, in some instances, not realize that the default active molecule is the one that optking is acting on and changing. But the upside convenience wins, I agree.

If changing the mol over the course of an opt was new behavior, I think it'd be very much up for debate. I know FAE is not keen on background/globals changes. And they are weird in a Jupyter notebook. But for now, I'm for minimal surprise upon cpp-optking -> py-optking.

@JonathonMisiewicz
Copy link
Contributor

Please ping me for review once Lori approves. She knows this part of the code better than I do.

@AlexHeide
Copy link
Contributor Author

AlexHeide commented Sep 29, 2022

In cases where reference values are not matched perfectly (but the test should pass) is it better to loosen the comparison or update the reference value. I assume updating the reference value is the way to go but I want to have some record of asking before I start slightly changing reference values.

Secondarily, should the reference values be updated in general at some point so that users don't stumble across an instance where the value is slightly off and wonder why?

Explanation:
I've expanded the number of tests I'm running since I started cleaning up core.cc. In the opt specific tests the convergence is usually tight enough that the nuclear repulsion energies match the reference values just fine for both optimizers. There are some tests like cc1-3 that are failing due to being just above threshold. atol = 0.001 the difference is ~ 0.0017. These tests are using the default qchem convergence criteria ~ 3e-4 max_force. Both optimizers finish well below the criteria and geometries match to 1e-4 Angstroms and 0.001 degrees. Geometries are the same. There are around 5 or 6 tests failing like this.

@AlexHeide
Copy link
Contributor Author

Would y'all like this PR to include updated output.ref files for the tests or would a separate test updating PR be better? I don't see anything in the" adding tests documentation" about reference log files. I can include those as well if desired. It will just greatly increase the number of lines changed in this PR.

@JonathonMisiewicz
Copy link
Contributor

Would y'all like this PR to include updated output.ref files for the tests or would a separate test updating PR be better? I don't see anything in the" adding tests documentation" about reference log files. I can include those as well if desired. It will just greatly increase the number of lines changed in this PR.

Please update .ref files. Probably not worth adding .log files, but that's a @loriab question.

There are some tests like cc1-3 that are failing due to being just above threshold. atol = 0.001 the difference is ~ 0.0017. These tests are using the default qchem convergence criteria ~ 3e-4 max_force. Both optimizers finish well below the criteria and geometries match to 1e-4 Angstroms and 0.001 degrees.

Could you elaborate on why tests are failing at all, and what numbers are differing? Are these Cartesian coordinates?

@loriab
Copy link
Member

loriab commented Sep 29, 2022

In cases where reference values are not matched perfectly (but the test should pass) is it better to loosen the comparison or update the reference value. I assume updating the reference value is the way to go but I want to have some record of asking before I start slightly changing reference values.

Secondarily, should the reference values be updated in general at some point so that users don't stumble across an instance where the value is slightly off and wonder why?

General guidance to for ref values to be from a tightly converged/optimized calc, then loosen the comparison check to accommodate the default/existing conv crit (https://psicode.org/psi4manual/master/add_tests.html#test-contents). Tests checking opt status at a certain cycle exempt of course. That's the principle, but do feel free to change as you see fit --- the reference values (agreed, preferred thing to change if the ref is the culprit) or the comparison crit (if it's the optimizer behavior that's the instigator).

Explanation:
I've expanded the number of tests I'm running since I started cleaning up core.cc. In the opt specific tests the convergence is usually tight enough that the nuclear repulsion energies match the reference values just fine for both optimizers. There are some tests like cc1-3 that are failing due to being just above threshold. atol = 0.001 the difference is ~ 0.0017. These tests are using the default qchem convergence criteria ~ 3e-4 max_force. Both optimizers finish well below the criteria and geometries match to 1e-4 Angstroms and 0.001 degrees. Geometries are the same. There are around 5 or 6 tests failing like this.

Thanks for the explanation. I'd view reference NRE values as less venerable. For one thing, only those that caused trouble were even updated when physical constants changed, iirc.

Would y'all like this PR to include updated output.ref files for the tests or would a separate test updating PR be better? I don't see anything in the" adding tests documentation" about reference log files. I can include those as well if desired. It will just greatly increase the number of lines changed in this PR.

Separate, please. Like updating samples/, better to keep the not-for-visual-inspection changes aside. Update: I see Jonathon thinks differently. Including is ok with me, now that GH allows files to be folded up rather than scrolled through. Is .log where all the optking detailed output goes now? I guess we ought to start collecting them. output.log, perhaps. But this can also be deferred to a grand regenerate-the-refs script and PR.

@JonathonMisiewicz
Copy link
Contributor

I'm perfectly happy to leave .ref to a separate PR, so long as @AlexHeide agrees to take responsibility for getting that in by PsiCon at the latest. 🙂

@AlexHeide
Copy link
Contributor Author

AlexHeide commented Sep 29, 2022

Could you elaborate on why tests are failing at all, and what numbers are differing? Are these Cartesian coordinates?

I poorly worded my explanation. I was attempting to say that the geometries are virtually the same but not numerically, they are both well converged. I assumed the discrepancy was just a numeric difference and there have been various tweaks as well over the last few years to the algorithm.

There is 1 algorithmic difference I'd be concerned about, from looking at the two outputs. New optking doesn't consider the most recent step in the hessian updating procedure - cpp-optking did. I've found three pieces of logic that explicitly or implicitly prevent updating with the most recent step in all or specific cases. I'd have to ask @psi-rking if this is a bug or was changed due to some stability concern.

Is .log where all the optking detailed output goes now? I guess we ought to start collecting them. output.log, perhaps. But this can also be deferred to a grand regenerate-the-refs script and PR.

Yes optking's detailed output goes to .log but it isn't strictly speaking necessary for the user to see the detailed logs I would say. However, if the test is failing it might be nice to have a more detailed optimization record to compare against if the optimizer is at fault.

@JonathonMisiewicz
Copy link
Contributor

Just updating the reference is fine with me, in that case.

@psi-rking
Copy link
Contributor

psi-rking commented Sep 29, 2022

There is 1 algorithmic difference I'd be concerned about, from looking at the two outputs. New optking doesn't consider the most recent step in the hessian updating procedure - cpp-optking did. I've found three pieces of logic that explicitly or implicitly prevent updating with the most recent step in all or specific cases. I'd have to ask @psi-rking if this is a bug or was changed due to some stability concern.

Interesting. This does not ring a bell for me. I take the question to be "given a hessian and the current forces (beyond the first step) should you update the hessian with those forces before using them to calculate the step?" Can you point to the code? IDK, probably best answer is whatever works better in practice and doesn't cause problems.

Actually, it's possible I did this because I learned to avoid hessian updating when displacements are very small (or the geometries are very close). Perhaps I didn't want to update hessian until I knew the step size.

@AlexHeide
Copy link
Contributor Author

I may not be following this right. I'd say the state of the psi4 active mol during an optimization is arbitrary -- whatever works for you. In cpp-optking, I think the communication was through legacymolecule anyways. The molecule optking is acting on should be updated by the time control returns to the user at the end of the opt. I thought this was already happening through https://github.com/psi4/psi4/pull/2727/files#diff-acf663ccea13592c4c656cf89c7b62e6f5bd3b2e8b4f12ba354129bd39d096f8R1296-R1297 . That's consistent with cpp-optking, and I think that must be happening b/c many of the tests check NRE before and after opt.

I want to make sure that the active molecule behavior is as expected. Whatever molecule the driver uses will be updated. This will be either the active molecule OR the passed molecule. If the molecule is passed the active molecule is not updated in any way. All the asserts pass here.

import math

molecule h2o {
    pubchem:water
}

molecule h2o2 {
    pubchem:hydrogen peroxide
}

# quick comparison. h2o2 is active molecule
h2o2_nre = h2o2.nuclear_repulsion_energy()
active_nre = core.get_active_molecule().nuclear_repulsion_energy()

assert math.isclose(h2o2_nre, active_nre)

# optimize h2o2 (active molecule) expect repulsion energy to match
E = optimize("scf/sto-3g")
h2o2_opt_nre = h2o2.nuclear_repulsion_energy()
active_nre = core.get_active_molecule().nuclear_repulsion_energy()

assert math.isclose(h2o2_opt_nre, active_nre)

# optimize h2o. nuclear repulsion does not match. active molecule is still h2o2
# Currently (next commit will fix) h2o2 would be optimized by this call. (optking side issue)
E = optimize("scf/sto-3g", molecule=h2o)
h2o_opt_nre = h2o.nuclear_repulsion_energy()
active_nre = core.get_active_molecule().nuclear_repulsion_energy()

assert math.isclose(h2o2_opt_nre, active_nre)

The whole wrong molecule being optimized thing is an optking side issue that is fixed on optking/master.

@loriab loriab added this to the Psi4 1.7 milestone Sep 30, 2022
@loriab loriab added optking external-interface For issues about interfaces with external programs: ADCC, CheMPS2, GDMA, MRCC... driver For issues that are specifically about Psi's driver. labels Sep 30, 2022
Copy link
Member

@loriab loriab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the legacy cleanup! I've now looked over everything but the actual meat, driver.py . That's next.

@@ -0,0 +1,32 @@
if(NOT (${CMAKE_DISABLE_FIND_PACKAGE_optking}))
include(FindPythonModule)
find_python_module(optking ATLEAST 1.0.0 QUIET)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note so reminded to switch back before merge. And switch the tarball to a tag.

Copy link
Member

@loriab loriab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me, incl. driver.py. There's a few code suggestions, and I think you're still working on tweaking some test refs, then I think its good to merge for broader trial.

@AlexHeide AlexHeide force-pushed the optking branch 2 times, most recently from 53006da to 54af027 Compare November 22, 2022 23:18
@JonathonMisiewicz
Copy link
Contributor

Let me be more explicit about why I find the "multi-fragment optimizations" section confusing.

  • The first sentence talks about "the metric for connecting atoms" without explaining what "connecting atoms" means. Even worse, the first sentence is not obviously about dimers.
  • It isn't clear to me what a "reference point" signifies. While I can tell whether something is an acceptable reference point, what are these used for? It looks like these are atoms used to define intermolecular internal coordinates.
  • I don't know what it means to talk about a linear combination of atoms, or how to interpret [[3], [1], [2]], [[1, 2, 3, 4, 5, 6], [2], [6]] in your first example.

The third point is crucial - I can't follow your examples. I won't insist on a figure, but I do insist on examples I can understand.


extern std::map<std::string, plugin_info> plugins;

namespace opt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Farewell old friend!

core.def("set_active_molecule", py_psi_set_active_molecule, "molecule"_a,
"Activates a previously defined *molecule* in global memory so next computations use it.");
core.def("get_active_molecule", &py_psi_get_active_molecule, "Returns the currently active molecule object.");
core.def("set_legacy_molecule", py_psi_set_legacy_molecule, "molecule"_a,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and good riddance.

compare_values(nre3_ref[1], variable('nuclear repulsion energy'), 3, '[2] NRE: zmat, external values, w/dummy') #TEST
clean()
opt_clean()
# molecule NH3 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't we running this test? As long as there are no Cartesian constraints, we can still optimize it, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep we have tests elsewhere for NH3. Only a small amount of the tests in mints5 are active to begin with. For this one we would either need to just copy new optkings nuclear repulsion energy into the test, which is kinda cheating, or remove it for now. At some point, we should come back and try to make these tests 1. more reasonable and 2. repeatable. Unfortunately, when I started working on 1 cfour refused to converge the scf. So I went with removal for any that were failing.

@@ -17,6 +17,7 @@ set {
scf_type pk
basis 6-31G(d)
reference rhf
ensure_bt_convergence True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, if this keyword is necessary, then convergence could also be achieved by reducing the starting or maximum step size.

from addons import *

@ctest_labeler("opt;cart;noc1")
def test_opt14():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops

Setting |optking__frag_mode| to `multi` will now add a special
set of intermolecular coordinates between fragments - internally referred to as DimerFrag
coordinates. For a set of molecular fragments, a set of reference points are chosen on each
fragment. The reference points may be an atom or a linear combination of atomic positions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fragment. The reference points may be an atom or a linear combination of atomic positions.
fragment. Each reference point will be either an atom or a linear combination of positions of atoms within that fragment.

these reference points. Up to six reference points will be created (depending upon the number
of atoms in each fragment). See :ref:`Dimer coordinate table <table:DimerFrag>` for how
reference points are created. For a set of three dimers A, B, and C, sets of coordinates are
created between each pair: AB, AC, and BC. Each DimerFrag will likely use different reference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
created between each pair: AB, AC, and BC. Each DimerFrag will likely use different reference
created between each pair: AB, AC, and BC. Each DimerFrag may use different reference

coordinates. For a set of molecular fragments, a set of reference points are chosen on each
fragment. The reference points may be an atom or a linear combination of atomic positions.
Stretches, bends, and dihedral angles between two of the fragments will be created using
these reference points. Up to six reference points will be created (depending upon the number
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest deleting the sentence that begins with "Up to six..."

points. Creation of the intermolecular coordinates can be controlled through
|optking__interfrag_mode|, |optking__frag_ref_atoms|, and |optking__interfrag_coords|.

.. note:: Manually creating the interfragment coordinates is meant for power users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.. note:: Manually creating the interfragment coordinates is meant for power users.
.. note:: Manual specification of the interfragment coordinates is supported for power users, and provides complete control of fragments' relative orientation.

set {
basis 6-31+G
frag_mode MULTI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
frag_mode MULTI
frag_mode MULTI
The line below specifies the reference points that will be used to construct the interfragment coordinates between the two fragments (called A and B). The format is the following:
[[A-1], [A-2], [A-3]], [[B-1], [B-2], [B-3]]
In terms of atoms within each fragment, the line below chooses, for water:
H3 of water for the first reference point, O1 of water for the second reference point, H2 of water for the third reference point
and for benzene:
the mean of the positions of all the C atoms, C2, one of the Carbon atoms, C6, another one of the carbon atoms.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlexHeide , I am concerned that this example may now not be correct. In order to support trimers, etc., I changed the atoms numbering to the total atom number - not the number within the fragment. See e.g., test_dimers_ne2.py for the simplest example that also uses frag_ref_atoms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. Tests got updated but my examples didn't. Good catch.

@loriab
Copy link
Member

loriab commented Dec 5, 2022

looks like typos in options snagging up the docs:

/home/runner/work/psi4/psi4/code/objdir/doc/sphinxman/source/optking.rst:359: ERROR: Undefined substitution referenced: "optking_frag_ref_atoms".
/home/runner/work/psi4/psi4/code/objdir/doc/sphinxman/source/optking.rst:359: ERROR: Undefined substitution referenced: "otking__frag_ref_atoms".

Copy link
Contributor

@JonathonMisiewicz JonathonMisiewicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Way clearer, thank you. Two minor typos I spotted and would like fixed before merge-in.

I'll leave final merge to Lori.


.. _DimerSection:

In previous versions of optking, the metric for connecting atoms was increased until all atoms,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In previous versions of optking, the metric for connecting atoms was increased until all atoms,
In previous versions of optking, the metric for connecting atoms was increased until all atoms

set of intermolecular coordinates between fragments - internally referred to as DimerFrag
coordinates (see `here <DimerIntro_>` for the brief description).
For each pair of molecular fragments, a set of up to 3 reference points
are chosen on each fragment. Each reference points will be either an atom or a linear combination
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
are chosen on each fragment. Each reference points will be either an atom or a linear combination
are chosen on each fragment. Each reference point will be either an atom or a linear combination

@loriab loriab merged commit e6e8deb into psi4:master Dec 5, 2022
@loriab loriab mentioned this pull request Apr 18, 2023
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

driver For issues that are specifically about Psi's driver. external-interface For issues about interfaces with external programs: ADCC, CheMPS2, GDMA, MRCC... optking

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants