Skip to content

Clone Solution objects used by Reactor / ReactorSurface#1921

Merged
speth merged 23 commits intoCantera:mainfrom
speth:clone-solution
Oct 8, 2025
Merged

Clone Solution objects used by Reactor / ReactorSurface#1921
speth merged 23 commits intoCantera:mainfrom
speth:clone-solution

Conversation

@speth
Copy link
Copy Markdown
Member

@speth speth commented Jul 5, 2025

Changes proposed in this pull request

  • Partially implements changes envisioned in Eliminate traps associated with accessing ThermoPhase objects in use by Reactors enhancements#201:
    • Implements clone method for Solution, ThermoPhase, Kinetics, and Transport, based on AnyMap serialization
    • constructors for Reactor objects now take an optional clone argument, which determines whether the Solution object used should be cloned. The default is currently false, which will raise a warning. After Cantera 3.2, this default will be changed to true, with no warning (so, for the duration of Cantera 3.2, this argument is not optional if you don't want a warning).
    • ReactorNet::initialize checks all Solution and Interface objects used by Reactor and ReactorSurface objects in the network to make sure that they are all unique. If not, a warning is issued. This warning will be an error after Cantera 3.2.
  • Provides user interface side of Improve ReactorNet handling of bulk phase interactions enhancements#202
    • Constructing cloned Interface objects requires having access to the (probably cloned) Solution objects for the adjacent phases at the time the ReactorSurface is created, which necessitates the introduction of a new constructor for ReactorSurface. In doing so, we also resolve the issue of linking a single ReactorSurface to multiple adjacent Reactors.
  • Include reaction rate multipliers in serialized Solution state.

If applicable, fill in the issue number this pull request is fixing

After Cantera 3.2, when we can guarantee that the ReactorNet contains no re-used Solution objects, there is additional refactoring to the logic for how ReactorNet::eval works that can be implemented to complete these two enhancements.

Checklist

  • The pull request includes a clear description of this code change
  • Commit messages have short titles and reference relevant issues
  • Build passes (scons build & scons test) and unit tests address code coverage
  • Style & formatting of contributed code follows contributing guidelines
  • The pull request is ready for review

@speth speth added the Reactors label Jul 5, 2025
@speth speth force-pushed the clone-solution branch 3 times, most recently from 76fc4be to 2038ba6 Compare July 5, 2025 22:23
@codecov
Copy link
Copy Markdown

codecov bot commented Jul 5, 2025

Codecov Report

❌ Patch coverage is 74.26471% with 70 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.58%. Comparing base (99550d0) to head (3d6ff95).
⚠️ Report is 40 commits behind head on main.

Files with missing lines Patch % Lines
src/zeroD/ReactorFactory.cpp 45.65% 25 Missing ⚠️
src/zeroD/ReactorSurface.cpp 70.27% 5 Missing and 6 partials ⚠️
src/zeroD/ReactorBase.cpp 63.15% 7 Missing ⚠️
interfaces/cython/cantera/reactor.pyx 84.21% 6 Missing ⚠️
src/oneD/MultiNewton.cpp 0.00% 5 Missing ⚠️
src/zeroD/Reactor.cpp 37.50% 5 Missing ⚠️
src/kinetics/Kinetics.cpp 89.74% 0 Missing and 4 partials ⚠️
src/base/Solution.cpp 90.32% 1 Missing and 2 partials ⚠️
src/zeroD/ReactorNet.cpp 88.23% 0 Missing and 2 partials ⚠️
include/cantera/transport/Transport.h 0.00% 1 Missing ⚠️
... and 1 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1921      +/-   ##
==========================================
- Coverage   75.58%   75.58%   -0.01%     
==========================================
  Files         451      451              
  Lines       56362    56550     +188     
  Branches     9299     9331      +32     
==========================================
+ Hits        42602    42744     +142     
- Misses      10616    10648      +32     
- Partials     3144     3158      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@speth speth force-pushed the clone-solution branch 3 times, most recently from 9244690 to f410da9 Compare July 7, 2025 17:59
@speth
Copy link
Copy Markdown
Member Author

speth commented Jul 7, 2025

With this implementation now largely working, I have a couple of observations in relation to the discussion in Cantera/enhancements#201:

  • We cannot rely on always cloning the Solution objects used in a reactor network. The most severe issue with this is that CustomRate objects implemented as Python lambdas cannot be serialized. This is currently the reason the custom_reactions.py example fails to run.
  • We cannot make cloning Solution objects the default behavior in this Cantera version. Based on the changes to various test cases and examples, it is clear that there is a fair amount of existing code that relies on the incidental sharing of Solution states. While identifying these cases in the test suite is relatively easy, it is not so easy elsewhere and would be a significant breaking change.

As such, I think the next steps for this PR are:

  • Implement an additional clone argument to the Reactor and ReactorSurface constructors, which will default to None, In Cantera 3.2, if no value for this argument is provided, the Solution will not be cloned and a warning will be issued. In Cantera 3.3, we can make clone=True the default, while clone=False will still be available for special cases.
  • Add a check during ReactorNet::initialize to identify whether any Solution objects are being used multiple times within the network. In Cantera 3.2, doing so will result in a warning, which can then be converted to an error in Cantera 3.3.

@ischoegl
Copy link
Copy Markdown
Member

@speth ... thanks for your work on this, as well as the thoughts for the roadmap, which looks 👍.

I was somewhat surprised that CustomRate objects with Python lambdas cause failures, which is not among the consequences that I would have expected. On second thought, I assume that it has to do with those not being part of the YAML serialization, and thus cannot be cloned? If this were the case, I'd suggest simply throwing an error if there's an attempt to clone a mechanism that includes CustomRate (I have not started a review, so it may already be there). I assume this is purely limited to Python lambdas, as C++ has no way of ensuring that Python objects persist in memory? I.e., CustomRate with Func1 objects should work?

Beyond, it appears that #1923 fixes other sample tests that are currently failing.

@speth speth force-pushed the clone-solution branch 3 times, most recently from 87263f0 to b30373c Compare July 11, 2025 18:34
@speth
Copy link
Copy Markdown
Member Author

speth commented Jul 11, 2025

I was somewhat surprised that CustomRate objects with Python lambdas cause failures, which is not among the consequences that I would have expected. On second thought, I assume that it has to do with those not being part of the YAML serialization, and thus cannot be cloned?

I didn't anticipate this either, but it's pretty clear in retrospect. Indeed, the issue is that such lambdas cannot be serialized.

If this were the case, I'd suggest simply throwing an error if there's an attempt to clone a mechanism that includes CustomRate (I have not started a review, so it may already be there). I assume this is purely limited to Python lambdas, as C++ has no way of ensuring that Python objects persist in memory? I.e., CustomRate with Func1 objects should work?

This affects all CustomRate objects, including ones built on Func1, since the CustomFunc1Rate::getParameters method is not implemented (or rather, explicitly raises a NotImplementedError). In principle, we could implement serialization of Func1-based rates, but it's not strictly necessary. The fallback is for the user to instantiate independent Solution objects and call the Reactor constructor with the clone=False option.

Likewise, ExtensibleRate objects that don't implement the set_parameters / get_parameters pair will require this alternative approach, as will any other user-defined model that doesn't fully implement serialization.


The "next steps" I mentioned before are now completed, but I'm currently struggling with how to get this to play nice with the generated CLib. I added a new method, shared_ptr<Solution> ReactorBase::solution() for accessing the Solution object used by a Reactor. Depending on whether a clone was created or not, this is either the new cloned object or just the same object that was passed into the constructor. In the first case, I understand why CLib is struggling -- that cloned Solution isn't already in the cabinet, but I'm not sure what the right way to update the code generation for that is. In the second case, this accessor also doesn't seem to be working, and I can't figure out why.

@ischoegl
Copy link
Copy Markdown
Member

ischoegl commented Jul 12, 2025

Thanks, @speth for further context.

[...] In principle, we could implement serialization of Func1-based rates, but it's not strictly necessary. The fallback is for the user to instantiate independent Solution objects and call the Reactor constructor with the clone=False option.
Likewise, ExtensibleRate objects that don't implement the set_parameters / get_parameters pair will require this alternative approach, as will any other user-defined model that doesn't fully implement serialization.

It should be pretty straight-forward to come up with Func1 serialization so things work out of the box. For ExtensibleRate, it's obviously up to the users. This clearly goes beyond this PR.

The "next steps" I mentioned before are now completed, but I'm currently struggling with how to get this to play nice with the generated CLib. I added a new method, shared_ptr<Solution> ReactorBase::solution() for accessing the Solution object used by a Reactor. Depending on whether a clone was created or not, this is either the new cloned object or just the same object that was passed into the constructor. In the first case, I understand why CLib is struggling -- that cloned Solution isn't already in the cabinet, but I'm not sure what the right way to update the code generation for that is. In the second case, this accessor also doesn't seem to be working, and I can't figure out why.

Hmm. This makes sense - it wasn't straightforward to get the initial accessors to work properly (most relevant example: sol_adjacent); I managed to address this with the uses field. While the Python code should be relatively easy to follow (I did go back and revised), the conditionals in Jinja are quite difficult to make readable (this may be improved by Cantera/enhancements#232).

At the same time, the existing logic may be sufficient to handle this situation using your new shared_ptr<Solution> ReactorBase::solution() method. The cabinet entries should be created during constructor calls for Reactor and Domain1D objects. Fortunately, the Jinja logic for constructors is among the more straightforward/better commented ones:

clib-constructor: |-
  ## CLib constructor template: instantiates new object
  // constructor: {{ cxx_wraps }}
  try {
      {% for line in lines %}
      ## add lines used for CLib/C++ variable crosswalk
      {{ line }}
      {% endfor %}
      {% if shared %}
      ## instantiated object manages associated objects (see: sol3_newSolution)
      ## storage of associated objects requires id (handle) of new object
      auto obj = {{ cxx_name }}({{ ', '.join(cxx_args) }});
      int id = {{ base }}Cabinet::add(obj);
      {% for typ, getter in shared %}
      ## add associated objects (see: sol3_newSolution, sol3_adjacent)
      if (obj->{{ getter }}()) {
          {{ typ }}Cabinet::add(obj->{{ getter }}(), id);
      }
      {% endfor %}
      return id;
      {% else %}{# not shared #}
      ## instantiated object has no associated objects; no id is needed
      return {{ base }}Cabinet::add({{ cxx_name }}({{ ', '.join(cxx_args) }}));
      {% endif %}{# shared #}
  } catch (...) {
      return handleAllExceptions({{ error[0] }}, {{ error[1] }});
  }

Specifically, you'd need to get the Solution object clones into the following portion via the shared argument:

      {% for typ, getter in shared %}
      ## add associated objects (see: sol3_newSolution, sol3_adjacent)
      if (obj->{{ getter }}()) {
          {{ typ }}Cabinet::add(obj->{{ getter }}(), id);
      }
      {% endfor %}

As the cloned Solution is a managed object, the CLib destructors need to use the uses field as well.

As an aside, I noticed that there are some stray sol3_* references left in the ## Jinja comment blocks 😢 (should be sol_*).

PS: In a nutshell, you may need to update ctreactor_auto.yaml as follows:

- name: new
  wraps: newReactorBase
  uses: [solution]
  [...]
- name: del
  uses: [solution]

I haven't tested it, though.

void setSolution(shared_ptr<Solution> sol);

//! Access the Solution object used to represent the contents of this reactor.
shared_ptr<Solution> solution() { return m_solution; }
Copy link
Copy Markdown
Member

@ischoegl ischoegl Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other contexts (specifically Python), we use contents. I don't have strong opinions, though.

Fwiw, we're not consistent in Python, either:

    property thermo:
        """
        The `ThermoPhase` object representing the reactor's contents.
        """
        def __get__(self):
            self.rbase.restoreState()
            return self._contents

The docstring here isn't correct any longer as it has become the Solution object (it refers to the same object as the new C++ method). The situation is the also same for an existing kinetics property. It still all works due to inheritance, but is overall misleading.

Obviously, this is a pre-existing condition. Introducing the C++ method does, however, provide an opportunity to make things consistent across all APIs.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, ReactorBase::contents already exists, returning a ThermoPhase&.

I think getting to a consistent state with contents as the name will take a couple of iterations (versions) -- we can deprecate the existing ReactorBase::contents now, along with the Python ReactorBase.thermo and ReactorSurface.kinetics properties, but at least for now we need a distinct name for this method.

Copy link
Copy Markdown
Member

@ischoegl ischoegl Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I'd say use ReactorBase::contents4 as the intermediate name so it's clear that it's temporary? Having consistent naming is one of my pet peeves ...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what the "4" means in this context, and would rather stick with "solution" as a reasonably interpretable name for now, with a "TODO" to rename it later. Do you want me to introduce the other changes as part of this PR or as a separate follow-on?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 4 is a placeholder similar to the 3 I have used extensivly during transitions. Right now, we have a Reactor4 ... my idea is that this is a feature that goes closer to a future Cantera, rather than the current major version. I'm 👍 with solution but would like to deprecate ReactorBase::contents regardless. This PR or another one - up to you as long as it goes in for 3.2.

@speth
Copy link
Copy Markdown
Member Author

speth commented Aug 24, 2025

Hmm. This makes sense - it wasn't straightforward to get the initial accessors to work properly (most relevant example: sol_adjacent); I managed to address this with the uses field. While the Python code should be relatively easy to follow (I did go back and revised), the conditionals in Jinja are quite difficult to make readable (this may be improved by Cantera/enhancements#232).

Thanks for pointing me to the uses field, and the template that's use for the newReactorBase method. This seems to get some of the necessary behavior right with minimal effort. However, it's not quite there for both cases of either cloning or not cloning the Solution:

  • In the case where the Solution object is not cloned, this results in an additional reference to the Solution being added to the cabinet, with a new ID. This new Solution ID then cannot be used to obtain the thermo member using sol_thermo.
  • In the case where the Solution is cloned, we get a new ID which is correctly retrieved via reactor_solution. However, the components of this Solution like the thermo object are not themselves being added to the corresponding cabinets, again causing sol_thermo to fail, and I'm not sure where that logic should live.

I think I don't fully understand what the whole "parent" logic within Cabinet does, which is maybe contributing to my confusion here. The current situation is a bit complicated by the fact in one case, the Solution should be "owned" by the Reactor and (presumably) deleted when the Reactor is deleted, but in the other case the Solution should outlive the Reactor.

@speth speth marked this pull request as ready for review September 29, 2025 22:29
@speth
Copy link
Copy Markdown
Member Author

speth commented Sep 29, 2025

@ischoegl, thanks for the help sorting out the new clib aspect of this. I think this is ready for review now.

The complaint from Codecov is at least partly incorrect, due to it miscounting coverage of the lambda functions in ReactorFactory, and much of the other uncovered code is in deprecated methods that are only needed for transitional purposes.

Copy link
Copy Markdown
Member

@ischoegl ischoegl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@speth ... thanks for taking this on: this is a pretty important feature.

I have minor comments and one larger issue: I believe that long-term, we'd like to keep treating a ReactorSurface just like any other reactor, and instead implement a new Connector to object to create the linkage? I personally believe that this would a cleaner solution, but would require severing the internal stacking of the work vector. This would likely happen in a future release, but would also reverse some of the current deprecation decisions.

void setSolution(shared_ptr<Solution> sol);

//! Access the Solution object used to represent the contents of this reactor.
shared_ptr<Solution> solution() { return m_solution; }
Copy link
Copy Markdown
Member

@ischoegl ischoegl Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I'd say use ReactorBase::contents4 as the intermediate name so it's clear that it's temporary? Having consistent naming is one of my pet peeves ...

Comment on lines +44 to +45
//! @deprecated To be removed after Cantera 3.2. Replaced by constructor where
//! contents and adjacent reactors are specified
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should go there yet. Python has a 'pending deprecation' ... I'd prefer to use a comment like this here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is impossible to clone the Interface object without access to the Solution objects which define the adjacent phases.

While I suppose making these links later is possible in the case where the Interface object isn't being cloned, I think requiring these Solution objects to be provided at construction time is consistent with the changes that you've introduced to require Solution objects when constructing Reactors and requiring Reactor objects to be provided when constructing Connectors.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes back to me viewing ReactorNet as a bipartite graph, where ReactorBase specializations are connected by Connector specializations. Ultimately, I envision using a BiLayer (or similar) object to establish the connection. From that perspective, this isn't an extrapolation of what I've done before ... I'm 👍 with this for the time being, but the deprecation may be temporary.

It is true that cloning is impossible without access, but this is due to the solution vector being stacked in a way that I don't think is ideal.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the coupling between phases involved in an interface kinetic mechanism is intrinsically a tighter relationship than that, but we can continue to work on how this is represented after Cantera 3.2. In any case, this PR provides a necessary first step: the ability to simultaneously set all of the Solution and Interface objects involved in a ReactorNet to match the state of the network, and to then evaluate properties and rates as needed.

Copy link
Copy Markdown
Member

@ischoegl ischoegl Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, referencing #1874. Definitely beyond the scope of this PR.

@ischoegl
Copy link
Copy Markdown
Member

ischoegl commented Sep 30, 2025

P.S.: Thinking about this some more, I'm curious about the rationale for not using Solution clones in the future: do you see any applications where people would like to use a single copy? I'm honestly leaning towards just using clones (i.e., no shared objects). Apart from Python users, whom we can warn with the None workaround, anyone using implicit sharing as a 'feature' is in for a surprise once the default changes to 'clone'. I'd rather just rip that band-aid off and avoid introducing the additional parameter ...

@speth
Copy link
Copy Markdown
Member Author

speth commented Sep 30, 2025

P.S.: Thinking about this some more, I'm curious about the rationale for not using Solution clones in the future: do you see any applications where people would like to use a single copy? I'm honestly leaning towards just using clones (i.e., no shared objects).

Yes, I believe there are cases where not cloning the Solution object is preferable. Besides the relatively small performance benefit, the main reason is to handle cases where the Solution is not (correctly) serializable. There are currently several places where this may happen:

  • Any use of CustomRate, where the rate is defined as a Python lambda function
  • Any use of ExtensibleRate where the user has not gotten as far as implementing the get_parameters and set_parameters methods to allow their rate class to be serialized. While users should be able to implement these methods, I think it is an unnecessary burden in some cases, and it would significantly increase the complexity of just trying out a new rate parameterization to see if it's useful before going to the effort of implementing these additional utility methods.
  • Any in-development thermodynamic or kinetics/reaction models implemented in C++, where again, it should be possible to start with implementing the methods that describe the physics rather than having to have all of the utility methods working right from the start.

Apart from Python users, whom we can warn with the None workaround, anyone using implicit sharing as a 'feature' is in for a surprise once the default changes to 'clone'. I'd rather just rip that band-aid off and avoid introducing the additional parameter ...

What existing (as of Cantera 3.1) use case doesn't provide a warning or notification about this change?

  • make_shared<Reactor>(soln, name) issues a deprecation warning about "clone argument not specified and notes the pending change in behavior"
  • newReactor("Reactor", soln, name) issues a deprecation warning about the return type change and redirects the user to newReactor4 where the clone argument is added. I guess the warning here should also mention the change in the cloning behavior, given the path is not removal of this method but to change the existing behavior.
  • newReactor4 is "new in Cantera 3.2" and defaults to clone=true
  • newReservoir is "new in Cantera 3.2" and defaults to clone=true
  • newReactorSurface is "new in Cantera 3.2" and defaults to clone=true

Copy link
Copy Markdown
Member

@ischoegl ischoegl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@speth ... thanks for your detailed comments: your explanations address most of my concerns. The only sticking points are the partial cloning option, where I'd rather see a clean break, and the deprecation of ReactorBase::contents, which can be addressed in a follow-on PR. Regarding the pending deprecations, I explained that I envision an alternative way of attaching a Reactor to a ReactorSurface with a future Connector object at some point in the future. If we need to reactivate what's deprecated now is necessary, so be it.

Regarding your feedback on my 'PS' - yes, with the C++ deprecation warnings in place, this is good by me.

void setSolution(shared_ptr<Solution> sol);

//! Access the Solution object used to represent the contents of this reactor.
shared_ptr<Solution> solution() { return m_solution; }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 4 is a placeholder similar to the 3 I have used extensivly during transitions. Right now, we have a Reactor4 ... my idea is that this is a feature that goes closer to a future Cantera, rather than the current major version. I'm 👍 with solution but would like to deprecate ReactorBase::contents regardless. This PR or another one - up to you as long as it goes in for 3.2.

Comment on lines +44 to +45
//! @deprecated To be removed after Cantera 3.2. Replaced by constructor where
//! contents and adjacent reactors are specified
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes back to me viewing ReactorNet as a bipartite graph, where ReactorBase specializations are connected by Connector specializations. Ultimately, I envision using a BiLayer (or similar) object to establish the connection. From that perspective, this isn't an extrapolation of what I've done before ... I'm 👍 with this for the time being, but the deprecation may be temporary.

It is true that cloning is impossible without access, but this is due to the solution vector being stacked in a way that I don't think is ideal.

Copy link
Copy Markdown
Member

@ischoegl ischoegl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only remaining concern was to start the deprecation cycle for ReactorBase::contents so we can start moving things. I'm 👍 for this to be a separate PR.

@speth
Copy link
Copy Markdown
Member Author

speth commented Oct 7, 2025

My only remaining concern was to start the deprecation cycle for ReactorBase::contents so we can start moving things. I'm 👍 for this to be a separate PR.

I'm happy to add that to this PR; just haven't had time to work on this in the past week. Please hold off on merging this.

Actually, scratch that. I started in on migrating ReactorBase::contents along with renaming the Python Reactor.thermo and ReactorSurface.kinetics methods to also be contents, and it touches a lot of code, so I'll take that up in a separate PR.

@speth speth merged commit b614984 into Cantera:main Oct 8, 2025
49 of 51 checks passed
@speth speth deleted the clone-solution branch October 8, 2025 15:01
@ischoegl
Copy link
Copy Markdown
Member

@speth ... in order to handle this consistently going forward, do you believe it to be necessary to extend cloning to Domain1D and SolutionArray?

@ischoegl ischoegl mentioned this pull request Oct 13, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Negative molar fractions appear when using ReactorSurface

2 participants