Skip to content

Change valid-digits to min-timestep#1788

Merged
fsimonis merged 15 commits intoprecice:developfrom
NiklasKotarsky:changeSignificantDigitToSmallestTimeStep
Oct 26, 2023
Merged

Change valid-digits to min-timestep#1788
fsimonis merged 15 commits intoprecice:developfrom
NiklasKotarsky:changeSignificantDigitToSmallestTimeStep

Conversation

@NiklasKotarsky
Copy link
Copy Markdown
Contributor

@NiklasKotarsky NiklasKotarsky commented Sep 1, 2023

Main changes of this PR

Changes the tag valid-digits tag to a min-timestep tag. Also ensures that the same tolerance is used to check for the end of the time window in Storage, Bspline and BaseCouplingScheme.

Motivation and additional information

Resolves: #1712, #1751, #1776

Author's checklist

  • I used the pre-commit hook to prevent dirty commits and used pre-commit run --all to format old commits.
  • I added a changelog file with make changelog if there are user-observable changes since the last release.
  • I added a test to cover the proposed changes in our test suite.
  • For breaking changes: I documented the changes in the appropriate porting guide.
  • I sticked to C++17 features.
  • I sticked to CMake version 3.16.3.
  • I squashed / am about to squash all commits that should be seen as one.

Reviewers' checklist

  • Does the changelog entry make sense? Is it formatted correctly?
  • Do you understand the code changes?

Copy link
Copy Markdown
Contributor

@BenjaminRodenberg BenjaminRodenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that this PR resolves so many problems 👍 What's missing in my opinion:

  • Can you write a test that reproduces the error you describe in #1776? If I understand everything correctly, the test should trigger the assertion mentioned in #1776, if you use develop and it should succeed, if you use this branch. This also helps us to not run into the same problem again in the future.
  • Please add corresponding documentation (changelog entry and porting guide)

@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

Yes, I also have to check that the errors raised in 1751 are actual precice errors with error messages. I should also at some point also add some documentation with the recomended value for min-timestep tag. For time adaptive solvers one would want to exchange the line "currentDt = dt > maxDt ? maxDt : dt" with "currentDt = dt > maxDt-minTimestep ? maxDt : dt" to avoid that the last time window is smaller than minTimestep.

@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

I also checked now that the added test ReadWriteScalarDataFirstParticipantHighTolerance fails on develop e256a11 with the same error message as #1776 if the min-timestep tag is exchanged to validDigits=1.

@BenjaminRodenberg BenjaminRodenberg added this to the Version 3.0.0 milestone Sep 4, 2023
@NiklasKotarsky NiklasKotarsky force-pushed the changeSignificantDigitToSmallestTimeStep branch 2 times, most recently from 76c3ae5 to cb572ea Compare September 11, 2023 09:16
@uekerman
Copy link
Copy Markdown
Member

@NiklasKotarsky Status-quo here is still that you trigger me once I should have a look, right? Or should I have a look now already?

@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

@NiklasKotarsky Status-quo here is still that you trigger me once I should have a look, right? Or should I have a look now already?

It might be ready for a first review. It is in a state where the bugs in #1751 and #1776 now emit proper preCICE errors with a recommendation of how to change the solvers to aviod them. I am still trying to figure out where to place the tag min-time-step-size in the config file, since it should only occur once in the whole file.

The implementation also currently forces all solvers to hit the end of the time window within numerical accuracy. The problem is that if a solver stops before reaching the end of the time window then the Bspline algorithm would need to extrapolate the remaining part of the time window, which is currently not supported. The simple fix would be to only return the last value, but fixing this here feels a little bit out of scope for this pull request.

@NiklasKotarsky NiklasKotarsky marked this pull request as ready for review September 18, 2023 14:24
Copy link
Copy Markdown
Member

@uekerman uekerman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good, also the configuration. Appreciated 👍 (and sorry for my slow reply)
But, the implemented behavior now is different than what we had previously and not 100% what we need I am afraid of.

  • Previously: Participant can do do smaller time step sizes than eps, but preCICE does never return a next max time step size smaller than eps. If the remainder in the window is smaller than eps, preCICE simply rounds up to the window end (hence the term valid digits we used before).
  • Currently in this PR: Participant is simply not allowed to hand over a time step size smaller than eps. I am not sure if this is what we want. The motivation is to prevent participants from doing to small time step sizes (which could lead to stability problems sometimes). The current behavior one could also implement directly in the solver.

Fixing this should not be too complicated.

  • Could you then please add an integration test that uses a min-time-step-size larger than 1e-14?

@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

Implementation looks good, also the configuration. Appreciated 👍 (and sorry for my slow reply) But, the implemented behavior now is different than what we had previously and not 100% what we need I am afraid of.

* Previously: Participant can do do smaller time step sizes than eps, but preCICE does never return a next max time step size smaller than eps. If the remainder in the window is smaller than eps, preCICE simply rounds up to the window end (hence the term valid digits we used before).

* Currently in this PR: Participant is simply not allowed to hand over a time step size smaller than eps. I am not sure if this is what we want. The motivation is to prevent participants from doing to small time step sizes (which could lead to stability problems sometimes). The current behavior one could also implement directly in the solver.

I can change it such that participant can do do smaller time step sizes than eps
whithout preCICE throwing an error. The reason for removing the rounding up of the time window causes problems with the waveform iteration. The problem boils down to that we either have to set the time in the last sample to the end of the time window manually or we have to allow for extrapolation in the Bsplines interpolation, since the other solver otherwise could sample a time behind the last timestep. I have not yet found a nice way of making the old functionality consistent with the interpolation. There seems to be also something wrong with the Open-foam adapter related to waveform iterations and valid-digits, see precice/openfoam-adapter#305 (comment).

I also had a very hard time coming up with a situation where one would need to use eps and not fixing it on the solver side. The only valid use case I could come up with is some solvers use 16-bit floating point arithmetic and some solvers use 32 bit floating point arithmetic. I have no idea if this is a realistic use case?

@uekerman
Copy link
Copy Markdown
Member

The problem boils down to that we either have to set the time in the last sample to the end of the time window manually

This should be the way to go in my opinion.

I also had a very hard time coming up with a situation where one would need to use eps and not fixing it on the solver side.

Imagine you have a solver that has issues with too small time step sizes. Say you don't want a dt smaller than 1e-10. Now, it could happen that preCICE tells you that the last time step within a time window needs to be smaller than 1e-12. What do you do then. Of course, you could throw an error on the side of the solver, but that is no solution.
Instead, with this "valid digits / min time step" functionality, we can define in preCICE that the minimum should be 1e-10. Then, preCICE does no tell "do one more with 1e-12", but treats the last one the solver did as the window end already (i.e. constant extrapolation).
Understandable?

@NiklasKotarsky NiklasKotarsky force-pushed the changeSignificantDigitToSmallestTimeStep branch 3 times, most recently from 2be3946 to 22e03d2 Compare September 26, 2023 15:06
@fsimonis
Copy link
Copy Markdown
Member

Then, preCICE does no tell "do one more with 1e-12", but treats the last one the solver did as the window end already (i.e. constant extrapolation).

I don't think that we correctly handle this case in the coupling logic.
The $\epsilon$ derived by the valid-digits is used to check if $|\text{timeWindowSize}-\text{computedTime}| < \epsilon$.

if (hasTimeWindowSize()) {
const bool isAtWindowEnd = math::equals(getComputedTimeWindowPart(), getTimeWindowSize(), _eps);
return isAtWindowEnd;
} else { // using participant first method

We advance timeWindowStart by the computedTime instead of timeWindowSize:

_timeWindowStartTime += _computedTimeWindowPart;

_timeWindowStartTime += _computedTimeWindowPart;

This leads to a desync of the time across participants.
It also means that samples are not at the end of the time window.

@BenjaminRodenberg
Copy link
Copy Markdown
Contributor

Imagine you have a solver that has issues with too small time step sizes. Say you don't want a dt smaller than 1e-10. Now, it could happen that preCICE tells you that the last time step within a time window needs to be smaller than 1e-12. What do you do then. Of course, you could throw an error on the side of the solver, but that is no solution.
Instead, with this "valid digits / min time step" functionality, we can define in preCICE that the minimum should be 1e-10. Then, preCICE does no tell "do one more with 1e-12", but treats the last one the solver did as the window end already (i.e. constant extrapolation).
Understandable?

Couldn't we have a test representing this case? I think the use-case of the feature is still hard to understand and if I search for valid-digits in our code base, there are only two configs actually using this option. One is PrimaryRankSockets.xml (does not sound like the main purpose of this test is to test this feature) and the other is MultiCoupling.xml (as far as I understand already a bi-coupling scheme should be sufficient).

Writing a test also helps to more clearly define what we actually want here.

@NiklasKotarsky NiklasKotarsky force-pushed the changeSignificantDigitToSmallestTimeStep branch 5 times, most recently from 6fc2d35 to e7fd30f Compare September 27, 2023 16:15
@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

Nice catch @fsimonis, it should be possible to fix the coupling logic by adjusting the time in BaseCouplingScheme and moving the last sample to the end of the time window. However as far as I under stand it I do not have write access to the samples and stamples directly in baseCouplingScheme, which means I should not be able to move the samples directly. Therefore I would propose to just adding the last sample to the end of the time window as well, which should ensure that we do not have to extrapolate. The concern I have with this solution is that it introduces a time integration error, since preCICE is moving time steps around, which is not apparent or easily understandable from the user side. The proper way to avoid this integration error is to adjust the last time step that the solver takes and force it to hit the end of the time window exactly meaning to make the last timestep 10e-12 bigger, but I can also see the argument that this feature would be a nice convenience when that is difficult to achieve.

@uekerman
Copy link
Copy Markdown
Member

Couldn't we have a test representing this case?

Yes, that's what I am suggesting above.

The proper way to avoid this integration error is to adjust the last time step that the solver takes and force it to hit the end of the time window exactly meaning to make the last timestep 10e-12 bigger,

We cannot do this as this is already in the past. The solver only tells preCICE about the time step size it used after it did the time step.

@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

NiklasKotarsky commented Sep 28, 2023

I have implemented a test case now documenting the effects of a large min time step on the waveform iteration https://github.com/precice/precice/pull/1788/files#diff-0a12f7d580510ad191bb2b2d985a1ddc12bae9ef5f7b87c6187afeb6b1cd5164. The test currently fails on develop, whit the error ´PRECICE_ASSERT(math::equals(timesAscending(nTimeSteps - 1), time::Storage::WINDOW_END))´. I am still struggling a bit with accuracy of the interpolation, since it seems to me that the interpolation no longer conserves polynomials, but this might be a coding error in the feature.

@fsimonis
Copy link
Copy Markdown
Member

Regarding #1788 (comment)

The problem is not that the CouplingScheme controls the time, it's how the time is controlled, when, and why.
Having periods of incorrect function returns essentially means that we need something like #816 but for coupling schemes.

I still strongly suggest to keep time consistent. There is also no reason not to return more information in addComputedTime. Example:

struct TimeStepInfo {
  double userTimeStep;
  double adjustedTimeStep;
  bool   hasBeenAdjusted;
  bool   atEndOfTimeWindow;
};

Therefore I would suggest fixing the documentation and then merging here and doing this refactoring later when it is more clear if this is a useful feature or if one should change it or even remove it entirely.

No matter the decision, we need to clean this up before the release of v3 along the remaining coupling-related topics.

@fsimonis
Copy link
Copy Markdown
Member

I would also like to point out that no one has answered the question "Is this even necessary?" in my previous comment #1788 (comment)

@BenjaminRodenberg BenjaminRodenberg added the breaking change Breaks backwards compatibility and users need to act label Oct 16, 2023
@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

I would also like to point out that no one has answered the question "Is this even necessary?"

The only use case which I can come up with where it is necessary would be mixed precision, where one solver has a lower precision than the other. Other than that this would be more of a nice to have feature when the time step is difficult to change or if the solver itself does not really have a time step. For example a controller which has a fixed time step that interfaces to some FSI simulation.

I still strongly suggest to keep time consistent
We adjust the computed time inside firstExchange(), meaning that at least getTime() and getNextTimeStepMaxSize() are easy to use incorrectly between addComputedTime() and firstExchange().

If I understand this correctly then we have to modify the time in addComputedTime() in order to keep the time consistent?

I am a little bit unsure of how to proceed here. I could spend some time to move everything to advance in ParticipantIMPL and addComputedTime in order to make it more robust. The tricky part for me is that this feature will have unintended consequences especially for multicoupling with different time window lengths.

@uekerman
Copy link
Copy Markdown
Member

Is all of this necessary?

Yes, but it will not be a widely-used feature.

If [the solver] decides to make a step of getMaxTimeStepSize() - x, then it needs to deal with the fact that the next time step can be at most x.

The whole discussion goes a bit into a fundamental direction: "What is black-box coupling?" Typically, solvers have methods like solverDt = computeTimeStepSize(). We don't know what the method does exactly. Underlying might be a hard stability criterion. Meaning, the solver cannot simply do a time step with solverDt + eps.

What's very real are the following factors:

True, but an important effect of the feature is also to save computational time. The most expensive part is the solveTimeStep of the solver. We are avoiding a unnecessary (since too small) time step of the solver.

I would suggest we merge here and if necessary continue discussion in #1832 after v3.0.
This PR promises to fix critical bugs and should IMO, therefore, not be postponed further than really necessary.

@fsimonis
Copy link
Copy Markdown
Member

Other than that this would be more of a nice to have feature when the time step is difficult to change or if the solver itself does not really have a time step. For example a controller which has a fixed time step that interfaces to some FSI simulation.

&

The whole discussion goes a bit into a fundamental direction: "What is black-box coupling?" Typically, solvers have methods like solverDt = computeTimeStepSize(). We don't know what the method does exactly. Underlying might be a hard stability criterion. Meaning, the solver cannot simply do a time step with solverDt + eps.

If it is a hard criterion or the time step cannot be changed, then why wouldn't the solver prescribe the time window size using a serial scheme?

The current implementation artificially extends the time step, which is completely intransparent for the adapter and coupled solver.
preCICE only tells the solver how long the next time step can be, there is no way of the solver finding out what timestep preCICE actually took. This leads to a shift in the time points of the solver and preCICE.

Let's say that this repeats every time window, then this time shift can build up.
Is this a problem? Do we need some way of telling this to the adapter?

@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

NiklasKotarsky commented Oct 19, 2023

If it is a hard criterion or the time step cannot be changed, then why wouldn't the solver prescribe the time window size using a serial scheme?

Using participant first might be a good idea for some systems and setups, but it only works in serial coupling. Meaning that we need to provide something else for the parallel and multicoupling case. The min-time-step tag is more meant to be used to avoid round up errors that could arise when doing integer time steps or as a quick fix when the time step in the solver is difficult to modify.

Let's say that this repeats every time window, then this time shift can build up. Is this a problem? Do we need some way of telling this to the adapter?

We are assuming that deltaCorrection << deltaT. Using this feature we are going to make an additional time integration error that scales with deltaCorrection, which in most cases will be smaller than the time integration error. Therefore we should be able to just ignore this error completely on both the user and coupling side.

There might be a case for an api function in preCICE that returns the current time, since we might run into different round of errors in the solver and preCICE resulting in a time difference between preCICE and the solver. This would also cover the problems this feature might cause for the solver, but this is out of scope for the current pull request.

@fsimonis
Copy link
Copy Markdown
Member

After the extensive discussion here, I think the rationale for the implementation is sufficiently documented.

@NiklasKotarsky are there any more changes you would like to do?
If not, then I would volunteer to squash and merge this PR.

@NiklasKotarsky
Copy link
Copy Markdown
Contributor Author

After the extensive discussion here, I think the rationale for the implementation is sufficiently documented.

@NiklasKotarsky are there any more changes you would like to do? If not, then I would volunteer to squash and merge this PR.

No, this is ready to be merged.

@fsimonis
Copy link
Copy Markdown
Member

I'll deal with the conflicts and merge the PR.

@fsimonis fsimonis merged commit 077e881 into precice:develop Oct 26, 2023
@NiklasKotarsky NiklasKotarsky deleted the changeSignificantDigitToSmallestTimeStep branch October 26, 2023 14:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking change Breaks backwards compatibility and users need to act

Projects

None yet

4 participants