Skip to content

Conversation

@maflcko
Copy link
Member

@maflcko maflcko commented Jul 6, 2023

The timeout was unlimited before, so just restore that value for now: #27988 (comment) .

(Strictly speaking, this is a behavior change for the blockfilterindex and txindex tests, because it only restores the coinstatsindex behavior.)

The timeout was unlimited before, so just restore that value for now:
bitcoin#27988 (comment)
@DrahtBot
Copy link
Contributor

DrahtBot commented Jul 6, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK furszy, ajtowns, mzumsande

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28026 (test: bugfix, synchronize indexes synchronously by furszy)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Member

@furszy furszy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fabed7e

I'm a bit more inclined for #28026 but it is fine either way.

Note:
Would be good to mention in the PR description that this is a behavior change for the blockfilterindex and txindex tests. It only restores the coinstatsindex behavior.

@maflcko maflcko requested a review from mzumsande July 6, 2023 14:08
@maflcko
Copy link
Member Author

maflcko commented Jul 6, 2023

Thx, edited OP

@furszy
Copy link
Member

furszy commented Jul 6, 2023

Little note: I think that the only downside here with respect to increasing the timeout to a much higher number or #28026 is that the tests could run forever if the index m_synced flag is never set by the thread. I'm not sure how the CI will behave in this scenario, guess that we will realize that something like this happened just by reading the last executed test name.

@maflcko
Copy link
Member Author

maflcko commented Jul 6, 2023

We've been plagued by intermittent timeouts locally (for months/years) and now in CI (for days), but never a thread that never finished. So I think it should be evident that this or ##28026 should be preferred. After all, it is no different than any other deadlock or infinite loop anywhere else in the code.

@furszy
Copy link
Member

furszy commented Jul 6, 2023

Actually, it is not when the thread never finishes. The thread could have finished, but never set the m_synced flag to true. Which would make BlockUntilSyncedToCurrentChain never return true. Which could happen on all the index sync errors.

@ajtowns
Copy link
Contributor

ajtowns commented Jul 6, 2023

utACK fabed7e

Could also restore the original behaviour by doing something like:

void IndexWaitSynced(BaseIndex& index, std::chrono::microseconds timeout)
{
    const auto end = SteadyClock::now() + timeout;
    while (!index.BlockUntilSyncedToCurrentChain()) {
        Assert(timeout <= 0s || SteadyClock::now < end);
        UninterruptibleSleep(100ms);
    }
}

and calling IndexWaitSynced(index, 10s) normally, but IndexWaitSynced(index, -1s) for coinstatsindex.

@mzumsande
Copy link
Contributor

ACK fabed7e

as per #28026 (review) I like this approach more than running the sync synchronously.

Could also restore the original behaviour by doing something like:

The problem in the first place was that that the tests for txindex and blockfilterindex would have rare intermittent timeouts. So instead of making guesses about the slowest environment the tests could be run in I think it's ok to just have no timeout.

@DrahtBot DrahtBot removed the request for review from mzumsande July 6, 2023 15:07
@maflcko
Copy link
Member Author

maflcko commented Jul 6, 2023

Which could happen on all the index sync errors.

Good point. Though, that seems like a bigger problem of the unit tests not handling AbortNode/FatalError/StartShutdown at all?

@furszy
Copy link
Member

furszy commented Jul 6, 2023

Which could happen on all the index sync errors.

Good point. Though, that seems like a bigger problem of the unit tests not handling AbortNode/FatalError/StartShutdown at all?

The quickest fix would be to add a ShutdownRequested() call into the IndexWaitSynced() loop. And verify at the end of the function that the index class is synced by calling:

BOOST_CHECK(index.GetSummary().synced);

In this way, we can be guarded against the infinite loop. And also verify the index synchronization result (which must be always successful).

The "AbortNode/FatalError/StartShutdown" treatment on the tests topic seems to be a different problem that we should aboard in the future, yeah. Here the test main thread will actively-wait for a flag to be set, and the flag could never be set. So, even if we start handling "AbortNode/FatalError/StartShutdown" on the base tests class, the error would need to be handled by a different thread (not sure if we already have something for it).

@fanquake fanquake merged commit 299f17a into bitcoin:master Jul 7, 2023
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 7, 2023
fabed7e test: Restore unlimited timeout in IndexWaitSynced (MarcoFalke)

Pull request description:

  The timeout was unlimited before, so just restore that value for now: bitcoin#27988 (comment) .

  (Strictly speaking, this is a behavior change for the blockfilterindex and txindex tests, because it only restores the coinstatsindex behavior.)

ACKs for top commit:
  ajtowns:
    utACK fabed7e
  mzumsande:
    ACK fabed7e
  furszy:
    ACK fabed7e

Tree-SHA512: 66a878be58bbe53ad8e0c23f05569dd42df688be747551fbd202ada22d20a8285714e58fa2a71664deadb070ddf86cfad88c01042ff95ed26f6b40e4a10cec0a
@maflcko maflcko deleted the 2307-test-restore-timeout- branch July 8, 2023 09:19
ryanofsky added a commit to bitcoin-core/gui that referenced this pull request Jul 11, 2023
89ba890 test: indexes, fix on error infinite loop (furszy)

Pull request description:

  Coming from bitcoin/bitcoin#28036 (comment), I thought that we were going to fix it there but seems that got merged without it for some reason.

  As index sync failures trigger a shutdown request without notifying `BaseIndex::BlockUntilSyncedToCurrentChain` in any way, we also need to check whether a shutdown was requested or not inside 'IndexWaitSynced'.

  Otherwise, any error inside the index sync process will hang the test forever.

ACKs for top commit:
  MarcoFalke:
    lgtm ACK 89ba890
  jamesob:
    ACK 89ba890
  ryanofsky:
    Code review ACK 89ba890. Just comment update since last review

Tree-SHA512: 1f6daf34e51d3fbc802799bfa4ac0ef0d8f774db5f9e2f5d35df18a77679778475c94efc3da1fb723ebaf3583e4075e4a5cbe4a5104ad0c50e2b32076e247b29
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 12, 2023
89ba890 test: indexes, fix on error infinite loop (furszy)

Pull request description:

  Coming from bitcoin#28036 (comment), I thought that we were going to fix it there but seems that got merged without it for some reason.

  As index sync failures trigger a shutdown request without notifying `BaseIndex::BlockUntilSyncedToCurrentChain` in any way, we also need to check whether a shutdown was requested or not inside 'IndexWaitSynced'.

  Otherwise, any error inside the index sync process will hang the test forever.

ACKs for top commit:
  MarcoFalke:
    lgtm ACK 89ba890
  jamesob:
    ACK 89ba890
  ryanofsky:
    Code review ACK 89ba890. Just comment update since last review

Tree-SHA512: 1f6daf34e51d3fbc802799bfa4ac0ef0d8f774db5f9e2f5d35df18a77679778475c94efc3da1fb723ebaf3583e4075e4a5cbe4a5104ad0c50e2b32076e247b29
@bitcoin bitcoin locked and limited conversation to collaborators Jul 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants