Skip to content

Conversation

@jnewbery
Copy link
Contributor

This PR removes two sources of unreliability that were causing extended test cases to fail intermittently on travis:

  • bip9-softforks stop-starts bitcoind twice, but does not wait for the p2p connection to mininode to re-open. This would lead to race conditions where the following call to getblocktemplate() would sometimes fail because there were no p2p connections open to the node. This commit also removes the shutil.rmtree() call that was blatting the test_framework.log file.
  • forknotify asserts that an alert has been written to a file and fails immediately if the file is empty. This would lead to race conditions where the assert would sometimes be hit before bitcoind had written to the file.

Copy link
Contributor

@JeremyRubin JeremyRubin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utack! A couple suggestions but nothing major.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want the same behavior on L44?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait_for_verack() is called by the comparison test framework at the start of the test run (comptest.py line 300).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any work we can do in-between this and the wait for verack?

Maybe have a comment here about having to wait for verack rather than just deleting the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we need to wait for the network thread to start before receiving the verack (the network thread is what catches the verack and calls the callback).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more idiomatic way to get the dirname?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of.

I don't like the fact that this test is using shutil and removing files itself. This change just makes it slightly less damaging (ie it's only removing the datadir rather than the entire tmpdir)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe raise the error yourself rather than assert, unless you only want it in debug mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any difference. No-one runs the test cases with -O

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using asserts as control flow is generally an antipattern (I think that the for/else loop is a bit clearer in any case).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. I see what you're saying now. Yes, I'll remove the assert from the control flow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a bit cleaner to do something like (a couple of changes, feel free to pick and choose -- I was torn on the os.path.exists+getsize v.s. continual reading with open).

        for t in xrange(100):
            if os.path.exists(self.alert_filename) and os.path.getsize(self.alert_filename):
                break
            if t != 99: time.sleep(0.1) 
        else:
            raise AssertionError("-alertnotify did not warn of up-version blocks")
        with open(self.alert_filename, 'r', encoding='utf8') as f:
            alert_text = f.read()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is equivalent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah; mostly this was motivated by being able to get rid of the stateful timeout = 10; assert timeout > 0; timeout -= 0.1 combo.
This is also fine.

        for t in xrange(100):
            with open(self.alert_filename, 'r', encoding='utf8') as f:
                alert_text = f.read()
            if alert_text:
                break
            time.sleep(0.1) 
        else:
            raise AssertionError("-alertnotify did not warn of up-version blocks")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should rarely actually take 10 seconds, if the code is correct, right? Maybe 1 second will be a bit better in the case where it is actually broken & you're trying to fix it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends. Travis catches a lot of timing edge cases. I don't think there's any problem making the timeout something high like 10 seconds since the while loop will break as soon as the file is written.

bip9-sofforks.py stop-starts the bitcoind node twice during the test
run, but it doesn't wait for the connection from mininode to open before
continuing with the test. This leads to race conditions where the test
can fail getblocktemplate() because it has no p2p connections.
forknotify would intermittently fail because the alert file was not
being written fast enough. This commit adds a timeout so the test does
not fail immediately.
@jnewbery jnewbery force-pushed the extended_test_unreliablility branch from baeecbb to a4fd89f Compare March 28, 2017 20:22
@jnewbery
Copy link
Contributor Author

Pushed a new version using while/else and not using assert for control flow.

@maflcko
Copy link
Member

maflcko commented Apr 2, 2017

utACK a4fd89f. Going to merge this, so that the failures are a thing of the past.

time.sleep(0.1)
timeout -= 0.1
else:
assert False, "-alertnotify did not warn of up-version blocks"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnewbery nit: Any reason you changed the raise AssertionError to assert False?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason. Equivalent behaviour but you're probably right that raise AssertionError is better style here.

@maflcko maflcko merged commit a4fd89f into bitcoin:master Apr 2, 2017
maflcko pushed a commit that referenced this pull request Apr 2, 2017
… tests

a4fd89f Make forknotify.py more robust (John Newbery)
1f3d78b Wait for connection to open in bip9-softforks.py (John Newbery)

Tree-SHA512: de7d0002ee62ad97059b6f6c89b11f6e9901e3b4164ef6906bcd61e4ca499c277d9034784755966e5baf599869fad611b0b18f5547a384ceb5b7db3cc5bbd132
@jnewbery
Copy link
Contributor Author

jnewbery commented Apr 2, 2017

🎉 first successful daily run of extended tests on Travis: https://travis-ci.org/bitcoin/bitcoin/builds/217786785

Thanks for merging @MarcoFalke

PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request May 20, 2019
…ctional tests

a4fd89f Make forknotify.py more robust (John Newbery)
1f3d78b Wait for connection to open in bip9-softforks.py (John Newbery)

Tree-SHA512: de7d0002ee62ad97059b6f6c89b11f6e9901e3b4164ef6906bcd61e4ca499c277d9034784755966e5baf599869fad611b0b18f5547a384ceb5b7db3cc5bbd132
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants