-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Remove sources of unreliablility in extended functional tests #10072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
447e5ec to
baeecbb
Compare
JeremyRubin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utack! A couple suggestions but nothing major.
test/functional/bip9-softforks.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want the same behavior on L44?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait_for_verack() is called by the comparison test framework at the start of the test run (comptest.py line 300).
test/functional/bip9-softforks.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any work we can do in-between this and the wait for verack?
Maybe have a comment here about having to wait for verack rather than just deleting the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we need to wait for the network thread to start before receiving the verack (the network thread is what catches the verack and calls the callback).
test/functional/bip9-softforks.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a more idiomatic way to get the dirname?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I know of.
I don't like the fact that this test is using shutil and removing files itself. This change just makes it slightly less damaging (ie it's only removing the datadir rather than the entire tmpdir)
test/functional/forknotify.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe raise the error yourself rather than assert, unless you only want it in debug mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there's any difference. No-one runs the test cases with -O
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using asserts as control flow is generally an antipattern (I think that the for/else loop is a bit clearer in any case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah. I see what you're saying now. Yes, I'll remove the assert from the control flow.
test/functional/forknotify.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a bit cleaner to do something like (a couple of changes, feel free to pick and choose -- I was torn on the os.path.exists+getsize v.s. continual reading with open).
for t in xrange(100):
if os.path.exists(self.alert_filename) and os.path.getsize(self.alert_filename):
break
if t != 99: time.sleep(0.1)
else:
raise AssertionError("-alertnotify did not warn of up-version blocks")
with open(self.alert_filename, 'r', encoding='utf8') as f:
alert_text = f.read()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is equivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah; mostly this was motivated by being able to get rid of the stateful timeout = 10; assert timeout > 0; timeout -= 0.1 combo.
This is also fine.
for t in xrange(100):
with open(self.alert_filename, 'r', encoding='utf8') as f:
alert_text = f.read()
if alert_text:
break
time.sleep(0.1)
else:
raise AssertionError("-alertnotify did not warn of up-version blocks")
test/functional/forknotify.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should rarely actually take 10 seconds, if the code is correct, right? Maybe 1 second will be a bit better in the case where it is actually broken & you're trying to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends. Travis catches a lot of timing edge cases. I don't think there's any problem making the timeout something high like 10 seconds since the while loop will break as soon as the file is written.
bip9-sofforks.py stop-starts the bitcoind node twice during the test run, but it doesn't wait for the connection from mininode to open before continuing with the test. This leads to race conditions where the test can fail getblocktemplate() because it has no p2p connections.
forknotify would intermittently fail because the alert file was not being written fast enough. This commit adds a timeout so the test does not fail immediately.
baeecbb to
a4fd89f
Compare
|
Pushed a new version using while/else and not using assert for control flow. |
|
utACK a4fd89f. Going to merge this, so that the failures are a thing of the past. |
| time.sleep(0.1) | ||
| timeout -= 0.1 | ||
| else: | ||
| assert False, "-alertnotify did not warn of up-version blocks" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jnewbery nit: Any reason you changed the raise AssertionError to assert False?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No good reason. Equivalent behaviour but you're probably right that raise AssertionError is better style here.
|
🎉 first successful daily run of extended tests on Travis: https://travis-ci.org/bitcoin/bitcoin/builds/217786785 Thanks for merging @MarcoFalke |
This PR removes two sources of unreliability that were causing extended test cases to fail intermittently on travis: