fix a flaky test in advanced TLS#8474
Conversation
|
Oh, gosh, sleeps. No, we need to remove the sleeps. This is an API issue and will impact users as well. I don't understand what code you are talking about with "we scheduled the thread for the client". You only changed I think all the sleeps need to be removed though, and we need to change the behavior to immediately load the file as soon as updateIdentityCredentialsFromFile and the like are called. The other option is to block within a handshake if the files haven't been loaded (doesn't seem good) or to have a method that can load from file without polling (seems we really should have that now even) and require/expect users to load from file manually the first time. |
| GeneralSecurityException { | ||
| UpdateResult newResult = readAndUpdate(keyFile, certFile, 0, 0); | ||
| if (!newResult.success) { | ||
| throw new GeneralSecurityException("The initial update was not successful"); |
There was a problem hiding this comment.
As a user, it isn't clear what to do with this error. As we discussed, success really has more of updated semantics. Maybe use the language, "Files were unmodified before their initial update. Probably a bug."? Ditto for trust manager.
|
@ejona86 Hi Eric, the newly added code contains some conditions, e.g. the first time readAndUpdate(), that I couldn't think of a way to deliberately make them fail to cover all code execution paths. That makes the code coverage(codecov/patch) a bit low for this PR. Can we proceed without considering about the coverage at this moment? Or can you think of some other ways to enhance the code coverage? |
|
Yeah, @ZhenLian, it is fine as-is. |
|
Cool, thank you so much! |
This is intended to fix #8454.
The problem was because at the time we scheduled the thread for the client, we didn't reserve enough time before starting the connection, so there could be a situation where the handshake has established, while the client hasn't finished its first read/reload yet.
I added the code to sleep for 5 seconds and then begin our connection. Ideally, to fully remove the race condition, we might need some ways for the reloading thread telling us that the first read has been completed in the main test. But I can't think of any way that won't introduce any changes to the main code(I assume it is a bad practice to change code under "src" to better fit code under "test").
I've verified internally that this changes worked fine by running it for 1000 times(it failed previously).