MINOR: Fix now that kafka.apache.org resolves to 3 IP addresses#9294
MINOR: Fix now that kafka.apache.org resolves to 3 IP addresses#9294mumrah merged 2 commits intoapache:trunkfrom
Conversation
abbccdda
left a comment
There was a problem hiding this comment.
These tests are really fragile, instead of verifying the number of addresses, could we also check the pattern of addresses or something similar to verify the correctness?
|
@abbccdda I agree that they are fragile. Could you give a little more detail about checking the patterns? I'm not sure if I follow. |
|
Netty used a in-memory DNS server for such tests: https://github.com/netty/netty/blob/master/resolver-dns/src/test/java/io/netty/resolver/dns/TestDnsServer.java. I suppose that we could do something similar in AK as well. What about merging this one and filing a JIRA to improve this in the future? All these flaky tests due to this are quite annoying so we could already fix them. What do you think? It would be great if we could backport it to previous versions as well. |
|
Agreed with @dajac's suggestion. Maybe for this change we could have the assertion as > 1 rather than equalling a specific value |
| @Test | ||
| public void testResolveDnsLookup() throws UnknownHostException { | ||
| // Note that kafka.apache.org resolves to 2 IP addresses | ||
| // Note that kafka.apache.org resolves to at least 2 IP addresses |
There was a problem hiding this comment.
Comment here and below refer to 2 IP addresses still.
There was a problem hiding this comment.
Are you saying to change it back? I thought it was fair to say "at least 2" to reflect we are asserting greater than 1.
|
|
||
| connectionStates.connecting(nodeId1, time.milliseconds(), hostTwoIps, ClientDnsLookup.USE_ALL_DNS_IPS); | ||
| InetAddress addr3 = connectionStates.currentAddress(nodeId1); | ||
| assertSame(addr1, addr3); |
There was a problem hiding this comment.
Hmm, so previously (and also with this change) we are assuming that we will get different resolved IPs for each connection to apache.kafka.org? This seems to rely on round-robbin DNS resolution that we can't really control. I think these assertions will always be prone to failure unless we can control the DNS server like @dajac suggested.
I guess we can commit this to unblock the tests, but we should definitely prioritize a better fix.
There was a problem hiding this comment.
Yeah. I was wondering if we should even keep this test in the current form, and I'm hoping that a DNS we can control would resolve this issue.
There was a problem hiding this comment.
The test is verifying the round robin that our code does, not the DNS server. We should ensure we still have that coverage in the meantime.
There was a problem hiding this comment.
Thanks for clarifying, @ijuma. Seems like we could verify the round-robin behavior without relying on specific number of IPs.
There was a problem hiding this comment.
I see. I changed it to show that the first three are different but I didn't include the check that it loops back to the first IP since that part seems the most fragile.
mumrah
left a comment
There was a problem hiding this comment.
Tests looking much better, just the one or two flaky tests as usual rather than dozens of failures :)
|
Thanks for the patch @jolshan! |
|
Have we cherry picked this to older branches? It seems like they will be broken otherwise. |
Reviewers: David Jacot <[email protected]>, Boyang Chen <[email protected]>, David Arthur <[email protected]>
|
2.6 backport applied cleanly, kicked off a build here https://ci-builds.apache.org/job/Kafka/job/kafka-2.6-jdk8/15/
|
…he#9294) Reviewers: David Jacot <[email protected]>, Boyang Chen <[email protected]>, David Arthur <[email protected]>
|
@mumrah I think 2.0 and higher would be a good target. We don't need separate PRs as long as the cherry pick is clean. |
…he#9294) Reviewers: David Jacot <[email protected]>, Boyang Chen <[email protected]>, David Arthur <[email protected]>
ClusterConnectStatesTest and ClientUtilsTest were failing because they expected kafka.apache.org to resolve to 2 IP addresses. This updates the tests so they reflect that DNS resolves to 3 addresses.