Skip to content

Feature/bump netlink#main#43718

Merged
thaJeztah merged 1 commit intomoby:masterfrom
s4ke:feature/bump-netlink#main
Jun 22, 2022
Merged

Feature/bump netlink#main#43718
thaJeztah merged 1 commit intomoby:masterfrom
s4ke:feature/bump-netlink#main

Conversation

@s4ke
Copy link
Copy Markdown
Contributor

@s4ke s4ke commented Jun 12, 2022

Attempting to fix #43551

Will update according to contribution guidlines if the fix is fine.

- What I did

Bumped netlink dependencies to newest release (currently in beta)

- How to verify it

still in progress, but if everyone agrees that this is the correct course of action, I will verify this on currently affected servers.

- Description for the changelog

Bump netlink dependency to fix issues where encrypted overlay networks did not work with newer Linux kernel releases.

- A picture of a cute animal (not mandatory but encouraged)

@s4ke s4ke force-pushed the feature/bump-netlink#main branch 2 times, most recently from 127545c to 39012bd Compare June 12, 2022 22:41
@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 12, 2022

Since I am new to working on this project, I would greatly appreciate some input on these:

  1. Is moby/moby the correct repo for this fix or should I have done this in libnetwork?
  2. Can I test the encryption behaviour with two docker daemons running in dind or do I have to spin up VMs for this?
  3. Did I manipulate vendor.mod correctly?

@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 13, 2022

I feel like a unit test in https://github.com/moby/moby/tree/master/libnetwork/test/integration/dnet would be helpful to create a test to validate that encryption works properly, as I haven't found any other test (new to this project).

@thaJeztah what do you think?

@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 13, 2022

closing this as the issue seems to be more complex

@s4ke s4ke closed this Jun 13, 2022
vendor.mod Outdated
github.com/prometheus/client_golang => github.com/prometheus/client_golang v1.6.0
github.com/prometheus/procfs => github.com/prometheus/procfs v0.0.11
github.com/vishvananda/netlink => github.com/vishvananda/netlink v1.1.0
github.com/vishvananda/netlink => github.com/vishvananda/netlink v1.2.1-beta.2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this replace rule if we decide to update to a new release; ISTR we added this to not update to a non-released version if there was no need. As there's a (beta) release now, perhaps we should consider updating to a newer version either way 🤔

@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 16, 2022

@thaJeztah Thanks for the review :).

Even though this probably might not be directly fixing #43551 I think this is still a useful upgrade, right? If yes, I can improve the PR according to your review? wdyt?

@thaJeztah
Copy link
Copy Markdown
Member

Yes, I think it'd be ok to update the dependency to a more current version; if you have time to work on updating the PR; please do 👍 (please amend the existing commit so that there's only a single commit in the PR - but happy to help with that if needed!)

@s4ke s4ke reopened this Jun 16, 2022
@thaJeztah
Copy link
Copy Markdown
Member

Looks like there's some linting failures, because .Delete() was deprecated in favour of .Close() (the former now being just a wrapper for it, so easy to resolve by just s/Delete/Close/

libnetwork/drivers/bridge/setup_device_test.go:23:8: SA1019: nh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
	defer nh.Delete()
	      ^
libnetwork/drivers/bridge/setup_device_test.go:49:8: SA1019: nh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
	defer nh.Delete()
	      ^
libnetwork/drivers/bridge/setup_device_test.go:71:8: SA1019: nh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
	defer nh.Delete()
	      ^
libnetwork/drivers/bridge/setup_ipv4_test.go:37:8: SA1019: nh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
	defer nh.Delete()
	      ^
libnetwork/drivers/bridge/setup_ipv4_test.go:70:8: SA1019: nh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
	defer nh.Delete()
	      ^
libnetwork/drivers/bridge/setup_ipv6_test.go:24:8: SA1019: nh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
	defer nh.Delete()
	      ^
libnetwork/drivers/bridge/setup_ipv6_test.go:74:8: SA1019: nh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
	defer nh.Delete()
	      ^
libnetwork/drivers/overlay/ov_network.go:439:10: SA1019: nlh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
			defer nlh.Delete()
			      ^
libnetwork/drivers/overlay/ov_utils.go:142:9: SA1019: nlh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
		defer nlh.Delete()
		      ^
libnetwork/osl/namespace_linux.go:237:3: SA1019: n.nlHandle.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
		n.nlHandle.Delete()
		^
libnetwork/osl/namespace_linux.go:290:3: SA1019: n.nlHandle.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
		n.nlHandle.Delete()
		^
libnetwork/osl/namespace_linux.go:472:3: SA1019: n.nlHandle.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
		n.nlHandle.Delete()
		^
libnetwork/osl/sandbox_linux_test.go:137:8: SA1019: nh.Delete is deprecated: use Close instead which is in line with typical resource release patterns for files and other resources. (staticcheck)
	defer nh.Delete()
	      ^

@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 16, 2022

locally the tests are failing for me, so I will have to check this out further:

=== Skipped
=== SKIP: libnetwork/iptables TestFirewalldInit (0.00s)
    firewalld_test.go:14: firewalld is not running

=== Failed
=== FAIL: libnetwork/drivers/bridge TestPortMappingV6Config (0.12s)
time="2022-06-16T17:32:48Z" level=warning msg="bridge store not initialized. kv object docker/network/v1.0/bridge/dummy/ is not added to the store"
time="2022-06-16T17:32:48Z" level=warning msg="bridge store not initialized. kv object docker/network/v1.0/bridge-endpoint/ep1/ is not added to the store"
time="2022-06-16T17:32:48Z" level=warning msg="bridge store not initialized. kv object docker/network/v1.0/bridge-endpoint/ep1/ is not added to the store"
    port_mapping_test.go:163: Failed to store the port bindings into the sandbox info. Found: [{udp 172.19.0.11 400 0.0.0.0 54000 54000} {tcp 172.19.0.11 500 0.0.0.0 65000 65000} {sctp 172.19.0.11 500 0.0.0.0 65000 65000}]

=== FAIL: libnetwork/ipam TestParallelPredefinedRequest2 (0.00s)
panic: send on closed channel [recovered]
        panic: send on closed channel

goroutine 27 [running]:
testing.tRunner.func1.2({0x64b7c0, 0x6dfb10})
        /usr/local/go/src/testing/testing.go:1389 +0x24e
testing.tRunner.func1()
        /usr/local/go/src/testing/testing.go:1392 +0x39f
panic({0x64b7c0, 0x6dfb10})
        /usr/local/go/src/runtime/panic.go:838 +0x207
github.com/docker/docker/libnetwork/ipam.runParallelTests(0xc000503520, 0x1)
        /go/src/github.com/docker/docker/libnetwork/ipam/allocator_test.go:1503 +0x28b
github.com/docker/docker/libnetwork/ipam.TestParallelPredefinedRequest2(0x5ed747?)
        /go/src/github.com/docker/docker/libnetwork/ipam/allocator_test.go:1607 +0x1e
testing.tRunner(0xc000503520, 0x6a2390)
        /usr/local/go/src/testing/testing.go:1439 +0x102
created by testing.(*T).Run
        /usr/local/go/src/testing/testing.go:1486 +0x35f

=== FAIL: libnetwork/ipam TestParallelPredefinedRequest4 (unknown)

=== FAIL: libnetwork/ipam TestParallelPredefinedRequest1 (unknown)

=== FAIL: libnetwork/ipam TestParallelPredefinedRequest3 (unknown)

DONE 326 tests, 1 skipped, 5 failures in 202.256s
make: *** [Makefile:235: test-unit] Error 1

@thaJeztah
Copy link
Copy Markdown
Member

thaJeztah commented Jun 16, 2022

Let's see what CI does; I know we inherited some, erm, "weird" tests from libnetwork, some of which are not-so-very-unit-tests that may require some setting up, so perhaps they're ok.

@thaJeztah
Copy link
Copy Markdown
Member

Hmm... funny; so for some reason, go modules now expects the github.com/Azure/go-ansiterm package to be a direct requires, no longer an indirect, making validate fail;

The result of go mod vendor differs

 M vendor.mod

Please vendor your package with hack/vendor.sh.
diff --git a/vendor.mod b/vendor.mod
index b72c3c3398..a5018087e0 100644
--- a/vendor.mod
+++ b/vendor.mod
@@ -86,9 +86,10 @@ require (
 	gotest.tools/v3 v3.2.0
 )
 
+require github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
+
 require (
 	code.cloudfoundry.org/clock v1.0.0 // indirect
-	github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1 // indirect
 	github.com/agext/levenshtein v1.2.3 // indirect
 	github.com/armon/circbuf v0.0.0-20190214190532-5111143e8da2 // indirect

Let me know if you need a hand fixing that

@thaJeztah
Copy link
Copy Markdown
Member

(fwiw; don't put it as a separate require line, but move it into the first require section if the script doesn't automatically put it there)

@s4ke s4ke force-pushed the feature/bump-netlink#main branch from 67b2596 to 0e88a6b Compare June 16, 2022 20:15
bump netlink to 1.2.1
change usages of netlink handle .Delete() to Close()
remove superfluous replace in vendor.mod
make requires of github.com/Azure/go-ansiterm direct

Signed-off-by: Martin Braun <[email protected]>
@s4ke s4ke force-pushed the feature/bump-netlink#main branch from 0e88a6b to 5edfd6d Compare June 16, 2022 20:26
@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 16, 2022

I think I got it now. I greatly appreciate the handholding here :). Great first commiter experience I must say :)

@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 17, 2022

CI seems happy with the latest changes :)

EDIT: seems like I misinterpreted GitHub CI. There seem to be some issues it seems. But they look unrelated-ish?

@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 20, 2022

Hmm, seems like CI is failing for windows-2022? Tbh, I am having a hard time reading the test logs here on GitHub, but to me these errors look unrelated to the PR?

@thaJeztah
Copy link
Copy Markdown
Member

I gave CI a kick to run again 👍 🤞

@s4ke
Copy link
Copy Markdown
Contributor Author

s4ke commented Jun 22, 2022

CI is green 🤖 🥳

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoop!

LGTM

@thaJeztah
Copy link
Copy Markdown
Member

@tianon @corhere PTAL

@thaJeztah
Copy link
Copy Markdown
Member

Thx! Let me bring this one in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docker Engine 20.10.14 breaks with debian kernel linux-image-5.10.0-13-amd64

3 participants