Skip to content

Fix NPE in dialer#2022

Merged
crosbymichael merged 1 commit intocontainerd:masterfrom
denverdino:master
Jan 18, 2018
Merged

Fix NPE in dialer#2022
crosbymichael merged 1 commit intocontainerd:masterfrom
denverdino:master

Conversation

@denverdino
Copy link
Copy Markdown
Contributor

@denverdino denverdino commented Jan 18, 2018

Signed-off-by: Li Yi [email protected]

In one testing with Docker Engine 17.12, I found the NPE in dialer of containerd. The log of docker engine is as following:

Jan 16 11:55:18 iZbp12bxxhtwowgjmozk40Z dockerd[10893]: time="2018-01-16T11:55:17.970922424+08:00" level=error msg="Handler for GET /containers/json returned error: write unix /var/run/docker.sock->@: write: broken pipe"
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z dockerd[10893]: panic: runtime error: invalid memory address or nil pointer dereference
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z dockerd[10893]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x1c09b69]
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z dockerd[10893]: goroutine 36453382 [running]:
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z dockerd[10893]: github.com/docker/docker/vendor/github.com/containerd/containerd/dialer.Dialer.func2(0xc44bb1aba0)
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z dockerd[10893]:         /go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/dialer/dialer.go:46 +0x59
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z dockerd[10893]: created by github.com/docker/docker/vendor/github.com/containerd/containerd/dialer.Dialer
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z dockerd[10893]:         /go/src/github.com/docker/docker/vendor/github.com/containerd/containerd/dialer/dialer.go:43 +0x1c7
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z systemd[1]: docker.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z systemd[1]: docker.service: Unit entered failed state.
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z systemd[1]: docker.service: Failed with result 'exit-code'.
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Jan 16 11:55:19 iZbp12bxxhtwowgjmozk40Z systemd[1]: Stopped Docker Application Container Engine.

Signed-off-by: Li Yi <[email protected]>
@codecov-io
Copy link
Copy Markdown

Codecov Report

Merging #2022 into master will increase coverage by 5.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2022      +/-   ##
==========================================
+ Coverage   40.48%   45.51%   +5.02%     
==========================================
  Files          73       94      +21     
  Lines        7998     9349    +1351     
==========================================
+ Hits         3238     4255    +1017     
- Misses       4245     4384     +139     
- Partials      515      710     +195
Flag Coverage Δ
#linux 50.06% <ø> (?)
#windows 40.48% <ø> (ø) ⬆️
Impacted Files Coverage Δ
oci/spec_unix.go 98.37% <0%> (ø)
fs/diff_unix.go 56.52% <0%> (ø)
cio/io.go 45.09% <0%> (ø)
fs/dtype_linux.go 50% <0%> (ø)
mount/mount.go 0% <0%> (ø)
oci/spec.go 50% <0%> (ø)
fs/hardlink_unix.go 60% <0%> (ø)
mount/mount_linux.go 0% <0%> (ø)
snapshots/btrfs/btrfs.go 57.39% <0%> (ø)
oci/spec_opts.go 0% <0%> (ø)
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update acc6011...adfa9a2. Read the comment docs.

Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread dialer/dialer.go
go func() {
dr := <-synC
if dr != nil {
if dr != nil && dr.c != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable; given this means that dialer() returns an error I wonder if there is any reason to log this error, which is effectively coming from net.DialTimeout()? But given a timeout error is already being returned and this is in a goroutine it could be messy to try and log.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we reached the timeout, I don't think it's useful to log any error the dialer might return.

Copy link
Copy Markdown
Contributor

@mlaventure mlaventure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crosbymichael
Copy link
Copy Markdown
Member

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants