Description
Hello, I found in the function clusterLeave()
|
func (nDB *NetworkDB) clusterLeave() error { |
func (nDB *NetworkDB) clusterLeave() error {
mlist := nDB.memberlist
if err := nDB.sendNodeEvent(NodeEventTypeLeave); err != nil {
log.G(context.TODO()).Errorf("failed to send node leave: %v", err)
}
if err := mlist.Leave(time.Second); err != nil {
return err
}
// cancel the context
nDB.cancelCtx()
for _, t := range nDB.tickers {
t.Stop()
}
return mlist.Shutdown()
}
If the mlist.Leave() return err, the nDB.cancelCtx() below will not get executed.
|
if err := mlist.Leave(time.Second); err != nil { |
|
return err |
|
} |
|
|
|
// cancel the context |
|
nDB.cancelCtx() |
And it will lead the <-nDB.ctx.Done in triggerFunc() blocked persistently, so the goroutine leak.
|
go nDB.triggerFunc(trigger.interval, t.C, trigger.fn) |
blocking position:
|
for { |
|
select { |
|
case <-C: |
|
f() |
|
case <-nDB.ctx.Done(): |
|
return |
|
} |
|
} |
Reproduce
I reproduce the bug by goleak.
Firstly, I modified the judge condition from err != nil to err == nil. Because I don't know how to let err != nil, the change only to make the return err can be executed easier. I'm not sure whether the change can lead other influences.
Normally:
if err := mlist.Leave(time.Second); err != nil {
return err
}
After modified:
if err := mlist.Leave(time.Second); err == nil {
return err
}
Then I used goleak to test in these test function related the funciton.
|
func TestNetworkDBSimple(t *testing.T) { |
Like this:

The result shows that there is a bug at the <-nDB.ctx.Done
Expected behavior
No response
docker version
docker info
Additional Info
In short, I think the bug is caused by return but have not called the cancelFunc. I have tried to describe it in detail.
Description
Hello, I found in the function clusterLeave()
moby/libnetwork/networkdb/cluster.go
Line 222 in ceefb7d
If the mlist.Leave() return err, the nDB.cancelCtx() below will not get executed.
moby/libnetwork/networkdb/cluster.go
Lines 229 to 234 in ceefb7d
And it will lead the <-nDB.ctx.Done in triggerFunc() blocked persistently, so the goroutine leak.
moby/libnetwork/networkdb/cluster.go
Line 176 in ceefb7d
blocking position:
moby/libnetwork/networkdb/cluster.go
Lines 251 to 258 in ceefb7d
Reproduce
I reproduce the bug by goleak.
Firstly, I modified the judge condition from err != nil to err == nil. Because I don't know how to let err != nil, the change only to make the return err can be executed easier. I'm not sure whether the change can lead other influences.
Normally:
After modified:
Then I used goleak to test in these test function related the funciton.
moby/libnetwork/networkdb/networkdb_test.go
Line 180 in ceefb7d
Like this:
The result shows that there is a bug at the <-nDB.ctx.Done
Expected behavior
No response
docker version
docker info
Additional Info
In short, I think the bug is caused by return but have not called the cancelFunc. I have tried to describe it in detail.