Skip to content

Fix task load.#2151

Merged
crosbymichael merged 1 commit intocontainerd:masterfrom
Random-Liu:fix-load-task
Feb 27, 2018
Merged

Fix task load.#2151
crosbymichael merged 1 commit intocontainerd:masterfrom
Random-Liu:fix-load-task

Conversation

@Random-Liu
Copy link
Copy Markdown
Member

@Random-Liu Random-Liu commented Feb 22, 2018

Fixes #2150.

Option 3 of #2150.

Signed-off-by: Lantao Liu [email protected]

@codecov-io
Copy link
Copy Markdown

Codecov Report

Merging #2151 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2151   +/-   ##
=======================================
  Coverage   45.41%   45.41%           
=======================================
  Files          82       82           
  Lines        8749     8749           
=======================================
  Hits         3973     3973           
  Misses       4130     4130           
  Partials      646      646
Flag Coverage Δ
#linux 50.07% <ø> (ø) ⬆️
#windows 41.09% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3b4fcf7...afb3f39. Read the comment docs.

@mlaventure
Copy link
Copy Markdown
Contributor

What about calling close after client.TaskService().Delete() instead?

@Random-Liu
Copy link
Copy Markdown
Member Author

What about calling close after client.TaskService().Delete() instead?

That is an option, too. It depends on how we define the order.
If we think that a deleted task means all associated resources are released, then we should put close in front.
If we don't care about the fifo directory. I think we can do that after. :)

@mlaventure
Copy link
Copy Markdown
Contributor

If we don't care about the fifo directory. I think we can do that after. :)

Well, it's a temp directory, a new one is created everytime. I'm on the position to delete the resources only once the task is actually gone. Especially since we assume those resources are there on attach.

@dnephin
Copy link
Copy Markdown
Contributor

dnephin commented Feb 23, 2018

I don't think we should have the mkdir in two places. I think I agree with @mlaventure. How about something like this?

diff --git a/task.go b/task.go
index 68308b62..5ee36337 100644
--- a/task.go
+++ b/task.go
@@ -267,7 +267,6 @@ func (t *task) Delete(ctx context.Context, opts ...ProcessDeleteOpts) (*ExitStat
 	if t.io != nil {
 		t.io.Cancel()
 		t.io.Wait()
-		t.io.Close()
 	}
 	r, err := t.client.TaskService().Delete(ctx, &tasks.DeleteTaskRequest{
 		ContainerID: t.id,
@@ -275,6 +274,10 @@ func (t *task) Delete(ctx context.Context, opts ...ProcessDeleteOpts) (*ExitStat
 	if err != nil {
 		return nil, errdefs.FromGRPC(err)
 	}
+	// Only close the IO after a successful Delete
+	if t.io != nil {
+		t.io.Close()
+	}
 	return &ExitStatus{code: r.ExitStatus, exitedAt: r.ExitedAt}, nil
 }

@Random-Liu
Copy link
Copy Markdown
Member Author

@dnephin OK. Will do so. :)

@Random-Liu
Copy link
Copy Markdown
Member Author

Done

Copy link
Copy Markdown
Contributor

@dnephin dnephin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@mlaventure mlaventure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one nit

Comment thread task.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/close/cleanup ?

If it was just closing the IO it wouldn't need to be done so late after a Cancel/Wait combo.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Random-Liu
Copy link
Copy Markdown
Member Author

Random-Liu commented Feb 23, 2018

The test failure seems like something wrong with golang 1.10 to me:


    The stdout of exec should be hello
    Expected
        <string>: hello
        
    to equal
        <string>: hello
        

It does equal, even the tailing \n is right...

@Random-Liu
Copy link
Copy Markdown
Member Author

Hm, it constantly fails with golang 1.10...
Let me take a look.

@Random-Liu
Copy link
Copy Markdown
Member Author

golang 1.10 has a bug... I'll file an issue and send a PR.

@Random-Liu
Copy link
Copy Markdown
Member Author

Random-Liu commented Feb 24, 2018

This is a golang 1.10 bug, see:

Let's not use golang 1.10 for now... This bug is really easy to hit...

@Random-Liu
Copy link
Copy Markdown
Member Author

This turns out to be a race condition exposed in golang 1.10. Send out a fix for cri-tools kubernetes-sigs/cri-tools#257.

Will update cri-tools as soon as that is fixed to unblock the submit queue.

@crosbymichael
Copy link
Copy Markdown
Member

Rebase now that the other PR is merged?

@Random-Liu
Copy link
Copy Markdown
Member Author

Will do now.

@Random-Liu
Copy link
Copy Markdown
Member Author

@crosbymichael Done.

@Random-Liu
Copy link
Copy Markdown
Member Author

Seems like another race condition for concurrent image pull.

E0226 22:26:15.401313    5139 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to get sandbox image "gcr.io/google_containers/pause:3.0": failed to pull image "gcr.io/google_containers/pause:3.0": failed to pull image "gcr.io/google_containers/pause:3.0": failed to write config: ref k8s.io/11/config-sha256:5fbdc14c118c31ddd5c553f66892b846469341287a0d356485b477467cae6dd0 locked: unavailable
Feb 26 22:26:15.401: INFO: Unexpected error occurred: rpc error: code = Unknown desc = failed to get sandbox image "gcr.io/google_containers/pause:3.0": failed to pull image "gcr.io/google_containers/pause:3.0": failed to pull image "gcr.io/google_containers/pause:3.0": failed to write config: ref k8s.io/11/config-sha256:5fbdc14c118c31ddd5c553f66892b846469341287a0d356485b477467cae6dd0 locked: unavailable

@crosbymichael
Copy link
Copy Markdown
Member

LGTM

@crosbymichael crosbymichael merged commit 25c4034 into containerd:master Feb 27, 2018
@Random-Liu Random-Liu deleted the fix-load-task branch March 5, 2018 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants