Skip to content

Comments

#3301 adding lock around worker placement#3902

Merged
ddadlani merged 3 commits intomasterfrom
issue/3301
Jun 19, 2019
Merged

#3301 adding lock around worker placement#3902
ddadlani merged 3 commits intomasterfrom
issue/3301

Conversation

@deniseyu
Copy link
Contributor

Signed-off-by: Denise Yu [email protected]
Co-authored-by: Krishna Mannem [email protected]

@deniseyu deniseyu requested review from ddadlani and vito May 22, 2019 19:43
@kcmannem
Copy link
Contributor

dont merge yet, we messed up image pulling

@jwntrs
Copy link
Contributor

jwntrs commented May 24, 2019

@kcmannem @deniseyu

This seems kinda roundabout to me. You are basically saying find me a worker and create a container, then you expose back the worker just so that we can look up the container all over again.

Why don't we just add a CreateContainer method in worker.Client, which returns a Creating container so that we can move closer to not caring about worker selection. Then we can move a bunch of the worker.FindOrCreateContainer logic, such as input streaming, into a method on the creating container itself (or something that takes in a creating container).

It will probably also help to separate the FindOrCreate logic into Find or Create and push that orchestration into core. Once we move to ephemeral check containers we will never be looking up an existing container, only ever creating new ones.

\cc @ddadlani

@kcmannem
Copy link
Contributor

@pivotal-jwinters I think the goal is to get to what you were saying. I agree its kind of weird right now. But this wasn't meant to be that big refactor but rather way to get things correct again and ease some pain points for users. This gives them something while we can work on longer term solutions after.

@marco-m
Copy link
Contributor

marco-m commented Jun 15, 2019

@wagdav FYI

@wagdav
Copy link
Contributor

wagdav commented Jun 17, 2019

@vito @ddadlani would you please take a look at this PR? My understanding is that this would be a fix for #3301 and possibly other scheduler related problems. Thanks!

@kcmannem
Copy link
Contributor

@wagdav this change exposed a larger issue where checks error every now and then. Another recent change in how check sessions are handled also makes it difficult to patch this issue. We are still working on a fix for that error, after which we will merge this into master

@kcmannem
Copy link
Contributor

@wagdav we're gunna workaround the issue, this should be going in considering the tests pass.

Signed-off-by: Divya Dadlani <[email protected]>
Co-authored-by: Krishna Mannem <[email protected]>
@kcmannem
Copy link
Contributor

@ddadlani fixed the tests, we accidentally commit changes to the docker-compose. its ready for review now.

Signed-off-by: Divya Dadlani <[email protected]>
Co-authored-by: Krishna Mannem <[email protected]>
- -c
- |
exit 2
- in_parallel:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ddadlani instead of making a new topgun test cause its heavy, I decided to just update the existing one to now assert that in_parallel tasks get distributed evenly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

- -c
- |
exit 2
- in_parallel:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

@ddadlani ddadlani merged commit 24e3a3d into master Jun 19, 2019
@vito vito deleted the issue/3301 branch June 19, 2019 14:32
@ddadlani
Copy link
Contributor

fixes #3301

@aledegano
Copy link
Contributor

fixes #3301

@ddadlani your comment didn't really close issue #3301, was that intended? (I thought that GitHub should close the issue with this type of comments...)

@ddadlani
Copy link
Contributor

@aledeganopix4d yes it should have. I think I had to comment that before I merged this PR, though. Thanks for pointing it out!

kcmannem pushed a commit that referenced this pull request Jun 27, 2019
jamieklassen pushed a commit that referenced this pull request Jun 27, 2019
@jamieklassen jamieklassen added the release/no-impact This is an issue that never affected released versions (i.e. a regression caught prior to shipping). label Jul 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release/no-impact This is an issue that never affected released versions (i.e. a regression caught prior to shipping).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants