Skip to content

Increase graceful timeout and hardcode AWS_PROFILE#306

Merged
squeakymouse merged 4 commits intomainfrom
katiewu/increase-graceful-timeout-and-aws-profile
Oct 3, 2023
Merged

Increase graceful timeout and hardcode AWS_PROFILE#306
squeakymouse merged 4 commits intomainfrom
katiewu/increase-graceful-timeout-and-aws-profile

Conversation

@squeakymouse
Copy link
Copy Markdown
Contributor

No description provided.

main_env = []
if isinstance(flavor, RunnableImageLike) and flavor.env:
main_env = [{"name": key, "value": value} for key, value in flavor.env.items()]
main_env.append({"name": "AWS_PROFILE", "value": build_endpoint_request.aws_role})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we put a test for this (to test various merging logic with user-provided aws profiles)

def start_server():
parser = argparse.ArgumentParser()
parser.add_argument("--graceful-timeout", type=int, default=600)
parser.add_argument("--graceful-timeout", type=int, default=1800)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what's being used in async tasks?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought we are going to patch llm-engine/model-engine/model_engine_server/inference/async_inference/celery.py to listen to sigterm?

Copy link
Copy Markdown
Contributor

@seanshi-scale seanshi-scale Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah iirc this code is used to serve the user container for both sync and async tasks for artifact-like bundles, and async_inference/celery.py shouldn't be used anymore for the user containers at least (not sure about celery-forwarder though)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd also have to patch celery-forwarder to listen to sigterm

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what's being used in async tasks?

This is what Frances's async endpoint uses

we'd also have to patch celery-forwarder to listen to sigterm

From the celery documentation: "When shutdown is initiated the worker will finish all currently executing tasks before it actually terminates" which I think means we're good?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possible to simulate such an scenario by sending traffic while restarting a pod?

@squeakymouse squeakymouse merged commit 4817ffc into main Oct 3, 2023
@squeakymouse squeakymouse deleted the katiewu/increase-graceful-timeout-and-aws-profile branch October 3, 2023 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants