Increase graceful timeout and hardcode AWS_PROFILE by squeakymouse · Pull Request #306 · scaleapi/llm-engine

squeakymouse · 2023-10-03T21:46:02Z

No description provided.

seanshi-scale · 2023-10-03T22:01:20Z

model-engine/model_engine_server/infra/gateways/resources/k8s_resource_types.py

    main_env = []
    if isinstance(flavor, RunnableImageLike) and flavor.env:
        main_env = [{"name": key, "value": value} for key, value in flavor.env.items()]
+    main_env.append({"name": "AWS_PROFILE", "value": build_endpoint_request.aws_role})


should we put a test for this (to test various merging logic with user-provided aws profiles)

yunfeng-scale · 2023-10-03T22:12:57Z

model-engine/model_engine_server/inference/sync_inference/start_fastapi_server.py

 def start_server():
    parser = argparse.ArgumentParser()
-    parser.add_argument("--graceful-timeout", type=int, default=600)
+    parser.add_argument("--graceful-timeout", type=int, default=1800)


is this what's being used in async tasks?

i thought we are going to patch llm-engine/model-engine/model_engine_server/inference/async_inference/celery.py to listen to sigterm?

yeah iirc this code is used to serve the user container for both sync and async tasks for artifact-like bundles, and async_inference/celery.py shouldn't be used anymore for the user containers at least (not sure about celery-forwarder though)

we'd also have to patch celery-forwarder to listen to sigterm

is this what's being used in async tasks?

This is what Frances's async endpoint uses

we'd also have to patch celery-forwarder to listen to sigterm

From the celery documentation: "When shutdown is initiated the worker will finish all currently executing tasks before it actually terminates" which I think means we're good?

possible to simulate such an scenario by sending traffic while restarting a pod?

squeakymouse added 3 commits October 3, 2023 17:50

test

4f565f8

try again

155f3c2

oops

d2ed944

squeakymouse requested review from ian-scale, seanshi-scale, yixu34 and yunfeng-scale October 3, 2023 21:46

seanshi-scale reviewed Oct 3, 2023

View reviewed changes

yunfeng-scale reviewed Oct 3, 2023

View reviewed changes

pin pylint

87c4756

seanshi-scale approved these changes Oct 3, 2023

View reviewed changes

squeakymouse merged commit 4817ffc into main Oct 3, 2023

squeakymouse deleted the katiewu/increase-graceful-timeout-and-aws-profile branch October 3, 2023 23:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase graceful timeout and hardcode AWS_PROFILE#306

Increase graceful timeout and hardcode AWS_PROFILE#306
squeakymouse merged 4 commits intomainfrom
katiewu/increase-graceful-timeout-and-aws-profile

squeakymouse commented Oct 3, 2023

Uh oh!

seanshi-scale Oct 3, 2023

Uh oh!

yunfeng-scale Oct 3, 2023

Uh oh!

yunfeng-scale Oct 3, 2023

Uh oh!

seanshi-scale Oct 3, 2023 •

edited

Loading

Uh oh!

seanshi-scale Oct 3, 2023

Uh oh!

squeakymouse Oct 3, 2023

Uh oh!

yunfeng-scale Oct 4, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

squeakymouse commented Oct 3, 2023

Uh oh!

seanshi-scale Oct 3, 2023

Choose a reason for hiding this comment

Uh oh!

yunfeng-scale Oct 3, 2023

Choose a reason for hiding this comment

Uh oh!

yunfeng-scale Oct 3, 2023

Choose a reason for hiding this comment

Uh oh!

seanshi-scale Oct 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seanshi-scale Oct 3, 2023

Choose a reason for hiding this comment

Uh oh!

squeakymouse Oct 3, 2023

Choose a reason for hiding this comment

Uh oh!

yunfeng-scale Oct 4, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

seanshi-scale Oct 3, 2023 •

edited

Loading