Skip to content

[Bug]: idle_duration causes instance recreate loop in non-elastic fleet #3128

@un-def

Description

@un-def

Steps to reproduce

Create a non-elastic fleet with idle_duration, e.g.,

# .dstack.yml
type: fleet
name: fleet-aws
nodes: 2
backends: [aws]
idle_duration: 5m
resources:
  cpu: 1..
  gpu: 0
  memory: 1GB..
  disk: 10GB..
dstack apply --yes --detach && dstack fleet --watch

Actual behaviour

FLEET      INSTANCE  BACKEND  RESOURCES  PRICE  STATUS   CREATED
 fleet-aws  0                                    pending  10 sec ago
            1                                    pending  10 sec ago
 FLEET      INSTANCE  BACKEND          RESOURCES                 PRICE    STATUS        CREATED
 fleet-aws  0         aws (us-east-1)  cpu=2 mem=1GB disk=100GB  $0.0104  provisioning  40 sec ago
            1         aws (us-east-1)  cpu=2 mem=1GB disk=100GB  $0.0104  provisioning  40 sec ago
 FLEET      INSTANCE  BACKEND          RESOURCES                 PRICE    STATUS        CREATED
 fleet-aws  0         aws (us-east-1)  cpu=2 mem=1GB disk=100GB  $0.0104  idle          5 min ago
            1         aws (us-east-1)  cpu=2 mem=1GB disk=100GB  $0.0104  idle          5 min ago

After 5 minutes:

 FLEET      INSTANCE  BACKEND          RESOURCES                 PRICE    STATUS       CREATED
 fleet-aws  0         aws (us-east-1)  cpu=2 mem=1GB disk=100GB  $0.0104  terminating  5 mins ago
            1         aws (us-east-1)  cpu=2 mem=1GB disk=100GB  $0.0104  terminating  5 mins ago
 FLEET      INSTANCE  BACKEND  RESOURCES  PRICE  STATUS   CREATED
 fleet-aws  0                                    pending  26 sec ago
            1                                    pending  26 sec ago

...and so on, each 5 minutes (idle_duration) instances are terminated and provisioned again.

Expected behaviour

No response

dstack version

b29c55e

Server logs

dstack._internal.server.services.backends:347 Requesting instance offers from backends: ['aws']
dstack._internal.server.services.backends:347 Requesting instance offers from backends: ['aws']
dstack._internal.server.background.tasks.process_instances:582 Trying t3.micro in aws/us-east-1 for $0.0104 per hour
dstack._internal.core.backends.aws.compute:290 Trying provisioning t3.micro in us-east-1d
dstack._internal.server.background.tasks.process_instances:582 Trying t3.micro in aws/us-east-1 for $0.0104 per hour
dstack._internal.core.backends.aws.compute:290 Trying provisioning t3.micro in us-east-1d
dstack._internal.server.background.tasks.process_instances:629 Created instance fleet-aws-0
dstack._internal.server.background.tasks.process_instances:629 Created instance fleet-aws-1
...
dstack._internal.server.background.tasks.process_instances:244 Instance fleet-aws-1 idle duration expired: idle time 300s. Terminating
dstack._internal.server.background.tasks.process_instances:244 Instance fleet-aws-0 idle duration expired: idle time 305s. Terminating
dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 18.234.230.76
dstack._internal.server.background.tasks.process_instances:969 Instance fleet-aws-1 terminated
dstack._internal.server.background.tasks.process_instances:931 Terminating runner instance 13.218.152.136
dstack._internal.server.background.tasks.process_instances:969 Instance fleet-aws-0 terminated
dstack._internal.server.background.tasks.process_fleets:175 Added 2 instances to fleet fleet-aws
dstack._internal.server.services.backends:347 Requesting instance offers from backends: ['aws']
dstack._internal.server.services.backends:347 Requesting instance offers from backends: ['aws']
dstack._internal.server.background.tasks.process_instances:582 Trying t3.micro in aws/us-east-1 for $0.0104 per hour
dstack._internal.server.background.tasks.process_instances:582 Trying t3.micro in aws/us-east-1 for $0.0104 per hour
dstack._internal.core.backends.aws.compute:290 Trying provisioning t3.micro in us-east-1d
dstack._internal.core.backends.aws.compute:290 Trying provisioning t3.micro in us-east-1d
dstack._internal.server.background.tasks.process_instances:629 Created instance fleet-aws-1
dstack._internal.server.background.tasks.process_instances:629 Created instance fleet-aws-0
...
dstack._internal.server.background.tasks.process_instances:244 Instance fleet-aws-0 idle duration expired: idle time 310s. Terminating
dstack._internal.server.background.tasks.process_instances:244 Instance fleet-aws-1 idle duration expired: idle time 315s. Terminating

Additional information

No response

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions