Skip to content

SagemakerProcessingOperator ThrottlingException #16763

@jimmycfa

Description

@jimmycfa

Apache Airflow version: 2.0.2

Kubernetes version (if you are using kubernetes) (use kubectl version): NA

Environment: MWAA and Locally

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): NA
  • Kernel (e.g. uname -a): NA
  • Install tools: NA
  • Others: NA

What happened:

When calling the SagemakerProcessingOperator sometimes get: "botocore.exceptions.ClientError: An error occurred (ThrottlingException)" due to excessive ListProcessingJobs operations.

What you expected to happen:

The job should have started without timing out. I believe one fix would be to use the NameContains functionality of boto3 list_processing_jobs so you don't have to paginate as is occurring here.

How to reproduce it:

If you incrementally create Sagemaker Processing jobs you will eventually see the Throttling as the pagination increases.

Anything else we need to know:

This looks like it is happening when the account already has a lot of former Sagemaker Processing jobs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind:bugThis is a clearly a bugprovider:amazonAWS/Amazon - related issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions