-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Description
Apache Airflow version: 2.0.2
Kubernetes version (if you are using kubernetes) (use kubectl version): NA
Environment: MWAA and Locally
- Cloud provider or hardware configuration: AWS
- OS (e.g. from /etc/os-release): NA
- Kernel (e.g.
uname -a): NA - Install tools: NA
- Others: NA
What happened:
When calling the SagemakerProcessingOperator sometimes get: "botocore.exceptions.ClientError: An error occurred (ThrottlingException)" due to excessive ListProcessingJobs operations.
What you expected to happen:
The job should have started without timing out. I believe one fix would be to use the NameContains functionality of boto3 list_processing_jobs so you don't have to paginate as is occurring here.
How to reproduce it:
If you incrementally create Sagemaker Processing jobs you will eventually see the Throttling as the pagination increases.
Anything else we need to know:
This looks like it is happening when the account already has a lot of former Sagemaker Processing jobs.