-
Notifications
You must be signed in to change notification settings - Fork 547
Description
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
The Okta source plugin gets rate limited, and loses some data.
The logs have messages like this:
{
"level": "error",
"module": "okta-src",
"client": "okta",
"error": "too many requests",
"message": "table resolver finished with error",
"table": "okta_application_group_assignments",
"time": "2023-05-05T14:12:50Z"
}
Expected Behavior
If the service (okta in this case) rate limits CQ< it should backoff and retry, and not complete until all the data is gathered (or retries are exhausted).
CloudQuery (redacted) config
kind: source
spec:
# Source spec section
name: okta
path: cloudquery/okta
version: "v2.2.4"
tables: ["*"]
destinations: ["s3"]
spec:
# Required. Your Okta domain name
domain: "https://${OKTA_DOMAIN}/"
# Optional. Okta Token to access API, you can set this with OKTA_API_TOKEN environment variable
# ⚠️ Warning - Your token should be kept secret and not committed to source control
token: $${OKTA_TOKEN}
---
kind: destination
spec:
name: "s3"
path: "cloudquery/s3"
version: "v3.1.2"
write_mode: "append" # s3 only supports 'append' mode
# batch_size: 10000 # optional
# batch_size_bytes: 5242880 # optional
spec:
bucket: "${CQ_S3_BUCKET}"
region: "${AWS_REGION}" # Example: us-east-1
path: "cloudquery/{{TABLE}}/{{YEAR}}/{{MONTH}}/{{DAY}}/{{UUID}}.json"
format: "json"
athena: false # <- set this to true for Athena compatibility
Steps To Reproduce
- Run
cloudquery sync - Observe errors in the log
- Examine output table, find some missing rows (I noticed them in the join tables, eg
okta_group_users,okta_application_group_assignments, etc)
CloudQuery (redacted) logs
{
"level": "error",
"module": "okta-src",
"client": "okta",
"error": "too many requests",
"message": "table resolver finished with error",
"table": "okta_application_group_assignments",
"time": "2023-05-05T14:12:50Z"
}
CloudQuery version
2.5.3
Additional Context
Running in fargate, using ghcr.io/cloudquery/cloudquery:2.5
Pull request (optional)
- I can submit a pull request
yevgenypats
Metadata
Metadata
Assignees
Labels
No labels