-
Notifications
You must be signed in to change notification settings - Fork 5k
Description
Search before asking
- I had searched in the issues and found no similar feature requirement.
Description
Why remote logging?
- Avoid task log loss after worker is torn down
- Easier to obtain logs and troubleshoot after the task logs are uploaded to remote storage
- Enhanced cloud-native support for DS
Feature Design
Connect to different remote targets
DS can support a variety of common remote storage, and has strong scalability to support other types of remote storage
- S3
- OSS
- ElasticSearch
- Azure Blob Storage
- Google Cloud Storage
- ...
When to write logs to remote storage
Like airflow, DS writes the task logs to remote storage after the task completes (success or fail).
How to read logs
Since the task log is stored in both the worker's local and remote storage, when the api-server needs to read the log of a certain task instance, it needs to determine the reading strategy.
Airflow first tries to read the logs stored remotely, and if it fails, reads the local logs. But I prefer to try to read the local log first, and then read the remote log if the local log file does not exist. Because this can reduce the consumption of network bandwidth.
We could discuss this further.
Log retention strategy
For example, the maximum capacity of remote storage can be set, and old logs can be deleted by rolling.
Sub-tasks
- [Feature-13331][Remote Logging] Add support for writing task logs to OSS #13332
- [Feature-13419][Remote Logging] Add support for writing task logs to S3 #13649
- [Feature-13429][Remote Logging] Add support for writing task logs to Google Cloud Storage #13777
Ref
Any comments or suggestions are welcome.
Use case
Discussed above.
Related issues
- [Feature][Logging] Add support for logging into remote storage #8543
- This issue is a sub-task of [RoadMap][Year 2022 Q4] Community RoadMap #12436
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
