Skip to content

[DSIP-18][Remote Logging] Add support for writing task logs to remote storage #13017

@rickchengx

Description

@rickchengx

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

Why remote logging?

  • Avoid task log loss after worker is torn down
  • Easier to obtain logs and troubleshoot after the task logs are uploaded to remote storage
  • Enhanced cloud-native support for DS

Feature Design

截屏2023-01-18 14 20 57

Connect to different remote targets

DS can support a variety of common remote storage, and has strong scalability to support other types of remote storage

  • S3
  • OSS
  • ElasticSearch
  • Azure Blob Storage
  • Google Cloud Storage
  • ...

When to write logs to remote storage

Like airflow, DS writes the task logs to remote storage after the task completes (success or fail).

How to read logs

Since the task log is stored in both the worker's local and remote storage, when the api-server needs to read the log of a certain task instance, it needs to determine the reading strategy.

Airflow first tries to read the logs stored remotely, and if it fails, reads the local logs. But I prefer to try to read the local log first, and then read the remote log if the local log file does not exist. Because this can reduce the consumption of network bandwidth.

We could discuss this further.

Log retention strategy

For example, the maximum capacity of remote storage can be set, and old logs can be deleted by rolling.

Sub-tasks

Ref

Any comments or suggestions are welcome.

Use case

Discussed above.

Related issues

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions