Skip to content

Scheduler cpu usage. #512

@csepulveda

Description

@csepulveda

Checks

Chart Version

8.5.3

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.5", GitCommit:"aea7bbadd2fc0cd689de94a54e5b7b758869d691", GitTreeState:"clean", BuildDate:"2021-09-15T21:10:45Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"darwin/arm64"}
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.11-eks-f17b81", GitCommit:"f17b810c9e5a82200d28b6210b458497ddfcf31b", GitTreeState:"clean", BuildDate:"2021-10-15T21:46:21Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}

Helm Version

version.BuildInfo{Version:"v3.7.2", GitCommit:"663a896f4a815053445eec4153677ddc24a0a361", GitTreeState:"clean", GoVersion:"go1.17.3"}

Description

Hello guys.
I dont know if this behaviour its normal but the airflow scheduler use a lot of cpu.
i use eks and m5n.large instances and the scheduler use more than 30% of cpu without any dag. (i thibk could be my dag, but i remove all and the cpu its almost the same.)

kubectl top pod -n airflow --sort-by=cpu  --use-protocol-buffers=true             NAME                                      CPU(cores)   MEMORY(bytes)   
airflow-scheduler-5d5998697c-8vv7x        228m         194Mi           
airflow-pgbouncer-86578bd5b6-vfjgn        10m          11Mi            
airflow-flower-75c46d996f-n4pk7           9m           129Mi           
airflow-postgresql-0                      8m           39Mi            
airflow-redis-master-0                    8m           6Mi             
airflow-web-5cc6d98f65-9pkj6              5m           1026Mi          
airflow-worker-0                          3m           973Mi           
airflow-db-migrations-59fd758646-8qsrv    1m           218Mi           
airflow-sync-users-57989db584-2gn4h       1m           219Mi           
airflow-sync-variables-695d855c88-slvf4   1m           91Mi 

Relevant Logs

____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
[2022-01-27 14:10:19,518] {scheduler_job.py:662} INFO - Starting the scheduler
[2022-01-27 14:10:19,518] {scheduler_job.py:667} INFO - Processing each file at most -1 times
[2022-01-27 14:10:20,606] {manager.py:254} INFO - Launched DagFileProcessorManager with pid: 50
[2022-01-27 14:10:20,608] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs
[2022-01-27 14:15:20,729] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs
[2022-01-27 14:20:20,865] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs
[2022-01-27 14:25:20,989] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs
[2022-01-27 14:30:21,126] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs
[2022-01-27 14:35:21,244] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs
[2022-01-27 14:40:21,367] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs
[2022-01-27 14:45:21,480] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs
[2022-01-27 14:50:21,587] {scheduler_job.py:1217} INFO - Resetting orphaned tasks for active dag runs

Custom Helm Values

airflow:

  config:
    AIRFLOW__SCHEDULER__MIN_FILE_PROCESS_INTERVAL: 120
    AIRFLOW__SCHEDULER__PROCESSOR_POLL_INTERVAL: 10
    AIRFLOW__SCHEDULER__PARSING_PROCESSES: 4
    AIRFLOW__SCHEDULER__DAG_DIR_LIST_INTERVAL: 300
    AIRFLOW__SCHEDULER__SCHEDULER_HEARTBEAT_SEC: 10


dags:
  gitSync:
    enabled: true
    repo: "[email protected]:xxxxx/data_team/etl/airflow.git"
    branch: "xxxx"
    sshSecret: "airflow-ssh-git-secret"
    sshSecretKey: "id_rsa"
    sshKnownHosts: |-
      gitlab.com ssh-rsa xxxxx

webserverSecretKey: xxxxx

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugkind - things not working properly

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions