Skip to content

Conversation

@bitsofdave
Copy link
Contributor

@bitsofdave bitsofdave commented Feb 25, 2022

What issues does your PR fix?

The double-setting of CONNECTION_CHECK_MAX_COUNT was preventing the airflow scheduler from adopting pods due to k8s thinking the env var value was changing when the pod was being patched:

[2022-02-24 22:24:48,385] {kubernetes_executor.py:665} INFO - attempting to adopt pod podname.69b7727977014e948bc0b31cb2946803
[2022-02-24 22:24:48,399] {kubernetes_executor.py:683} INFO - Failed to adopt pod podname.69b7727977014e948bc0b31cb2946803. Reason: (422)
Reason: Unprocessable Entity
... is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations`

The change in question that is preventing the patch request:

Name:      \"CONNECTION_CHECK_MAX_COUNT\",
Value:     \"0\",
Value:     \"20\",
ValueFrom: nil

What does your PR do?

  • Ensures that the kubernetesExecutor pod_template only has one CONNECTION_CHECK_MAX_COUNT in its env:
    • NOTE: the default CONNECTION_CHECK_MAX_COUNT is still 20 (when airflow.legacyCommands=false)

Checklist

For all Pull Requests

For releasing ONLY

@bitsofdave bitsofdave force-pushed the fix_double_setting_connection branch from a0b0c7b to 8d0d00f Compare February 25, 2022 16:49
@avinashparchuri
Copy link

Thanks @bitsofdave for the PR - we are running into the same issue with our deployment. @thesuperzapper would it be possible to get this in the next release?

@MarkusTeufelberger
Copy link

Same here, this breaks pod adoption by the scheduler. Please merge this and ideally do a release as well.

@SoumyadipAuddy
Copy link

Even we faced the same problem where the scheduler cannot adopt the pod after the scheduler restarts. Please merge this ⚡

@thesuperzapper thesuperzapper added this to the airflow-8.6.0 milestone Mar 21, 2022
@thesuperzapper thesuperzapper force-pushed the fix_double_setting_connection branch from 8d0d00f to 5043e60 Compare March 21, 2022 07:57
Copy link
Member

@thesuperzapper thesuperzapper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bitsofdave thanks for the PR, I have rebased and made some small changes (so we can merge and release ASAP).

cc @avinashparchuri @MarkusTeufelberger @SoumyadipAuddy

@thesuperzapper thesuperzapper removed the request for review from gsemet March 21, 2022 08:02
@thesuperzapper thesuperzapper added the status/ready-to-merge status - this will be merged into next release label Mar 22, 2022
@SoumyadipAuddy
Copy link

@thesuperzapper by when this code can be expected to be merged and be present as a separate release 🤔

@thesuperzapper
Copy link
Member

@SoumyadipAuddy this change will be in version 8.6.0 of the chart.

I am currently working through the remaining issues for 8.6.0 so I can cut a release sometime before Friday.

See the milestone for 8.6.0 to see what remains.

@MarkusTeufelberger
Copy link

Personally, I really don't care about any feature that's planned for 8.6.0 at the moment other than this single fix. As this is a clear bug, not a new feature, please either cut a patch release or at least merge it to main so I can stop vendoring the chart and pin my pipelines back onto this repository again.

@thesuperzapper thesuperzapper changed the title fix: set CONNECTION_CHECK_MAX_COUNT once fix: only set CONNECTION_CHECK_MAX_COUNT once Mar 31, 2022
@thesuperzapper thesuperzapper merged commit 3d6f944 into airflow-helm:main Apr 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/ready-to-merge status - this will be merged into next release

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Double setting of CONNECTION_CHECK_MAX_COUNT preventing pod adoption

5 participants