-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Add Kubernetes cleanup-pods CLI command for Helm Chart #11802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
closes: apache#11146 Adds a CLI command `airflow kubernetes cleanup-pods --namespace NAMESPACE` to Clean up Kubernetes pods in evicted/failed/succeeded states for the Helm Chart
|
CI is green @mik-laj |
| config.load_incluster_config() | ||
| core_v1 = client.CoreV1Api() | ||
| print(f'Listing pods in namespace {namespace}') | ||
| pod_list = core_v1.list_namespaced_pod(namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On large clusters, retrieving the collection of some resource types may result in very large responses that can impact the server and client. For instance, a cluster may have tens of thousands of pods, each of which is 1-2kb of encoded JSON. Retrieving all pods across all namespaces may result in a very large response (10-20MB) and consume a large amount of server resources.
Should we use chunking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only for a specific namespace, but yes chunking would be good - do you have any examples to hand?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
continue_token = None
while True:
pod_list = core_v1.list_namespaced_pod(namespace, limit=500, _continue=continue_token)
for pod in pod__list.items:
# Process item
continue_token = pod_list.metadata._continue
if not continue_token:
break Not tested, but should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 5e4a078
|
Are we planning to backport this changr to Airflow 1.10? if not, can you add documentation that this option is not supported in Airflow 1.10. we maintain compatible with Airflow 1.10 and 2.0, so when this is not possible, we shoyld have documentation to inform the user about these limitations. |
| imagePullPolicy: {{ .Values.images.airflow.pullPolicy }} | ||
| # Don't use entry point here, we don't need to wait on pg-bouncer etc being available. | ||
| command: ["airflow-cleanup-pods", "--namespace={{ .Release.Namespace }}"] | ||
| args: ["kubernetes", "cleanup-pods", "--namespace={{ .Release.Namespace }}"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment left: #11802 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I am planning to backport this, will add 1.10.13 milestone
|
The PR should be OK to be merged with just subset of tests as it does not modify Core of Airflow. The committers might merge it or can add a label 'full tests needed' and re-run it to run all tests if they see it is needed! |
- BugFix: Tasks with ``depends_on_past`` or ``task_concurrency`` are stuck (apache#12663) - Fix issue with empty Resources in executor_config (apache#12633) - Fix: Deprecated config ``force_log_out_after`` was not used (apache#12661) - Fix empty asctime field in JSON formatted logs (apache#10515) - [AIRFLOW-2809] Fix security issue regarding Flask SECRET_KEY (apache#3651) - [AIRFLOW-2884] Fix Flask SECRET_KEY security issue in www_rbac (apache#3729) - [AIRFLOW-2886] Generate random Flask SECRET_KEY in default config (apache#3738) - Add missing comma in setup.py (apache#12790) - Bugfix: Unable to import Airflow plugins on Python 3.8 (apache#12859) - Fix setup.py missing comma in ``setup_requires`` (apache#12880) - Don't emit first_task_scheduling_delay metric for only-once dags (apache#12835) - Update setup.py to get non-conflicting set of dependencies (apache#12636) - Rename ``[scheduler] max_threads`` to ``[scheduler] parsing_processes`` (apache#12605) - Add metric for scheduling delay between first run task & expected start time (apache#9544) - Add new-style 2.0 command names for Airflow 1.10.x (apache#12725) - Add Kubernetes cleanup-pods CLI command for Helm Chart (apache#11802) - Don't let webserver run with dangerous config (apache#12747) - Replace pkg_resources with importlib.metadata to avoid VersionConflict errors (apache#12694) - Clarified information about supported Databases
closes: apache#11146 (cherry picked from commit 980c725)
closes: apache#11146 (cherry picked from commit 980c725)
|
|
||
| def _delete_pod(name, namespace): | ||
| """Helper Function for cleanup_pods""" | ||
| core_v1 = client.CoreV1Api() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use get_kube_client here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup we should
closes: #11146
Adds a CLI command
airflow kubernetes cleanup-pods --namespace NAMESPACEto Clean up Kubernetes pods in evicted/failed/succeeded states for the Helm Chart
cc @alikhtag
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.