-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Add CeleryKubernetesExecutor #10654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CeleryKubernetesExecutor #10654
Conversation
|
Hi @kaxil and @dimberman could you please take a look at this PR? I think I also need another pr to indicate which methods are private in the |
|
@potiuk Worth resurrecting your "multi-executor" PR from earlier this year to make it more generic, or do you think this is Good Enough? |
I think Multi-executor was too-generic. There are hardly use cases for a lot of combinations of the executors and the increased complexity of the configuration and potential side effects and issues if someone tries to combine for example Sequential Executor with KubernetesExecutor are well.... scary. Also, it gives a very good message IMHO -> if you are in Production use Celery and Kubernetes executor. Full Stop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be configurable? What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the default should be "kubernetes" though.
|
+1 for this approach instead of "generic-multi-executor". @pingzh can you add please some documentation under |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| CeleryKubernetesExecutor consists of celery_executor and kubernetes_executor. | |
| CeleryKubernetesExecutor consists of CeleryExecutor and KubernetesExecutor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When the queue is `kubernetes`, kubernetes_executor is selected to run the task, | |
| otherwise, celery_executor is used. | |
| When the queue is `kubernetes`, KubernetesExecutor is selected to run the task, | |
| otherwise, CeleryExecutor is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I have to do this 😄 I was thinking about this yesterday but was not sure... Do you think that this name is good? On one hand, it's explicit on the other... I can imagine questions like "Is this for Celery on Kubernetes?", "Can I use CeleryExecutor on kubernetes? Or do I have to use this one?"
Do I have any better ideas? CeleryAndKubernetesExecutor? Is this better? I don't know just want to hear what others think 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about the naming. i like it, CeleryAndKubernetesExecutor which indicates celery and kubernetes are both used in this class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion on the naming of the class, I am fine with either since we have the first line of the docstring explaining it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@turbaszek @pingzh Maybe we can name it the "MultiExecutor" or "JoinedExecutor"? I could see a case where in the future we route even more executors into this (like Dask)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dimberman thanks for the suggestion. I think we can go further to have a BaseComposeddExecutor, which serves as the base class for any composed executor. The only two methods for the subclass to override is the __init__ and also the _route method.
However, I am thinking of whether we over-optimize it in this case. I don't want to have a too generic name at the beginning. We can come back to refactor the logic when we have another use case like this.
it consists of CeleryExecutor and KubernetesExecutor, which allows users to route their tasks to either Kubernetes or Celery based on the queue defined on a task
|
Strange, i started to get this static check error after I rebased the master: |
Please rebase it again. We had a MyPy error kicking in and it was fixed in #10879 |
|
(I re-forked the apache/airflow repo, mine was the incubator-airflow) the new PR is: #10901 |
|
@pingzh should we close this one? |
Let us close this one. Sorry about this as it caused the comments not showing up in the new PR. |
it consists of CeleryExecutor and KubernetesExecutor, which allows users
to route their tasks to either Kubernetes or Celery based on the queue
defined on a task
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.