-
Notifications
You must be signed in to change notification settings - Fork 16.3k
[AIRFLOW-4835] Refactor render_template #5461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5461 +/- ##
==========================================
- Coverage 79.13% 79.13% -0.01%
==========================================
Files 488 488
Lines 30608 30601 -7
==========================================
- Hits 24222 24215 -7
Misses 6386 6386
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #5461 +/- ##
==========================================
- Coverage 79.04% 79.04% -0.01%
==========================================
Files 489 489
Lines 30685 30678 -7
==========================================
- Hits 24256 24249 -7
Misses 6429 6429
Continue to review full report at Codecov.
|
turbaszek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you can remove refactored models from pylint TODO list?
@nuclearpinguin Maybe I don't understand correctly, but this only refactors the template rendering function, not the entire models.taskinstance script. |
|
I just wanted to suggest that maybe after your refactor the files could be remove from the todo list. If there's only few errors left then maybe it's worth to refactor them too? 🙂 |
|
Side comment @nuclearpinguin @BasPH -> I think we should adapt our approach for pylint changes a bit after first experiences with it. This (and other) cases shows that we rarely naturally fix all the pylint problems per-file. While it worked fine for GCP operators/tests, I think the approach proposed by @mik-laj works a bit nicer: #5587 (comment), however maybe we can modify it slightly. I like the idea that you won't fix it file-by-file but warning-by-warning. It sounds much more natural. But I do not like the idea that this means that files will not disappear from the todo list as quickly as we would like them to, and will not prevent people from adding new pylint errors in the meantime to files that are already pretty much cleared. @mik-laj already has a script that produces updated pylint_todo.txt so we can regularly re-run and update it (it does not have to be always when we fix some errors - it can be independent and regular). I think what might make sense is to combine this approach - maybe we could extend the script that will run pylint on the whole codebase and produce updated pylint_todo.txt list + list of candidates that can be cleaned really quickly (basically with a few pylint problems left). This way our natural process of fixing pylint problems would look as follows:
What do you think @BasPH @mik-laj @nuclearpinguin ? @mik-laj - I know you have some tools/scripts that you use to automate some of that work (like creating JIRA issues etc). Maybe you could create a PR with those tools that others (us?) could use as well and we could setup a small 'process' for that. We are down to 4000 errors or so from 5000 but I feel that if we establish this kind of small process, we can invite others to help with the effort and get it done much more quickly. Especially we could encourage new contributors of Airflow to help with that if we give them easy to use process/tools to do it. |
|
First part of discussion: #5587 |
|
Commented in #5587. Any remarks on this PR? |
ashb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small changes, but much much nicer overall!
airflow/models/taskinstance.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we are refactoring this: How about we move all this logic in to a function in BaseOperator -- it all seems to operate on self.task.X which is probably a good indicator that's where this belongs. (We should keep this for back-compat though?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will move around some code tonight.
|
@ashb I've addressed all your comments:
|
ashb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the tests now, much easier to follow
A few small changes I just noticed this time around. One bug too I think
airflow/models/baseoperator.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to create a new jinja env for each template field - we probably want to reuse a jinja env for all fields don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow that's pretty serious, I missed this one when moving code to the BaseOperator. Thanks for pointing it out.
airflow/models/baseoperator.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return self.dag.get_template_env() if hasattr(self, 'dag') else jinja2.Environment(cache_size=0) | |
| return self.dag.get_template_env() if self.has_dag() else jinja2.Environment(cache_size=0) |
Especially given the attribute is _dag :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
airflow/models/baseoperator.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need to do this -- as dag.get_template_env() will handle this for us (once we fix the bug there).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed 👍
airflow/models/baseoperator.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think we can drop this if and just iterate over the self.template_fields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, the BaseOperator holds an empty list so should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth changing these sort of uses in the tests to op.render_template_fields({'ds': DATE}) now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, no point in calling render_tempate anymore directly because it's now more of an internal method being called by render_template_fields.
tests/models/test_dag.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ({"user_defined_filters": {"hello": lambda name: 'Hello %s' % name}}, "{{ 'world' | hello }}", {}, "Hello world"), # noqa: E501, pylint: disable=line-too-long | |
| ({"user_defined_filters": {"hello": lambda name: 'Hello %s' % name}}, | |
| "{{ 'world' | hello }}", | |
| {}, | |
| "Hello world"), |
perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we moved the templating logic to the BaseOperator, I also moved the tests for the respective methods with it. There was no test_baseoperator.py yet so all code in there is now nicely formatted with Black.
tests/models/test_dag.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look at all these deletions 😍
|
@ashb addressed your comments. I've created I think a few bits and pieces could still improve (e.g. |
|
Yeah I think most of the tests that should be in test_baseoperator are currently in test_taskinstance or core. We can move them later |
|
I can't easily check on my phone but it looks good. Will check on a bigger screen tonight |
|
I resolved @bjoernpollex-sc's comment and while I was at it, added support for templating sets, since it was such a small change. |
|
@ashb any other remarks on this PR, or is it good to go? |
|
@bazph I took a week off, will look again in the morning |
- Refactors `BaseOperator.render_template()` and removes `render_template_from_field()`. The functionality could be greatly simplified into a single `render_template()` function. - Improves performance by removing two `hasattr` calls and avoiding recreating Jinja environments. - Removes the argument `attr` to `render_template()` which wasn't used. - Squashes multiple similar tests into two parameterized tests. - Adheres to 110 line length. - Adds support for templating sets. - Adds Pydoc. - Adds typing. (cherry picked from commit 47dd4c9)
- Refactors `BaseOperator.render_template()` and removes `render_template_from_field()`. The functionality could be greatly simplified into a single `render_template()` function. - Improves performance by removing two `hasattr` calls and avoiding recreating Jinja environments. - Removes the argument `attr` to `render_template()` which wasn't used. - Squashes multiple similar tests into two parameterized tests. - Adheres to 110 line length. - Adds support for templating sets. - Adds Pydoc. - Adds typing. (cherry picked from commit 47dd4c9)
Make sure you have checked all steps below.
Jira
Description
I got started on making airflow/models Pylint compatible, and bumped into the render_template function(s) which were real hard to understand at first sight. Because of the large change required to improve it, I decided to split it off in a separate PR.
This PR:
BaseOperator.render_template()and removesrender_template_from_field(). The functionality could be greatly simplified into a singlerender_template()function.hasattrcalls and avoiding recreating Jinja environments.attrtorender_template()which wasn't used.Tests
No unit tests added, but multiple similar tests merged into a single, parameterized, test.
Commits
Documentation
Code Quality
flake8