Skip to content

Conversation

@hoesler
Copy link
Contributor

@hoesler hoesler commented Nov 27, 2019

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    runtime var is now a duration object, not a float

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

@codecov-io
Copy link

codecov-io commented Nov 27, 2019

Codecov Report

Merging #6682 into master will decrease coverage by 0.53%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6682      +/-   ##
==========================================
- Coverage   83.82%   83.29%   -0.54%     
==========================================
  Files         668      668              
  Lines       37543    37542       -1     
==========================================
- Hits        31472    31270     -202     
- Misses       6071     6272     +201
Impacted Files Coverage Δ
airflow/utils/dag_processing.py 55.18% <0%> (-2.97%) ⬇️
airflow/models/dagbag.py 86.47% <100%> (-0.06%) ⬇️
airflow/models/dagrun.py 96.58% <100%> (ø) ⬆️
airflow/operators/postgres_operator.py 0% <0%> (-100%) ⬇️
airflow/kubernetes/volume_mount.py 44.44% <0%> (-55.56%) ⬇️
airflow/executors/sequential_executor.py 47.61% <0%> (-52.39%) ⬇️
airflow/kubernetes/volume.py 52.94% <0%> (-47.06%) ⬇️
airflow/kubernetes/pod_launcher.py 45.25% <0%> (-46.72%) ⬇️
airflow/kubernetes/refresh_config.py 50.98% <0%> (-23.53%) ⬇️
...rflow/contrib/operators/kubernetes_pod_operator.py 78.2% <0%> (-20.52%) ⬇️
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e82008d...f2ac8cb. Read the comment docs.

Copy link
Member

@XD-DENG XD-DENG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a nice catch.
I cross-checked the code & documentation, and LGTM.
Thanks for your contribution!

@XD-DENG
Copy link
Member

XD-DENG commented Nov 28, 2019

@hoesler Thanks again for the PR! LGTM and I would be more than happy to merge it to master.

Before I merge, may we request a favor from you: may you check if there is similar/same issue for other statsd-related code? If yes, we can address them together in this PR.

I will also do the same check from my side. After that we are good to go.

@XD-DENG
Copy link
Member

XD-DENG commented Nov 28, 2019

Hi @hoesler ,
I found one more:

duration = (timezone.utcnow() - start_dttm).total_seconds() * 1000
Stats.timing("dagrun.dependency-check.{}".format(self.dag_id), duration)

.total_seconds() * 1000 can be removed to avoid unnecessary back & forth conversion.

May you address this together in this PR? After that I can merge immediately.

All other related lines in the code base seems ok.

Cheers!

@hoesler
Copy link
Contributor Author

hoesler commented Nov 28, 2019

Sure, I will add that

@hoesler
Copy link
Contributor Author

hoesler commented Nov 28, 2019

found one more

Copy link
Member

@XD-DENG XD-DENG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow-ups @hoesler .
The CI is failing though after the further changes. Let's figure it out first.


td = timezone.utcnow() - ts
td = td.total_seconds() + (
float(td.microseconds) / 1000000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the error I found in the CI log is below

1) ERROR: test_cli_list_dags (tests.cli.commands.test_dag_command.TestCliDags)
----------------------------------------------------------------------
   Traceback (most recent call last):
    tests/cli/commands/test_dag_command.py line 361 in test_cli_list_dags
      dag_command.dag_list_dags(args)
    airflow/utils/cli.py line 80 in wrapper
      return f(*args, **kwargs)
    airflow/cli/commands/dag_command.py line 231 in dag_list_dags
      print(dagbag.dagbag_report())
    airflow/models/dagbag.py line 480 in dagbag_report
      duration=sum([o.duration for o in stats]),
   TypeError: unsupported operand type(s) for +: 'int' and 'datetime.timedelta'

It should be due to this change in airflow/models/dagbag.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this specific file/change, in order to avoid breaking other logics in other files, I would like to suggest to change in Stats.timing() line instead.

I.e., revert your change in line 422, then change

filename = file_stat.file.split('/')[-1].replace('.py', '')
Stats.timing('dag.loading-duration.{}'.
             format(filename),
             format(filename),
             file_stat.duration)

into

filename = file_stat.file.split('/')[-1].replace('.py', '')
Stats.timing('dag.loading-duration.{}'.
             format(filename),
             format(filename),
             timedelta(seconds=file_stat.duration))

(timedelta needs to be imported)

Let me know your opinion? Cheers

@hoesler
Copy link
Contributor Author

hoesler commented Nov 29, 2019

The problem was the sum function, which is basically a reduce(array, +, 0). I had to change the initial value to timedelta()

@hoesler
Copy link
Contributor Author

hoesler commented Nov 29, 2019

All tests passed, except test_on_kill using sqlite and python 3.6. No idea how my code changes should be involved in this. Maybe a randomly race condition, as this test heavily depends on correct execution order.

@potiuk
Copy link
Member

potiuk commented Nov 29, 2019

test_on_kill is known to be flaky. In the upcoming pytest change it's fixed to be less flaky :).

@XD-DENG
Copy link
Member

XD-DENG commented Nov 29, 2019

Cool, thanks @hoesler .

To be safe, I restarted the failing test. Will merge once it turns green.

@XD-DENG XD-DENG merged commit df356b8 into apache:master Nov 29, 2019
@XD-DENG
Copy link
Member

XD-DENG commented Nov 29, 2019

Hi @hoesler , please note I updated the subject of your JIRA ticket to better describe the code change here, also for clearer change log.

galuszkak pushed a commit to FlyrInc/apache-airflow that referenced this pull request Mar 5, 2020
mjpieters added a commit to mjpieters/airflow that referenced this pull request Nov 21, 2020
kaxil pushed a commit that referenced this pull request Jan 22, 2021
* Backport pull request #6682 to v1-10
* Backport pull request #10629 to v1-10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants