Skip to content

Conversation

@kunalyogenshah
Copy link
Contributor

@kunalyogenshah kunalyogenshah commented Apr 23, 2019

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:
    This PR adds a new button to the Airflow UI next to "View log" ("View log in ES"). It provides an external URL for visualizing the log, and renders the latest try for the Task, DAG and execution being looked up. This button is controlled by a new elasticsearch_frontend conf param, which provides the template URL for viewing the corresponding log. Screenshot below :

UI without frontend provided:
Screen Shot 2019-05-03 at 2 37 50 PM

This is the UI when a frontend is provided:
Screen Shot 2019-05-03 at 2 39 40 PM

Documentation update is pending code review. Will update documentation after.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Unit tested the UI and checked the console for errors in the JS. Button doesn't show up if the param is not defined.

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

@KevinYang21
Copy link
Member

This is nice, thank you @kunalyogenshah ! Users using Airflow with Elasticsearch backend do likely want to utilize more those aggregation features Elasticsearch frontends, e.g. Kibana, provide.

Before looking into the code, my opinion on the UI change is that we make the layout similar to how we do the log downloading--we allow users to choose the try number of the task instances. Do you think that make sense?

@feng-tao
Copy link
Member

@kunalyogenshah , not sure if this is the right way to show the UI. I think another approach(Lyft) is to leverage #5059 and #5094 which allows you to customize your task instance panel in generic way.

@feng-tao
Copy link
Member

feng-tao commented Apr 23, 2019

@kunalyogenshah
Copy link
Contributor Author

Ah. Thanks @feng-tao . This seems to be exactly similar to what we've been trying to do. Let me look into using extra links instead.

@kunalyogenshah
Copy link
Contributor Author

@feng-tao : A quick naive clarification question. Does this mean that the DAG operator itself must specify the link to Kibana as an extra link in its operator definition? Or is there a way to make that universal for all DAGs and tasks?

@milton0825
Copy link
Contributor

@kunalyogenshah I think this is what you are looking for: #5094
We can register an extra link that is global to all the operators.

@feng-tao
Copy link
Member

+1 for @milton0825 , cc @Fokko as well

@kunalyogenshah
Copy link
Contributor Author

Perfect. Thanks @milton0825 . I'll move my questions over to that PR for better context :)

@kunalyogenshah
Copy link
Contributor Author

Bringing this thread back up to share some alternate thoughts. I'd like to advocate for this approach specifically for just logs look up for the following reasons :

  • Using elasticsearch for now is quite well baked into the log handling, and Kibana is possibly the most common frontend for this. Using that as a frontend for visualizing / searching logs seems like quite a useful central feature to have as compared to a plugin. Semantically we'd use extra links plugin for features that are more specific to each individual Airflow DAG / Task / Provider.
  • This can be quite easily integrated and extended like the existing download logs / view logs, and seems on the same plane w.r.t. utility.
  • Also referring to the proposed metrics changes in this PR ([AIRFLOW-4360] Add performance metrics for Webserver UI. #5139), having a standalone endpoint for serving logs allows us to track metrics for just that endpoint better than a common extra links section. (like tracking views and failures, etc).

Thoughts?

@milton0825
Copy link
Contributor

@kunalyogenshah I am thinking about making extra_links aware of try-number. Once that is done, we can probably consolidate all the download logs and view logs and register them through extra_links. download logs and view logs will by default be there and can be disabled through airflow.cfg. What do you think?

I am fine with shipping this PR now and rework later to support what I proposed.

@kunalyogenshah kunalyogenshah force-pushed the kunal_add_es_logs branch 7 times, most recently from 13b3ea1 to bb9334a Compare May 7, 2019 19:50
@codecov-io
Copy link

codecov-io commented May 7, 2019

Codecov Report

Merging #5164 into master will decrease coverage by 0.02%.
The diff coverage is 50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5164      +/-   ##
==========================================
- Coverage   78.67%   78.65%   -0.03%     
==========================================
  Files         470      470              
  Lines       30026    30044      +18     
==========================================
+ Hits        23623    23631       +8     
- Misses       6403     6413      +10
Impacted Files Coverage Δ
airflow/www/views.py 75.88% <50%> (-0.34%) ⬇️
airflow/models/taskinstance.py 92.42% <0%> (-0.18%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e187625...b599d8b. Read the comment docs.

Copy link
Member

@KevinYang21 KevinYang21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, this has been tested on your local environment and you've confirmed it was working right? Do you mind share the test setup( e.g. the frontend url w/o sensitive info) please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: We can have an example comment here to make it easier to adapt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@kunalyogenshah
Copy link
Contributor Author

Yeah, I tested this by just changing the cfg file to include the frontend URL. The frontend URL I used is also included in the CFG file.

@kunalyogenshah
Copy link
Contributor Author

Tailored example URL from default cfg file. Caused parsing errors for the default file. For Kibana specifically, the format I used in testing was :

elasticsearch_frontend = <Domain>/_plugin/kibana/app/kibana#/discover?_g=()&_a=(filters:!((query:(match:(log_id:(query:'{log_id}',type:phrase))))))

Copy link
Member

@KevinYang21 KevinYang21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Baking it for 2 more days to allow other committers to review. Then I'll merge

@KevinYang21 KevinYang21 merged commit 52a9d4e into apache:master May 20, 2019
@mik-laj
Copy link
Member

mik-laj commented May 22, 2019

@kunalyogenshah This features is very interesting. Can you write documentation for this feature? The description in the configuration file is very little visible. I think you should add a note to the file: doc/howto/write-logs.rst.

@kunalyogenshah
Copy link
Contributor Author

Hey @mik-laj . I can definitely write documentation if needed. Will send a followup PR.

@kurtqq
Copy link
Contributor

kurtqq commented May 27, 2019

@kaxil can this PR cheery picked to 1.10.4?

@ashb
Copy link
Member

ashb commented Jun 6, 2019

I am thinking about making extra_links aware of try-number. Once that is done, we can probably consolidate all the download logs and view logs and register them through extra_links. download logs and view logs will by default be there and can be disabled through airflow.cfg. What do you think?

I am fine with shipping this PR now and rework later to support what I proposed.

@milton0825 Yes please, that would be great.

@ashb
Copy link
Member

ashb commented Jun 6, 2019

@KevinYang21 When you merge PRs can you double check the commit message correctly references the jira - the release scripts I use to make sure we get everything in to the release need it. In this case it was [AIRFLOW] rather than [AIRFLOW-4396] - you can edit the commit message in the github UI when merging. Thanks

ashb pushed a commit that referenced this pull request Jun 7, 2019
andriisoldatenko pushed a commit to andriisoldatenko/airflow that referenced this pull request Jul 26, 2019
wmorris75 pushed a commit to modmed-external/incubator-airflow that referenced this pull request Jul 29, 2019
dharamsk pushed a commit to postmates/airflow that referenced this pull request Aug 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants