Skip to content

[AIRFLOW-3971] Add Google Cloud Natural Language operators#4980

Merged
kaxil merged 1 commit intoapache:masterfrom
PolideaInternal:feature/gcp-natural-lanaguage-operators
Apr 5, 2019
Merged

[AIRFLOW-3971] Add Google Cloud Natural Language operators#4980
kaxil merged 1 commit intoapache:masterfrom
PolideaInternal:feature/gcp-natural-lanaguage-operators

Conversation

@mik-laj
Copy link
Copy Markdown
Member

@mik-laj mik-laj commented Mar 26, 2019

Make sure you have checked all steps below.

Jira

Description

Hello,

I would like to create operators for Google Natural Language:

  • CloudLanguageAnalyzeEntitiesOperator
  • CloudLanguageAnalyzeEntitySentimentOperator
  • CloudLanguageAnalyzeSentimentOperator
  • CloudLanguageClassifyTextOperator

Tests

  • Unit tests for hook
  • Unit tests for operators
  • System test

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • How-to Guide
  • Integratio.rst
  • API Reference

Code Quality

  • Passes flake8

@mik-laj mik-laj force-pushed the feature/gcp-natural-lanaguage-operators branch 2 times, most recently from 3873ac6 to b66ea5b Compare March 29, 2019 14:03
Comment thread airflow/contrib/hooks/gcp_natural_language_hook.py Outdated
Comment thread airflow/contrib/operators/gcp_natural_language_operator.py Outdated
Comment thread airflow/contrib/operators/gcp_natural_language_operator.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the full name of the API: Google Cloud Natural Language API.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do it.

Do you think it's worth updating the google-cloud-python library to use the correct name?
https://github.com/googleapis/google-cloud-python/blob/master/docs/index.rst

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, yes, I think that would make sense.

Comment thread docs/code.rst Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is just a rendering problem during review, but all of these autoclass:: entries don't seem to be rendering correctly for me here:
https://github.com/apache/airflow/blob/b66ea5b2a1dd87572e2ac5626fce9ea164bd8097/docs/code.rst

They just show up as text literals in a code-style text box.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github does not support additional directives provided by Sphinx. Ever since I made this PR, the format of the documentation has also changed.

Reference:
https://lists.apache.org/thread.html/7c4a303a9de81c9b49de2f47589c799db12ad4bcfd66530309a09dc1@%3Cdev.airflow.apache.org%3E

Comment thread docs/howto/operator/gcp/natural_language.rst Outdated
Comment thread docs/howto/operator/gcp/natural_language.rst Outdated
Comment thread docs/howto/operator/gcp/natural_language.rst Outdated
@mik-laj
Copy link
Copy Markdown
Member Author

mik-laj commented Mar 29, 2019

@wwlian If you are interested, I will prepare you preview of the built-in documentation in HTML format.

@mik-laj mik-laj force-pushed the feature/gcp-natural-lanaguage-operators branch 2 times, most recently from 979aa0c to 49b98ab Compare March 29, 2019 19:07
@mik-laj
Copy link
Copy Markdown
Member Author

mik-laj commented Mar 29, 2019

PTAL @kaxil I considered all the suggestions of the community.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Example Airflow DAG for Google Natural Language service
Example Airflow DAG for Google Cloud Natural Language service

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add gcp here... as other cloud providers might have similar services and can conflict in future?

Suggested change
# [START howto_operator_natural_language_analyze_entities]
# [START howto_gcp_operator_natural_language_analyze_entities]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And same in all the comments below

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflict will never happen because the tags are always associated with the file. Prefixes are completely unnecessary. However, I will introduce your suggestion in a moment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah.. yes you right. :-) Probably I shouldn't be reviewing so late... brain-fade moment 😄

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to use 'howto_operator_gcp_natural_language_analyze_entities` to mark the file that contains these sections. I did not add the word "GCP" earlier because I thought it was a sufficient level of depth to easily find a file.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea agree.. Can you resolve conflicts with docs/conf.py

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I'm working on it now.

@mik-laj mik-laj force-pushed the feature/gcp-natural-lanaguage-operators branch from 49b98ab to 7d4f7ca Compare March 31, 2019 23:37
@mik-laj
Copy link
Copy Markdown
Member Author

mik-laj commented Mar 31, 2019

@kaxil I apply all your suggestion. In addition, I changed all literalinclude to exampleinclude, so full code of example is available now. :-)
Reference:
#4894

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pip install airflow[gcp]
pip install apache-airflow[gcp]

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I updated a PR.

Comment thread docs/exts/docroles.py Outdated
Comment thread setup.py Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update this to httplib2~=0.9.2, so that the major version update doesn't break Airflow.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I will do it.

@mik-laj
Copy link
Copy Markdown
Member Author

mik-laj commented Apr 1, 2019

@kaxil Do you have any comments about the documentation? In this PR I introduce a new structure of documentation, which will be a lot user-friendly. It contains more information, and information that already exists is presented in a different form. I would like the documentation to be understandable for a novice user. The previous form did not meet my expectations. While creating this documentation, I tried to enter all the suggestions that were reported to the previous documentation from my client. I will be grateful for your comments and opinions..

Preview of latest version is available: http://two-pain.surge.sh/howto/operator/gcp/natural_language.html

@mik-laj mik-laj force-pushed the feature/gcp-natural-lanaguage-operators branch from 7d4f7ca to 447de8a Compare April 1, 2019 22:12
Copy link
Copy Markdown
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. And the documentation looks quite detailed as well. Well done @mik-laj :)

@kaxil
Copy link
Copy Markdown
Member

kaxil commented Apr 4, 2019

I don't know why CI has not been triggered for this PR. 🤔

@mik-laj
Copy link
Copy Markdown
Member Author

mik-laj commented Apr 4, 2019

@kaxil I do not understand it either. I can run the task on my forks if it helps you.

@mik-laj mik-laj force-pushed the feature/gcp-natural-lanaguage-operators branch from 447de8a to 4868ad6 Compare April 4, 2019 09:45
@mik-laj
Copy link
Copy Markdown
Member Author

mik-laj commented Apr 4, 2019

A trigger a new build on my Travis. Status is available: PolideaInternal#74

@mik-laj mik-laj force-pushed the feature/gcp-natural-lanaguage-operators branch 6 times, most recently from 37db3a8 to 0e1f0d6 Compare April 4, 2019 21:26
@kaxil
Copy link
Copy Markdown
Member

kaxil commented Apr 4, 2019

the CI is failing @mik-laj

@mik-laj
Copy link
Copy Markdown
Member Author

mik-laj commented Apr 4, 2019

@kaxil Thanks. I working on it now.

@mik-laj mik-laj force-pushed the feature/gcp-natural-lanaguage-operators branch from 0e1f0d6 to 81b9ce2 Compare April 5, 2019 00:10
@codecov-io
Copy link
Copy Markdown

codecov-io commented Apr 5, 2019

Codecov Report

Merging #4980 into master will increase coverage by <.01%.
The diff coverage is 78.22%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4980      +/-   ##
==========================================
+ Coverage   76.23%   76.24%   +<.01%     
==========================================
  Files         466      469       +3     
  Lines       30101    30225     +124     
==========================================
+ Hits        22949    23045      +96     
- Misses       7152     7180      +28
Impacted Files Coverage Δ
airflow/contrib/hooks/gcp_api_base_hook.py 84.76% <ø> (ø) ⬆️
...ntrib/example_dags/example_gcp_natural_language.py 0% <0%> (ø)
...contrib/operators/gcp_natural_language_operator.py 100% <100%> (ø)
airflow/contrib/hooks/gcp_natural_language_hook.py 91.17% <91.17%> (ø)
airflow/models/__init__.py 92.95% <0%> (-0.05%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 463c183...d04c780. Read the comment docs.

@mik-laj
Copy link
Copy Markdown
Member Author

mik-laj commented Apr 5, 2019

@kaxil it's green now. I replace a method import_module with a method from autodoc sphinx extension (it's part of core) and add a mock support based on autodoc code.

@mik-laj mik-laj force-pushed the feature/gcp-natural-lanaguage-operators branch from 81b9ce2 to d04c780 Compare April 5, 2019 09:55
Copy link
Copy Markdown
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, LGTM

@kaxil kaxil merged commit 99c8a6f into apache:master Apr 5, 2019
cthenderson pushed a commit to cthenderson/apache-airflow that referenced this pull request Apr 16, 2019
andriisoldatenko pushed a commit to andriisoldatenko/airflow that referenced this pull request Jul 26, 2019
wmorris75 pushed a commit to modmed-external/incubator-airflow that referenced this pull request Jul 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants