-
Notifications
You must be signed in to change notification settings - Fork 16.3k
[AIRFLOW-7014] Add Apache Kylin operator #9149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
|
47d426a to
263a804
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it used somewhere? What does it do? Could you add a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used by kylinoperator, I hive moved it to kylin operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be simpler if the user could pass a datetime object or ISO datetime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In concert with the kylin API, and May be some users will use Jinja templates ,so millisecond timestamp is more appropriate(but kylinpy use datetime so it should be converted to datetime)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it used somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unuseful, I have removed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it used somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unuseful, I have removed it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it used somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unuseful, I have removed it
9f6736e to
50c7279
Compare
50c7279 to
40f6b3e
Compare
4c74b3d to
1b92bac
Compare
|
@mik-laj Hi , can you give me some advice on how to deal with the CI Build error ? These errors are different from my native CI Build, and I hive no idea on these errors |
You need to add your integration to index: |
setup.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's duplicate. I think airflow.kylin is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much! Should the key is airflow.kylin or apache.kylin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apache.kylin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi mik-laj,The test step of ‘CI Build / Backport packages (pull_request) ’ still not pass,can you give me some suggestion? Thanks!
liuyonghengheng
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much! I will test it right now, Should the key is airflow.kylin or apache.kylin?
1b92bac to
ff23d46
Compare
|
I don't have an idea. @potiuk Can you help? |
|
I think it's lack of README.md file or the "CHANGES" file. This is the first time we add a new provider, so this is kinda expected . I will take a look and add a fixup shortly. |
|
@liuyonghengheng -> I pushed a fixup to see it will work but I will commit it regardless to master |
|
@potiuk Thank you very much, I guess the README file is automatically generated,but I don't know which scripts I should run |
|
The related #9739 merged. @liuyonghengheng -> just rebase yours on top of master and remove my commits and you should be good to go! |
a1059a9 to
3c7d6b1
Compare
|
@nichunen @wangrupeng @zhangayqian @hit-lacus I don't have experience with Apache Kylin. Will you find some time to check that this integration looks fine? I don't have experience with Apache Kylin. Will you find some time to check that this integration looks fine? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| class KylinOperator(BaseOperator): | |
| class KylinSubmitRequestOperator(BaseOperator): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shaofengshi @zhaoyongjie @nichunen @wangrupeng @zhangayqian @hit-lacus Do you have any suggestions?Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mik-laj hi Mik, thanks for reviewing the PR! This is Shaofeng Shi, I'm one of Apache Kylin PMC & committer; Yongheng is my team member, this Airflow Operator has been developed and deployed for a while. Regarding the naming convention, I suggest to name it as "KylinCubeOperator", because the actions in it are related with Cubes (not just submit request). Hope it can be accepted. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 , SubmitRequest is too general/wide from my side, use KylinCubeOperator maybe better because almost all requests are interacting with Cube.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hive renamed it to ‘KylinCubeOperator’
|
Also it would be great (but it is not hard requirement) if the example dag is used to create HowTo for Kylin Operators. You can see quite a number examples in the google HowToS. It's a great way to promote your operators - if you feel up to it @liuyonghengheng it is highly recommended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo, should be "job's" .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not use Chinese character, for example use ' other than ‘
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build_streaming or ? merge_streaming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much Xiaoxiang, I'll fix it right away
|
+1 from my side |
01917ee to
3173537
Compare
3173537 to
4f499f2
Compare
|
Awesome work, congrats on your first merged pull request! |
|
Merged. We had some temporary problem with Python 3.8.4 released last night breaking SQL Alchemy - that's why you got the errors. Thanks for your patience! |
|
@potiuk Thank you very much! |
* Tests should also be triggered when there is just setup.py change (apache#9690) So far tests were not triggered when only requirements changed, but this is quite needed in fact. * Update FlaskAppBuilder to v3 (apache#9648) * Some Pylint fixes in airflow/models/taskinstance.py (apache#9674) * Update migrations to ensure compatibility with Airflow 1.10.* (apache#9660) closes apache#9640 * Fix _process_executor_events method to use in-memory try_number (apache#9692) * use the correct claim name in the webserver (apache#9688) * Update Thumbtack points of contact in Airflow Users list (apache#9701) The previously-listed person is no longer at the company * generate go client from openapi spec (apache#9502) * generate go client from openapi spec * move openapi codegen to seperate workflow * [AIRFLOW-XXXX] Remove unnecessary docstring in AWSAthenaOperator * Add health API endpoint (apache#8144) (apache#9277) * Add AWS StepFunctions integrations to the aws provider (apache#8749) * Move gcs & wasb task handlers to their respective provider packages (apache#9714) * Allow AWSAthenaHook to get more than 1000/first page of results (apache#6075) Co-authored-by: Dylan Joss <[email protected]> * Add Dag Runs CRUD endpoints (apache#9473) * Make airflow/migrations/env.py Pylint Compatible (apache#9670) * Get Airflow configs with sensitive data from Secret Backends (apache#9645) * YAML file supports extra json parameters (apache#9549) Co-authored-by: Kamil Breguła <[email protected]> Co-authored-by: Vinay <[email protected]> Co-authored-by: Kamil Breguła <[email protected]> * fix grammar in prereq tasks gcp operator docs (apache#9728) * Add The Climate Corporation to user list (apache#9726) * Add Qingping Hou to committers list (apache#9725) * Add new fantastic team member of Polidea. (apache#9724) * Error in description after deployment (apache#9723) * Error in description after deployment Co-authored-by: Daniel Debny <[email protected]> * Skip one version of Python for each test. Skip one version of Python for each test. * Add read-only endpoints for DAG Model (apache#9045) Co-authored-by: Tomek Urbaszek <[email protected]> Co-authored-by: Tomek Urbaszek <[email protected]> * Ensure Kerberos token is valid in SparkSubmitOperator before running `yarn kill` (apache#9044) do a kinit before yarn kill if keytab and principal is provided * Update example DAG for AI Platform operators (apache#9727) * Fix warning about incompatible plugins (apache#9704) One condition was bad and warns when the plugin is for admin and FAB flask. * Update local_task_job.py (apache#9746) Removing the suicide joke. * Tests are working for newly added backport providers (apache#9739) * Tests are working for newly added backport providers * Pre-create Celery db result tables before running Celery worker (apache#9719) Otherwise at large scale this can end up with some tasks failing as they try to create the result table at the same time. This was always possible before, just exceedingly rare, but in large scale performance testing where I create a lot of tasks quickly (especially in my HA testing) I hit this a few times. This is also only a problem for fresh installs/clean DBs, as once these tables exist the possible race goes away. This is the same fix from apache#8909, just for runtime, not test time. * Support extra config options for Sentry (apache#8911) For now only dsn can be configured through the airflow.cfg. Need support 'http_proxy' option for example (it can't be configured through the environment variables). This change implements solution for supporting all existed (and maybe future) options for sentry configuration. * Use namedtuple for TaskInstanceKeyType (apache#9712) * Use namedtuple for TaskInstanceKeyType * Add TargetQueryValue to KEDA Autoscaler (apache#9748) Co-authored-by: Daniel Imberman <[email protected]> * Add unit tests for mlengine_operator_utils (apache#9702) * Mask other forms of password arguments in SparkSubmitOperator (apache#9615) This is a follow-up to apache#6917 before modifying the masking code. Related: apache#9595. * Use absolute paths in howto guides (apache#9758) * Fix StackdriverTaskHandler + add system tests (apache#9761) Co-authored-by: Tomek Urbaszek <[email protected]> Co-authored-by: Tomek Urbaszek <[email protected]> * Check project structure in sensors/transfers directories (apache#9764) * Add tests for yandex hook (apache#9665) * improve type hinting for celery provider (apache#9762) * Add ME-Br to who uses Airflow list (apache#9770) * Add 1.10.11 Changelog & Update UPDATING.md (apache#9757) * Links Breeze documentation to new Breeze video (apache#9768) * Fix is_terminal_support_colors functtion (apache#9734) * Add type hinting for discord provider (apache#9773) * Fix typo in the word "Airflow" (apache#9772) * Add Google Stackdriver link (apache#9765) * Improve type hinting to provider microsoft (apache#9774) * Unit tests jenkins hook (apache#9767) * Fixes failing formatting of DAG file containing {} in docstring (apache#9779) * Upgrade to latest isort (5.0.8) (apache#9782) * Add API Endpoint - DagRuns Batch (apache#9556) Co-authored-by: Ephraim Anierobi <[email protected]> * Improve typing coverage in scheduler_job.py (apache#9783) * Enable pretty output in mypy (apache#9785) * provide_session keep return type (apache#9787) * Refactor Google operators guides (apache#9766) * Refactor Google guides * fixup! Refactor Google guides * fixup! fixup! Refactor Google guides * Fix small errors in image building documentation (apache#9792) * Backfill reset_dagruns set DagRun to NONE state (apache#9756) * Add DAG Source endpoint (apache#9322) * The group of embedded DAGs should be root to be OpenShift compatible (apache#9794) * Add docs for replace_microseconds parameters in trigger DAG endpoint (apache#9793) * Add multiple file upload functionality to GCS hook (apache#8849) Co-authored-by: Timothy Healy <[email protected]> * Keep functions signatures in decorators (apache#9786) * Use paths relative to root docs dir in *include directives (apache#9797) * Add Migration guide from the experimental API to the REST API (apache#9771) Co-authored-by: Kaxil Naik <[email protected]> Co-authored-by: Kamil Breguła <[email protected]> * Update paths in .github/boring-cyborg.yml (apache#9799) * Update paths in .github/boring-cyborg.yml * fixup! Update paths in .github/boring-cyborg.yml * Minor typo fix in OpenAPI specification (apache#9809) * Enable annotations to be added to the webserver service (apache#9776) * Make airflow package type check compatible (apache#9791) * Update README to add Py 3.8 in supported versions (apache#9804) * Remove unnecessary comprehension (apache#9805) * Add type annotations for redis provider (apache#9815) * Remove package.json and yarn.lock from the prod image (apache#9814) Closes apache#9810 * For now cloud tools are not needed in CI (apache#9818) Currently there is "unbound" variable error printed in CI logs because of that. * Python 3.8.4 release breaks our builds (apache#9820) * Allow `replace` flag in gcs_to_gcs operator. (apache#9667) * Allow `replace` flag in gcs_to_gcs operator. If we are not replacing, list all files in the Destination GCS bucket and only keep those files which are present in Source GCS bucket and not in Destination GCS bucket * Add kylin operator (apache#9149) Co-authored-by: yongheng.liu <[email protected]> * Fix SqlAlchemy-Flask failure with python 3.8.4 (apache#9821) * Add API Reference docs (redoc) to sphinx (apache#9806) * Add Google Deployment Manager Hook (apache#9159) Co-authored-by: Ephraim Anierobi <[email protected]> * Remove HTTP guide index in docs (apache#9796) * Improve type hinting to provider cloudant (apache#9825) Co-authored-by: Refael Y <[email protected]> * Add option to delete by prefix to S3DeleteObjectsOperator (apache#9350) Co-authored-by: Felix Uellendall <[email protected]> * Add CloudVisionDeleteReferenceImageOperator (apache#9698) * Add note in Updating.md about the change in `run_as_user` default (apache#9822) Until Airflow 1.10.10 the default run_as_user config (https://airflow.readthedocs.io/en/1.10.10/configurations-ref.html#run-as-user) which defaulted it to root user `0` (https://github.com/apache/airflow/blob/96697180d79bfc90f6964a8e99f9dd441789177c/airflow/contrib/executors/kubernetes_executor.py#L295-L301) In Airflow 1.10.11 we changed it to `50000` * Improve typing in airflow/models/pool.py (apache#9835) * Remove global variable with API auth backend (apache#9833) * Fix Writing Serialized Dags to DB (apache#9836) * Update gcp to google in docs (apache#9839) Co-authored-by: Ashwin Shankar <[email protected]> * BigQueryTableExistenceSensor needs to specify keyword arguments (apache#9832) * Add guide for AI Platform (previously Machine Learning Engine) Operators (apache#9798) * Change DAG.clear to take dag_run_state (apache#9824) * Change DAG.clear to take dag_run_state * fix lint * fix tests * assign var * extend original clause * Rename DagBag.store_serialized_dags to Dagbag.read_dags_from_db (apache#9838) * Update more occurrences of gcp to google (apache#9842) * Add Dynata to the Airflow users list (apache#9846) * Fix S3FileTransformOperator to support S3 Select transformation only (apache#8936) Documentation for S3FileTransformOperator states that users can skip transformation script if S3 Select experession is specified, but in this case the created file is always zero bytes long. This fix changes the behaviour, so in case of no transformation given, the source file (a result of S3Select) is uploaded. * Fix DagRun.conf when using trigger_dag API (apache#9853) fixes apache#9852 * Helm chart can now place arbitrary config settings in to airflow.cfg (apache#9816) Rather than only allowing specific pre-determined config settings, this change allows the user to place _any_ config setting they like in the generated airflow.cfg, including overwriting the "generated defaults". This providers a nicer interface for the users of the chart (even if the could already set these via the env vars). * Fix typo in datafusion operator (apache#9859) Co-authored-by: michalslowikowski00 <[email protected]> * Fix Experimental API Client (apache#9849) * Add imagePullSecrets to the create user job (apache#9802) So that it can pull the specified image from a private registry. * Group CI scripts in subdirectories (apache#9653) Reviewed the scripts and removed some of the old unused ones. Co-authored-by: Jarek Potiuk <[email protected]> Co-authored-by: Ephraim Anierobi <[email protected]> Co-authored-by: Kaxil Naik <[email protected]> Co-authored-by: Tomek Urbaszek <[email protected]> Co-authored-by: Aneesh Joseph <[email protected]> Co-authored-by: Dylan Joss <[email protected]> Co-authored-by: QP Hou <[email protected]> Co-authored-by: Cooper Gillan <[email protected]> Co-authored-by: Omair Khan <[email protected]> Co-authored-by: chamcca <[email protected]> Co-authored-by: lindsable <[email protected]> Co-authored-by: Vinay G B <[email protected]> Co-authored-by: Kamil Breguła <[email protected]> Co-authored-by: Vinay <[email protected]> Co-authored-by: Vismita Uppalli <[email protected]> Co-authored-by: Jeff Melching <[email protected]> Co-authored-by: Daniel Debny <[email protected]> Co-authored-by: James Timmins <[email protected]> Co-authored-by: Tomek Urbaszek <[email protected]> Co-authored-by: Morgan Racine <[email protected]> Co-authored-by: Ash Berlin-Taylor <[email protected]> Co-authored-by: Bolgov Andrey <[email protected]> Co-authored-by: Daniel Imberman <[email protected]> Co-authored-by: Daniel Imberman <[email protected]> Co-authored-by: chipmyersjr <[email protected]> Co-authored-by: Jacek Kołodziej <[email protected]> Co-authored-by: Kanthi <[email protected]> Co-authored-by: morrme <[email protected]> Co-authored-by: Nitai Bezerra da Silva <[email protected]> Co-authored-by: Rafferty Chen <[email protected]> Co-authored-by: Mauricio De Diana <[email protected]> Co-authored-by: Guilherme Da Silva Gonçalves <[email protected]> Co-authored-by: takunnithan <[email protected]> Co-authored-by: Chao-Han Tsai <[email protected]> Co-authored-by: Tobiasz Kędzierski <[email protected]> Co-authored-by: Tim Healy <[email protected]> Co-authored-by: Timothy Healy <[email protected]> Co-authored-by: Adam Dobrawy <[email protected]> Co-authored-by: Vicken Simonian <[email protected]> Co-authored-by: Alexander Sutcliffe <[email protected]> Co-authored-by: royberkoweee <[email protected]> Co-authored-by: yongheng.liu <[email protected]> Co-authored-by: yongheng.liu <[email protected]> Co-authored-by: Sam Wheating <[email protected]> Co-authored-by: Ephraim Anierobi <[email protected]> Co-authored-by: rafyzg <[email protected]> Co-authored-by: Refael Y <[email protected]> Co-authored-by: Shoichi Kagawa <[email protected]> Co-authored-by: Felix Uellendall <[email protected]> Co-authored-by: ashwinshankar77 <[email protected]> Co-authored-by: Ashwin Shankar <[email protected]> Co-authored-by: Nathan Hadfield <[email protected]> Co-authored-by: Neil Bhandari <[email protected]> Co-authored-by: Mariusz Strzelecki <[email protected]> Co-authored-by: Michał Słowikowski <[email protected]> Co-authored-by: michalslowikowski00 <[email protected]>
|
Hi Potiuk @potiuk , I want use comand 'pip install apache-airflow-backport-providers-apache-kylin' ,but this package doesn't exist。 Is this backport package generated automatically and uploaded to PyPi by CI ? Thanks! |
|
Nope. We need to release software officially, including PMC voting (as described in https://www.apache.org/foundation/voting.html#ReleaseVotes). I think we will have the next series of backport packages released in one-two weeks. |
|
Ok,Thank you very much |
Make sure to mark the boxes below before creating PR: [x]
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.