Add information about Amazon Elastic MapReduce Connection#26687
Add information about Amazon Elastic MapReduce Connection#26687potiuk merged 7 commits intoapache:mainfrom
Conversation
|
Fighting with a system test at the moment, I'll check it out tomorrow (Tuesday) 👍 |
|
@ferruzzi very appreciate! Almost all of the changes related to documentation or inform users about something that we do previously silently. Like if Also today I overwrite |
cfa6f5f to
a3576f7
Compare
I am personally a big fan of transparency. 👍 |
o-nikolas
left a comment
There was a problem hiding this comment.
Thanks for the contribution, great stuff as always! Just some minor changes requested 👍
Cool. I will add this suggestion in the morning. |
2136aa6 to
a36e4de
Compare
a36e4de to
5b72026
Compare
5b72026 to
afe3a6e
Compare
| mock_emr = None | ||
|
|
||
|
|
||
| @unittest.skipIf(mock_emr is None, 'moto package not present') |
There was a problem hiding this comment.
Nice! Love the conversion to pytest here.
There was a problem hiding this comment.
Yeah, I try to replace unittests by pytest in most cases this is not required huge afford.
However still a lot of unittests tests
# Total number of files which use `unittest.TestCase `
❯ grep -rl 'unittest.TestCase' ./tests | wc -l
528
# By tests packages
❯ grep -rl 'unittest.TestCase' ./tests | cut -d"/" -f3 | sort | uniq -c | sort -nr
379 providers
36 charts
23 utils
20 cli
10 ti_deps
10 api_connexion
6 sensors
6 operators
6 executors
6 core
5 www
5 always
4 models
3 api
2 task
2 kubernetes
1 security
1 plugins
1 macros
1 hooks
1 dag_processing
# By provider ("apache", "microsoft", "common" has subpackages which is separate provider)
❯ grep -rl 'unittest.TestCase' ./tests/providers/ | cut -d"/" -f5 | sort | uniq -c | sort -nr
113 google
101 amazon
32 apache
26 microsoft
6 databricks
4 redis
4 mysql
4 alibaba
3 trino
3 tableau
3 qubole
3 oracle
3 jenkins
3 http
3 docker
3 atlassian
3 arangodb
3 airbyte
2 yandex
2 vertica
2 telegram
2 sqlite
2 snowflake
2 sftp
2 segment
2 salesforce
2 presto
2 postgres
2 opsgenie
2 neo4j
2 mongo
2 jdbc
2 influxdb
2 imap
2 grpc
2 ftp
2 exasol
2 discord
2 dingding
2 datadog
2 common
2 cncf
2 asana
1 ssh
1 singularity
1 sendgrid
1 samba
1 papermill
1 openfaas
1 elasticsearch
1 cloudant
1 celery| except AirflowNotFoundException: | ||
| config = {} | ||
| config = {} | ||
| if self.emr_conn_id: |
There was a problem hiding this comment.
Neat-picking: Should we have a separate function ?
def _validate_params_emr_conn_id(emr_conn_id: str):IMHO it does increase readability by ditching a few level of indentation.
There was a problem hiding this comment.
Actually for this purpose usually use get_conn() however we can not overwrite this method because it uses for obtain AWS credentials (aws_conn_id).
We could create this method, but only use in one place. Current implementation not contain any complex logic, so personally I do not see any benefits with this separate method.


Right now there is not information how to use and
Amazon Elastic MapReduce Connection.Personally for me this connection a bit odd because it only contain parameters for single boto3 method.
However this connection exists for a long time and might be use for some one, so I add:
emr_conn_idandjob_flow_overrideswork together