Add RdsStopDbOperator and RdsStartDbOperator #27076

hankehly · 2022-10-16T05:46:21Z

Closes: #25952
Related: #26003

Summary

This PR adds the following operators to the amazon provider package.

RdsStartDbOperator - Starts an RDS instance or cluster, optionally creates a snapshot.
RdsStartDbOperator - Stops an RDS instance or cluster

Todo

hankehly · 2022-10-16T06:18:04Z

airflow/providers/amazon/aws/operators/rds.py

+        self.wait_for_completion = wait_for_completion
+
+    def execute(self, context: Context) -> str:
+        self.db_type = RdsDbType(self.db_type)


db_type is a templated field. Casting to RdsDbType in the constructor will raise an exception when users pass template strings, so doing it here instead.

Can you describe the reason for making db_type a templated field?
It could be either instance or cluster so do we need to make it more complex?

Other RDS settings are templated, so including db_type seemed natural. A user could generate a list of RDS instances/clusters to switch on/off at particular times of the day. Templating could be helpful in that case. Given it's a single-line change, the tradeoff between added flexibility and increased complexity seems appropriate to me here.

hankehly · 2022-10-16T06:19:20Z

docs/apache-airflow-providers-amazon/operators/rds.rst

+
+.. exampleinclude:: /../../tests/system/providers/amazon/aws/example_rds_instance.py
+    :language: python
+    :dedent: 4


hankehly · 2022-10-16T06:20:39Z

airflow/providers/amazon/aws/operators/rds.py

        elif item_type == 'event_subscription':
            subscriptions = self.hook.conn.describe_event_subscriptions(SubscriptionName=item_name)
            return subscriptions['EventSubscriptionsList']
+        elif item_type == "db_instance":


These are the same changes as those made to RdsBaseSensor in #26003.
Related: #25952 (comment)

hankehly · 2022-10-16T06:22:11Z

airflow/providers/amazon/aws/operators/rds.py

+            self.hook.conn.get_waiter("db_cluster_available").wait(DBClusterIdentifier=self.db_identifier)
+
+
+class RdsStopDbOperator(RdsBaseOperator):


Stop an RDS instance/cluster. Optionally create a snapshot.

hankehly · 2022-10-16T06:22:19Z

airflow/providers/amazon/aws/operators/rds.py

        return json.dumps(delete_db_instance, default=str)


+class RdsStartDbOperator(RdsBaseOperator):


Start an RDS instance/cluster.

hankehly · 2022-10-16T08:37:44Z

@potiuk @ferruzzi @kazanzhy @o-nikolas @vincbeck
Please review this PR at your earliest convenience.

o-nikolas

Just a couple small nits but otherwise a very clean PR!

Also note: There are some static check failures and doc build failures that needs to be addressed (mostly mypy type related stuff for static checks). You can enable pre-commit to run static checks on your code before submitting a PR (readme) and you can also build docs locally as well (breeze build-docs --package-filter apache-airflow-providers-amazon)

o-nikolas · 2022-10-16T15:13:05Z

tests/providers/amazon/aws/operators/test_rds.py

+        instance_snapshots = snapshot_result.get("DBSnapshots")
+        assert instance_snapshots
+        assert len(instance_snapshots) == 1
+


It might be worth adding a test for stopping the DB cluster with a snapshot and use the caplog fixture (there are examples in other airflow tests) to ensure your warning message is logged.

Nice idea. Please see test_stop_db_cluster_create_snapshot_logs_warning_message in the same file.

o-nikolas · 2022-10-16T15:14:46Z

tests/system/providers/amazon/aws/example_rds_instance.py

+    stop_db_instance = RdsStopDbOperator(
+        task_id="stop_db_instance",
+        db_identifier=rds_db_identifier,
+        wait_for_completion=True,


wait_for_completion=True is a default value, you can drop it from this call (similar to how RdsCreateDbInstanceOperator` does above.

kazanzhy · 2022-10-16T20:09:51Z

airflow/providers/amazon/aws/operators/rds.py

+            response = self.hook.conn.start_db_cluster(DBClusterIdentifier=self.db_identifier)
+        return response
+
+    def _wait_until_db_available(self):


Maybe we could use here already written _await_status method.
It will reduce the code and unify all operators within this file.
WDYT?

@kazanzhy
Thanks for your review. I agree, that would be more consistent. What concerns me is that when using _await_status we need to specify all possible wait_statuses. I observed the following states when creating an RDS instance, but depending on the settings, there may be more. (And they might change in the future?)

Creating

Backing-up

Configuring-enhanced-monitoring

Rather than listing every possible state or risk "missing one," I think using the builtin "waiter" implementation is more appropriate here. We only need to check for the "available" state.

@o-nikolas
I noticed you work at Amazon, is there anything you'd like to add to this discussion?

I agree, we should use waiters as much as possible. They come with nice features such as retry with exponential backoff, ...

Our preference for AWS code is to use the boto Waiters wherever possible. I don't have the context to know whether the existing _await_status method could be updated to use waiters or not, but that would be my preference rather than the other direction. But I also think that's out of scope for this PR. So how you've got it now is fine, IMHO.

I'm confused on why the _await_status is on operator level rather than on Hook level.

What concerns me is that when using _await_status we need to specify all possible wait_statuses

Why? We need to wait for Available status only aren't we?

Yupp, will do 👍

Just dogpiling on the "we should be using waiters everywhere" train. There are a few other places (The EKS operators I wrote before I knew about the boto waiters, for example) in the package where we have custom wait methods but the official boto ones would be cleaner and easier alternatives. It might make a nice task/project for new contributors to go through and see where they can be swapped out?

@eladkal I remember that I implemented _await_status to use the single method for different operators. I was the best solution in my opinion at that time. I knew about waiters but I guess there weren't all waiters that I needed.

I agree with @ferruzzi, let's use _await_status in this PR and then refactor the module and add waiters where it's possible. WDYT?

Thank you all for the feedback. Consensus:

Leave current changes as-is

Refactor waiting logic in #27096

kazanzhy · 2022-10-16T20:10:37Z

airflow/providers/amazon/aws/operators/rds.py

+            response = self.hook.conn.stop_db_cluster(DBClusterIdentifier=self.db_identifier)
+        return response
+
+    def _wait_until_db_stopped(self):


The same suggestion. Let's use _await_status method here.

It's used around lines 802-808. (Could there be a misunderstanding?)

vincbeck · 2022-10-17T16:53:42Z

airflow/providers/amazon/aws/operators/rds.py

+            response = self.hook.conn.start_db_cluster(DBClusterIdentifier=self.db_identifier)
+        return response
+
+    def _wait_until_db_available(self):


I agree, we should use waiters as much as possible. They come with nice features such as retry with exponential backoff, ...

hankehly · 2022-10-18T02:05:41Z

@o-nikolas @ferruzzi @eladkal @kazanzhy
Please see/approve the latest feedback changes at your earliest convenience.

docs/apache-airflow-providers-amazon/operators/rds.rst

Co-authored-by: Vincent <[email protected]>

eladkal

LGTM

hankehly added 3 commits October 2, 2022 12:19

Add operator classes

5144707

Merge branch 'main' into issue-25952-add-rds-operators

1d2809b

Add system/unit tests

0bbabed

boring-cyborg bot added area:providers area:system-tests provider:amazon AWS/Amazon - related issues labels Oct 16, 2022

hankehly added 2 commits October 16, 2022 15:03

Update docs

b36659e

Cast db_type after templating

d78a519

hankehly commented Oct 16, 2022

View reviewed changes

Remove todo comments

91cd3cc

hankehly marked this pull request as ready for review October 16, 2022 08:32

hankehly requested a review from eladkal as a code owner October 16, 2022 08:32

hankehly mentioned this pull request Oct 16, 2022

Add RDS operators/sensors #25952

Closed

2 tasks

o-nikolas requested changes Oct 16, 2022

View reviewed changes

kazanzhy reviewed Oct 16, 2022

View reviewed changes

hankehly added 2 commits October 17, 2022 09:09

Do not specify default values on operator init

5ce7530

Add unit test to check for warning message

b65f9fe

vincbeck approved these changes Oct 17, 2022

View reviewed changes

o-nikolas mentioned this pull request Oct 17, 2022

Use Boto waiters instead of customer _await_status method for RDS Operators #27096

Closed

2 tasks

hankehly added 3 commits October 18, 2022 09:38

Fix static checks

d0784b9

Add db_snapshot_identifier to list of templated fields

f3f91c1

Merge branch 'main' into issue-25952-add-rds-operators

313a8ee

hankehly requested review from ferruzzi and removed request for eladkal October 18, 2022 02:02

hankehly requested review from ferruzzi, kazanzhy and o-nikolas and removed request for ferruzzi, kazanzhy and o-nikolas October 18, 2022 02:02

o-nikolas approved these changes Oct 18, 2022

View reviewed changes

eladkal requested review from eladkal and removed request for kazanzhy October 18, 2022 16:55

eladkal requested changes Oct 18, 2022

View reviewed changes

docs/apache-airflow-providers-amazon/operators/rds.rst Outdated Show resolved Hide resolved

docs/apache-airflow-providers-amazon/operators/rds.rst Outdated Show resolved Hide resolved

eladkal and others added 2 commits October 19, 2022 00:49

Update docs/apache-airflow-providers-amazon/operators/rds.rst

6da1325

Co-authored-by: Vincent <[email protected]>

Update docs/apache-airflow-providers-amazon/operators/rds.rst

9bc1d80

Co-authored-by: Vincent <[email protected]>

eladkal approved these changes Oct 18, 2022

View reviewed changes

eladkal merged commit a2413cf into apache:main Oct 19, 2022

hankehly deleted the issue-25952-add-rds-operators branch October 19, 2022 05:36

hankehly mentioned this pull request Oct 31, 2022

Use Boto waiters instead of customer _await_status method for RDS Operators #27410

Merged

2 tasks

potiuk mentioned this pull request Nov 15, 2022

Status of testing Providers that were prepared on November 15, 2022 #27674

Closed

		self.hook.conn.get_waiter("db_cluster_available").wait(DBClusterIdentifier=self.db_identifier)


		class RdsStopDbOperator(RdsBaseOperator):

		return json.dumps(delete_db_instance, default=str)


		class RdsStartDbOperator(RdsBaseOperator):

Add RdsStopDbOperator and RdsStartDbOperator #27076

Add RdsStopDbOperator and RdsStartDbOperator #27076

Uh oh!

Conversation

hankehly commented Oct 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Todo

Uh oh!

hankehly Oct 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hankehly commented Oct 16, 2022

Uh oh!

o-nikolas left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eladkal Oct 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kazanzhy Oct 17, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hankehly commented Oct 18, 2022

Uh oh!

Uh oh!

Uh oh!

eladkal left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

hankehly commented Oct 16, 2022 •

edited

Loading

hankehly Oct 16, 2022 •

edited

Loading

eladkal Oct 17, 2022 •

edited

Loading

kazanzhy Oct 17, 2022 •

edited

Loading