Skip to content

Conversation

@borismo
Copy link
Contributor

@borismo borismo commented Sep 13, 2024

Recently, AWS added session reuse to the Redshift Data API. It allows to, say, create a temporary table in one statement, and select from it in a subsequent one.
I think it would be useful that RedshiftDataOperator supports this new feature.

Decisions

  • the hook and operator's database arguments now optional because when the session ID is provided, boto3 doesn't allow database to be specified. Because this becomes a named argument, I had to move it after sql. If users of the operator and hook rely on the position to pass the argument that could break things ⚠️ . Note that boto3 only requires Sql.
  • the hook now returns an object containing the statement ID and session ID instead of just a statement ID. I choose a dataclass instead of a dict to make it harder to make a mistake when indexing the keys. If users use the hook in their tasks or custom operators, this could also break things ⚠️
  • fixed the typo in the parse_statement_resposne method. It's a public one. So, again, there is a risk of breaking change. ⚠️
  • boto3's validation error messages are not very useful when the session ID is a non-UUID string, or if no database, workgroup or session ID was provided, so I added checks in the hook.

To do

  • update docs
  • update changelog
  • update and run examples
    • example_redshift.py
    • example_redshift_s3_transfers.py

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Sep 13, 2024
Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from your new feature, you seem to fix many bugs, thanks for doing it!

Copy link
Contributor

@vincbeck vincbeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting "request changes" so that it is not merged out of the blue. Since this PR contains a breaking change, that means the next release of the amazon provider package will be a major one. We (AWS team) want to take this opportunity to remove deprecated stuff. This is just a way to control when this PR will be merged

@vincbeck
Copy link
Contributor

@borismo Just checking in, are you working on this one?

@borismo
Copy link
Contributor Author

borismo commented Sep 21, 2024

@vincbeck , yes, but on my free time. I'll try to make some progress soon.

@borismo
Copy link
Contributor Author

borismo commented Sep 22, 2024

The Compat 2. checks fail if the test expects execution_date=None to be pushed to the Xcom.
Non-DB::3.8 and All:LowestDeps-Postgres:12:3.8 fail when I add it.
Does anyone know how to solve this?

@borismo borismo marked this pull request as ready for review September 22, 2024 15:40
@borismo borismo requested a review from eladkal as a code owner September 22, 2024 15:40
@vincbeck
Copy link
Contributor

The Compat 2. checks fail if the test expects execution_date=None to be pushed to the Xcom. Non-DB::3.8 and All:LowestDeps-Postgres:12:3.8 fail when I add it. Does anyone know how to solve this?

Interesting, I'd like to see the error from compat tests. Could you solve the non-db tests please?

@borismo
Copy link
Contributor Author

borismo commented Sep 24, 2024

Could you solve the non-db tests please?

Sure, done in 87da7b9

@vincbeck
Copy link
Contributor

That worked :) Merging it now

@vincbeck vincbeck merged commit 8580e6d into apache:main Sep 24, 2024
@vincbeck
Copy link
Contributor

@eladkal This is a breaking change in Amazon provider package. So you know, the next release should be a major version bump. AWS team will take advantage of this to remove deprecated stuff from Amazon provider package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants