Skip to content

Conversation

@dimon222
Copy link
Contributor

@dimon222 dimon222 commented Nov 20, 2021

What is this?
This PR is to help with consequences of 867e930 (#17987)

Currently out of the box base_aws implementation is going to show warnings whenever connection cache is invoked (that is potentially always when you use any AWS-based resource in Airflow like S3 as storage for logging or sensors)

Why?
Newly added warning for deprecation is shown every time anyone is passing not None value client_type to get_client_type method. Same is happening with not None value resource_type being passed to get_resource_type.
While this new deprecation warning might have been created with good intentions, the out of the box calls to mentioned methods weren't modified accordingly (missed during review?), so right now Airflow is barking on its own code for deprecation on any AWS-hook based component.

Condition 1:

if client_type:
warnings.warn(
"client_type is deprecated. Set client_type from class attribute.",
DeprecationWarning,
stacklevel=2,
)

Calls:

if self.client_type:
return self.get_client_type(self.client_type, region_name=self.region_name)

else:
return self.get_client_type("iam").get_role(RoleName=role)["Role"]["Arn"]

(for iam call please read the notes on the very end of this PR message)

Condition 2:

if resource_type:
warnings.warn(
"resource_type is deprecated. Set resource_type from class attribute.",
DeprecationWarning,
stacklevel=2,
)

Calls:

elif self.resource_type:
return self.get_resource_type(self.resource_type, region_name=self.region_name)


Now, to mitigate this I took next approach: I pass temporarily None in mentioned arguments until the code is refactored to get rid of entire argument in method (I assume somewhen in the future?).

Few notes:

  1. One piece of code that I randomly noticed is that IAM request is not standard (see call 2 in above mentioned cuts, and is using special hardcoded value "iam"). As I'm not certain about all the specifics of Airflow internals I have added "iam" to list of exceptions for deprecation for time being. Feel free to modify to proper approach.
  2. There might be other invocations of mentioned methods with "not None" being passed by Airflow codebase, but unfortunately, I haven't done deep inspection of entire codebase to verify.

@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Nov 20, 2021
@potiuk
Copy link
Member

potiuk commented Nov 20, 2021

@john-jac @ferruzzi - maybe you can help with sorting out what's the best approach here ?

@john-jac
Copy link
Contributor

@john-jac @ferruzzi - maybe you can help with sorting out what's the best approach here ?

@potiuk what are the Airflow policies around removing parameters after depreciation? At a minimum we could drop the warning to INFO or DEBUG, but I would want to make sure any changes meet the community's backwards compatibility standards for DAGs.

@ferruzzi
Copy link
Contributor

@potiuk I'd ask you 😛 Basically the same thing John said. I've seen some deprecation warning in other places in the provider packages as well.

It looks like these were flagged for deprecation a couple months ago in 867e930 What's Airflow's policy on this? How long do we support something after a deprecation warning is pushed? I presume it's there until the next major version of the provider package?

If they were flagged as deprecated without an actual plan to follow though with removing them, then I guess that's an Issue to tack onto the list?

@potiuk
Copy link
Member

potiuk commented Nov 22, 2021

Yeah feel free to raise it on the list :).

I think we have no "hard" rules defined for the providers - except the SamVer which tell that you MAY introduce breaaking change (i.e. drop parameters) when you release major upgrade.

So technically, we could drop those deprecated parameters with any MAJOR release of provider. However I am of a position that not everything that can be done - should be.

I think keeping backwards compatibility is extremely important - and except stuff that we HAVE have as backwards incompatibility, I think we should drop something only when we have very good reason - i.e. when it slows us down on releasing something or when it costs a lot to maintain it. In vast majority of cases I see no reason for dropping a parameter if it is just renaming and keeping old one for backwards compatibility, This has very little benefit (two lines of code less) but can potentially wreak havoc for an unsuspecting user.

So my approach is: We should drop any such deprecated parameters when we "have to" (because we have good reason) not when we "can"

However this is a different case @dimon222 talks about (please confirm @dimon222)

From what @dimon222 understand says is that deprecation warnings are raised accidentaly whenever there is a legitimate use of the Operator and it cannot be avoided. Usually when you have such depracation there should be an easy way to change your use of it and deprecation warning should disappear.

I do not know that much about the the operator and it's parameter to judge it - but I find the if iam logic a bit suspicious and some questions should be answered:

  1. should the client_type be deprecated (as it was in AwsBaseHook make client_type & resource_type optional params for get_client_type & get_resource_type #17987) - seems that 'if client_type == 'iam' suggests that there is still legitimate use of it ? Why?
  2. is there a way to skip client_type and do what @dimon222 wants to do without raising deprecation warnining
  3. finally is the deprecation warning clear-enough and explains the user how to convert their code to avoid the deprecation warning?

As I wrote - I do not know much about the reasons for deprecating client_type - but if we answer the three questions above, we should be able to decide if deprecation should really be there. Also @eladkal maybe you can chime in here?

@dimon222
Copy link
Contributor Author

@potiuk
Correct.

In general, I see no problem in any kind of refactoring if there's a legitimate reason for that. In our current scenario, however, we see that iam case that potentially wasn't considered (most likely just accidentally missed). I'm not much worried about iam as end Airflos user, but what I'm worried is about enormous spam of deprecation warnings in logs for components that I haven't touched - aws hook in cache (in this case the actual method calls for general scenario, skipping the IAM case).

So the middleground I'm proposing here is:

  1. Main methods of hook will no longer raise false positives for deprecation and use new mechanic (see PR code itself). If user is utilizing these methods in his hook/aws based component implementations, (s)he will validly receive a proper deprecation warning. Once right time comes, for example for 3.x.x version cutoff the hook can be adjusted to drop the argument altogether and methods that are called in it will be modified to not pass anything in arg.
  2. iam scenario becomes "corner case" like it or not. If we want to polish that fragment we could consider stopping calls to respective method for iam, this way avoiding any corner case considerations for method's logic.

@eladkal
Copy link
Contributor

eladkal commented Nov 23, 2021

however, we see that iam case that potentially wasn't considered (most likely just accidentally missed).

It was missed

iam scenario becomes "corner case" like it or not. If we want to polish that fragment we could consider stopping calls to respective method for iam, this way avoiding any corner case considerations for method's logic.

This is probably the better solution as #17987 solves a real problem so apparently it's just incomplete.

@potiuk
Copy link
Member

potiuk commented Nov 23, 2021

Just to clarify @eladkal - you think the proposed change is ok :) ? I am about to start preparing the providers release so we should merge that one I guess ? Sorry i have been swamped a bit recently and had no tone to look in detail

@eladkal
Copy link
Contributor

eladkal commented Nov 23, 2021

Just to clarify @eladkal - you think the proposed change is ok :) ? I am about to start preparing the providers release so we should merge that one I guess ? Sorry i have been swamped a bit recently and had no tone to look in detail

The current PR doesnt solve the issue. It just hide warnning. The solution is doing a small refactor as suggested so get_client_type(IAM) is not needed. To my understanding this is what @dimon222 is going to do? (Then we dont need to bump major version)

@potiuk
Copy link
Member

potiuk commented Nov 23, 2021

The current PR doesnt solve the issue. It just hide warnning. The solution is doing a small refactor as suggested so get_client_type(IAM) is not needed. To my understanding this is what @dimon222 is going to do? (Then we dont need to bump major version)

Works for me! @dimon222 - I will be releasing providers later this week, so If you could make the change before and we could merge it before, that would be awesome!

@dimon222
Copy link
Contributor Author

Done, up for another review.

@dimon222
Copy link
Contributor Author

Applied review corrections + modified all identified references with my basic search. Please verify the new corrections.

@uranusjr uranusjr dismissed their stale review November 23, 2021 17:24

Resolved

Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Wonder if some unit tests can be added?

@github-actions
Copy link

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Nov 23, 2021
@uranusjr
Copy link
Member

These are all already covered by tests (and working), but this removes depretion warnings emitted by them. Do you mean tests on not emitting deprecation warnings? That might be a good thing to do globally (IIRC there’s a pytest flag for that).

@eladkal
Copy link
Contributor

eladkal commented Nov 23, 2021

Do you mean tests on not emitting deprecation warnings? That might be a good thing to do globally (IIRC there’s a pytest flag for that).

yep - to avoid problems like we have here. If we had something like that I would have handle it in the first PR. but yeah that is out of scope here

@uranusjr
Copy link
Member

Splitting the testing issue to #19788.

@uranusjr uranusjr merged commit 4be0414 into apache:main Nov 23, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Nov 23, 2021

Awesome work, congrats on your first merged pull request!

@uranusjr
Copy link
Member

Arrgh! I misclicked the merge. Submitting a fix to the test failures now…

@uranusjr
Copy link
Member

Trying to fix this in #19789

@dimon222 dimon222 deleted the bugfix/adjust_cache_to_not_show_deprecation branch November 23, 2021 20:28
uranusjr added a commit to uranusjr/airflow that referenced this pull request Nov 23, 2021
eladkal pushed a commit that referenced this pull request Nov 23, 2021
uranusjr added a commit to uranusjr/airflow that referenced this pull request Nov 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers okay to merge It's ok to merge this PR as it does not require more tests provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants