-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Description
Apache Airflow Provider(s)
amazon
Versions of Apache Airflow Providers
I am using apache-airflow-providers-amazon==2.4.0, but this issue is still relevant as of the most recent version, apache-airflow-providers-amazon==4.0.0.
Apache Airflow version
2.2.2
Operating System
OS X
Deployment
MWAA
Deployment details
These are my relevant configuration details:
[secrets]
backend = airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend
backend_kwargs = {"connections_prefix": "", "full_url_mode": false}
What happened
When full_url_mode=False, values still need to be URL-encoded.
This creates a break in behavior between how the Airflow UI works and how SecretsManagerBackend() works. When I was porting over my config from the Airflow UI to the AWS Secrets Manager, I could not literally copy+paste my config as a JSON, otherwise this resulted in an error. It was also unclear and surprising that it works this aay.
What you think should happen instead
This behavior makes sense for full_url_mode=True in order to ensure that the URL serialization can be parsed properly.
However, this behavior is not actually required for when full_url_mode=False because the quotes in the JSON delimit the values, which is what one would reasonably expect to be happening behind the scenes.
This is why I consider this a bug, or at least bug-like:
- it's unexpected behavior (because it's not a URL, it's a JSON)
- It deviates from how a core implementation works (i.e. the Airflow UI)
- It is not required to behave like this.
How to reproduce
The way I struggled with this was in creating a Slack hook. The Slack hook takes an input something like this:
{
"conn_type": "http",
"host": "https://hooks.slack.com/services",
"password": "/path/to/hook"
}This works fine using the Airflow UI without URL-encoding, however it is necessary to URL encode everything before passing it through the SecretsManagerBackend().
Anything else
The SecretsManagerBackend uses the BaseSecretsBackend implementation of the get_connection() method. This method takes a conn_id: str and returns an Optional[Connection].
Possible Airflow 2.3 implementation
The base implementation gets some value from the method self.get_conn_value(conn_id=conn_id), then passes it to self.deserialize_connection(conn_id=conn_id, value=value). This method accepts either a URL or a JSON (a string that starts with {).
However, the get_conn_value() method of the SecretsManagerBackend never returns a JSON. Even if the value it retrieves from the cloud is a JSON, it will stick it into a URL at first. At least as of Airflow 2.3.0, the SecretManagerBackend behavior of avoiding JSONs is unnecessary as get_conn_value() is allowed to return a JSON, and the base implementation of get_conn_value`.
Possible Airflow 2.2 implementation
The Apache Airflow Amazon provider package supports Airflow 2.2. Supporting this version of Airflow is also possible, but is slightly more involved: the SecretsManagerBackend would need to override the base implementation for get_connection such that the JSON kwargs are passed into Connection.__init__. (Without overridding this method, only the uri=? kwarg is used to construct the URL).
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct
