-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Get Airflow configs with sensitive data from Secret Backends #9645
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get Airflow configs with sensitive data from Secret Backends #9645
Conversation
airflow/configuration.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmm, not a huge fan of the duplication here and in airflow/secrets/__init__.py.
Since this would only be called after the class is constructed would we instead do
| @cached_property | |
| def _secrets_backend_client(self): | |
| if super().has_option('secrets', 'backend') is False: | |
| return None | |
| secrets_backend_cls = super().get('secrets', 'backend') | |
| if not secrets_backend_cls: | |
| return None | |
| print("secrets_backend_cls", secrets_backend_cls) | |
| secrets_backend = import_string(secrets_backend_cls) | |
| if secrets_backend: | |
| try: | |
| alternative_secrets_config_dict = json.loads( | |
| super().get('secrets', 'backend_kwargs', fallback='{}') | |
| ) | |
| except json.JSONDecodeError: | |
| alternative_secrets_config_dict = {} | |
| return secrets_backend(**alternative_secrets_config_dict) | |
| @cached_property | |
| def _secrets_backend_client(self): | |
| if super().has_option('secrets', 'backend') is False: | |
| return None | |
| from airflow.secrets import secrets_backend_list | |
| return secrets_backend_list |
(give or take).
By delaying the import to inside the function no import loop is created.
This probably also means adding a get_config(section, key) to airflow.secrets, and possibly changing the default implemenation from NotImplementedError to return None.
Note: we should pass they key and section in directly, and let the secret backends convert that to a single parameter if needed (for instance we could have /config/core/sql_alchemy_conn, rather than config/core__sql_alchemy_conn)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main reason there is some code duplication here is we can't use airflow.secrets as otherwise we would have infinite loop (even when the import is inside the loop) since the DEFAULT_SECRET_BACKEND ( Environment Variable and Connection ) needs sql_achemy_conn value.
I can separate the code to a different file and import that file at both places if the concern is just duplication
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sql_alchemy_conn_secret = conn/sql_alchemy_connshould already allow that.
Whatever they pass to sql_alchemy_conn_secret will be directly used to get secret
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with TYPE_CHECKING in ed8cc1a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| And if configurations_prefix is ``airflow/configurations/sql_alchemy_conn``, this would be accessible | |
| And if configurations_prefix is ``airflow/config/core/sql_alchemy_conn``, this would be accessible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if you provide ``{"configurations_prefix": "airflow/configurations"}`` and request variable | |
| if you provide ``{"configurations_prefix": "airflow/config"}`` and request variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| configurations_prefix: str = 'airflow/configuration', | |
| config_prefix: str = 'airflow/config', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
etc (haven't made this change everywhere)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:D I went back and forth with this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
airflow/secrets/base_secrets.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def get_configuration(self, key: str) -> Optional[str]: | |
| def get_configuration(self, section: str, key: str) -> Optional[str]: |
…rom Secret Backends
… data from Secret Backends
ada4ac6 to
3b38109
Compare
|
thank you for picking this one and sorting it quickly @kaxil - this is super useful, hoping to see this in the next 1.10.* release |
airflow/configuration.py
Outdated
| env_var_cmd = env_var + '_SECRET' | ||
| if env_var_cmd in os.environ: | ||
| # if this is a valid command key... | ||
| if (section, key) in self.sensitive_config_values: | ||
| return _get_config_value_from_secret_backend(os.environ[env_var_cmd]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| env_var_cmd = env_var + '_SECRET' | |
| if env_var_cmd in os.environ: | |
| # if this is a valid command key... | |
| if (section, key) in self.sensitive_config_values: | |
| return _get_config_value_from_secret_backend(os.environ[env_var_cmd]) | |
| env_var_secret_path = env_var + '_SECRET' | |
| if env_var_secret_path in os.environ: | |
| # if this is a valid secret key... | |
| if (section, key) in self.sensitive_config_values: | |
| return _get_config_value_from_secret_backend(os.environ[env_var_secret_path]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 032d33f
| """ | ||
| raise NotImplementedError() | ||
|
|
||
| def get_config(self, key: str) -> Optional[str]: # pylint: disable=unused-argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be
| def get_config(self, key: str) -> Optional[str]: # pylint: disable=unused-argument | |
| def get_config(self, section: str, key: str) -> Optional[str]: # pylint: disable=unused-argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, except it comes from a single value in the config file. Hmmmm.
Perhaps update the example to show include a / in it?
[core]
sql_alchemy_conn_secret = core/sql_alchemy_connDunno.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added docs in 032d33f
| .. code-block:: ini | ||
| [core] | ||
| sql_alchemy_conn_secret = sql_alchemy_conn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmmm, I wonder if this will be confusing for people running in Kube, who might expect this to come from a Kube secret.
(I don't have a suggestion on how to deal with this.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs/howto/set-config.rst
Outdated
| The universal order of precedence for all configuration options is as follows: | ||
|
|
||
| #. set as an environment variable | ||
| #. set as a command environment variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #. set as a command environment variable | |
| #. set as a command environment variable | |
| #. set as a secret environment variable |
Though the wording on this isn't that clear :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 032d33f
…nsitive data from Secret Backends
| return output | ||
|
|
||
|
|
||
| def _get_config_value_from_secret_backend(config_key): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why this is a function in the module, and not a method on the AirflowConfiguration class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to keep it similar to def run_command(command):, no other reason
| If variables prefix is ``airflow/variables/hello``, this would be accessible | ||
| if you provide ``{"variables_prefix": "airflow/variables"}`` and request variable key ``hello``. | ||
| And if configs_prefix is ``airflow/configs/sql_alchemy_conn``, this would be accessible | ||
| if you provide ``{"configs_prefix": "airflow/configs"}`` and request variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if you provide ``{"configs_prefix": "airflow/configs"}`` and request variable | |
| if you provide ``{"configs_prefix": "airflow/configs"}`` and request config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in d8d5b8f
| self, | ||
| connections_prefix: str = 'airflow/connections', | ||
| variables_prefix: str = 'airflow/variables', | ||
| configs_prefix: str = 'airflow/configs', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably just be config, not configs -- as config is both singular and plural, at least in British English. (Configurations/configs) sounds odd to my ears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in 35d75e5
…with sensitive data from Secret Backends
…onfigs with sensitive data from Secret Backends
Adds support to AWS SSM for feature added in apache#9645
Adds support to Google Cloud CloudSecretManagerBackend for feature added in apache#9645
) Adds support to AWS SSM for feature added in #9645
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc
…023) Adds support to AWS SSM for feature added in apache/airflow#9645 GitOrigin-RevId: 2410f592a4ab160b377f1a9e5de3b7262b9851cc

We should be able to get the followings config containing passwords and other sensitive data from Secret Backends:
Make sure to mark the boxes below before creating PR: [x]
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.