Skip to content

Conversation

@elwinarens
Copy link

Add secrets backend for microsoft azure key vault. Related issue #8258.

Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg
Copy link

boring-cyborg bot commented Jul 3, 2020

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, pylint and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://apache-airflow-slack.herokuapp.com/

@turbaszek turbaszek requested review from kaxil and potiuk July 3, 2020 10:49
@elwinarens elwinarens force-pushed the azure_key_vault_secrets_backend branch from 1b55972 to e9f6a0b Compare July 3, 2020 10:53
@elwinarens elwinarens force-pushed the azure_key_vault_secrets_backend branch 2 times, most recently from 07e9bb6 to fbfe3c8 Compare July 3, 2020 12:43
Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI Failed:

airflow/providers/microsoft/azure/secrets/azure_key_vault.py:42: error: Incompatible default for argument "vault_url" (default has type "None", argument has type "str")
Found 1 error in 1 file (checked 2279 source files)

There were some mypy errors. Exiting


Fix End of Files..................................................................................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook

Fixing airflow/providers/microsoft/azure/secrets/__init__.py


Run isort to sort imports.........................................................................................................Failed
- hook id: isort
- files were modified by this hook

Fixing /home/runner/work/airflow/airflow/airflow/providers/microsoft/azure/secrets/azure_key_vault.py



************* Module airflow.providers.microsoft.azure.secrets.azure_key_vault
airflow/providers/microsoft/azure/secrets/azure_key_vault.py:123:12: E1205: Too many arguments for logging format string (logging-too-many-args)

------------------------------------


                   EXITING /opt/airflow/scripts/ci/in_container/run_pylint_main.sh WITH STATUS CODE 1
###########################################################################################
************* Module airflow.providers.microsoft.azure.secrets.azure_key_vault
airflow/providers/microsoft/azure/secrets/azure_key_vault.py:123:12: E1205: Too many arguments for logging format string (logging-too-many-args)


@elwinarens elwinarens force-pushed the azure_key_vault_secrets_backend branch 3 times, most recently from fc98cb6 to e17e26c Compare July 6, 2020 12:56
@elwinarens elwinarens force-pushed the azure_key_vault_secrets_backend branch from e17e26c to 6bc7c1f Compare July 9, 2020 11:10
@elwinarens elwinarens force-pushed the azure_key_vault_secrets_backend branch 2 times, most recently from f6a1df0 to 4a5f447 Compare July 10, 2020 05:38
@elwinarens elwinarens force-pushed the azure_key_vault_secrets_backend branch from 4a5f447 to 0e86265 Compare July 10, 2020 05:41
@elwinarens
Copy link
Author

@kaxil In the aws secrets manager in the docstring it says:

   :param config_prefix: Specifies the prefix of the secret to read to get Variables.
    :type config_prefix: str

(Variables should be Config i think)

Should i add config to this backend too?

path = f'{path_prefix}{sep}{secret_id}'
return path.replace('_', sep)

def get_secret_value(self, name: str) -> Optional[str]:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored _get_secret to get_secret_value to allow use of this backend for secrets that do not conform to the prefixes provided for connections and variables. We have a use case for this as secret names are sometimes internally created by another department that conform to different naming standards.

Copy link
Member

@kaxil kaxil Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@elwinarens elwinarens Jul 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make it a private method again and will create a hook later on.

@kaxil
Copy link
Member

kaxil commented Jul 10, 2020

Should i add config to this backend too?

Yes please, it would allow you to retrieve some of the Airflow configs from Secrets Backend, details here https://airflow.readthedocs.io/en/latest/howto/set-config.html

And if variables prefix is ``airflow-variables-hello``, this would be accessible
if you provide ``{"variables_prefix": "airflow-variables"}`` and request variable key ``hello``.

:param vault_url: The URL of an Azure Key Vault to use
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more: also put vault_url after variables_prefix here too

@kaxil
Copy link
Member

kaxil commented Jul 20, 2020

Do you need any help in completing this @elwinarens?

@elwinarens
Copy link
Author

elwinarens commented Jul 25, 2020

@kaxil Could use help, yes. Don’t think i can work on this the coming weeks, unfortunately. Feel free to fork this and put up a new PR

@alexbegg
Copy link
Contributor

Not sure if anyone who uses this noticed but the module azure.core.pipeline.policies._universal (which handles logging of HTTP requests for client.get_secret) is spitting out a lot of INFO-level logs for every request, no matter if it is a 200 or 404 response.

There might be a better way to stop it from spamming the info logs, but one workaround I found is to change the logging level for the azure logger to WARNING, I just not sure how it affects things other than this secret backend. I suggest the following gets added to azure_key_vault.py:

import logging

logger = logging.getLogger('azure')
logger.setLevel(logging.WARNING)

Here is an example of just one request shown in the INFO-level logs:

{_universal.py:412} INFO - Request URL: '\x1b[1mhttps://**********/secrets/**********/?api-version=REDACTED\x1b[0m'
{_universal.py:413} INFO - Request method: '\x1b[1mGET\x1b[0m'
{_universal.py:414} INFO - Request headers:
{_universal.py:417} INFO -     '\x1b[1mAccept\x1b[0m': '\x1b[1mapplication/json\x1b[0m'
{_universal.py:417} INFO -     '\x1b[1mx-ms-client-request-id\x1b[0m': '\x1b[1me76166a4-d2b0-11ea-8042-f2d2fd64bfce\x1b[0m'
{_universal.py:417} INFO -     '\x1b[1mUser-Agent\x1b[0m': '\x1b[1mazsdk-python-keyvault-secrets/4.1.0 Python/3.6.10 (Linux-4.15.0-1089-azure-x86_64-with-debian-10.3)\x1b[0m'
{_universal.py:417} INFO -     '\x1b[1mAuthorization\x1b[0m': '\x1b[1mREDACTED\x1b[0m'
{_universal.py:431} INFO - Response status: 200
{_universal.py:432} INFO - Response headers:
{_universal.py:435} INFO -     '\x1b[1mCache-Control\x1b[0m': '\x1b[1mno-cache\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mPragma\x1b[0m': '\x1b[1mno-cache\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mContent-Type\x1b[0m': '\x1b[1mapplication/json; charset=utf-8\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mExpires\x1b[0m': '\x1b[1m-1\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mx-ms-keyvault-region\x1b[0m': '\x1b[1mREDACTED\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mx-ms-request-id\x1b[0m': '\x1b[1maa7b60bd-292a-4f33-a034-21690c75463f\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mx-ms-keyvault-service-version\x1b[0m': '\x1b[1mREDACTED\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mx-ms-keyvault-network-info\x1b[0m': '\x1b[1mconn_type=Ipv4;addr=**********;act_addr_fam=InterNetwork;\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mX-AspNet-Version\x1b[0m': '\x1b[1mREDACTED\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mX-Powered-By\x1b[0m': '\x1b[1mREDACTED\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mStrict-Transport-Security\x1b[0m': '\x1b[1mREDACTED\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mX-Content-Type-Options\x1b[0m': '\x1b[1mREDACTED\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mDate\x1b[0m': '\x1b[1mThu, 30 Jul 2020 22:06:21 GMT\x1b[0m'
{_universal.py:435} INFO -     '\x1b[1mContent-Length\x1b[0m': '\x1b[1m355\x1b[0m'

P.S. I don't really have the time either to finish this up, but I made a very similar secrets backend for Azure Key Vault that used ClientSecretCredential instead of DefaultAzureCredential, but I like this method better because instead of having to put the client_id, client_secret, and tenant_id in the kwargs you can just add them as environment variables

@elwinarens
Copy link
Author

@kaxil @alexbegg Will try and make some time this week

@elwinarens
Copy link
Author

elwinarens commented Aug 3, 2020

@alexbegg After investigating the verbose logging: while i understand that the logs are verbose and annoying, i don't think the responsibility of changing the default behaviour of the azure python sdk implementation should be changed in this backend. Because changing the logging behaviour here affects other parts (there's no logger for this particular client, i think) Related Azure/azure-sdk-for-python#9422

@flolas
Copy link
Contributor

flolas commented Aug 19, 2020

I'm interested in this functionality. Do you need help finishing the PR?

Thanks!

@kaxil
Copy link
Member

kaxil commented Aug 26, 2020

@flolas Please feel free to create a new PR, looks like @elwinarens might be busy

@kaxil
Copy link
Member

kaxil commented Sep 10, 2020

ping @elwinarens @flolas

@alexbegg
Copy link
Contributor

alexbegg commented Sep 10, 2020

@kaxil If neither are able to continue the work, what is remaining to do here? Does it need a hook? has config secrets been added to this backend yet? I am already using this same backend for my own Airflow installation but as a custom secrets backend as a standalone class, not incorporated into the apache-airflow-backport-providers-microsoft-azure package. I was working on developing my own PR for this but I was still working on the hook and other parts when I left off.

@kaxil
Copy link
Member

kaxil commented Sep 10, 2020

@kaxil If neither are able to continue the work, what is remaining to do here? Does it need a hook? has config secrets been added to this backend yet? I am already using this same backend for my own Airflow installation but as a custom secrets backend as a standalone class, not incorporated into the apache-airflow-backport-providers-microsoft-azure package. I was working on developing my own PR for this but I was still working on the hook and other parts when I left off.

It needs rebase on Master, I tried but the PR author hasn't allowed "Edit from Maintainers"

@kaxil
Copy link
Member

kaxil commented Sep 10, 2020

@elwinarens
Copy link
Author

elwinarens commented Sep 11, 2020

@kaxil that option is not available, because this PR is not under a personal user. Feel free to cherry pick and to put up a new PR. Let me know if you need anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants