Skip to content

Conversation

@ashb
Copy link
Member

@ashb ashb commented Jun 7, 2020

It's fairly common to say whitelisting and blacklisting to describe
desirable and undesirable things in cyber security. Just because
it is common doesn't mean it's right.

However, there's an issue with the terminology. It only makes sense if
you equate white with 'good, permitted, safe' and black with 'bad,
dangerous, forbidden'. There are some obvious problems with this.

You may not see why this matters. If you're not adversely affected by
racial stereotyping yourself, then please count yourself lucky. For some
of your friends and colleagues (and potential future colleagues), this
really is a change worth making.

From now on, we will use 'allow list' and 'deny list' in place of
'whitelist' and 'blacklist' wherever possible. Which, in fact, is
clearer and less ambiguous. So as well as being more inclusive of all,
this is a net benefit to our understandability.

(Words mostly borrowed from
https://www.ncsc.gov.uk/blog-post/terminology-its-not-black-and-white)

Closes #9175


Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

It's fairly common to say whitelisting and blacklisting to describe
desirable and undesirable things in cyber security. However just because
it is common doesn't mean it's right.

However, there's an issue with the terminology. It only makes sense if
you equate white with 'good, permitted, safe' and black with 'bad,
dangerous, forbidden'. There are some obvious problems with this.

You may not see why this matters. If you're not adversely affected by
racial stereotyping yourself, then please count yourself lucky. For some
of your friends and colleagues (and potential future colleagues), this
really is a change worth making.

From now on, we will use 'allow list' and 'deny list' in place of
'whitelist' and 'blacklist' wherever possible. Which, in fact, is
clearer and less ambiguous. So as well as being more inclusive of all,
this is a net benefit to our understandability.

(Words mostly borrowed from
<https://www.ncsc.gov.uk/blog-post/terminology-its-not-black-and-white>)
@boring-cyborg boring-cyborg bot added area:Scheduler including HA (high availability) scheduler provider:Apache labels Jun 7, 2020
@ashb
Copy link
Member Author

ashb commented Jun 7, 2020

There are a few references left over.

airflow/config_templates/default_webserver_config.py:#     'whitelist': ['@YOU_COMPANY_DOMAIN'],  # optional
airflow/providers/apache/cassandra/hooks/cassandra.py:    DCAwareRoundRobinPolicy, RoundRobinPolicy, TokenAwarePolicy, WhiteListRoundRobinPolicy,
airflow/providers/apache/cassandra/hooks/cassandra.py:Policy = Union[DCAwareRoundRobinPolicy, RoundRobinPolicy, TokenAwarePolicy, WhiteListRoundRobinPolicy]
airflow/providers/apache/cassandra/hooks/cassandra.py:        - WhiteListRoundRobinPolicy
airflow/providers/apache/cassandra/hooks/cassandra.py:                'load_balancing_policy': 'WhiteListRoundRobinPolicy',
airflow/providers/apache/cassandra/hooks/cassandra.py:        if policy_name == 'WhiteListRoundRobinPolicy':
airflow/providers/apache/cassandra/hooks/cassandra.py:                raise Exception('Hosts must be specified for WhiteListRoundRobinPolicy')
airflow/providers/apache/cassandra/hooks/cassandra.py:            return WhiteListRoundRobinPolicy(hosts)
airflow/providers/apache/cassandra/hooks/cassandra.py:                                      'WhiteListRoundRobinPolicy',)
airflow/providers/apache/hive/operators/hive_stats.py:        if 'col_blacklist' in kwargs:
airflow/providers/apache/hive/operators/hive_stats.py:                'col_blacklist kwarg passed to {c} (task_id: {t}) is deprecated, please rename it to '
airflow/providers/apache/hive/operators/hive_stats.py:            excluded_columns = kwargs.pop('col_blacklist')
pylintrc:extension-pkg-whitelist=setproctitle
tests/providers/apache/cassandra/hooks/test_cassandra.py:    DCAwareRoundRobinPolicy, RoundRobinPolicy, TokenAwarePolicy, WhiteListRoundRobinPolicy,
tests/providers/apache/cassandra/hooks/test_cassandra.py:        # test WhiteListRoundRobinPolicy with args
tests/providers/apache/cassandra/hooks/test_cassandra.py:            self._assert_get_lb_policy('WhiteListRoundRobinPolicy',
tests/providers/apache/cassandra/hooks/test_cassandra.py:                                       WhiteListRoundRobinPolicy)
tests/providers/apache/cassandra/hooks/test_cassandra.py:                {'child_load_balancing_policy': 'WhiteListRoundRobinPolicy',
tests/providers/apache/cassandra/hooks/test_cassandra.py:                 }, TokenAwarePolicy, expected_child_policy_type=WhiteListRoundRobinPolicy)
tests/providers/apache/cassandra/hooks/test_cassandra.py:        # test host not specified for WhiteListRoundRobinPolicy should throw exception
tests/providers/apache/cassandra/hooks/test_cassandra.py:        self._assert_get_lb_policy('WhiteListRoundRobinPolicy',
tests/providers/apache/cassandra/hooks/test_cassandra.py:                                   WhiteListRoundRobinPolicy,
tests/providers/apache/cassandra/hooks/test_cassandra.py:                                   {'child_load_balancing_policy': 'WhiteListRoundRobinPolicy'},

It is probably worth adding a check for this: git grep -Ei '(black|white)[_-]?list' sort of thing, if we can easily exclude some files/occurrences.

@ashb ashb removed area:Scheduler including HA (high availability) scheduler provider:Apache labels Jun 7, 2020
@potiuk potiuk mentioned this pull request Jun 8, 2020
6 tasks
@potiuk
Copy link
Member

potiuk commented Jun 8, 2020

Added PR to automate the check #9175. I managed to remote the pylintrc entry there (and removed optional whitelist from a comment in webserver_config example of Google Oauth)

@nullhack nullhack mentioned this pull request Jun 8, 2020
6 tasks
@ashb
Copy link
Member Author

ashb commented Jun 8, 2020

Thanks @potiuk -- I've pulled that commit in to here, and added a description field suggesting alternate words to use. It isn't displayed anywhere, but at it is at least something in the repo.

@potiuk
Copy link
Member

potiuk commented Jun 8, 2020

Yep. Closing this one.

@potiuk potiuk closed this Jun 8, 2020
@potiuk potiuk reopened this Jun 8, 2020
@ashb ashb reopened this Jun 8, 2020
@ashb
Copy link
Member Author

ashb commented Jun 8, 2020

Wrong one :)

@potiuk
Copy link
Member

potiuk commented Jun 8, 2020

Yep

@ashb
Copy link
Member Author

ashb commented Jun 8, 2020

Just quarantined tests failing.

@ashb ashb merged commit 6350fd6 into apache:master Jun 8, 2020
@ashb ashb deleted the language-matters branch June 8, 2020 09:01
@madison-ookla
Copy link
Contributor

Thanks Ash :) This is great to see!

@leahecole
Copy link
Contributor

I was just going to make this PR - so glad you beat me to it!!

@apache apache deleted a comment Jun 10, 2020
@leahecole
Copy link
Contributor

For anyone who is doubting this change or curious as to the history of the words, the U.S. Library of Medicine has detailed etymology

@apache apache locked as spam and limited conversation to collaborators Jun 15, 2020
@apache apache deleted a comment from Oseryx Jun 15, 2020
@potiuk potiuk added this to the Airflow 1.10.11 milestone Jun 20, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants