Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Feb 20, 2021

This change attempts to stabilize pylint checks. Since we have
recently added self-hosted runners with multiple CPUS, seems that
re-enabling parallel mode for pylint makes perfect sense as we will
finally be able to use the parallelism and speed up static checks
significantly.

Previously the tests were run in single-processor mode in attempt
to avoid random mistakes where different files were processed in
different processes. This led to random pylint or mypy problems
and aither false-positives or false negatives before especially
when it came to circular dependencies. but since we are now past
heavy refactoring, this should be no problem for future changes
and occasional false positive/negative is less disruptive than
long checks.

The attempt is made to apply sorting order in the files passed
to pylint. This should provide more stability in the results
of running the tests in PR and in master.

We had some custom pylint plugins that prevented using of pylint
parallelism. For now we are giving up on one of the plugins
(no asserts use) and we rely on committer's review on that (we
have a rule in place to only use asserts in tests). The other
plugin was replaced by coming back to separation of "main code"
and "test code" and applying different rules to those - we have
now two different configuration files|: pylintrc and
pylintrc-tests to control settings for those two different cases.

Mypy and flake8 have been parallelized at the level of pre-commits.

By implementing those changes we are likely to speed up the
tests on self-hosted runners 6x-8x times.

Also caching of pre-commit environments (including the
pre-commit installation) is improved. Previously we only cached the
environment created by the pre-commit, but with this change we
also cache the pre-commit installation in .local directory - this
should save 1-2 minutes of the job.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@potiuk potiuk requested a review from ashb as a code owner February 20, 2021 13:38
@potiuk potiuk requested review from kaxil and mik-laj February 20, 2021 13:39
@potiuk
Copy link
Member Author

potiuk commented Feb 20, 2021

cc: @xinbinhuang

@potiuk
Copy link
Member Author

potiuk commented Feb 20, 2021

The first results show that in self-hosted runner environment we can decrease pylint checks run from ~20 minutes to ~6 minutes when this one gets implemented. So I'd say - let's merge it quickly. I think stability will be improved as well as we again separate out test pylints from the "main" pylint and I think some of the recent stability issues were connected with tests accidentally being checked with "main" configuration.

@potiuk potiuk force-pushed the improve-static-checks-speed branch from 5e8c9e4 to f39b5f5 Compare February 20, 2021 14:37
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Feb 20, 2021
@github-actions
Copy link

The Workflow run is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.

@potiuk potiuk force-pushed the improve-static-checks-speed branch from f39b5f5 to d985734 Compare February 20, 2021 15:01
Copy link
Member

@turbaszek turbaszek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pylintrc and pylintrc-tests

👏

@github-actions
Copy link

The Workflow run is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.

@potiuk potiuk force-pushed the improve-static-checks-speed branch from d985734 to 91b2ca3 Compare February 20, 2021 17:37
@github-actions
Copy link

The Workflow run is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.

@potiuk potiuk force-pushed the improve-static-checks-speed branch from 91b2ca3 to a62b7ef Compare February 20, 2021 19:23
@github-actions
Copy link

The Workflow run is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.

@potiuk potiuk force-pushed the improve-static-checks-speed branch from a62b7ef to 6265130 Compare February 20, 2021 23:28
@github-actions
Copy link

The Workflow run is cancelling this PR. Building images for the PR has failed. Follow the the workflow link to check the reason.

@potiuk potiuk force-pushed the improve-static-checks-speed branch from 6265130 to 38e4f06 Compare February 20, 2021 23:47
This change attempts to stabilize pylint checks. Since we have
recently added self-hosted runners with multiple CPUS, seems that
re-enabling parallel mode for pylint makes perfect sense as we will
finally be able to use the parallelism and speed up static checks
significantly.

Previously the tests were run in single-processor mode in attempt
to avoid random mistakes where different files were processed in
different processes. This led to random pylint or mypy problems
and aither false-positives or false negatives before especially
when it came to circular dependencies. but since we are now past
heavy refactoring, this should be no problem for future changes
and occasional false positive/negative is less disruptive than
long checks.

The attempt is made to apply sorting order in the files passed
to pylint. This should provide more stability in the results
of running the tests in PR and in master.

We had some custom pylint plugins that prevented using of pylint
parallelism. For now we are giving up on one of the plugins
(no asserts use) and we rely on committer's review on that (we
have a rule in place to only use asserts in tests). The other
plugin was replaced by coming back to separation of "main code"
and "test code" and applying different rules to those - we have
now two different configuration files|: pylintrc and
pylintrc-tests to control settings for those two different cases.

Mypy and flake8 have been parallelized at the level of pre-commits.

By implementing those changes we are likely to speed up the
tests on self-hosted runners 6x-8x times.

Also caching of pre-commit environments (including the
pre-commit installation) is improved. Previously we only cached the
environment created by the pre-commit, but with this change we
also cache the pre-commit installation in .local directory - this
should save 1-2 minutes of the job.
@potiuk potiuk force-pushed the improve-static-checks-speed branch from 38e4f06 to 8fb7b3b Compare February 21, 2021 09:15
@potiuk potiuk merged commit 82cb041 into apache:master Feb 21, 2021
@potiuk potiuk deleted the improve-static-checks-speed branch February 21, 2021 09:22
potiuk added a commit to potiuk/airflow that referenced this pull request Feb 25, 2021
The apache#14332 change introduced --local flag for static checks and
caching of the pre-commit virtualenv but the necessary PATH change
was not added to `basic_static_checks.sh`. This PR fixes it.
potiuk added a commit that referenced this pull request Feb 25, 2021
The #14332 change introduced --local flag for static checks and
caching of the pre-commit virtualenv but the necessary PATH change
was not added to `basic_static_checks.sh`. This PR fixes it.
potiuk added a commit that referenced this pull request Mar 3, 2021
This change attempts to stabilize pylint checks. Since we have
recently added self-hosted runners with multiple CPUS, seems that
re-enabling parallel mode for pylint makes perfect sense as we will
finally be able to use the parallelism and speed up static checks
significantly.

Previously the tests were run in single-processor mode in attempt
to avoid random mistakes where different files were processed in
different processes. This led to random pylint or mypy problems
and aither false-positives or false negatives before especially
when it came to circular dependencies. but since we are now past
heavy refactoring, this should be no problem for future changes
and occasional false positive/negative is less disruptive than
long checks.

The attempt is made to apply sorting order in the files passed
to pylint. This should provide more stability in the results
of running the tests in PR and in master.

We had some custom pylint plugins that prevented using of pylint
parallelism. For now we are giving up on one of the plugins
(no asserts use) and we rely on committer's review on that (we
have a rule in place to only use asserts in tests). The other
plugin was replaced by coming back to separation of "main code"
and "test code" and applying different rules to those - we have
now two different configuration files|: pylintrc and
pylintrc-tests to control settings for those two different cases.

Mypy and flake8 have been parallelized at the level of pre-commits.

By implementing those changes we are likely to speed up the
tests on self-hosted runners 6x-8x times.

Also caching of pre-commit environments (including the
pre-commit installation) is improved. Previously we only cached the
environment created by the pre-commit, but with this change we
also cache the pre-commit installation in .local directory - this
should save 1-2 minutes of the job.

(cherry picked from commit 82cb041)
potiuk added a commit that referenced this pull request Mar 3, 2021
The #14332 change introduced --local flag for static checks and
caching of the pre-commit virtualenv but the necessary PATH change
was not added to `basic_static_checks.sh`. This PR fixes it.

(cherry picked from commit c4da66c)
potiuk added a commit that referenced this pull request Mar 3, 2021
The #14332 change introduced --local flag for static checks and
caching of the pre-commit virtualenv but the necessary PATH change
was not added to `basic_static_checks.sh`. This PR fixes it.

(cherry picked from commit c4da66c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools full tests needed We need to run full set of tests for this PR to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants