Background
When we started using pre-commit-dbt (the original project) we found that some hooks did not exhibit what we believe is the "correct" behavior. We made updates to the way some hooks worked and merged those changes. Now that we have moved to dbt-checkpoint and officially updated the version, some people are running into issues due to the new behavior.
What were we trying to solve for?
We noticed that a hook we were using check_model_has_all_columns was not always catching issues. The root cause was that it only checked that a property (yml) file contained all the columns in a model (sql) file if the model was changed. However, if the property file was the only file changed then it was not compared to the model file. This would allow someone to delete or add a column to the property file even though it did not match the model file. We felt this check should assure that these two files are always in sync and as such it should check if either file is changed.
How was this issue addressed?
We "fill in" the missing files so that if the yml file is changed we find the corresponding sql file so we can make sure we do the proper check.
Side effects
@followingell reported an issue with check-model-has-tests where excluded files were being included in the check. @karabulute found what we had implemented and submitted a PR that removed the functionality we added.
Our Opinion
We don't believe we should remove get_missing_file_paths from all the hooks because without this we are indirectly not complying with the spirit of the hook.
Proposal
We propose we add a parameter to the hooks that have this behavior so that files can be excluded, but we cannot use the pre-commit exclude parameter because we don't have that information in dbt-checkpoint.
Instead of doing this
- id: check-model-has-tests
description: "Ensures that the model has a number of tests"
args: ["--test-cnt", "1", "--"]
exclude: |
(?x)(
models/demo
)
We propose this
- id: check-model-has-tests
description: "Ensures that the model has a number of tests"
args: ["--test-cnt", "1", "--exclude","models/demo", "--"]
Which hooks are Impacted
Hooks that implement yml/sql file discovery:
- Discover SQL:
- Discover YML:
- Discover both:
Background
When we started using pre-commit-dbt (the original project) we found that some hooks did not exhibit what we believe is the "correct" behavior. We made updates to the way some hooks worked and merged those changes. Now that we have moved to dbt-checkpoint and officially updated the version, some people are running into issues due to the new behavior.
What were we trying to solve for?
We noticed that a hook we were using
check_model_has_all_columnswas not always catching issues. The root cause was that it only checked that a property (yml) file contained all the columns in a model (sql) file if the model was changed. However, if the property file was the only file changed then it was not compared to the model file. This would allow someone to delete or add a column to the property file even though it did not match the model file. We felt this check should assure that these two files are always in sync and as such it should check if either file is changed.How was this issue addressed?
We "fill in" the missing files so that if the yml file is changed we find the corresponding sql file so we can make sure we do the proper check.
Side effects
@followingell reported an issue with check-model-has-tests where
excludedfiles were being included in the check. @karabulute found what we had implemented and submitted a PR that removed the functionality we added.Our Opinion
We don't believe we should remove
get_missing_file_pathsfrom all the hooks because without this we are indirectly not complying with the spirit of the hook.Proposal
We propose we add a parameter to the hooks that have this behavior so that files can be excluded, but we cannot use the pre-commit
excludeparameter because we don't have that information in dbt-checkpoint.Instead of doing this
We propose this
Which hooks are Impacted
Hooks that implement yml/sql file discovery: