Skip to content

feat(check-builtin-literals): implement builtin hook#890

Closed
lmmx wants to merge 1 commit intoj178:masterfrom
lmmx:check-builtin-literals
Closed

feat(check-builtin-literals): implement builtin hook#890
lmmx wants to merge 1 commit intoj178:masterfrom
lmmx:check-builtin-literals

Conversation

@lmmx
Copy link
Copy Markdown
Collaborator

@lmmx lmmx commented Oct 15, 2025

Description

I picked this from the checklist of builtin hooks to support in #880 because Apache Airflow is using it, and it's the slowest unported running one of their hooks now.

This hook catches unnecessary builtin type constructor calls and suggests replacing them with their literal equivalents (list() to [] etc.).

Binary Size Change
+9.38% (.text: 16.0 MiB → 17.5 MiB)

Demo

Running this on apache/airflow (who use prek!) takes quite a long time for a hook:

louis 🌟 ~/tmp/airflow $ prek run -a --config .pre-commit-config.yaml --verbose
Require literal syntax when initializing builtins........................Passed
- hook id: check-builtin-literals
- duration: 0.70s

pre-commit is about the same speed (4% faster)

louis 🌟 ~/tmp/airflow $ pre-commit run -a --config .pre-commit-config.yaml --verbose                                                                                                                               
[WARNING] Unexpected key(s) present at root: minimum_prek_version                                         
Check that merge conflicts are not being committed.......................Passed                           
- hook id: check-merge-conflict                      
- duration: 0.08s                                    
Detect accidentally committed debug statements...........................Passed                           
- hook id: debug-statements                          
- duration: 0.79s                                    
Require literal syntax when initializing builtins........................Passed                           
- hook id: check-builtin-literals                    
- duration: 0.67s        

With prek installed from this feature branch, it runs in exactly the same amount of time! And also fails!

louis 🌟 ~/tmp/airflow $ cargo install --path ~/dev/prek 2>/dev/null
louis 🌟 ~/tmp/airflow $ prek run -a --config .pre-commit-config.yaml --verbose
Require literal syntax when initializing builtins........................Failed
- hook id: check-builtin-literals
- duration: 0.60s
- exit code: 1
  airflow-core/tests/unit/serialization/test_serde.py:97:45: replace list() with []
  airflow-core/tests/unit/utils/test_db_cleanup.py:91:25: replace dict() with {}
  airflow-core/tests/unit/utils/test_db_cleanup.py:115:25: replace dict() with {}
  providers/google/tests/unit/google/cloud/sensors/test_dataproc_metastore.py:111:48: replace dict() with {}
  providers/google/tests/unit/google/cloud/sensors/test_dataproc_metastore.py:111:56: replace list() with []
  providers/google/tests/unit/google/cloud/sensors/test_dataproc_metastore.py:111:64: replace tuple() with ()
  airflow-core/tests/unit/dag_processing/test_collection.py:371:69: replace dict() with {}
  airflow-core/tests/unit/dag_processing/test_collection.py:470:65: replace dict() with {}
  airflow-core/tests/unit/dag_processing/test_collection.py:583:65: replace dict() with {}
  airflow-core/tests/unit/dag_processing/test_collection.py:613:65: replace dict() with {}
  providers/apache/cassandra/tests/unit/apache/cassandra/sensors/test_record.py:38:29: replace dict() with {}
  providers/apache/cassandra/tests/unit/apache/cassandra/sensors/test_record.py:53:29: replace dict() with {}
  providers/apache/cassandra/tests/unit/apache/cassandra/sensors/test_record.py:70:29: replace dict() with {}
  providers/google/tests/unit/google/cloud/transfers/test_sql_to_gcs.py:119:42: replace dict() with {}
  providers/google/tests/unit/google/cloud/transfers/test_sql_to_gcs.py:174:42: replace dict() with {}
  providers/google/tests/unit/google/cloud/transfers/test_sql_to_gcs.py:218:42: replace dict() with {}
  providers/google/tests/unit/google/cloud/transfers/test_sql_to_gcs.py:262:42: replace dict() with {}
  providers/google/tests/unit/google/cloud/transfers/test_sql_to_gcs.py:307:42: replace dict() with {}
  providers/google/tests/unit/google/cloud/transfers/test_sql_to_gcs.py:369:42: replace dict() with {}
  providers/cncf/kubernetes/tests/unit/cncf/kubernetes/test_pod_generator.py:748:25: replace dict() with {}
  providers/apache/cassandra/tests/unit/apache/cassandra/sensors/test_table.py:37:29: replace dict() with {}
  providers/apache/cassandra/tests/unit/apache/cassandra/sensors/test_table.py:53:29: replace dict() with {}
  providers/apache/cassandra/tests/unit/apache/cassandra/sensors/test_table.py:67:29: replace dict() with {}
  task-sdk/tests/task_sdk/bases/test_operator.py:881:12: replace dict() with {}
  providers/amazon/src/airflow/providers/amazon/aws/executors/ecs/ecs_executor_config.py:52:17: replace dict() with {}
  providers/amazon/src/airflow/providers/amazon/aws/executors/batch/batch_executor_config.py:48:74: replace dict() with {}
  providers/openlineage/tests/unit/openlineage/test_conf.py:267:11: replace dict() with {}
  providers/apache/hdfs/tests/unit/apache/hdfs/sensors/test_web_hdfs.py:39:29: replace dict() with {}
  providers/apache/hdfs/tests/unit/apache/hdfs/sensors/test_web_hdfs.py:55:29: replace dict() with {}
  providers/apache/hdfs/tests/unit/apache/hdfs/sensors/test_web_hdfs.py:75:29: replace dict() with {}
  providers/apache/hdfs/tests/unit/apache/hdfs/sensors/test_web_hdfs.py:93:29: replace dict() with {}
  dev/breeze/tests/test_selective_checks.py:1173:18: replace tuple() with ()
  providers/amazon/tests/unit/amazon/aws/sensors/test_s3.py:558:36: replace dict() with {}
  providers/amazon/tests/unit/amazon/aws/sensors/test_s3.py:560:36: replace dict() with {}
  providers/amazon/tests/unit/amazon/aws/sensors/test_s3.py:562:32: replace dict() with {}

I was confused why this gives different results, but they don't look wrong. In fact they seem to have just been missed by the Python version!

These are all valid hits:

louis 🌟 ~/tmp/airflow $ rg 'dict\(\)' airflow-core/tests/unit/utils/test_db_cleanup.py
91:            pytest.param(dict(), True, id="not supplied"),
115:            pytest.param(dict(), False, id="not supplied"),
louis 🌟 ~/tmp/airflow $ rg 'dict\(\)' airflow-core/tests/unit/dag_processing/test_collection.py
371:            update_dag_parsing_results_in_db("testing", None, [dag], dict(), None, set(), session)
470:        update_dag_parsing_results_in_db("testing", None, [dag], dict(), parse_duration, set(), session)
583:        update_dag_parsing_results_in_db("testing", None, [dag], dict(), None, set(), session)
613:        update_dag_parsing_results_in_db("testing", None, [dag], dict(), None, set(), session)

As well as finding more hits, the Rust version finishes 15% faster, in just 0.60s 🎉

@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 15, 2025

Codecov Report

❌ Patch coverage is 96.08541% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.00%. Comparing base (7069606) to head (7a953dc).
⚠️ Report is 131 commits behind head on master.

Files with missing lines Patch % Lines
...builtin/pre_commit_hooks/check_builtin_literals.rs 96.05% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #890      +/-   ##
==========================================
- Coverage   90.02%   90.00%   -0.02%     
==========================================
  Files          64       65       +1     
  Lines       11999    12222     +223     
==========================================
+ Hits        10802    11001     +199     
- Misses       1197     1221      +24     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lmmx lmmx force-pushed the check-builtin-literals branch from 640b596 to 02dd3ed Compare October 15, 2025 11:51
@lmmx lmmx mentioned this pull request Oct 15, 2025
34 tasks
@lmmx lmmx force-pushed the check-builtin-literals branch 2 times, most recently from 6fcfb6e to 0b88169 Compare October 15, 2025 13:06
@lmmx lmmx mentioned this pull request Oct 15, 2025
2 tasks
@lmmx lmmx force-pushed the check-builtin-literals branch 4 times, most recently from 5e43fa7 to 7eb1764 Compare October 19, 2025 08:58
@lmmx lmmx force-pushed the check-builtin-literals branch from 7eb1764 to 7a953dc Compare October 19, 2025 10:24
@j178
Copy link
Copy Markdown
Owner

j178 commented Nov 19, 2025

Similar to #884 (comment), I’m gonna close this one too. Thanks for your effort though!

@j178 j178 closed this Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants