Implement check-xml as builtin hook#894
Merged
j178 merged 2 commits intoj178:masterfrom Oct 17, 2025
Merged
Conversation
34 tasks
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #894 +/- ##
==========================================
+ Coverage 89.98% 90.14% +0.15%
==========================================
Files 63 64 +1
Lines 11756 11939 +183
==========================================
+ Hits 10579 10762 +183
Misses 1177 1177 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Owner
|
Thank you! I’m not really keen on pulling in a big dependency just for one hook, unless it’s something super commonly used (like check-toml). Like other hook ports, could you check how much the binary size changes? |
Collaborator
Author
|
Sure let me take a look... cargo build --release
du -h prek
So +0.18% bigger binary, not much. It is based on things we already have in the compiled tree I think: see its Cargo.toml |
Owner
|
That's great! |
2 tasks
55deba8 to
c00932e
Compare
Collaborator
Author
|
Merge conflicts resolved |
# Conflicts: # tests/builtin_hooks.rs
check-xml as builtin hook
Owner
|
Thank you! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
To go with the existing format hooks, this is a hook to fast path the check XML hook. Follows #880 where it was noted as one of the hooks used by Airflow.
I tried a few different libraries: Python's XML checker uses a SAX parser (
xml.sax.handlerin the original), from the standard library (here)Given that none of them matched the behaviour of the Python SAX parser, I then just modified the quick-xml result to reject that, and also to reject empty files (which the Python one rejects but this didn't).
I used this script to test in a
~/tmpsubdirClick to show repro script
I then ran it with this branch of prek and regular pre-commit and compared their results to ensure we maintain the same outcomes
The same files are rejected
The run takes 0.00s instead of 0.05s
New quick-xml dependency increases compilation time from 21.8s to 22.3s (+2%)
Unit and snapshot tests added and compared against check-xml behaviour
Demo
With the Apache Airflow repo, pre-commit takes 0.1s
louis 🌟 ~/tmp/airflow $ pre-commit run -a --verbose Check XML files with xmllint.............................................Passed - hook id: check-xml - duration: 0.1sWhile prek takes 0.01s (10x faster) 🎉
louis 🌟 ~/tmp/airflow $ prek run -a --verbose Check XML files with xmllint.............................................Passed - hook id: check-xml - duration: 0.01s