Merged
Conversation
Add support for matching files by their content type (extension, shebang,
or magic number) rather than just glob patterns. This addresses user
requests for matching scripts without file extensions.
**Problem:**
Users want to match files based on their content type (e.g., Python
scripts without .py extension, shell scripts identified by shebang)
rather than relying solely on filename patterns.
**Solution:**
- Added `types` field to Step configuration (OR logic - matches ANY type)
- Implemented file type detection in new `src/file_type.rs` module:
- Extension-based detection (e.g., .py → python)
- Shebang parsing for scripts (e.g., #!/usr/bin/env python3)
- Magic number detection using `infer` crate for binary files
- Special filename detection (e.g., Dockerfile)
- Results are cached for performance
**Supported types include:**
- Languages: python, javascript, typescript, ruby, go, rust, etc.
- Shells: shell, bash, zsh, fish
- Data formats: json, yaml, toml, xml
- Special: text, binary, executable, dockerfile
**Example usage:**
```pkl
["black"] {
types = List("python") // Matches .py files AND Python scripts
fix = "black {{files}}"
}
["shellcheck"] {
types = List("shell") // Matches shell files by extension & shebang
check = "shellcheck {{files}}"
}
```
**Changes:**
- Added `infer` crate dependency for magic number detection
- Created `src/file_type.rs` module with detection logic and tests
- Updated `pkl/Config.pkl` schema with `types` field
- Updated `src/step.rs` to apply type filtering after glob/regex
- Updated builtins (black, shfmt) to use types field
- Added comprehensive bats tests in `test/types.bats`
- Updated documentation with examples and supported types
Fixes: #413
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <[email protected]>
pkl/builtins/shfmt.pkl
Outdated
| stage = glob | ||
| // Matches shell files by extension and shebang (including extensionless scripts) | ||
| types = List("shell") | ||
| glob = List("**/*.bats") // Also include .bats files which may not be detected by shebang |
There was a problem hiding this comment.
Bug: Misclassified .bats files excluded by filters.
The types and glob filters combine with AND logic, but .bats files won't match types = List("shell") since the .bats extension isn't recognized as shell in get_types_by_extension and the #!/usr/bin/env bats shebang isn't handled in detect_shebang. This causes .bats files to be excluded despite matching the glob pattern, breaking shfmt for bats test files.
- Use next_back() instead of last() for DoubleEndedIterator - Remove unnecessary trim() before split_whitespace() - Replace vec! with slices in tests
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Keep builtins using glob patterns for backwards compatibility. Users can opt-in to types in their own configurations.
**Problem:** The function unconditionally inserted "binary" after infer detected a file type, preventing the fallback text-vs-binary detection from executing. Files without recognized magic numbers wouldn't get proper text/binary classification. **Root Cause:** 1. Line 171 inserted "binary" immediately after infer returned a result 2. This made types non-empty, so the condition at line 212 (if types.is_empty()) would never be true 3. The fallback null-byte scanning never ran for unrecognized file types 4. The file variable was opened but unused since infer::get_from_path opens the file independently **Solution:** - Early return after magic number detection succeeds - Only open file once for the null-byte scanning fallback - Now files without magic numbers properly fall through to text/binary detection **Tests:** - Added test_text_file_without_extension() to verify plain text detection - Added test_binary_file_detection() to verify null-byte detection - All existing tests still pass
**Problem:**
Symlinks were detected and returned early with only the "symlink" type,
preventing them from being matched by content-based type filters. When
allow_symlinks is enabled and a user specifies types = List("python"),
symlinks pointing to Python files wouldn't match because get_file_types
never checked the target's extension, shebang, or content.
**Root Cause:**
The early return at line 26 bypassed all subsequent type detection logic.
Symlinks were classified as {"symlink"} only, losing information about
their target file type.
**Solution:**
1. Don't return early for symlinks - continue detection
2. Read the symlink target path when checking filename and extension
3. Use std::fs::metadata (follows symlinks) for executable check
4. Keep symlink_metadata only for detecting the symlink itself
**Result:**
Symlinks now get both "symlink" type AND their target's types:
- Link to .py file: {"symlink", "python", "text"}
- Link to shell script: {"symlink", "shell", "bash", "text"}
**Tests:**
- test_symlink_to_python_file: Verifies symlinks get target types
- test_symlink_matches_target_type: Verifies type filtering works
- All 114 tests pass
Merged
jdx
added a commit
that referenced
this pull request
Nov 15, 2025
## [1.21.0](https://github.com/jdx/hk/compare/v1.20.0..v1.21.0) - 2025-11-15 ### 🚀 Features - **(dprint)** new builtin by [@scop](https://github.com/scop) in [#402](#402) - **(mypy,ruff,ruff_format)** associate with .pyi by [@scop](https://github.com/scop) in [#404](#404) - **(prettier)** support Vue files by [@minusfive](https://github.com/minusfive) in [#388](#388) - **(terraform,tofu)** include .tftest.hcl in glob by [@scop](https://github.com/scop) in [#397](#397) - **(tflint)** add fix command by [@scop](https://github.com/scop) in [#401](#401) - **(typos)** new builtin by [@scop](https://github.com/scop) in [#400](#400) - use recursive glob patterns in all builtins by [@jdx](https://github.com/jdx) in [#383](#383) - shfmt improvements by [@scop](https://github.com/scop) in [#410](#410) - add content-based file type matching by [@jdx](https://github.com/jdx) in [#416](#416) - add clap-sort unit test and sort CLI flags alphabetically by [@jdx](https://github.com/jdx) in [#419](#419) - Add alternate config directory support with tests by [@jdx](https://github.com/jdx) in [#407](#407) ### 🐛 Bug Fixes - **(golangci-lint)** check with --fix=false by [@scop](https://github.com/scop) in [#399](#399) - **(shfmt)** don't pass -s by [@scop](https://github.com/scop) in [#398](#398) - **(tf_lint)** don't pass filenames by [@scop](https://github.com/scop) in [#396](#396) - Import elixir builtins by [@arthurcogo](https://github.com/arthurcogo) in [#390](#390) - Add warning for existing Git hooks path by [@jdx](https://github.com/jdx) in [#409](#409) - prevent untracked files from being staged with <JOB_FILES> by [@jdx](https://github.com/jdx) in [#408](#408) ### 📚 Documentation - Add Elixir builtins to docs by [@arthurcogo](https://github.com/arthurcogo) in [#389](#389) - glossary grammar fix by [@scop](https://github.com/scop) in [#395](#395) - fix link to Pkl language docs by [@scop](https://github.com/scop) in [#394](#394) ### 📦️ Dependency Updates - update anthropics/claude-code-action digest to 8a1c437 by [@renovate[bot]](https://github.com/renovate[bot]) in [#391](#391) - update jdx/mise-action digest to be3be22 by [@renovate[bot]](https://github.com/renovate[bot]) in [#392](#392) - update github artifact actions (major) by [@renovate[bot]](https://github.com/renovate[bot]) in [#393](#393) - update rust crate infer to 0.19 by [@renovate[bot]](https://github.com/renovate[bot]) in [#418](#418) - update jdx/mise-action digest to 9dc7d5d by [@renovate[bot]](https://github.com/renovate[bot]) in [#417](#417) ### New Contributors - @scop made their first contribution in [#410](#410) - @arthurcogo made their first contribution in [#390](#390) - @minusfive made their first contribution in [#388](#388) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Cut v1.21.0: bump versions, add changelog, refresh CLI docs/usage with alphabetized flags, update templates/messages, and tweak deps. > > - **Release 1.21.0**: > - Bump `hk` version to `1.21.0` across `Cargo.toml`, `Cargo.lock`, `hk.usage.kdl`, docs, and example configs. > - Add `CHANGELOG.md` entry for 1.21.0. > - **CLI/Docs**: > - Regenerate `docs/cli/commands.json`, CLI markdown, and `hk.usage.kdl` with alphabetized flags and reordered options (e.g., `--fix`, `--from-ref`, `--to-ref`, `--skip-step`). > - Update migration docs flag order (`--force` before `--output`). > - Update `hk test` flags (add `--list`, reorder `--name`/`--step`). > - **Templates/References**: > - Update Pkl `amends/import` URLs in examples and generator (`src/cli/init.rs`) to `v1.21.0`. > - Update Pkl error hint in `src/config.rs` to reference `v1.21.0`. > - **Dependencies**: > - Bump crates (`bytes` 1.11.0, `serde_with` 3.16.0) in `Cargo.lock`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 93af5e6. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> Co-authored-by: mise-en-dev <[email protected]>
tmeijn
pushed a commit
to tmeijn/dotfiles
that referenced
this pull request
Nov 21, 2025
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [hk](https://github.com/jdx/hk) | minor | `1.20.0` -> `1.22.0` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>jdx/hk (hk)</summary> ### [`v1.22.0`](https://github.com/jdx/hk/blob/HEAD/CHANGELOG.md#1220---2025-11-19) [Compare Source](jdx/hk@v1.21.1...v1.22.0) ##### 🚀 Features - Add `stdin` to step config by [@​thejcannon](https://github.com/thejcannon) in [#​435](jdx/hk#435) ##### 🐛 Bug Fixes - save patch backup files when using git as stash method by [@​jdx](https://github.com/jdx) in [#​434](jdx/hk#434) ##### 📚 Documentation - Clarify `stash` default (behavior) by [@​thejcannon](https://github.com/thejcannon) in [#​431](jdx/hk#431) - Clarify hook fix default by [@​thejcannon](https://github.com/thejcannon) in [#​433](jdx/hk#433) ### [`v1.21.1`](https://github.com/jdx/hk/blob/HEAD/CHANGELOG.md#1211---2025-11-19) [Compare Source](jdx/hk@v1.21.0...v1.21.1) ##### 🐛 Bug Fixes - **(ruff)** Make `ruff` respect user config `exclude` by [@​thejcannon](https://github.com/thejcannon) in [#​421](jdx/hk#421) - **(ruff\_format)** Pass `--force-exclude` to `ruff format` (as well) by [@​thejcannon](https://github.com/thejcannon) in [#​428](jdx/hk#428) - Fix --check docstring by [@​thejcannon](https://github.com/thejcannon) in [#​423](jdx/hk#423) - Configuration Read Support YML File Extension by [@​hcoona](https://github.com/hcoona) in [#​427](jdx/hk#427) - treat check\_list\_files stderr as informational, not an error by [@​jdx](https://github.com/jdx) in [#​425](jdx/hk#425) - remove trailing whitespace in ruff\_format.pkl by [@​jdx](https://github.com/jdx) in [9f4abdc](jdx/hk@9f4abdc) ##### 🚜 Refactor - Enable `trailing-whitespace` in this repo by [@​thejcannon](https://github.com/thejcannon) in [#​429](jdx/hk#429) ##### 📚 Documentation - Don't suggest configuring hk in config env by [@​thejcannon](https://github.com/thejcannon) in [#​424](jdx/hk#424) ##### New Contributors - [@​thejcannon](https://github.com/thejcannon) made their first contribution in [#​428](jdx/hk#428) - [@​hcoona](https://github.com/hcoona) made their first contribution in [#​427](jdx/hk#427) ### [`v1.21.0`](https://github.com/jdx/hk/blob/HEAD/CHANGELOG.md#1210---2025-11-15) [Compare Source](jdx/hk@v1.20.0...v1.21.0) ##### 🚀 Features - **(dprint)** new builtin by [@​scop](https://github.com/scop) in [#​402](jdx/hk#402) - **(mypy,ruff,ruff\_format)** associate with .pyi by [@​scop](https://github.com/scop) in [#​404](jdx/hk#404) - **(prettier)** support Vue files by [@​minusfive](https://github.com/minusfive) in [#​388](jdx/hk#388) - **(terraform,tofu)** include .tftest.hcl in glob by [@​scop](https://github.com/scop) in [#​397](jdx/hk#397) - **(tflint)** add fix command by [@​scop](https://github.com/scop) in [#​401](jdx/hk#401) - **(typos)** new builtin by [@​scop](https://github.com/scop) in [#​400](jdx/hk#400) - use recursive glob patterns in all builtins by [@​jdx](https://github.com/jdx) in [#​383](jdx/hk#383) - shfmt improvements by [@​scop](https://github.com/scop) in [#​410](jdx/hk#410) - add content-based file type matching by [@​jdx](https://github.com/jdx) in [#​416](jdx/hk#416) - add clap-sort unit test and sort CLI flags alphabetically by [@​jdx](https://github.com/jdx) in [#​419](jdx/hk#419) - Add alternate config directory support with tests by [@​jdx](https://github.com/jdx) in [#​407](jdx/hk#407) ##### 🐛 Bug Fixes - **(golangci-lint)** check with --fix=false by [@​scop](https://github.com/scop) in [#​399](jdx/hk#399) - **(shfmt)** don't pass -s by [@​scop](https://github.com/scop) in [#​398](jdx/hk#398) - **(tf\_lint)** don't pass filenames by [@​scop](https://github.com/scop) in [#​396](jdx/hk#396) - Import elixir builtins by [@​arthurcogo](https://github.com/arthurcogo) in [#​390](jdx/hk#390) - Add warning for existing Git hooks path by [@​jdx](https://github.com/jdx) in [#​409](jdx/hk#409) - prevent untracked files from being staged with \<JOB\_FILES> by [@​jdx](https://github.com/jdx) in [#​408](jdx/hk#408) ##### 📚 Documentation - Add Elixir builtins to docs by [@​arthurcogo](https://github.com/arthurcogo) in [#​389](jdx/hk#389) - glossary grammar fix by [@​scop](https://github.com/scop) in [#​395](jdx/hk#395) - fix link to Pkl language docs by [@​scop](https://github.com/scop) in [#​394](jdx/hk#394) ##### 📦️ Dependency Updates - update anthropics/claude-code-action digest to [`8a1c437`](jdx/hk@8a1c437) by [@​renovate\[bot\]](https://github.com/renovate\[bot]) in [#​391](jdx/hk#391) - update jdx/mise-action digest to [`be3be22`](jdx/hk@be3be22) by [@​renovate\[bot\]](https://github.com/renovate\[bot]) in [#​392](jdx/hk#392) - update github artifact actions (major) by [@​renovate\[bot\]](https://github.com/renovate\[bot]) in [#​393](jdx/hk#393) - update rust crate infer to 0.19 by [@​renovate\[bot\]](https://github.com/renovate\[bot]) in [#​418](jdx/hk#418) - update jdx/mise-action digest to [`9dc7d5d`](jdx/hk@9dc7d5d) by [@​renovate\[bot\]](https://github.com/renovate\[bot]) in [#​417](jdx/hk#417) ##### New Contributors - [@​scop](https://github.com/scop) made their first contribution in [#​410](jdx/hk#410) - [@​arthurcogo](https://github.com/arthurcogo) made their first contribution in [#​390](jdx/hk#390) - [@​minusfive](https://github.com/minusfive) made their first contribution in [#​388](jdx/hk#388) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0MS4xNzMuMSIsInVwZGF0ZWRJblZlciI6IjQxLjE3My4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds support for matching files by their content type (extension, shebang, or magic number) rather than just glob patterns. This addresses the feature request in #413 for matching scripts without file extensions.
Problem
Users previously had to use complex glob patterns with excludes to match files, making it difficult to handle scripts without file extensions. For example, matching Python scripts that use
#!/usr/bin/env python3but don't have a.pyextension required error-prone regex patterns.Solution
Added a new
typesfield to the Step configuration that uses OR logic (file matches ANY of the specified types):Implementation
File type detection in new
src/file_type.rsmodule with caching:.py→python)#!/usr/bin/env python3)infercrate for binary filesDockerfile)Supported types include:
Updated builtins (black, shfmt) to use the
typesfieldTest Plan
src/file_type.rsfor all detection methodstest/types.bats:Migration
This is fully backwards compatible - existing glob patterns continue to work. The
typesfield is optional and can be used alongsideglobpatterns for more precise filtering.🤖 Generated with Claude Code
Note
Introduces file type detection (extension, shebang, magic) and a new
Step.typesfilter, with docs/tests andinferdependency.src/file_type.rswith cached detection by extension, shebang, and magic numbers (viainfer); handles symlinks/executables.Stepwithtypes: List<String>and filter files accordingly; update skip logic; wire module insrc/main.rs.typesfield topkl/Config.pklStepschema.<STEP>.typesindocs/configuration.mdwith examples and supported types.src/file_type.rsand integration tests intest/types.bats.infer(plus transitivecfb,uuid) inCargo.toml/Cargo.lock.Written by Cursor Bugbot for commit 74ac521. This will update automatically on new commits. Configure here.