Apply fsspec config in HfFileSystem metaclass#4062
Conversation
The HfFileSystem metaclass _Cached overrides fsspec.spec._Cached.__call__ but never calls apply_config, so fsspec config values (from environment variables or config files) are silently dropped for HfFileSystem while they work for every other fsspec filesystem. Call apply_config on kwargs at the start of __call__, matching the upstream fsspec._Cached behavior. Fixes huggingface#3996
|
Hi @joaquinhuigomez, thanks for the PR! Do you have a minimal example showcasing the fix? i.e. a simple test that I could run both on also cc @lhoestq for viz' |
lhoestq
left a comment
There was a problem hiding this comment.
LGTM for consistency with fsspec
it means that fsspec config has precedence over huggingface_hub constants though but it makes sense imo (it's also the case for s3fs for example)
apply_config is also an old/mature function so I'm fine with using it as is
|
Thanks for the fix @joaquinhuigomez and thanks for reviewing @lhoestq! I'm fine with merging as soon as the CI is green :) |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
This PR has been shipped as part of the v1.12.0 release. |
HfFileSystem's
_Cachedmetaclass overridesfsspec.spec._Cached.__call__but skips theapply_configcall that upstream fsspec uses to read config values from environment variables and config files. As a result,HfFileSystemsilently ignores fsspec config that works on every other filesystem backend.This patch calls
apply_config(cls, kwargs)at the start of__call__, matching upstream fsspec behavior.Fixes #3996
Note
Medium Risk
Changes
HfFileSysteminstantiation/caching by applying fsspec env/config defaults before tokenization, which can alter cache keys and runtime options for existing users relying on implicit defaults.Overview
Ensures
HfFileSystemhonors standard fsspec configuration (env vars / config files) by callingapply_configin the custom_Cached.__call__before computing the instance cache token.This aligns
HfFileSystembehavior with upstream fsspec filesystems so default options from config are applied consistently when creating/reusing cached filesystem instances.Reviewed by Cursor Bugbot for commit d89bbf8. Bugbot is set up for automated code reviews on this repo. Configure here.