[CLI] [API] Add HfApi.copy_files method to copy files remotely and update 'hf buckets cp' #3874
[CLI] [API] Add HfApi.copy_files method to copy files remotely and update 'hf buckets cp' #3874
HfApi.copy_files method to copy files remotely and update 'hf buckets cp' #3874Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
Current status: x-bucket copies do not work. Need an extra call to CAS to tell about the xethash being registered in destination bucket |
HfApi.copy_files method to copy files remotelyHfApi.copy_files method to copy files remotely and update 'hf buckets cp'
| else: | ||
| all_adds.append((_download_from_repo(file.path), target_path)) |
There was a problem hiding this comment.
sub-optimal: could be parallelize but that's not something we want to optimize for now
|
|
||
|
|
||
| def _parse_hf_copy_handle(hf_handle: str) -> _BucketCopyHandle | _RepoCopyHandle: | ||
| # TODO: Harmonize hf:// parsing. See https://github.com/huggingface/huggingface_hub/issues/3971 |
There was a problem hiding this comment.
yes, #3971 is getting high in my priorities 🙈
hanouticelina
left a comment
There was a problem hiding this comment.
Made a first pass!
Co-authored-by: célina <[email protected]>
…gface_hub into feat/hfapi-copy-files
Co-authored-by: célina <[email protected]>
…gface_hub into feat/hfapi-copy-files t pu#
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9db2cec. Configure here.
|
|
||
| Notes: | ||
|
|
||
| - Bucket-to-repo copy is not supported. |

Note:
requires https://github.com/huggingface-internal/moon-landing/pull/17593 to be merged first. EDIT: merged!This PR adds a new
HfApi.copy_filesAPI and extendshf buckets cpto support remote HF-handle copy workflows.If source is a file, copies it. If a directory, recursively copy files under source folder.
xet_hash: copied directly by hashxet_hash(regular small file): download then re-uploadSee https://github.com/huggingface-internal/moon-landing/pull/17593#issue-4201288199 PR description for working test.
Tested on https://huggingface.co/buckets/Wauplin/bucket-raw
Note
Medium Risk
Introduces new path parsing and copy semantics (including revision handling and mixed copy/download code paths) plus changes bucket batch operation payloads/chunking, which could impact correctness and performance of bucket file operations.
Overview
Adds
HfApi.copy_files(exported ascopy_files) to copy a file or folder from anhf://bucket or repo (model/dataset/space, with optional@revision) into a bucket destination, using server-side hash copies when possible and falling back to download+reupload for non-Xet repo files.Extends
batch_bucket_fileswith a newcopyoperation type (NDJSONcopyFile) and updates internal batching/chunk sizing and upload logic so copy-by-hash operations can be sent without uploading data.Updates
hf buckets cpto accept generichf://...handles and enable remote-to-remote copies viaapi.copy_files, plus adds docs and tests covering bucket↔bucket, repo→bucket, and rejected bucket→repo cases.Reviewed by Cursor Bugbot for commit 5508120. Bugbot is set up for automated code reviews on this repo. Configure here.