Conversation
Signed-off-by: jakmro <[email protected]>
Signed-off-by: jakmro <[email protected]>
Signed-off-by: jakmro <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR updates the CLI and publishing workflow to support downloading pre-converted model weights from a Hugging Face org by default (with an option to force reconversion), and refactors the Hugging Face publishing script/workflow to publish models individually (optionally including Apple exports).
Changes:
- Default CLI
downloadpath now attempts to fetch pre-converted weights from a Hugging Face org, with--reconvertto force source conversion. - Refactors
publish_to_hf.pyto publish a single model per invocation (via--task export_model) and optionally export Apple weights (--apple). - Updates VAD default model ID to
snakers4/silero-vadand modernizes the publish GitHub Action to accept workflow inputs + publish lists from env.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
tests/run.sh |
Updates default VAD model ID used by the test runner. |
python/src/publish_to_hf.py |
Refactors publishing into explicit tasks; adds per-model publish flow and optional Apple export zip generation. |
python/src/cli.py |
Adds HF pre-converted weights download flow and --reconvert flag plumbing across commands. |
.github/workflows/publish_to_hf.yml |
Adds workflow_dispatch inputs, moves model lists into env, publishes models in a loop, and updates org README separately. |
Comments suppressed due to low confidence (2)
python/src/publish_to_hf.py:250
main()returns status codes, but__main__callsmain()withoutsys.exit(...), so the process will always exit 0 and the GitHub Action can succeed even when publishing fails. Importsysand callsys.exit(main())(or otherwise propagate the return code).
if __name__ == "__main__":
main()
python/src/publish_to_hf.py:107
- The log message "Failed to export pro weights" is now triggered by the
--applepath and is confusing. Also,export_pro_weights()can returnNone, which will raise insideshutil.move(...)and get swallowed. Consider explicitly handling theNonecase and updating the message to reflect Apple export, optionally including the underlying exception for debugging.
if export_apple:
try:
mlpackage = export_pro_weights(model_id, bits)
shutil.move(mlpackage, weights_out)
model_pro_zip = stage / "weights" / f"{model_name_lower}-apple.zip"
zip_dir(weights_out, model_pro_zip)
fingerprint.update(sha256(model_pro_zip).encode())
config["bits"] = bits
except Exception:
print("Failed to export pro weights")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| print_color(YELLOW, "Extracting model weights...") | ||
| with zipfile.ZipFile(zip_path, 'r') as zip_ref: | ||
| zip_ref.extractall(weights_dir) |
There was a problem hiding this comment.
zipfile.ZipFile.extractall() is used on a downloaded zip without validating member paths. A malicious zip can write outside weights_dir (Zip Slip). Validate that each member resolves within weights_dir (or implement a safe extraction helper) before extracting.
| weights_dir.mkdir(parents=True, exist_ok=True) | ||
|
|
||
| print_color(YELLOW, "Extracting model weights...") | ||
| with zipfile.ZipFile(zip_path, 'r') as zip_ref: | ||
| zip_ref.extractall(weights_dir) |
There was a problem hiding this comment.
download_from_hf() extracts into weights_dir without clearing it first. If weights_dir already exists but is incomplete/corrupt (e.g., missing config.txt), this can mix old and new files and still fail later. Consider removing the directory (or extracting into a temp dir and atomically replacing) before extractall().
| model_name = get_model_dir_name(model_id) | ||
| org = "cactus-compute" | ||
| repo_id = f"{org}/{model_id.split('/')[-1]}" |
There was a problem hiding this comment.
The Hugging Face org is hard-coded as cactus-compute here, while the publish workflow defaults to Cactus-Compute. If the namespace casing differs, downloads will fail. Consider making the org configurable (flag/env) and/or using a single constant shared with the publishing path.
| org: | ||
| description: Organization | ||
| required: false | ||
| default: Cactus-Compute |
There was a problem hiding this comment.
The workflow defaults the publish org to Cactus-Compute, but the CLI download path expects pre-converted weights under cactus-compute. If these differ on Hugging Face, publishing and downloads won’t line up. Align the default org (and casing) across tooling, or pass the same org value into both publishing and download logic.
| default: Cactus-Compute | |
| default: cactus-compute |
* pre-converted model downloads from HuggingFace Signed-off-by: jakmro <[email protected]> * clean Signed-off-by: jakmro <[email protected]> --------- Signed-off-by: jakmro <[email protected]>
* pre-converted model downloads from HuggingFace Signed-off-by: jakmro <[email protected]> * clean Signed-off-by: jakmro <[email protected]> --------- Signed-off-by: jakmro <[email protected]>
No description provided.