When globbing for wheels, only consider most recent version; do not clean up dist/build/__pycache__#2982
When globbing for wheels, only consider most recent version; do not clean up dist/build/__pycache__#2982
Conversation
5dfdb25 to
45dc590
Compare
pietern
left a comment
There was a problem hiding this comment.
Looks good. The sync exclude can be omitted, IMO.
| === Expecting 1 wheel to be uploaded | ||
| >>> jq .path | ||
| "/api/2.0/workspace-files/import-file/Workspace/Users/[USERNAME]/.bundle/test-bundle/default/artifacts/.internal/my_test_code-0.1.0-py3-none-any.whl" | ||
| "/api/2.0/workspace-files/import-file/Workspace/Users/[USERNAME]/.bundle/test-bundle/default/artifacts/.internal/my_test_code-0.2.0-py3-none-any.whl" |
There was a problem hiding this comment.
| dist/my_test_code-0.2.0-py3-none-any.whl | ||
|
|
||
| === Expecting 1 wheel in libraries section in /jobs/create | ||
| >>> jq -s .[] | select(.path=="/api/2.2/jobs/create") | .body.tasks out.requests.txt |
There was a problem hiding this comment.
This will call the update API.
| >>> [CLI] bundle deploy | ||
| Building python_artifact... | ||
| Uploading dist/my_test_code-0.1.0-py3-none-any.whl... | ||
| Uploading dist/my_test_code-0.2.0-py3-none-any.whl... |
There was a problem hiding this comment.
Why are both uploaded if the glob only matches the 0.1.0 one?
There was a problem hiding this comment.
The glob in libraries matches 0.1.0 but the glob in artifacts section matches 0.2.0.
Those filters operate independently from different mutators so each makes their own decision.
Perhaps it's a sign that glob expansion should be done from a single place with a global tracker of wheel versions.
There was a problem hiding this comment.
Is the glob in artifacts implicit or a default? If so, that seems surprising.
There was a problem hiding this comment.
Is the glob in artifacts implicit or a default?
Not sure what you mean? implicit == default, no?
If so, that seems surprising.
Somewhat. It would also be surprising if default *.whl behaved differently from explicit *.whl
bundle/deploy/files/sync.go
Outdated
|
|
||
| // We used to delete __pycache__ and build and most of the dist, so now we're excluding it manually | ||
| // TODO: if users include those manually, then we should not exclude it? | ||
| excludes := append(b.Config.Sync.Exclude, "__pycache__", "build", "dist") |
There was a problem hiding this comment.
This can be a regression. Most users will exclude dist in their .gitignore anyway. If they have an artifact build and include it in their sync configuration, then it would have been synchronized just fine previously.
By not deleting it, we can potentially synchronize more than we need, but still should observe the .gitignore rules and let it happen, so that the remote tree matches what you have locally.
There was a problem hiding this comment.
My concern was that we might start uploading a bunch of stuff for some users that we did not do before.
However, there is self-serving fix (update .gitignore or sync.exclude) so it's not a big deal.
It's hard to come up with sync patterns that match old behaviour exactly, so I removed this.
| @@ -0,0 +1,57 @@ | |||
| package patchwheel | |||
There was a problem hiding this comment.
Feels like this belongs in libs/python more than patchwheel. WDYT?
There was a problem hiding this comment.
libs/python is about finding python interpreter.
libs/patchwheel is about patching wheel but also parsing wheel names. It's a better fit but perhaps the name should be changed to something like wheel.
There was a problem hiding this comment.
I'd also say that it better belongs to libs/python or renamed to libs/wheel
| print_status | ||
|
|
||
| title "Update glob to target 0.1.0 version\n" | ||
| update_file.py databricks.yml './dist/*.whl' './dist/my*0.1.0*.whl' |
There was a problem hiding this comment.
With this new glob it doesn't exercise the latest-version-matching code.
There was a problem hiding this comment.
Yes, this tests that old wheel is still present, as in #2969
Still, good point, I changed the test to exercise new code as well.
| dist/my_test_code-0.1.0-py3-none-any.whl | ||
|
|
||
| === Expecting 1 wheel in libraries section in /jobs/create | ||
| >>> jq -s .[] | select(.path=="/api/2.2/jobs/create") | .body.tasks out.requests.txt |
There was a problem hiding this comment.
It seems like nothing is matched here, is it expected?
There was a problem hiding this comment.
fixed to look for /reset instead.
## Changes - Insert regular glob into default settings instead of custom function for finding wheels. - Remove code to explicitly fail the build if wheels are not found. ## Why - Helps in #2982 where we modify glob expansion to filter out same-name wheels. - Failed build will stop deploy regardless, because there is return code check and glob expansion check, no need for custom code there. - Simpler & more consistent behaviour: whl behave like those without 'type: whl'. ## Tests Existing tests. --------- Co-authored-by: Pieter Noordhuis <[email protected]>
45dc590 to
b009173
Compare
| @@ -0,0 +1,57 @@ | |||
| package patchwheel | |||
There was a problem hiding this comment.
I'd also say that it better belongs to libs/python or renamed to libs/wheel
| >>> [CLI] bundle deploy | ||
| Building python_artifact... | ||
| Uploading dist/my_test_code-0.1.0-py3-none-any.whl... | ||
| Uploading dist/my_test_code-0.2.0-py3-none-any.whl... |
There was a problem hiding this comment.
Is the glob in artifacts implicit or a default? If so, that seems surprising.
libs/patchwheel/filter_test.go
Outdated
| "other-0.9.0-py3-none-any.whl", | ||
| "project_name_bvs7tide6bhhpjy4dmcsb2qg44-0.0.1+20250604.74804-py3-none-any.whl", | ||
| "not-a-wheel.whl", | ||
|
|
…ously deleted stuff when syncing
0d60bc5 to
c0d9c66
Compare
## Release v0.256.0 ### Bundles * When building Python artifacts as part of "bundle deploy" we no longer delete `dist`, `build`, `*egg-info` and `__pycache__` directories ([#2982](#2982)) * When glob for wheels is used, like "\*.whl", it will filter out different version of the same package and will only take the most recent version ([#2982](#2982)) * Add preset `presets.artifacts_dynamic_version` that automatically enables `dynamic_version: true` on all "whl" artifacts ([#3074](#3074)) * Update client version to "2" for the serverless variation of the default-python template ([#3083](#3083)) * Fix reading dashboard contents when the sync root is different than the bundle root ([#3006](#3006)) * Fix variable resolution for lookup variables with other references ([#3054](#3054)) * Allow users to override the Terraform version to use by setting the `DATABRICKS_TF_VERSION` environment variable ([#3069](#3069))
Changes
__pycache__folders on every deploy.Why
Deleting dist folder is surprising and invasive (e.g. see #2969).
In order to avoid doing that but keep old "dist/*.whl" globs working, we're modifying glob behaviour to do additional filtering so that if there are more than 1 wheel from the same package, we'll only upload the one with most recent version.
Tests
New acceptance test simulates situation in #2969
Existing acceptance test show that wheels that were uploaded twice (from .internal and dist folders) are not uploaded once, which is more inline with the expectations.