Conversation
c882225 to
66a318b
Compare
66a318b to
2f36255
Compare
2f36255 to
b01cadd
Compare
b01cadd to
7d3810d
Compare
7d3810d to
92a3b4c
Compare
92a3b4c to
7c82324
Compare
7c82324 to
31b42d0
Compare
31b42d0 to
c4484b3
Compare
c4484b3 to
bfa5241
Compare
bfa5241 to
d3eb299
Compare
acceptance/acceptance_test.go
Outdated
| return latestWheel | ||
| } | ||
|
|
||
| func applyBundleConfig(t *testing.T, tmpDir string, bundleConfig map[string]any) string { |
There was a problem hiding this comment.
Can you include a docstring saying what this function returns?
There was a problem hiding this comment.
Done. Also changed it to return bool. 3a991f4
acceptance/acceptance_test.go
Outdated
| if configValue == "" { | ||
| continue | ||
| } | ||
| // either "" or a map are allowed |
There was a problem hiding this comment.
What is the significance of the empty string?
There was a problem hiding this comment.
Disables the setting. Added a comment.
acceptance/acceptance_test.go
Outdated
| } | ||
|
|
||
| var configPath, configData string | ||
| filenames := []string{"databricks.yml", "databricks.yml.tmpl"} |
There was a problem hiding this comment.
Why the tmpl file?
Also, the configuration could include the filename directly to make this not bundle-specific (or specific to databricks.yml but applicable to other YAML files in the test directory).
There was a problem hiding this comment.
Great idea, added BundleConfigTarget for this: 3a991f4
acceptance/acceptance_test.go
Outdated
|
|
||
| for _, filename := range filenames { | ||
| path := filepath.Join(tmpDir, filename) | ||
| exists := false |
There was a problem hiding this comment.
We need to know whether to exclude this file from the comparison to avoid test failing because there is a new file that is unaccounted for.
| newConfigData := configData | ||
| var applied []string | ||
|
|
||
| for _, configName := range utils.SortedKeys(validConfig) { |
There was a problem hiding this comment.
The keys are the "path" in YAML to apply the override to (e.g. config2.resources.jobs.example_job)?
If so, please include comments or examples of the shape of the input to this function to clarify.
There was a problem hiding this comment.
config2 is not part of yaml, it is the name of the config (for override purposes). There is selftest that shows how it works.
| if len(key) > 0 && key[0] == "BundleConfig" { | ||
| continue | ||
| } | ||
| t.Errorf("Undecoded key in %s[%d]: %#v", path, ind, key) |
There was a problem hiding this comment.
What does "undecoded" mean? How do these keys now end up in the config struct?
There was a problem hiding this comment.
It's means a key in TOML file was not mapped to any struct field.
There was a problem hiding this comment.
But how does the "BundleConfig" map get populated if it is not decoded?
I would expect either:
- The "BundleConfig" field to be populated with the config and this code to never execute
- This code to execute and the "BundleConfig" field to remain empty
There was a problem hiding this comment.
Good question! This seems to be an issue with toml parser's handling of map[string]any type.
Somehow it decodes it but also complains about it, This is what happens if I remove that check:
--- FAIL: TestAccept/selftest/bundleconfig/override (0.02s)
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[0]: toml.Key{"BundleConfig", "config1", "bundle"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[1]: toml.Key{"BundleConfig", "config1", "bundle", "name"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[2]: toml.Key{"BundleConfig", "config2", "resources", "jobs", "example_job"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[3]: toml.Key{"BundleConfig", "config2", "resources", "jobs", "example_job", "name"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[4]: toml.Key{"BundleConfig", "config2", "resources", "jobs", "example_job", "new_string"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[5]: toml.Key{"BundleConfig", "config2", "resources", "jobs", "example_job", "new_list"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[6]: toml.Key{"BundleConfig", "config2", "resources", "jobs", "example_job", "new_map"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[7]: toml.Key{"BundleConfig", "config2", "resources", "jobs", "example_job", "new_map", "key"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[8]: toml.Key{"BundleConfig", "config2", "resources", "jobs", "example_job", "list2"}
config.go:221: Undecoded key in selftest/bundleconfig/test.toml[9]: toml.Key{"BundleConfig", "config2", "resources", "jobs", "example_job", "string2"}
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| out, err := MergeBundleConfig(tt.initialYaml, tt.bundleConfig) |
There was a problem hiding this comment.
Can this function take a map[string]any as input instead?
There was a problem hiding this comment.
it already does?
bundle_config.go:func MergeBundleConfig(source string, bundleConfig map[string]any) (string, error) {
There was a problem hiding this comment.
I mean for "source". The mix of unmarshalling and receiving an object is confusing.
There was a problem hiding this comment.
I see what you mean, but there is some logic to it: the function receives config as a string and returns new updated config as a string. This allows it to encapsulate (and potentially have test cases for) details related to yaml parsing, such as whether it is strict or not, whether it preserves comments. The test runner on the other hand knows which file we're reading and writing and handles all I/O.
The fact that it also receives unmarshalled bundleConfig is just because we do all of unmarshalling of config in one place and there is no other way to receive bundleConfig.
There was a problem hiding this comment.
There is a lot of unnecessary marshalling / unmarshalling going on with this implementation though. It works for now and does not cause noticeable overhead, but worth rewriting a bit (for later).
andrewnester
left a comment
There was a problem hiding this comment.
I think we discussed it offline that it's important for this change to also include materialised test.toml configuration which will contain actual bundle config set up for this test (e.g. all parent BundleConfig merged). Is it something you plan to work on?
Not in this PR, that's separate issue. |
pietern
left a comment
There was a problem hiding this comment.
Minor remaining comments.
acceptance/acceptance_test.go
Outdated
| continue | ||
| } | ||
| // either "" or a map are allowed | ||
| // Empty string can be used to disable an update that was defined in parent config |
There was a problem hiding this comment.
Comment out of date; empty string will cause fatal now.
There was a problem hiding this comment.
Not sure what you mean, empty string is explicitly handled below to skip this setting:
configValue := bundleConfig[configName]
if configValue == "" {
continue
}
Let me move the comment about it to where the code is, to make it more obvious: ed731f4
|
|
||
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| out, err := MergeBundleConfig(tt.initialYaml, tt.bundleConfig) |
There was a problem hiding this comment.
I mean for "source". The mix of unmarshalling and receiving an object is confusing.
| if len(key) > 0 && key[0] == "BundleConfig" { | ||
| continue | ||
| } | ||
| t.Errorf("Undecoded key in %s[%d]: %#v", path, ind, key) |
There was a problem hiding this comment.
But how does the "BundleConfig" map get populated if it is not decoded?
I would expect either:
- The "BundleConfig" field to be populated with the config and this code to never execute
- This code to execute and the "BundleConfig" field to remain empty
af48f8b to
17f103c
Compare
## Changes Using BundleConfig option (#2809), specify boilerplate bundle.name once and clean up individual test cases. The tests with diagnostics (lineno, column) are left alone. Test acceptance/bundle/volume_path is split into invidual tests, one databricks.yml per test (since BundleConfig only updates databricks.yml[.tmpl] in the root of the test. ## Why Simpler to read and write, make each test focussed on what's important. ## Tests Existing tests. Benchmarked acceptance test suite (local) to ensure this does not add a lot of overhead (due to parsing/marshalling every config): ``` This branch: Time (mean ± σ): 18.364 s ± 0.857 s [User: 32.707 s, System: 27.281 s] Range (min … max): 17.002 s … 19.922 s 10 runs Main: Time (mean ± σ): 18.337 s ± 0.789 s [User: 32.613 s, System: 27.235 s] Range (min … max): 17.078 s … 19.342 s 10 runs ```
## Changes - Remove BundleConfig setting (#2809) which allows post-processing of databricks.yml - Add bundle.name section to all config that need it explicitly. ## Why It is not used outside of original 'name' use case and I don't think that use case alone warrants the complexity. This makes test runner simpler and understanding the test output simpler. In particular line numbers in the output become correct. There is also log noise that is removed. Dynamically generated configs are useful sometimes, but that can be done by a script, no need to support it on test runner level. Test runner can provide input to configuration via EnvMatrix.
Changes
New setting BundleConfig, to define portions of bundle config in test.toml which is then propagated to all child configs (like any other config setting).
Why
This feature allows structuring tests better, by having common section defined at the parent level and re-used by all children. This is also a foundation for BundleConfigMatrix (similar to EnvMatrix) which will allow to test variations on the config without duplicating test.
Some use cases:
Tests