Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve helm dependency update performance #11726

Merged
merged 10 commits into from
Aug 24, 2024

Conversation

JvD-Ericsson
Copy link
Contributor

What this PR does / why we need it:
This PR was created to improve performance of the dependency update command by skipping unnecessary downloading and loading of index files that have already been downloaded and loaded

I believe this would close refs #9865

Signed-off-by: Jeff van Dam [email protected]

@pull-request-size pull-request-size bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jan 13, 2023
What this PR does / why we need it:
This PR was created to improve performance of the dependency update command by
skipping unnecessary downloading and loading of index files that have already
been downloaded and loaded

I believe this would close refs helm#9865

Signed-off-by: Jeff van Dam <[email protected]>
@yxxhero yxxhero added this to the 3.11.1 milestone Jan 17, 2023
@joejulian joejulian modified the milestones: 3.11.1, 3.12.0 Jan 28, 2023
@joejulian
Copy link
Contributor

Changed the milestone because this isn't a bug fix or security issue.

@SuperSandro2000
Copy link

Could I motivate people to review this? It would be much appreciated.

baseChartUrls[repoURL] = strings.Join(slice[:len(slice)-1], "")
}
} else {
url = fmt.Sprintf("%s%s-%s.tgz", baseURL, name, version)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes assumptions about the structure of the chart URL, are we sure the assumptions are correct?

An alternative suggestion:
The chart URL is knowable earlier in the flow, in Resolve (resolver.go) at the point where we set the version (inside loop on lines 197-200), the url is available from ver.URLs[0]. We could store (perhaps in a map also returned from the method) the chart URLs which can be used during the download rather than using additional logic to again find the url

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, yeah makes more sense to do it that way. Have that updated now

What this PR does / why we need it:
This PR was created to improve performance of the dependency update command by
skipping unnecessary downloading and loading of index files that have already
been downloaded and loaded

I believe this would close refs helm#9865

Signed-off-by: Jeff van Dam <[email protected]>
@pull-request-size pull-request-size bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 24, 2023
@joejulian joejulian modified the milestones: 3.12.0, 3.13.0 May 5, 2023
Signed-off-by: Jeff van Dam <[email protected]>
@pull-request-size pull-request-size bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 11, 2023
@JvD-Ericsson
Copy link
Contributor Author

Hey @joejulian any plan to merge this anytime soon? Happy to make any changes if necessary

@JvD-Ericsson
Copy link
Contributor Author

Sorry to bother again @joejulian but is there any chance of this being merged anytime soon?

@rolfberkenbosch
Copy link

Thank you @JvD-Ericsson for your work. We have currently a lot of performance problemen with ArgoCD in combination with Helm. We have investigated the performance problems and we found out that this is due the helm dependencies. We have tested this pr and it will improve the performance a lot! @joejulian can you merge this pr ?

Copy link
Contributor

@joejulian joejulian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. I've tried some manual testing and everything seems to work. I don't have any examples that take a long time, but at least everything I use regularly seems to keep working.

Copy link
Collaborator

@mattfarina mattfarina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly makes sense from a performance perspective.

The one issue I found is a way to possibly send credentials to the wrong URL by including the URL the credentials are associated with in an annotation or description in a malicious index.yaml file. I think that part needs to be walked back for the sake of security.

@@ -52,21 +52,23 @@ func New(chartpath, cachepath string, registryClient *registry.Client) *Resolver
}

// Resolve resolves dependencies and returns a lock file with the resolution.
func (r *Resolver) Resolve(reqs []*chart.Dependency, repoNames map[string]string) (*chart.Lock, error) {
func (r *Resolver) Resolve(reqs []*chart.Dependency, repoNames map[string]string) (*chart.Lock, map[string]string, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, since this is in internal it's not part of the public API so a change is ok.

Comment on lines 384 to 386
file := string(yamlFile[:])
if strings.Contains(file, u) {
return rc, nil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this change. The string.Contains is run over a file with annotations and descriptions. The u could be in one of those fields which would return an improper result. While this is unlikely, the code that calls scanReposForURL grabs the credentials if there are any to pass on. I wonder if there is an edge case that could leak credentials with misuse of this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have that reverted now, so hopefully it can be merged now! If anything else needs changing let me know

@lapastillaroja
Copy link

It would be great to get this PR merged!

@joejulian joejulian requested a review from mattfarina August 28, 2023 19:05
@mattfarina mattfarina requested a review from joejulian August 31, 2023 16:48
@JvD-Ericsson
Copy link
Contributor Author

Logically, this looks good, however, I am receiving the following errors when running make test-style

make test-style
GO111MODULE=on golangci-lint run
WARN [runner] The linter 'deadcode' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused. 
WARN [runner] The linter 'varcheck' is deprecated (since v1.49.0) due to: The owner seems to have abandoned the linter. Replaced by unused. 
internal/test/ensure/ensure.go:60: File is not `gofmt`-ed with `-s` (gofmt)
//      tempdir := TempFile(t, "foo", []byte("bar"))
//      defer os.RemoveAll(tempdir)
//      filename := filepath.Join(tempdir, "foo")
pkg/lint/support/doc.go:17: File is not `gofmt`-ed with `-s` (gofmt)
/*Package support contains tools for linting charts.
pkg/gates/doc.go:16: File is not `gofmt`-ed with `-s` (gofmt)
/*Package gates provides a general tool for working with experimental feature gates.
pkg/chartutil/dependencies.go:48:13: superfluous-else: if block ends with a break statement, so drop this else and outdent its block (move short variable declaration to its own line if necessary) (revive)
                                        } else {
                                                log.Printf("Warning: Condition path '%s' for chart %s returned non-bool value", c, r.Name)
                                        }
pkg/releaseutil/kind_sorter.go:133:17: unused-parameter: parameter 'a' seems to be unused, consider removing or renaming it as _ (revive)
func lessByKind(a interface{}, b interface{}, kindA string, kindB string, o KindSortOrder) bool {
                ^
pkg/lint/rules/template.go:48:89: unused-parameter: parameter 'strict' seems to be unused, consider removing or renaming it as _ (revive)
func Templates(linter *support.Linter, values map[string]interface{}, namespace string, strict bool) {
                                                                                        ^
pkg/storage/driver/mock_test.go:252:38: unused-parameter: parameter 'releases' seems to be unused, consider removing or renaming it as _ (revive)
func newTestFixtureSQL(t *testing.T, releases ...*rspb.Release) (*SQL, sqlmock.Sqlmock) {
                                     ^
pkg/pusher/pusher.go:92:10: unused-parameter: parameter 'settings' seems to be unused, consider removing or renaming it as _ (revive)
func All(settings *cli.EnvSettings) Providers {
         ^
pkg/storage/storage_test.go:296:39: unused-parameter: parameter 'key' seems to be unused, consider removing or renaming it as _ (revive)
func (d *MaxHistoryMockDriver) Delete(key string) (*rspb.Release, error) {
                                      ^
pkg/plugin/installer/http_installer_test.go:47:30: unused-parameter: parameter 'href' seems to be unused, consider removing or renaming it as _ (revive)
func (t *TestHTTPGetter) Get(href string, _ ...getter.Option) (*bytes.Buffer, error) {
                             ^
cmd/helm/search/search_test.go:104:20: unused-parameter: parameter 't' seems to be unused, consider removing or renaming it as _ (revive)
func loadTestIndex(t *testing.T, all bool) *Index {
                   ^
pkg/repo/index_test.go:436:2: redefines-builtin-id: redefinition of the built-in function real (revive)
        var expected, real []string
        ^
pkg/repo/index_test.go:444:3: redefines-builtin-id: redefinition of the built-in function real (revive)
                real = append(real, scanner.Text())
                ^
pkg/repo/chartrepo_test.go:120:41: unused-parameter: parameter 'options' seems to be unused, consider removing or renaming it as _ (revive)
func (g *CustomGetter) Get(href string, options ...getter.Option) (*bytes.Buffer, error) {
                                        ^
pkg/kube/fake/printer.go:51:63: unused-parameter: parameter 'reader' seems to be unused, consider removing or renaming it as _ (revive)
func (p *PrintingKubeClient) Get(resources kube.ResourceList, reader io.Reader) (map[string][]runtime.Object, error) {
                                                              ^
pkg/kube/client.go:152:46: unused-parameter: parameter 'reader' seems to be unused, consider removing or renaming it as _ (revive)
func (c *Client) Get(resources ResourceList, reader io.Reader) (map[string][]runtime.Object, error) {
                                             ^
pkg/action/package.go:58:36: unused-parameter: parameter 'vals' seems to be unused, consider removing or renaming it as _ (revive)
func (p *Package) Run(path string, vals map[string]interface{}) (string, error) {
                                   ^
pkg/action/registry_logout.go:36:30: unused-parameter: parameter 'out' seems to be unused, consider removing or renaming it as _ (revive)
func (a *RegistryLogout) Run(out io.Writer, hostname string) error {
                             ^
pkg/action/registry_login.go:38:29: unused-parameter: parameter 'out' seems to be unused, consider removing or renaming it as _ (revive)
func (a *RegistryLogin) Run(out io.Writer, hostname string, username string, password string, insecure bool) error {
                            ^
cmd/helm/plugin_list.go:78:22: unused-parameter: parameter 'toComplete' seems to be unused, consider removing or renaming it as _ (revive)
func compListPlugins(toComplete string, ignoredPluginNames []string) []string {
                     ^
cmd/helm/docs.go:80:27: unused-parameter: parameter 'out' seems to be unused, consider removing or renaming it as _ (revive)
func (o *docsOptions) run(out io.Writer) error {
                          ^
cmd/helm/history.go:187:24: unused-parameter: parameter 'toComplete' seems to be unused, consider removing or renaming it as _ (revive)
func compListRevisions(toComplete string, cfg *action.Configuration, releaseName string) ([]string, cobra.ShellCompDirective) {
                       ^
cmd/helm/flags.go:196:39: unused-parameter: parameter 'toComplete' seems to be unused, consider removing or renaming it as _ (revive)
func compVersionFlag(chartRef string, toComplete string) ([]string, cobra.ShellCompDirective) {
                                      ^
cmd/helm/repo_list.go:126:20: unused-parameter: parameter 'prefix' seems to be unused, consider removing or renaming it as _ (revive)
func compListRepos(prefix string, ignoredRepoNames []string) []string {
                   ^
cmd/helm/completion.go:213:20: unused-parameter: parameter 'cmd' seems to be unused, consider removing or renaming it as _ (revive)
func noCompletions(cmd *cobra.Command, args []string, toComplete string) ([]string, cobra.ShellCompDirective) {
                   ^
pkg/downloader/manager.go:726:5: naked return in func `findChartURL` with 45 lines of code (nakedret)
                                return
                                ^
pkg/downloader/manager.go:731:5: naked return in func `findChartURL` with 45 lines of code (nakedret)
                                return
                                ^
pkg/downloader/manager.go:735:5: naked return in func `findChartURL` with 45 lines of code (nakedret)
                                return
                                ^
make: *** [test-style] Error 1```

Should be fixed now

Copy link
Contributor

@sabre1041 sabre1041 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MichaelMorrisEst
Copy link
Contributor

Thanks a million for reviewing @sabre1041, @scottrigby , can this now be merged as it has 2 maintainer approvals

@scottrigby scottrigby modified the milestones: 3.15.0, 3.16.0 May 17, 2024
@scottrigby
Copy link
Member

scottrigby commented May 17, 2024

@MichaelMorrisEst 3.15.0 was already released, so I just moved this PR to the 3.16.0 milestone.

@@ -191,13 +191,13 @@ func (m *Manager) Update() error {

// Now we need to find out which version of a chart best satisfies the
// dependencies in the Chart.yaml
lock, err := m.resolve(req, repoNames)
lock, urls, err := m.resolve(req, repoNames)
if err != nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check that urls is empty?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urls is ultimately read on line 756, if it is empty the key will not be found and the flow will proceed to the else statement and proceed as per existing implementation without any issues, so I am not seeing a reason to check for empty

@joejulian joejulian added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 25, 2024
@MichaelMorrisEst
Copy link
Contributor

Hi @joejulian I believe the 'needs-rebase' label can be removed again now. If there is anything else needed from my side let me know

@mattfarina
Copy link
Collaborator

@joejulian @scottrigby @sabre1041 can you re-review since new changes were pushed?

@mattfarina mattfarina removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 23, 2024
@joejulian joejulian merged commit 3337544 into helm:main Aug 24, 2024
5 checks passed
@MichaelMorrisEst
Copy link
Contributor

Thanks @joejulian for reviewing/merging and also everyone else who reviewed.
Thanks @JvD-Ericsson for implementing.
Great to get this over the line!

@yxxhero
Copy link
Member

yxxhero commented Aug 24, 2024

Awesome

mattfarina added a commit to mattfarina/helm that referenced this pull request Sep 12, 2024
The change in helm#11726 caused a regression where `helm dependency udpate`
stopped working. The format of the internal representation of the data
changed causing errors of "non-absolute URLs should be in form of
repo_name/path_to_chart". See helm#13324 for more details.

Since this change is in released Helm and it's a regression, reverting
the original change was the fastest and safest route to deliver a
fix as quickly as possible.

Closes helm#13324

Signed-off-by: Matt Farina <[email protected]>
@reneleonhardt
Copy link

Can the tests be improved to cover the unexpected problems and retry this PR?

@kickthemooon
Copy link

I updated my helm version to 3.16.1, tried out helm dep update but didn't see an improvement unfortunately. is there something I'm missing? is the lock file required for the improvement?

@kickthemooon
Copy link

Oh so it was reverted, so I'll try 3.16.0

@kickthemooon
Copy link

or could it be that it hasn't been released in any of the versions?

@MichaelMorrisEst
Copy link
Contributor

It was in 3.16.0 but reverted in 3.16.1 (see #13327 for details)

@reneleonhardt
Copy link

Is a fix / new version of these performance improvements planned with better test coverage?

@MichaelMorrisEst
Copy link
Contributor

Is a fix / new version of these performance improvements planned with better test coverage?

A new PR has been created to re-introduce the change and fix the issue found in the reverted change. This is waiting for review. See: #13342

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.