Skip to content

prototype of configurable parallelism#3934

Closed
jskeet wants to merge 3 commits intogoogleapis:mainfrom
jskeet:parallelism
Closed

prototype of configurable parallelism#3934
jskeet wants to merge 3 commits intogoogleapis:mainfrom
jskeet:parallelism

Conversation

@jskeet
Copy link
Copy Markdown
Contributor

@jskeet jskeet commented Feb 6, 2026

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces configurable parallelism for the generate command, breaking it down into parallel clean, generate, and format steps. The implementation correctly uses errgroup.WithContext and SetLimit to control concurrency. My review focuses on a critical correctness issue with loop variable capture in the new parallel execution helper and a potential reintroduction of a race condition during the clean step that was previously guarded against. Addressing these points will improve the robustness and correctness of the new parallel processing logic.

Comment on lines +173 to +177
for _, lib := range libraries {
g.Go(func() error {
return function(gctx, lib)
})
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There's a common Go pitfall here with loop variable capture in a closure. The lib variable is reused in each iteration of the loop. When the goroutines created by g.Go execute, they will likely all use the value of lib from the final iteration, leading to incorrect behavior where the same library is processed multiple times and others are skipped. To fix this, you should create a new variable inside the loop scope to capture the current value for each goroutine.

Suggested change
for _, lib := range libraries {
g.Go(func() error {
return function(gctx, lib)
})
}
for _, lib := range libraries {
lib := lib // Capture loop variable.
g.Go(func() error {
return function(gctx, lib)
})
}

Comment on lines +141 to 145
if err := executeInParallel(ctx, libraries, parallelization.Clean, func(gctx context.Context, lib *config.Library) error {
return cleanLibrary(cfg.Language, lib)
}); err != nil {
return err
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The previous implementation performed cleaning sequentially with a comment explaining it was to avoid race conditions when output directories are nested. This change introduces parallel cleaning, which could re-introduce those race conditions if a user configures parallelization.clean to be greater than 1. For example, if one library's output directory is a subdirectory of another's, cleaning them in parallel could lead to unpredictable behavior. Consider either forcing the clean operation to be sequential or adding a strong warning in the configuration documentation about this potential issue.

Comment on lines +257 to +260
if err := python.Generate(ctx, library, googleapisDir); err != nil {
return err
}
return nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by: simplify to:

Suggested change
if err := python.Generate(ctx, library, googleapisDir); err != nil {
return err
}
return nil
return python.Generate(ctx, library, googleapisDir)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change shouldn't be in here at all, in fact - it crept in when I was replacing the existing code with just "log and sleep". But the same change could still be made, in fact for all branches...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored the cleanLibrary function though, given that that was changing anyway...

// parallelization; a positive number represents a specific limit of
// parallelization; 0 is treated as equivalent to 1 (for the sake of
// convenience, as a default value).
type Parallelization struct {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should put this in the librarian.yaml. We should just hardcode it in internal/librarian/{language} since it is an implementation detail of the function.

@julieqiu
Copy link
Copy Markdown
Member

julieqiu commented Feb 7, 2026

I’m not convinced this needs to be configurable right now.

If languages need different behavior, I’d rather handle that at the language implementation level than introduce orchestration knobs in librarian.yaml.

For debugging, I think we can cross that bridge when we get there, and maybe adding a simple --sequential or --dev flag can solve the problem.

My preference would be to solve this problem in code right now, and we can always add configuration later if there are real use cases that can't be solved without it.

@jskeet
Copy link
Copy Markdown
Contributor Author

jskeet commented Feb 8, 2026

@julieqiu Fair enough. My guess is that the simplest way to do that is to change the sort of interface we've got with the languages to pass in the whole slice of libraries to clean (generate, format) and let the language decide whether to do it in series or parallel.

That means a bit of duplication (looping, basically) but is probably simpler than a sort of "parallelism negotiation" where the agnostic code asks the language code whether or not to parallelize - but you may have had something else in mind.

If you're happy with the "pass the slice in" approach I'll create a new PR tomorrow.

@jskeet
Copy link
Copy Markdown
Contributor Author

jskeet commented Feb 11, 2026

Closing in favor of #4005 (which still may not be what we want).

@jskeet jskeet closed this Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants