-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Concurrent Rego parsing in bundle loader #8067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent Rego parsing in bundle loader #8067
Conversation
✅ Deploy Preview for openpolicyagent ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Parsing is generally fast, so this mainly improves performance of creating big bundles with many Rego files in them. For Regal's embedded bundle, loading it from memory would previously take 16 ms on my laptop, and now it takes 9 ms. There are other things in this process that could be concurrent too, like JSON unmarshalling of multiple data files. But starting with parsing modules. This PR adds `errgroup` as a direct dependency (previously indirect) as it is a nicer way to work with wait groups, and one that can be useful elsewhere in the codebase (like in the compiler). Also, and as usual, went off on a bit of a tangent refactoring code related to the bundle build process, and made sure to use some common helpers in code where available. Signed-off-by: Anders Eknert <[email protected]>
e045403 to
9c77ff0
Compare
srenatus
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the nice side-track cleanups. Couple of questions, no blockers.
| } | ||
|
|
||
| func (tgw *TarGzWriter) Close() error { | ||
| return errors.Join(tgw.Writer.Close(), tgw.gw.Close()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's clever, but it is safe? Can we close tgw.gw when tgw.Writer.Close() returns an error? Probably, just wondering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess worst case would be that we get two errors reported instead of one? I can certainly change this, but it does look rather neat :P
| return fileInfo.Size(), nil | ||
| } | ||
|
|
||
| func normalizePath(p string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm I actually liked that one. It looks silly on the surface, but centralizes all our normalization efforts. If we were to add to them, do something beyond filepath.ToSlash, we could do that here. Now, we'll need to centralize them again if that happens. But it's a simple S&R, so fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this because "normalize" means nothing I understand without looking at the implementation, whereas I don't need to do that reading ToSlash. We'll find a better name should we need to extend this later 🙂
| import "strings" | ||
|
|
||
| // WithPrefix ensures that the string s starts with the given prefix. | ||
| // If s already starts with prefix, it is returned unchanged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there cases where this could be undesired? What if a user-provided file happens to have the prefix, too -- is that OK or would we want it to have two prefixes for some reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, yes, this helper is probably not what you want. But a nice convenience when that's not the case.
Parsing is generally fast, so this mainly improves performance of creating big bundles with many Rego files in them. For Regal's embedded bundle, loading it from memory would previously take 16 ms on my laptop, and now it takes 9 ms. There are other things in this process that could be concurrent too, like JSON unmarshalling of multiple data files. But starting with parsing modules.
This PR adds
errgroupas a direct dependency (previously indirect) as it is a nicer way to work with wait groups, and one that can be useful elsewhere in the codebase (like in the compiler).Also, and as usual, went off on a bit of a tangent refactoring code related to the bundle build process, and made sure to use some common helpers in code where available.