allow verifier log size to reach max to prevent infinite loop#1679
allow verifier log size to reach max to prevent infinite loop#1679brycekahle wants to merge 2 commits intocilium:mainfrom
Conversation
Signed-off-by: Bryce Kahle <[email protected]>
4fccb08 to
bc35014
Compare
|
If it wasn't clear, we are seeing the infinite loop in one of our tools. |
prog.go
Outdated
| const factor = 2 | ||
| logSize = internal.Between(logSize, minVerifierLogSize, maxVerifierLogSize/factor) | ||
| logSize *= factor | ||
| logSize = internal.Between(logSize*2, minVerifierLogSize, maxVerifierLogSize) |
There was a problem hiding this comment.
Can you bring back the overflow check that used to be here?
e8b05c5#diff-348ab86651d6b4babd4e62cbe504d85db37f9aea172e2b72a570b8d5d73b8bc5L443-L445
There was a problem hiding this comment.
I don't think it is possible to overflow? maxVerifierLogSize is 2^30, so 2^31 is the max it can be, which fits in a uint32.
There was a problem hiding this comment.
Good point. There are two problems with the current code:
- The maxVerifierLogSize may change at some point, as the comment points out. It'd be nice to not have this problem again.
- There is no protection against an infinite loop, as you've discovered.
As I wrote the code originally the intention was that:
- Passing a too large number for the buffer would give EINVAL, aborting the loop.
- The buffer exceeding uint32 would also abort it.
(1) doesn't happen anymore since we clamp to the maximum, (2) was removed. IMO I'd prefer bringing back the original code, which doesn't clamp to a max value (because hardcoding other system's limit makes code brittle) and checks for overflow. We can keep the LogSizeStart logic, although that might need adjustment because with LogLevel == 0 we actually use LogSizeStart * 2 for the first call.
There was a problem hiding this comment.
Updated, please take a look.
There was a problem hiding this comment.
Actually, I thought about this some more. Having the knowledge about the kernel max allows us to work up to that value before erroring with EINVAL. If we just keep doubling, then we could blow right by a value that would actually function.
There was a problem hiding this comment.
Maybe we should do that only for older kernels where log_true_size is not available?
Signed-off-by: Bryce Kahle <[email protected]>
d3e0ac3 to
544a72a
Compare
|
Ugh, the amount of time I've spent on these 30 lines of code over the years is getting ridiculous.. Discussed this offline with Lorenz, we came up with the following steps:
That should, hopefully, make this a bit more robust. I'll come up with something this week. |
|
@brycekahle Something I'm wondering about, though. How is |
I believe it is, because if you use the max size the kernel will not error, even if it actually needed more space. |
|
@brycekahle I just put up #1693, please give it a try. Closing this one for now. |
Hmm, interesting. It's a shame this is so hard to test due to the memory requirements and having to produce a program large and complex enough to trigger the limit.. Anyway, I tried covering as many cases as possible in my PR, please give it a try. |
In the case of:
log_true_sizeyetebpf/prog.go
Lines 453 to 455 in 29fedc5
maxVerifierLogSize is
1073741823, so integer divided by 2 =536870911. Multiplying by 2 equals1073741822. Thus we never reachmaxVerifierLogSizein the loop. The kernel keeps returningENOSPCbut the size never changes and you get an infinite loop.