Fixing longTermMax stats corruption in Regulator#1365
Merged
Conversation
When auto headroom is updated dynamically, it could cause a race condition where calcAuto() was called before any measurements were made (ctr == 0). This would cause a recording of "max" with a negative value (the reset() value of -999999.0). This is such a large (or small?) value that it corrupted the longTermMax rolling average for a long duration of time. The result of longTermMax corruption is that tolerance would be reset to a minimum value equal to the duration of a single audio buffer. This resulted in audio received from the participant being very garbled until longTermMax was finally able to recover. It's a pretty nasty bug. A bit hard to find and unwind, but the fix is pretty clear and obvious. I just added some extra sanity checks to calcAuto to ensure that max (and min) are only used with sane values. The "good" news is that I think this was only recently introduced by adding the ability to adjust headroom dynamically (via OSC). The "bad" news is that we currently have this running in production and I'm pretty sure vs-agent is calling that method at least once on startup for all studio sessions.
Collaborator
Author
|
I'm actually not entirely sure yet how this gets triggered. I've been able to reproduce it even without dynamically updating headroom. I thought tat was making it easier to repro at least, but it may have been a cooincidence. I think it may have more to do with peer having a different buffer size because I haven't been able to repro (at least yet) when they are the same. So impact surface may not be all that big.. Whatever the specific triggers may be, I'm still confident in the fix. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When auto headroom is updated dynamically, it could cause a race condition where calcAuto() was called before any measurements were made (ctr == 0). This would cause a recording of "max" with a negative value (the reset() value of -999999.0). This is such a large (or small?) value that it corrupted the longTermMax rolling average for a long duration of time.
The result of longTermMax corruption is that tolerance would be reset to a minimum value equal to the duration of a single audio buffer. This resulted in audio received from the participant being very garbled until longTermMax was finally able to recover.
It's a pretty nasty bug. A bit hard to find and unwind, but the fix is pretty clear and obvious. I just added some extra sanity checks to calcAuto to ensure that max (and min) are only used with sane values.
The "good" news is that I think this was only recently introduced by adding the ability to adjust headroom dynamically (via OSC). The "bad" news is that we currently have this running in production and I'm pretty sure vs-agent is calling that method at least once on startup for all studio sessions.