Fixing longTermMax stats corruption in Regulator by mikedickey · Pull Request #1365 · jacktrip/jacktrip

mikedickey · 2024-12-23T20:15:47Z

When auto headroom is updated dynamically, it could cause a race condition where calcAuto() was called before any measurements were made (ctr == 0). This would cause a recording of "max" with a negative value (the reset() value of -999999.0). This is such a large (or small?) value that it corrupted the longTermMax rolling average for a long duration of time.

The result of longTermMax corruption is that tolerance would be reset to a minimum value equal to the duration of a single audio buffer. This resulted in audio received from the participant being very garbled until longTermMax was finally able to recover.

It's a pretty nasty bug. A bit hard to find and unwind, but the fix is pretty clear and obvious. I just added some extra sanity checks to calcAuto to ensure that max (and min) are only used with sane values.

The "good" news is that I think this was only recently introduced by adding the ability to adjust headroom dynamically (via OSC). The "bad" news is that we currently have this running in production and I'm pretty sure vs-agent is calling that method at least once on startup for all studio sessions.

When auto headroom is updated dynamically, it could cause a race condition where calcAuto() was called before any measurements were made (ctr == 0). This would cause a recording of "max" with a negative value (the reset() value of -999999.0). This is such a large (or small?) value that it corrupted the longTermMax rolling average for a long duration of time. The result of longTermMax corruption is that tolerance would be reset to a minimum value equal to the duration of a single audio buffer. This resulted in audio received from the participant being very garbled until longTermMax was finally able to recover. It's a pretty nasty bug. A bit hard to find and unwind, but the fix is pretty clear and obvious. I just added some extra sanity checks to calcAuto to ensure that max (and min) are only used with sane values. The "good" news is that I think this was only recently introduced by adding the ability to adjust headroom dynamically (via OSC). The "bad" news is that we currently have this running in production and I'm pretty sure vs-agent is calling that method at least once on startup for all studio sessions.

mikedickey · 2024-12-23T20:45:05Z

I'm actually not entirely sure yet how this gets triggered. I've been able to reproduce it even without dynamically updating headroom. I thought tat was making it easier to repro at least, but it may have been a cooincidence. I think it may have more to do with peer having a different buffer size because I haven't been able to repro (at least yet) when they are the same. So impact surface may not be all that big.. Whatever the specific triggers may be, I'm still confident in the fix.

cchafe

Good catch and fix!

mikedickey requested review from cchafe and nwang92 December 23, 2024 20:15

cchafe approved these changes Dec 23, 2024

View reviewed changes

cchafe merged commit 42b57fa into dev Dec 23, 2024
19 checks passed

mikedickey deleted the bugfix/regulator-stats-corruption branch December 24, 2024 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fixing longTermMax stats corruption in Regulator#1365

Fixing longTermMax stats corruption in Regulator#1365
cchafe merged 1 commit intodevfrom
bugfix/regulator-stats-corruption

mikedickey commented Dec 23, 2024

Uh oh!

mikedickey commented Dec 23, 2024

Uh oh!

cchafe left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

mikedickey commented Dec 23, 2024

Uh oh!

mikedickey commented Dec 23, 2024

Uh oh!

cchafe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants