Skip to content

Commit ab05ace

Browse files
committed
Fix race condition in config-manager when label is unset
When the node label (nvidia.com/device-plugin.config) is not set, a race condition could cause the config-manager to hang indefinitely on startup. The issue occurred when the informer's AddFunc fired before the first Get() call, setting current="" and broadcasting. When Get() was subsequently called, it found lastRead == current (both empty strings) and waited forever, as no future events would wake it up. This fix adds an 'initialized' flag to SyncableConfig to ensure the first Get() call never waits, regardless of timing. Subsequent Get() calls still wait properly when the value hasn't changed. Signed-off-by: Uri Sternik <[email protected]>
1 parent 624b771 commit ab05ace

1 file changed

Lines changed: 5 additions & 4 deletions

File tree

cmd/config-manager/main.go

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ type SyncableConfig struct {
8282
cond *sync.Cond
8383
mutex sync.Mutex
8484
current string
85-
lastRead string
85+
lastRead *string
8686
}
8787

8888
// NewSyncableConfig creates a new SyncableConfig
@@ -106,11 +106,12 @@ func (m *SyncableConfig) Set(value string) {
106106
func (m *SyncableConfig) Get() string {
107107
m.mutex.Lock()
108108
defer m.mutex.Unlock()
109-
if m.lastRead == m.current {
109+
if m.lastRead != nil && *m.lastRead == m.current {
110110
m.cond.Wait()
111111
}
112-
m.lastRead = m.current
113-
return m.lastRead
112+
val := m.current
113+
m.lastRead = &val
114+
return *m.lastRead
114115
}
115116

116117
func main() {

0 commit comments

Comments
 (0)