Commit 270a4d4
committed
libn/networkdb: stop table events from racing network leaves
When a node leaves a network or the cluster, or memberlist considers the
node as failed, NetworkDB atomically deletes all table entries (for the
left network) owned by the node. This maintains the invariant that table
entries owned by a node are present in the local database indices iff
that node is an active cluster member which is participating in the
network the entries pertain to.
(*NetworkDB).handleTableEvent() is written in a way which attempts to
minimize the amount of time it is in a critical section with the mutex
locked for writing. It first checks under a read-lock whether both the
local node and the node where the event originated are participating in
the network which the event pertains to. If the check passes, the mutex
is unlocked for reading and locked for writing so the local database
state is mutated in a critical section. That leaves a window of time
between the participation check the write-lock being acquired for a
network or node event to arrive and be processed. If a table event for a
node+network races a node or network event which triggers the purge of
all table entries for the same node+network, the invariant could be
violated. The table entry described by the table event may be reinserted
into the local database state after being purged by the node's leaving,
resulting in an orphaned table entry which the local node will bulk-sync
to other nodes indefinitely.
It's not completely wrong to perform a pre-flight check outside of the
critical section. It allows for an early return in the no-op case
without having to bear the cost of synchronization. But such optimistic
concurrency control is only sound if the condition is double-checked
inside the critical section. It is tricky to get right, and this
instance of optimistic concurrency control smells like a case of
premature optimization. Move the pre-flight check into the critical
section to ensure that the invariant is maintained.
Signed-off-by: Cory Snider <[email protected]>1 parent a3aea15 commit 270a4d4
1 file changed
Lines changed: 6 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
148 | 148 | | |
149 | 149 | | |
150 | 150 | | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
151 | 157 | | |
152 | | - | |
153 | 158 | | |
154 | 159 | | |
155 | 160 | | |
| |||
161 | 166 | | |
162 | 167 | | |
163 | 168 | | |
164 | | - | |
165 | 169 | | |
166 | 170 | | |
167 | 171 | | |
168 | 172 | | |
169 | 173 | | |
170 | 174 | | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | 175 | | |
177 | 176 | | |
178 | 177 | | |
| |||
0 commit comments