libnet: don't lock *Endpoint in the void#47216
libnet: don't lock *Endpoint in the void#47216akerouanton wants to merge 1 commit intomoby:masterfrom
*Endpoint in the void#47216Conversation
|
@akerouanton I think the failure may be related (not a flaky); |
|
Yep, they're. I'll convert this one as a draft for now. |
66455fb to
ada1cb5
Compare
`(*Endpoint).sbJoin()` `(*Endpoint).sbLeave()` start by calling `(*Network).getEndpointFromStore()` before doing anything else. In turn this method calls `(*Store).GetObject()` with a freshly created `*Endpoint` instance, meaning the `*Endpoint` instance returned by `getEndpointFromStore` is a totally different reference than the one already living in memory. Thus, any mutex locks on these new instances actually lock *nothing* at all. Another property of `GetObject()` is to fully hydrate a given object based on the "canonical" reference that lives in its cache. Since both `sbJoin()` and `sbLeave()` are always called on an `*Endpoint` receiver which were either just created, or just retrieved from the datastore, there's no chance that the receiver was partially hydrated. Signed-off-by: Albin Kerouanton <[email protected]>
ada1cb5 to
69513a3
Compare
|
@thaJeztah Turns out, the same fix had to be applied to |
Silly question; are there other code-paths that could have been executed concurrently that could have modified the endpoint (so the hydrated copy we have being a stale / outdated copy)? (as there's no lock before we retrieved the hydrated copy, so curious if there's any synchronisation to prevent that from happening). |
Consider the call chain The endpoint receiver being partially hydrated is not the most important concern; if it is out of date w.r.t. the KVObject in the datastore, the |
- What I did
(*Endpoint).sbJoin(),(*Endpoint).sbLeave()start by calling(*Network).getEndpointFromStore()before doing anything else. In turn this method calls(*Store).GetObject()with a freshly created*Endpointinstance, meaning the*Endpointinstance returned bygetEndpointFromStoreis a totally different reference than the one already living in memory. Thus, any mutex locks on these new instances actually lock nothing at all.Another property of
GetObject()is to fully hydrate a given object based on the "canonical" reference that lives in its cache.Since both
sbJoin()andsbLeave()are always called on an*Endpointreceiver which were either just created, or just retrieved from the datastore, there's no chance that the receiver was partially hydrated.- How I did it
By conducting a static analysis of
sbJoin(),sbLeave()and their callsites.- How to verify it
CI