[BUG] Potential Data Race on server.master_repl_offset

**Describe the bug**

We are developing a static data race detection tool and detected a potential data race in redis version 6.0.7 
We found a potential race on `server.master_repl_offset` between the main thread and an IO thread.

**To reproduce**

```C
==== Found a race between: 
line 4988 in server.c AND line 164 in replication.c
Shared variable:
 at line 72 of server.c
 72|struct redisServer server; /* Server global state */
Thread 1:
 4986| {
 4987|     memcpy(server.replid,rsi.repl_id,sizeof(server.replid));
>4988|     server.master_repl_offset = rsi.repl_offset;
 4989|     /* If we are a slave, create a cached master from this
 4990|      * information, in order to allow partial resynchronizations*/
>>>Stacktrace:
>>>main
>>>  loadDataFromDisk [server.c:5272]
Thread 2:
 162|    unsigned char *p = ptr;
 163|
>164|    server.master_repl_offset += len;
 165|
 166|    /* This is a circular buffer, so write as much data we can at every*/
>>>Stacktrace:
>>>pthread_create [networking.c:3016]
>>>  IOThreadMain [networking.c:3016]
>>>    readQueryFromClient [networking.c:2979]
>>>      processInputBuffer [networking.c:1996]
>>>        processGopherRequest [networking.c:1886]
>>>          lookupKeyRead [gopher.c:53]
>>>            lookupKeyReadWithFlags [db.c:146]
>>>              expireIfNeeded [db.c:101]
>>>                propagateExpire [db.c:1311]
>>>                  replicationFeedSlaves [db.c:1233]
>>>                    feedReplicationBacklog [replication.c:270]
```


Inside of `main` in server.c, the function [`InitServerLast`](https://github.com/redis/redis/blob/6.0.7/src/server.c#L5271) creates IOThreads which can eventually call [`feedReplicationBacklog`](https://github.com/redis/redis/blob/6.0.7/src/replication.c#L164)  and write to `server.master_repl_offset` through the call stack shown for thread 2: 
`IOThreadMain -> readQueryFromClient -> ... -> replicationFeedSlaves -> feedReplicationBacklog`

Meanwhile, immediately after the main thread finishes spawning IOThreads and returns from `InitServerLast`, it calls [`loadDataFromDisk`](https://github.com/redis/redis/blob/6.0.7/src/server.c#L5272) which also writes to [`server.master_repl_offset`](https://github.com/redis/redis/blob/6.0.7/src/server.c#L4988).

We did not see any synchronization preventing these two accesses from happening in parallel, but we are not entirely sure. 

Could someone with more knowledge on this part of the code confirm? If there is some synchronization we missed, we would like to update our tool accordingly.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Potential Data Race on server.master_repl_offset #7779

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Potential Data Race on server.master_repl_offset #7779

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions