Skip to content

Make asynchronous replica re-initialization reliable#8324

Merged
dyemanov merged 1 commit intov5.0-releasefrom
reliable-replica-reinit
Nov 25, 2024
Merged

Make asynchronous replica re-initialization reliable#8324
dyemanov merged 1 commit intov5.0-releasefrom
reliable-replica-reinit

Conversation

@dyemanov
Copy link
Copy Markdown
Member

Currently when a physical backup is performed, journal segment is switched from N to N+1 at the backup start so that backup file is ensured to contain only data up to sequence N (including it). However, some long-running writeable transaction could already have some its changes stored in segments <= N while a commit event will be stored in some later segment. After re-initialization at the replica side, we continue with segment N+1 and (a) have older changes lost and (b) error "Transaction X is not found" usually happens. It means that the replica is inconsistent and must be re-initialized again. But if the primary is under high load, this may happen over and over.

The solution is to not delete segments <= N immediately, but instead scan them to find the active transactions at the end of N, calculate the new replication OAT, delete everything < OAT and replay the journal (active transactions only) starting with OAT, then proceed normally with N+1 and beyond.

@dyemanov
Copy link
Copy Markdown
Member Author

It appears something went wrong with the diff, sorry. Will fix ASAP.

@dyemanov dyemanov changed the base branch from master to v5.0-release November 21, 2024 15:47
@dyemanov
Copy link
Copy Markdown
Member Author

Wrong branch was initially selected, the patch is against v5 but can be (should be, I'd say) back- and front-ported.

@pavel-zotov
Copy link
Copy Markdown

pavel-zotov commented Dec 10, 2024

::: QA NOTE :::
Implemented within the group of tests related to replication, see:
functional/replication/test_make_async_reinit_reliable.py

@mrotteveel mrotteveel added the rlsnotes60: no Intentionally not added to the Firebird 6.0 release notes. label Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment