-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Paraphrasing a lot due to lack of ability to understand Go, at the point where the tool is ready to swap tables, there are these connections:
- GET_LOCK('something', 0);
- SELECT RELEASE_LOCK('something') FROM original_table WHERE GET_LOCK('something',999)>=0 LIMIT 1; # or similar
- RENAME TABLE original_table to original_old, original_new to original;
And then there's a check for the parsed binlog writes to have all gone to original_new, and then locks are released and the rename happens.
However, if either the GET_LOCK() connection or the SELECT RELEASE_LOCK() connection gets killed, the RENAME happens. So there are two places where a killed connection could result in a too-early-swap and some binlog events being lost.
I like the idea of confidence in being able to say that gh-ost wouldn't result in unintentional lost rows/writes. Maybe in this case it might involve locking the now-original-table after the rename and writing the missed events then (e.g. we'd know because the REPLACE/DELETE to original_new would have failed). Just brainstorming.