Skip to content

Conversation

@lzshlzsh
Copy link
Contributor

Because callback( eventListeners and lifecycleListeners) of BinaryLogClient is a list, and BinaryLogClient may reuse (see MySqlSplitReader#checkSplitOrStartNext),when multiple snapshotSplits are submitted to a SnapshotSplitReader, the callback list contains already processed snapshotSplits's MySqlBinlogSplitReadTask#handleEvent。When a binlog event arrives, the processed snapshot's callbacks are called and causes the current snapshot's BackfillBinlogReadTask's execute function end before get the BINLOG_END watermark event. So the snapshot phase hangs.

The following is the log of our online environment, we can see muliple MySqlStreamingChangeEventSource(super calss of MySqlBinlogSplitReadTask) callbacks of different snapshotSplits.

io.debezium.connector.mysql.MySqlStreamingChangeEventSource - XXX: eventListeners(7): com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@61540cca,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@352b5758,io.debezium.connector.mysql.MySqlStreamingChangeEventSource$$Lambda$1014/1247290871@703f0cf,io.debezium.connector.mysql.MySqlStreamingChangeEventSource$$Lambda$1015/190751860@5a253136,io.debezium.connector.mysql.MySqlStreamingChangeEventSource$$Lambda$1016/10641269@12fef255,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@18c84a61,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@55443f, lifecycleListeners(5): com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@61540cca,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@352b5758,io.debezium.connector.mysql.MySqlStreamingChangeEventSource$ReaderThreadLifecycleListener@730a6982,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@18c84a61,com.github.shyiko.mysql.binlog.jmx.BinaryLogClientStatistics@55443f

We believe, the imporper use of mysql BinlogClient is the root cause of some task hung up issues, such as #1156

@lzshlzsh
Copy link
Contributor Author

@leonardBang @kylemeow @minchowang Would you help to look at this problem.

@lzshlzsh lzshlzsh changed the title Fix the hung up of snapshot phase when reuse binaryLogClient [mysql-cdc] Fix the hung up of snapshot phase when reuse binaryLogClient Feb 14, 2023
@lzshlzsh
Copy link
Contributor Author

We just encountered this problem online. The snapshot stage is stuck, and the problem is solved after this repair. @minchowang

@leonardBang leonardBang self-requested a review February 22, 2023 13:27
@leonardBang
Copy link
Contributor

Thanks @lzshlzsh for the detail report and fix! I'll review this PR asap

@ruanhang1993 ruanhang1993 added this to the V2.5.0 milestone Jul 5, 2023
@yuxiqian
Copy link
Member

Hi @lzshlzsh, thanks for your contribution! Before this PR could be merged, could you please rebase it with latest master branch?

cc @leonardBang @PatrickRen

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.

@github-actions github-actions bot added Stale and removed Stale labels Sep 24, 2024
@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity for 60 days. It will be closed in 30 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Nov 24, 2024
@github-actions
Copy link

This pull request has been closed because it has not had recent activity. You could reopen it if you try to continue your work, and anyone who are interested in it are encouraged to continue work on this pull request.

@github-actions github-actions bot closed this Dec 25, 2024
@suxinglee
Copy link

@yuxiqian need attention

@misaya295
Copy link

@yuxiqian Is it fix in order pr? I cant find this code in flink-cdc master

@yuxiqian
Copy link
Member

Hi Misaya, it seems the original author of this PR has been inactive for a long time, and this PR is way off HEAD branch and not ready for merge. Feel free to open another PR for this if it's still reproducible in the latest version. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants