Resume issue with big row events

* *Operating System*: Not relevant
* *PHP Version*: 7.4
* *php-mysql-replication Version*: 7.0.1
* *mysql version*: MySQL 5.7 / MySQL 8

Hello there, 

We have a running application which:
- Listens to `MySQLReplication\Event\DTO\RowsDTO` events (inserts, updates, deletes)
- Loops over `RowsDTO::getValues()` to process them
- After `foreach ($rows->getValues())`, stores the binlog position of the event: `$event->getEventInfo()->getBinLogCurrent()->getBinFileName() / $event->getEventInfo()->getBinLogCurrent()->getBinLogPosition()`.

When the app starts, it starts from the previously stored filename / position.
In _most of the times_, this works well. However, we found a typical case when it doesn't.

Consider you have an INSERT containing several thousands rows (i.e. `INSERT INTO [table] SELECT * FROM [some_other_table]`). As opposed to what we expected, this doesn't fire a _single_ `RowsDTO` event with a huge `getValues()` array, but _several_ `RowsDTO` events with just a few dozens items in `getValues()` in each.

We didn't understood why, until we discovered the `binlog_row_event_max_size` option on the replica server, but that's not the point.

To reproduce:
- Consider that before doing your big INSERT query, your position is 00001
- Consider that position increments to 01000 at the end of the insert
- Imagine that you stop your app in the middle of processing one of `RowsDTO::getValues()` event
- Consider that you stored 00500 as the last processed position when you stopped the app (e.g. you have only processed half of the rows of that insert)

##### Expected Result:

* When restarting the app, it should resume on position 00500 and trigger remaining `RowsDTO` events.

##### Actual Result:

* Remaining `RowsDTO` events of that INSERT query are simply dismissed and apps resumes to the next query in the binlog.

The only solution we have right now is to pray that the application is not shut down (or doesn't crash) while processing that kind of big statements. We could increase `binlog_row_event_max_size` on the replica server but this won't guarantee `RowsDTO` won't be split and OTOH having a huge array in `RowsDTO::getValues()` might lead to memory issues.

As a quick fix, we now temporarily store filename / position on `QueryDTO` that are fired before and after the `RowsDTO` events, so that if not all `RowsDTO` can be processed, they can be replayed with the former position (00001 in our example). This leads to duplicate processing, but we prefer this instead of missing events.

Thank you,
Ben

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Resume issue with big row events #85

Expected Result:

Actual Result:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Resume issue with big row events #85

Description

Expected Result:

Actual Result:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions