Skip to content

Resume issue with big row events #85

Open
@bpolaszek

Description

@bpolaszek
  • Operating System: Not relevant
  • PHP Version: 7.4
  • php-mysql-replication Version: 7.0.1
  • mysql version: MySQL 5.7 / MySQL 8

Hello there,

We have a running application which:

  • Listens to MySQLReplication\Event\DTO\RowsDTO events (inserts, updates, deletes)
  • Loops over RowsDTO::getValues() to process them
  • After foreach ($rows->getValues()), stores the binlog position of the event: $event->getEventInfo()->getBinLogCurrent()->getBinFileName() / $event->getEventInfo()->getBinLogCurrent()->getBinLogPosition().

When the app starts, it starts from the previously stored filename / position.
In most of the times, this works well. However, we found a typical case when it doesn't.

Consider you have an INSERT containing several thousands rows (i.e. INSERT INTO [table] SELECT * FROM [some_other_table]). As opposed to what we expected, this doesn't fire a single RowsDTO event with a huge getValues() array, but several RowsDTO events with just a few dozens items in getValues() in each.

We didn't understood why, until we discovered the binlog_row_event_max_size option on the replica server, but that's not the point.

To reproduce:

  • Consider that before doing your big INSERT query, your position is 00001
  • Consider that position increments to 01000 at the end of the insert
  • Imagine that you stop your app in the middle of processing one of RowsDTO::getValues() event
  • Consider that you stored 00500 as the last processed position when you stopped the app (e.g. you have only processed half of the rows of that insert)
Expected Result:
  • When restarting the app, it should resume on position 00500 and trigger remaining RowsDTO events.
Actual Result:
  • Remaining RowsDTO events of that INSERT query are simply dismissed and apps resumes to the next query in the binlog.

The only solution we have right now is to pray that the application is not shut down (or doesn't crash) while processing that kind of big statements. We could increase binlog_row_event_max_size on the replica server but this won't guarantee RowsDTO won't be split and OTOH having a huge array in RowsDTO::getValues() might lead to memory issues.

As a quick fix, we now temporarily store filename / position on QueryDTO that are fired before and after the RowsDTO events, so that if not all RowsDTO can be processed, they can be replayed with the former position (00001 in our example). This leads to duplicate processing, but we prefer this instead of missing events.

Thank you,
Ben

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions