Skip to content

pw.io.mssql.read needs LSN persistence #218

@zxqfd555

Description

@zxqfd555

Is your feature request related to a problem? Please describe.

pw.io.mssql.read does not persist its position in any mode. When persistence is enabled, the connector immediately aborts with an error stating that persistence is not supported. Without persistence, on restart it re-reads everything from the beginning.

Describe the solution you'd like

Use the MS SQL Server WAL (Change Data Capture log) position as the persistence offset. Concretely, store the last successfully processed LSN (Log Sequence Number) in Pathway's persistence layer after each committed batch. On restart, resume CDC replication from that LSN.

If the saved LSN is no longer available on the server — because the CDC log retention window has expired and the corresponding log records have been purged — the connector must raise a descriptive error at startup, for example:

"Saved CDC offset (LSN ...) is no longer available on the MS SQL Server. The CDC log may have been purged. Manual re-snapshot or persistence reset is required."

Silently falling back to reading from the start or from the current position would produce incorrect results (duplicate emission or data loss respectively) and must not be done. The error path should have a code comment referencing this issue.

Describe alternatives you've considered

Storing a row-level timestamp or a row count as the offset is unreliable: timestamps are not guaranteed to be unique or monotonic across transactions, and row counts do not survive schema changes or deletes. The LSN is the only correct and stable resumption point for CDC-based connectors, and is the same approach used by pw.io.mongodb.read.

Additional context

MS SQL Server's CDC retention is controlled by a SQL Server Agent cleanup job (default retention: 3 days), rather than a replication slot that holds WAL on behalf of the consumer. This means the server does not guarantee log availability for slow or offline consumers — hence the hard error on missing LSN rather than any silent fallback.

Testing should extend the existing MS SQL Server integration test suite with: restart/resume tests that verify correct incremental delivery after a checkpoint, and a test that verifies the descriptive error is raised when the connector is restarted with an LSN that has been purged from the CDC tables.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions