Is your feature request related to a problem? Please describe.
Pathway currently has pw.io.sqlite.read, which is useful for exchanging data between Pathway and external processes (other Pathway pipelines, third-party applications, etc.). However, there is no corresponding pw.io.sqlite.write, making SQLite a read-only integration and breaking the symmetry that exists for all other database connectors (PostgreSQL, MySQL, MS SQL Server, MongoDB). Users who want to use SQLite as a shared data store between two Pathway processes have no native way to do so.
Describe the solution you'd like
Add a native SQLite output connector implemented in Rust using the existing rusqlite crate (MIT license), exposed as pw.io.sqlite.write.
The connector must be consistent with the interface established by the other database output connectors (e.g. pw.io.mysql.write, pw.io.postgres.write). Specifically:
Crate: rusqlite — MIT license. Already used by pw.io.sqlite.read; no new dependency needed.
API:
pw.io.sqlite.write(
table,
path,
table_name,
*,
max_batch_size=None,
init_mode="default", # "default" | "create_if_not_exists" | "replace"
output_table_type="stream_of_changes", # "stream_of_changes" | "snapshot"
# ... some other parameters consistent with other DB connectors (primary_key, name, sort_by, etc.)
)
init_mode options (matching the convention from other DB connectors):
"default" — write to an existing table; raise an error if it does not exist.
"create_if_not_exists" — create the table if it does not already exist, inferred from the Pathway schema.
"replace" — drop and recreate the table at startup.
output_table_type options:
"stream_of_changes" (default) — append a log of all changes to the target table, with additional time (INTEGER) and diff (INTEGER, +1 for insertion, -1 for deletion) columns, consistent with other connectors.
"snapshot" — maintain the current state of the data, applying upserts and deletes atomically per minibatch. Requires primary_key to be specified.
Type mapping and round-trip guarantee: the type serialization must be consistent with how pw.io.sqlite.read currently deserializes values, so that any value written via pw.io.sqlite.write survives a round-trip through pw.io.sqlite.read unchanged. The finalized type mapping table should be added to the pw.io.sqlite API docs page as part of this work, analogously to how it is done for pw.io.mysql.
Describe alternatives you've considered
Users can currently write to SQLite via a Python output connector, but this bypasses the engine-level batching and transaction management and is significantly more error-prone. There is no reason SQLite should be treated differently from the other database connectors.
Additional context
Testing: Since SQLite requires no external server or Docker container, the integration tests do not need to go into a separate Docker Compose-based test suite. They should be added directly to the existing unit test suite (e.g. test_io), alongside the existing pw.io.sqlite.read tests. The test coverage should include:
- Writing in
stream_of_changes mode and reading back, asserting equality.
- Writing in
snapshot mode and reading back, asserting equality.
- All three
init_mode variants (default, create_if_not_exists, replace).
- Round-trip tests for every Pathway type with a defined SQLite mapping — write via
pw.io.sqlite.write, read back via pw.io.sqlite.read, assert the value is unchanged. These are analogous to the *_parsing tests that exist for other connectors.
Since this is a completely new connector being added, no backward compatibility issues are expected.
Is your feature request related to a problem? Please describe.
Pathway currently has
pw.io.sqlite.read, which is useful for exchanging data between Pathway and external processes (other Pathway pipelines, third-party applications, etc.). However, there is no correspondingpw.io.sqlite.write, making SQLite a read-only integration and breaking the symmetry that exists for all other database connectors (PostgreSQL, MySQL, MS SQL Server, MongoDB). Users who want to use SQLite as a shared data store between two Pathway processes have no native way to do so.Describe the solution you'd like
Add a native SQLite output connector implemented in Rust using the existing
rusqlitecrate (MIT license), exposed aspw.io.sqlite.write.The connector must be consistent with the interface established by the other database output connectors (e.g.
pw.io.mysql.write,pw.io.postgres.write). Specifically:Crate:
rusqlite— MIT license. Already used bypw.io.sqlite.read; no new dependency needed.API:
init_modeoptions (matching the convention from other DB connectors):"default"— write to an existing table; raise an error if it does not exist."create_if_not_exists"— create the table if it does not already exist, inferred from the Pathway schema."replace"— drop and recreate the table at startup.output_table_typeoptions:"stream_of_changes"(default) — append a log of all changes to the target table, with additionaltime(INTEGER) anddiff(INTEGER,+1for insertion,-1for deletion) columns, consistent with other connectors."snapshot"— maintain the current state of the data, applying upserts and deletes atomically per minibatch. Requiresprimary_keyto be specified.Type mapping and round-trip guarantee: the type serialization must be consistent with how
pw.io.sqlite.readcurrently deserializes values, so that any value written viapw.io.sqlite.writesurvives a round-trip throughpw.io.sqlite.readunchanged. The finalized type mapping table should be added to thepw.io.sqliteAPI docs page as part of this work, analogously to how it is done forpw.io.mysql.Describe alternatives you've considered
Users can currently write to SQLite via a Python output connector, but this bypasses the engine-level batching and transaction management and is significantly more error-prone. There is no reason SQLite should be treated differently from the other database connectors.
Additional context
Testing: Since SQLite requires no external server or Docker container, the integration tests do not need to go into a separate Docker Compose-based test suite. They should be added directly to the existing unit test suite (e.g.
test_io), alongside the existingpw.io.sqlite.readtests. The test coverage should include:stream_of_changesmode and reading back, asserting equality.snapshotmode and reading back, asserting equality.init_modevariants (default,create_if_not_exists,replace).pw.io.sqlite.write, read back viapw.io.sqlite.read, assert the value is unchanged. These are analogous to the*_parsingtests that exist for other connectors.Since this is a completely new connector being added, no backward compatibility issues are expected.