Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions src/current/_data/redirects.yml
Original file line number Diff line number Diff line change
Expand Up @@ -299,6 +299,12 @@
- destination: molt/migrate-data-load-and-replication.md
sources: [':version/migrate-from-postgres.md']

- destination: molt/migrate-load-replicate.md
sources: ['molt/migrate-data-load-replicate-only.md']

- destination: molt/migrate-resume-replication.md
sources: ['molt/migrate-replicate-only.md']

- destination: molt/migration-overview.md
sources: [':version/migration-overview.md']

Expand Down
3 changes: 3 additions & 0 deletions src/current/_includes/molt/crdb-to-crdb-migration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{{site.data.alerts.callout_info}}
For CockroachDB-to-CockroachDB migrations, use [backup and restore]({% link {{ site.current_cloud_version }}/backup.md %}) with MOLT Replicator. Contact your account team for guidance.
{{site.data.alerts.end}}
6 changes: 3 additions & 3 deletions src/current/_includes/molt/fetch-data-load-output.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@
~~~

{% if page.name != "migrate-bulk-load.md" %}
This message includes a `cdc_cursor` value. You must set the `--defaultGTIDSet` replication flag to this value when starting [`replication-only` mode](#replicate-changes-to-cockroachdb):
This message includes a `cdc_cursor` value. You must set the `--defaultGTIDSet` replication flag to this value when [starting Replicator](#start-replicator):

{% include_cached copy-clipboard.html %}
~~~
Expand All @@ -83,9 +83,9 @@
~~~
</section>

{% if page.name == "migrate-data-load-replicate-only.md" %}
{% if page.name == "migrate-load-replicate.md" %}
<section class="filter-content" markdown="1" data-scope="oracle">
The following message shows the appropriate values for the `--backfillFromSCN` and `--scn` replication flags to use when [starting`replication-only` mode](#replicate-changes-to-cockroachdb):
The following message shows the appropriate values for the `--backfillFromSCN` and `--scn` replication flags to use when [starting Replicator](#start-replicator):

{% include_cached copy-clipboard.html %}
~~~
Expand Down
2 changes: 2 additions & 0 deletions src/current/_includes/molt/fetch-metrics.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
### Fetch metrics

By default, MOLT Fetch exports [Prometheus](https://prometheus.io/) metrics at `http://127.0.0.1:3030/metrics`. You can override the address with `--metrics-listen-addr '{host}:{port}'`, where the endpoint will be `http://{host}:{port}/metrics`.

Cockroach Labs recommends monitoring the following metrics during data load:
Expand Down
4 changes: 2 additions & 2 deletions src/current/_includes/molt/fetch-replication-output.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@
DEBUG [Jan 22 13:52:40] upserted rows conflicts=0 duration=7.620208ms proposed=1 target="\"molt\".\"migration_schema\".\"employees\"" upserted=1
~~~

{% if page.name != "migrate-replicate-only.md" %}
{% if page.name != "migrate-resume-replication.md" %}
{{site.data.alerts.callout_success}}
If replication is interrupted, you can [resume replication]({% link molt/migrate-replicate-only.md %}).
If replication is interrupted, you can [resume replication]({% link molt/migrate-resume-replication.md %}).
{{site.data.alerts.end}}
{% endif %}
4 changes: 2 additions & 2 deletions src/current/_includes/molt/fetch-table-filter-userscript.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@ api.configureSource("defaultdb.migration_schema", {
});
~~~

Pass the userscript to MOLT Fetch with the `--userscript` [replication flag](#replication-flags):
Pass the userscript to MOLT Replicator with the `--userscript` [flag](#replication-flags):

~~~
--replicator-flags "--userscript table_filter.ts"
--userscript table_filter.ts
~~~
172 changes: 162 additions & 10 deletions src/current/_includes/molt/migration-prepare-database.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,12 @@ GRANT EXECUTE_CATALOG_ROLE TO C##MIGRATION_USER;
GRANT SELECT_CATALOG_ROLE TO C##MIGRATION_USER;

-- Access to necessary V$ views
GRANT SELECT ON V_$LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOGFILE TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOGMNR_CONTENTS TO C##MIGRATION_USER;
GRANT SELECT ON V_$ARCHIVED_LOG TO C##MIGRATION_USER;
GRANT SELECT ON V_$DATABASE TO C##MIGRATION_USER;
GRANT SELECT ON V_$LOG_HISTORY TO C##MIGRATION_USER;

-- Direct grants to specific DBA views
GRANT SELECT ON ALL_USERS TO C##MIGRATION_USER;
Expand Down Expand Up @@ -125,31 +130,160 @@ GRANT SELECT, FLASHBACK ON migration_schema.tbl TO MIGRATION_USER;
#### Configure source database for replication

<section class="filter-content" markdown="1" data-scope="postgres">
{{site.data.alerts.callout_info}}
Connect to the primary PostgreSQL instance, **not** a read replica. Read replicas cannot create or manage logical replication slots. Verify that you are connected to the primary server by running `SELECT pg_is_in_recovery();` and getting a `false` result.
{{site.data.alerts.end}}

Enable logical replication by setting `wal_level` to `logical` in `postgresql.conf` or in the SQL shell. For example:

{% include_cached copy-clipboard.html %}
~~~ sql
ALTER SYSTEM SET wal_level = 'logical';
~~~

Create a publication for the tables you want to replicate. Do this **before** creating the replication slot.

To create a publication for all tables:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE PUBLICATION molt_publication FOR ALL TABLES;
~~~

To create a publication for specific tables:

{% include_cached copy-clipboard.html %}
~~~ sql
CREATE PUBLICATION molt_publication FOR TABLE employees, payments, orders;
~~~

Create a logical replication slot:

{% include_cached copy-clipboard.html %}
~~~ sql
SELECT pg_create_logical_replication_slot('molt_slot', 'pgoutput');
~~~

##### Verify logical replication setup

Verify the publication was created successfully:

{% include_cached copy-clipboard.html %}
~~~ sql
SELECT * FROM pg_publication;
~~~

~~~
oid | pubname | pubowner | puballtables | pubinsert | pubupdate | pubdelete | pubtruncate | pubviaroot
-------+------------------+----------+--------------+-----------+-----------+-----------+-------------+------------
59084 | molt_publication | 10 | t | t | t | t | t | f
~~~

Verify the replication slot was created:

{% include_cached copy-clipboard.html %}
~~~ sql
SELECT * FROM pg_replication_slots;
~~~

~~~
slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn | wal_status | safe_wal_size | two_phase
-----------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+---------------+-----------
molt_slot | pgoutput | logical | 16385 | molt | f | f | | | 2261 | 0/49913A20 | 0/49913A58 | reserved | | f
~~~
</section>

<section class="filter-content" markdown="1" data-scope="mysql">
For MySQL **8.0 and later** sources, enable [global transaction identifiers (GTID)](https://dev.mysql.com/doc/refman/8.0/en/replication-options-gtids.html) consistency. Set the following values in `mysql.cnf`, in the SQL shell, or as flags in the `mysql` start command:
Enable [global transaction identifiers (GTID)](https://dev.mysql.com/doc/refman/8.0/en/replication-options-gtids.html) and configure binary logging. Set `binlog-row-metadata` or `binlog-row-image` to `full` to provide complete metadata for replication.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be worth calling out that it's also important to tune binlog retention: https://dba.stackexchange.com/a/206602

This can impact if the data from the GTID you specify is still available or if it's now purged/rotated. It's important to note that if using something like AWS RDS or GCP CloudSQL, there are provider specific ways they handle this:


{{site.data.alerts.callout_info}}
GTID replication sends all database changes to Replicator. To limit replication to specific tables or schemas, use the `--table-filter` and `--schema-filter` flags in the `replicator` command.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note that schema-filter and table-filter are not supported for replicator. This use case will actually require a userscript. Given we don't have userscripts documented right now, wondering how you want to proceed here? CC @Jeremyyang920 @rohan-joshi

{{site.data.alerts.end}}

| Version | Configuration |
|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| MySQL 5.6 | `--gtid-mode=on`<br>`--enforce-gtid-consistency=on`<br>`--server-id={unique_id}`<br>`--log-bin=mysql-binlog`<br>`--binlog-format=row`<br>`--binlog-row-image=full`<br>`--log-slave-updates=ON` |
| MySQL 5.7 | `--gtid-mode=on`<br>`--enforce-gtid-consistency=on`<br>`--binlog-row-image=full`<br>`--server-id={unique_id}`<br>`--log-bin=log-bin` |
| MySQL 8.0+ | `--gtid-mode=on`<br>`--enforce-gtid-consistency=on`<br>`--binlog-row-metadata=full` |
| MariaDB | `--log-bin`<br>`--server_id={unique_id}`<br>`--log-basename=master1`<br>`--binlog-format=row`<br>`--binlog-row-metadata=full` |

- `--enforce-gtid-consistency=ON`
- `--gtid-mode=ON`
- `--binlog-row-metadata=full`
##### Verify MySQL GTID setup

For MySQL **5.7** sources, set the following values. Note that `binlog-row-image` is used instead of `binlog-row-metadata`. Set `server-id` to a unique integer that differs from any other MySQL server you have in your cluster (e.g., `3`).
Get the current GTID set to use as the starting point for replication:

- `--enforce-gtid-consistency=ON`
- `--gtid-mode=ON`
- `--binlog-row-image=full`
- `--server-id={ID}`
- `--log-bin=log-bin`
{% include_cached copy-clipboard.html %}
~~~ sql
-- For MySQL < 8.0:
SHOW MASTER STATUS;
-- For MySQL 8.0+:
SHOW BINARY LOG STATUS;
~~~

~~~
+---------------+----------+--------------+------------------+-------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+---------------+----------+--------------+------------------+-------------------------------------------+
| binlog.000005 | 197 | | | 77263736-7899-11f0-81a5-0242ac120002:1-38 |
+---------------+----------+--------------+------------------+-------------------------------------------+
~~~

Use the `Executed_Gtid_Set` value for the `--defaultGTIDSet` flag in MOLT Replicator.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note is that this value will only be used if there is no GTID in the memo table which is in the staging database (i.e _replicator). Otherwise, it will use the one in the memo table and keep track of advancing GTID checkpoints in memo. Is this called out elsewhere, or can we add a line about this here?

To force the system to respect the defaultGTIDSet you pass in, you can just clear the memo table and it will be as if it's a fresh run.


To verify that a GTID set is valid and not purged, use the following queries:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, this section will be really helpful and would have helped some folks sanity check before raising an issue.


{% include_cached copy-clipboard.html %}
~~~ sql
-- Verify the GTID set is in the executed set
SELECT GTID_SUBSET('77263736-7899-11f0-81a5-0242ac120002:1-38', @@GLOBAL.gtid_executed) AS in_executed;

-- Verify the GTID set is not in the purged set
SELECT GTID_SUBSET('77263736-7899-11f0-81a5-0242ac120002:1-38', @@GLOBAL.gtid_purged) AS in_purged;
~~~

If `in_executed` returns `1` and `in_purged` returns `0`, the GTID set is valid for replication.
</section>

<section class="filter-content" markdown="1" data-scope="oracle">
##### Enable ARCHIVELOG and FORCE LOGGING

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to @noelcrl to review the correctness here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct overall, will clean up these commands a bit and clarify what is happening.


Enable `ARCHIVELOG` mode for LogMiner to access archived redo logs:

{% include_cached copy-clipboard.html %}
~~~ sql
-- Check current log mode
SELECT log_mode FROM v$database;

-- Enable ARCHIVELOG (requires database restart)
SHUTDOWN IMMEDIATE;
STARTUP MOUNT;
ALTER DATABASE ARCHIVELOG;
ALTER DATABASE OPEN;

-- Verify ARCHIVELOG is enabled
SELECT log_mode FROM v$database; -- Expected: ARCHIVELOG
~~~

Enable supplemental logging for primary keys:

{% include_cached copy-clipboard.html %}
~~~ sql
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;

-- Verify supplemental logging
SELECT supplemental_log_data_min, supplemental_log_data_pk FROM v$database;
-- Expected: SUPPLEMENTAL_LOG_DATA_MIN: IMPLICIT (or YES), SUPPLEMENTAL_LOG_DATA_PK: YES
Comment on lines +270 to +274
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
-- Verify supplemental logging
SELECT supplemental_log_data_min, supplemental_log_data_pk FROM v$database;
-- Expected: SUPPLEMENTAL_LOG_DATA_MIN: IMPLICIT (or YES), SUPPLEMENTAL_LOG_DATA_PK: YES
-- Enable minimal supplemental logging for primary keys
ALTER DATABASE ADD SUPPLEMENTAL LOG DATA (PRIMARY KEY) COLUMNS;
-- Verify supplemental logging status
SELECT supplemental_log_data_min, supplemental_log_data_pk FROM v$database;
-- Expected:
-- SUPPLEMENTAL_LOG_DATA_MIN: IMPLICIT (or YES)
-- SUPPLEMENTAL_LOG_DATA_PK: YES

~~~

Enable `FORCE LOGGING` to ensure all changes are logged:

{% include_cached copy-clipboard.html %}
~~~ sql
ALTER DATABASE FORCE LOGGING;

-- Verify FORCE LOGGING is enabled
SELECT force_logging FROM v$database; -- Expected: YES
~~~

##### Create source sentinel table

Create a checkpoint table called `_replicator_sentinel` in the Oracle schema you will migrate:
Expand Down Expand Up @@ -243,6 +377,24 @@ CURRENT_SCN
1 row selected.
~~~

##### Get SCNs for replication startup

If you plan to use [initial data load](#start-fetch) followed by [replication](#start-replicator), obtain the correct SCNs **before** starting the initial data load to ensure no active transactions are missed. Run the following queries on the PDB in the order shown:

{% include_cached copy-clipboard.html %}
~~~ sql
-- Query the current SCN from Oracle
SELECT CURRENT_SCN FROM V$DATABASE;

-- Query the starting SCN of the earliest active transaction
SELECT MIN(t.START_SCNB) FROM V$TRANSACTION t;
~~~
Comment on lines +386 to +391
Copy link
Contributor

@noelcrl noelcrl Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be a correctness issue here, the following should work instead:

-- 1) Capture an SCN before inspecting active transactions
SELECT CURRENT_SCN AS before_active_scn FROM V$DATABASE;

-- 2) Find the earliest active transaction start SCN
SELECT MIN(t.START_SCNB) AS earliest_active_scn FROM V$TRANSACTION t;

-- 3) Capture the snapshot SCN after the checks
SELECT CURRENT_SCN AS snapshot_scn FROM V$DATABASE;


Use the results as follows:
Copy link
Contributor

@noelcrl noelcrl Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use the results as follows:
Use the query results by providing the following flag values to `replicator`:


- `--scn`: Use the result from the first query (current SCN)
- `--backfillFromSCN`: Use the result from the second query (earliest active transaction SCN). If the second query returns no results, use the result from the first query instead.
Comment on lines +395 to +396
Copy link
Contributor

@noelcrl noelcrl Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `--scn`: Use the result from the first query (current SCN)
- `--backfillFromSCN`: Use the result from the second query (earliest active transaction SCN). If the second query returns no results, use the result from the first query instead.
Compute the flags for replicator as follows:
--backfillFromSCN: use the smaller value between `before_active_scn` and `earliest_active_scn`. If `earliest_active_scn` has no value, use `before_active_scn`.
--scn: use `snapshot_scn`.
Make sure --scn is greater than or equal to --backfillFromSCN.


Add the redo log files to LogMiner, using the redo log file paths you queried:

{% include_cached copy-clipboard.html %}
Expand Down
2 changes: 2 additions & 0 deletions src/current/_includes/molt/migration-stop-replication.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
{% if page.name != "migrate-failback.md" %}
1. Stop application traffic to your source database. **This begins downtime.**
{% endif %}

1. Wait for replication to drain, which means that all transactions that occurred on the source database have been fully processed and replicated to CockroachDB. There are two ways to determine that replication has fully drained:
- When replication is caught up, you will not see new `upserted rows` logs.
Expand Down
2 changes: 2 additions & 0 deletions src/current/_includes/molt/molt-connection-strings.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ For example:
~~~
--source 'postgres://migration_user:password@localhost:5432/molt?sslmode=verify-full'
~~~

The source connection must point to the PostgreSQL primary instance, not a read replica.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we do have a flag that can still ignore replication setup for cases where folks just want a data load and don't have any need for replication setup or information. Should we clarify this? CC @Jeremyyang920

</section>

<section class="filter-content" markdown="1" data-scope="mysql">
Expand Down
20 changes: 10 additions & 10 deletions src/current/_includes/molt/molt-docker.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
For details on pulling Docker images, see [Docker image](#docker-image).
For details on pulling Docker images, refer to [Docker images](#docker-images).

### Performance

MOLT Fetch and Verify are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.
MOLT Fetch, Verify, and Replicator are likely to run more slowly in a Docker container than on a local machine. To improve performance, increase the memory or compute resources, or both, on your Docker container.

{% if page.name == "molt-fetch.md" %}
### Authentication
Expand Down Expand Up @@ -67,14 +67,14 @@ When testing locally, specify the host as follows:

- For macOS, use `host.docker.internal`. For example:

~~~
--source 'postgres://postgres:[email protected]:5432/molt?sslmode=disable'
--target "postgres://[email protected]:26257/molt?sslmode=disable"
~~~
~~~
--source 'postgres://postgres:[email protected]:5432/molt?sslmode=disable'
--target "postgres://[email protected]:26257/molt?sslmode=disable"
~~~

- For Linux and Windows, use `172.17.0.1`. For example:

~~~
--source 'postgres://postgres:[email protected]:5432/molt?sslmode=disable'
--target "postgres://[email protected]:26257/molt?sslmode=disable"
~~~
~~~
--source 'postgres://postgres:[email protected]:5432/molt?sslmode=disable'
--target "postgres://[email protected]:26257/molt?sslmode=disable"
~~~
43 changes: 26 additions & 17 deletions src/current/_includes/molt/molt-install.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,20 +14,7 @@ The following binaries are included:
- `molt`
- `replicator`

Both `molt` and `replicator` must be in your current **working directory**. To use replication features, `replicator` must be located either in the same directory as `molt` or in a directory directly beneath `molt`. For example, either of the following would be valid:

~~~
/migration-project/ # Your current working directory
├── molt # MOLT binary
└── replicator # Replicator binary
~~~

~~~
/migration-project/ # Your current working directory
├── molt # MOLT binary
└── bin/ # Subdirectory
└── replicator # Replicator binary
~~~
Both `molt` and `replicator` must be in your current **working directory**.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After separating the two, technically these two no longer need to be together in the same working directory. But it's still easier to keep them together.


To display the current version of each binary, run `molt --version` and `replicator --version`.

Expand All @@ -39,16 +26,19 @@ MOLT Fetch is supported on Red Hat Enterprise Linux (RHEL) 9 and above.
{{site.data.alerts.end}}
{% endif %}

### Docker image
### Docker images

[Docker multi-platform images](https://hub.docker.com/r/cockroachdb/molt/tags) containing both the AMD and ARM binaries are available. To pull the latest image for PostgreSQL and MySQL:
{% if page.name != "molt-replicator.md" %}
#### MOLT Fetch

[Docker multi-platform images](https://hub.docker.com/r/cockroachdb/molt/tags) containing both the AMD and ARM `molt` and `replicator` binaries are available. To pull the latest image for PostgreSQL and MySQL:

{% include_cached copy-clipboard.html %}
~~~ shell
docker pull cockroachdb/molt
~~~

To pull a specific version (e.g., `1.1.3`):
To pull a specific version (for example, `1.1.3`):

{% include_cached copy-clipboard.html %}
~~~ shell
Expand All @@ -61,5 +51,24 @@ To pull the latest image for Oracle (note that only `linux/amd64` is supported):
~~~ shell
docker pull cockroachdb/molt:oracle-latest
~~~
{% endif %}

{% if page.name != "molt-fetch.md" %}
#### MOLT Replicator

[Docker images for MOLT Replicator](https://hub.docker.com/r/cockroachdb/replicator/tags) are also available as a standalone binary:

{% include_cached copy-clipboard.html %}
~~~ shell
docker pull cockroachdb/replicator
~~~

To pull a specific version (for example, `v1.1.1`):

{% include_cached copy-clipboard.html %}
~~~ shell
docker pull cockroachdb/replicator:v1.1.1
~~~
{% endif %}

{% if page.name != "molt.md" %}For details on running in Docker, refer to [Docker usage](#docker-usage).{% endif %}
Loading
Loading