Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1147,14 +1147,39 @@ export const database: NavMenuConstant = {
items: [
{ name: 'Overview', url: '/guides/database/replication' },
{
name: 'Setting up replication',
url: '/guides/database/replication/setting-up-replication' as `/${string}`,
name: 'ETL Replication',
url: '/guides/database/replication/etl-replication-setup' as `/${string}`,
items: [
{
name: 'Setting up',
url: '/guides/database/replication/etl-replication-setup' as `/${string}`,
},
{
name: 'Destinations',
url: '/guides/database/replication/etl-destinations' as `/${string}`,
},
{
name: 'Monitoring',
url: '/guides/database/replication/etl-replication-monitoring' as `/${string}`,
},
{ name: 'FAQ', url: '/guides/database/replication/etl-replication-faq' },
],
},
{
name: 'Monitoring replication',
url: '/guides/database/replication/monitoring-replication' as `/${string}`,
name: 'Manual Replication',
url: '/guides/database/replication/manual-replication-setup' as `/${string}`,
items: [
{
name: 'Setting up',
url: '/guides/database/replication/manual-replication-setup' as `/${string}`,
},
{
name: 'Monitoring',
url: '/guides/database/replication/manual-replication-monitoring' as `/${string}`,
},
{ name: 'FAQ', url: '/guides/database/replication/manual-replication-faq' },
],
},
{ name: 'FAQ', url: '/guides/database/replication/faq' },
],
},
{
Expand Down
42 changes: 29 additions & 13 deletions apps/docs/content/guides/database/replication.mdx
Original file line number Diff line number Diff line change
@@ -1,39 +1,55 @@
---
title: 'Replication and change data capture'
description: 'An introduction to logical replication and change data capture'
id: 'replication'
title: 'Database Replication'
description: 'Replicate your database to external destinations using ETL or manual replication.'
subtitle: 'An introduction to database replication and change data capture.'
sidebar_label: 'Overview'
---

Replication is the process of copying changes from your database to another location. It's also referred to as change data capture (CDC): capturing all the changes that occur to your data.

## Use cases

You might use replication for:
You might use database replication for:

- **Analytics and Data Warehousing**: Replicate your operational database to analytics platforms for complex analysis without impacting your application's performance.
- **Data Integration**: Keep your data synchronized across different systems and services in your tech stack.
- **Backup and Disaster Recovery**: Maintain up-to-date copies of your data in different locations.
- **Read Scaling**: Distribute read operations across multiple database instances to improve performance.

## Replication in Postgres
## Replication methods

Postgres comes with built-in support for replication via publications and replication slots. Refer to the [Concepts and terms](#concepts-and-terms) section to learn how replication works.
Supabase supports two methods for replicating your database to external destinations:

## Setting up and monitoring replication in Supabase
### ETL replication

- [Setting up replication](/docs/guides/database/replication/setting-up-replication)
- [Monitoring replication](/docs/guides/database/replication/monitoring-replication)
<Admonition type="caution" label="Private Alpha">

<Admonition type="tip">

If you want to set up a read replica, see [Read Replicas](/docs/guides/platform/read-replicas) instead. If you want to sync your data in real time to a client such as a browser or mobile app, see [Realtime](/docs/guides/realtime) instead. For configuring replication to an ETL destination, use the [Dashboard](/dashboard/project/_/database/replication).
ETL Replication is currently in private alpha. Access is limited and features may change.

</Admonition>

Use Supabase ETL to automatically replicate data to supported systems.

- [Set up ETL Replication](/docs/guides/database/replication/etl-replication-setup)

### Manual replication

Configure your own replication using external tools and Postgres's native logical replication. This gives you full control over the replication process and allows you to use any tool that supports Postgres logical replication.

- [Set up Manual Replication](/docs/guides/database/replication/manual-replication-setup)

## Related features

Choose the data syncing method based on your use case:

- For realtime features and syncing data to clients (browsers, mobile apps), see [Realtime](/docs/guides/realtime)
- For deploying read-only databases across multiple regions, see [Read Replicas](/docs/guides/platform/read-replicas)

## Concepts and terms

### Write-Ahead Log (WAL)

Postgres uses a system called the Write-Ahead Log (WAL) to manage changes to the database. As you make changes, they are appended to the WAL (which is a series of files (also called "segments"), where the file size can be specified). Once one segment is full, Postgres will start appending to a new segment. After a period of time, a checkpoint occurs and Postgres synchronizes the WAL with your database. Once the checkpoint is complete, then the WAL files can be removed from disk and free up space.
Postgres uses a system called the Write-Ahead Log (WAL) to manage changes to the database. As you make changes, they are appended to the WAL, which is a series of files (also called "segments") where the file size can be specified. Once one segment is full, Postgres will start appending to a new segment. After a period of time, a checkpoint occurs and Postgres synchronizes the WAL with your database. Once the checkpoint is complete, then the WAL files can be removed from disk and free up space.

### Logical replication and WAL

Expand Down
126 changes: 126 additions & 0 deletions apps/docs/content/guides/database/replication/etl-bigquery.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
---
id: 'etl-bigquery'
title: 'ETL to BigQuery'
description: 'Replicate your Supabase database to Google BigQuery using ETL Replication.'
subtitle: 'Stream data changes to BigQuery in real-time.'
sidebar_label: 'BigQuery'
---

<Admonition type="caution" label="Private Alpha">

ETL Replication is currently in private alpha. Access is limited and features may change.

</Admonition>

BigQuery is Google's fully managed data warehouse. ETL Replication allows you to automatically sync your Supabase database tables to BigQuery for analytics and reporting.

<Admonition type="tip">

This page covers BigQuery-specific configuration. For complete setup instructions including publications, general settings, and pipeline management, see the [ETL Replication Setup guide](/docs/guides/database/replication/etl-replication-setup).

</Admonition>

### Setup

Setting up BigQuery replication requires preparing your GCP resources, then configuring BigQuery as an ETL destination.

#### Step 1: Prepare GCP resources

Before configuring BigQuery as a destination, set up the following in Google Cloud Platform:

1. **Google Cloud Platform (GCP) account**: [Sign up for GCP](https://cloud.google.com/gcp) if you don't have one
2. **BigQuery dataset**: Create a [BigQuery dataset](https://cloud.google.com/bigquery/docs/datasets-intro) in your GCP project
3. **GCP service account key**: Create a [service account](https://cloud.google.com/iam/docs/keys-create-delete) with the **BigQuery Data Editor** role and download the JSON key file

#### Step 2: Add BigQuery as an ETL destination

After preparing your GCP resources, configure BigQuery as an ETL destination:

1. Navigate to [Database](/dashboard/project/_/database/etl) → **ETL Replication** in your Supabase Dashboard
2. Click **Add destination**
3. Configure the destination:

<Image
alt="BigQuery Configuration Settings"
src="/docs/img/database/replication/etl-bigquery-details.png"
zoomable
/>

- **Destination type**: Select **BigQuery**
- **Project ID**: Your BigQuery project identifier (found in the GCP Console)
- **Dataset ID**: The name of your BigQuery dataset (without the project ID)

<Admonition type="note">

In the GCP Console, the dataset is shown as `project-id.dataset-id`. Enter only the part after the dot. For example, if you see `my-project.my_dataset`, enter `my_dataset`.

</Admonition>

- **Service Account Key**: Your GCP service account key in JSON format. The service account must have the following permissions:
- `bigquery.datasets.get`
- `bigquery.tables.create`
- `bigquery.tables.get`
- `bigquery.tables.getData`
- `bigquery.tables.update`
- `bigquery.tables.updateData`

4. Complete the remaining configuration following the [ETL Replication Setup guide](/docs/guides/database/replication/etl-replication-setup)

### How it works

Once configured, ETL Replication to BigQuery:

1. Captures changes from your Postgres database (INSERT, UPDATE, DELETE operations)
2. Batches changes for optimal performance
3. Creates BigQuery tables automatically to match your Postgres schema
4. Streams data to BigQuery with CDC metadata

<Admonition type="note">

Due to ingestion latency in BigQuery's streaming API, there may be a delay (typically seconds to minutes) in data appearing. This is normal and expected for BigQuery's architecture.

</Admonition>

#### BigQuery CDC format

BigQuery tables include additional columns for change tracking:

- `_change_type`: The type of change (`INSERT`, `UPDATE`, `DELETE`)
- `_commit_timestamp`: When the change was committed in Postgres
- `_stream_id`: Internal identifier for the replication stream

### Querying replicated data

Once replication is running, you can query your data in BigQuery:

```sql
-- Query the replicated table
SELECT * FROM `your-project.your_dataset.users`
WHERE created_at > TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY);

-- View CDC changes
SELECT
_change_type,
_commit_timestamp,
id,
name,
email
FROM `your-project.your_dataset.users`
ORDER BY _commit_timestamp DESC
LIMIT 100;
```

### Limitations

BigQuery-specific limitations:

- **Ingestion latency**: BigQuery's streaming API has inherent latency (typically seconds to minutes)
- **Row size**: Limited to 10 MB per row due to BigQuery Storage Write API constraints

For general ETL Replication limitations that apply to all destinations, see the [ETL Replication Setup guide](/docs/guides/database/replication/etl-replication-setup#limitations).

### Next steps

- [Set up ETL Replication](/docs/guides/database/replication/etl-replication-setup)
- [Monitor ETL Replication](/docs/guides/database/replication/etl-replication-monitoring)
- [View ETL Replication FAQ](/docs/guides/database/replication/etl-replication-faq)
32 changes: 32 additions & 0 deletions apps/docs/content/guides/database/replication/etl-destinations.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
id: 'etl-destinations'
title: 'ETL Destinations'
description: 'Choose where to replicate your database with ETL Replication.'
subtitle: 'Available destinations for ETL Replication.'
sidebar_label: 'Destinations'
---

<Admonition type="caution" label="Private Alpha">

ETL Replication is currently in private alpha. Access is limited and features may change.

</Admonition>

ETL Replication supports multiple destination types for syncing your database. Choose the destination that best fits your analytics and integration needs.

<Admonition type="note">

Some destinations may not be available for all users. Additional destinations are planned for the future, but we don't have public timelines to share at this time.

</Admonition>

### Available destinations

| Destination | Description | Configuration |
| ------------------------------- | ---------------------------------------------- | -------------------------------------------------------------------- |
| **Iceberg (Analytics Buckets)** | Apache Iceberg tables in S3-compatible storage | [Configure Iceberg →](/docs/guides/database/replication/etl-iceberg) |

### Next steps

- [Set up ETL Replication](/docs/guides/database/replication/etl-replication-setup)
- [Monitor ETL Replication](/docs/guides/database/replication/etl-replication-monitoring)
88 changes: 88 additions & 0 deletions apps/docs/content/guides/database/replication/etl-iceberg.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
id: 'etl-iceberg'
title: 'ETL to Iceberg (Analytics Buckets)'
description: 'Replicate your Supabase database to Iceberg format using Analytics Buckets.'
subtitle: 'Stream data to Analytics Buckets.'
sidebar_label: 'Iceberg'
---

<Admonition type="caution" label="Private Alpha">

ETL Replication is currently in private alpha. Access is limited and features may change.

</Admonition>

<Admonition type="caution" label="Current Limitation">

Iceberg replication is currently incomplete. It provides an append-only log listing all your data changes with an additional column explaining the type of operation (INSERT, UPDATE, DELETE).

</Admonition>

Apache Iceberg is an open table format for analytic datasets. ETL Replication to Iceberg uses Supabase [Analytics Buckets](/docs/guides/storage/analytics) to store your replicated data.

<Admonition type="tip">

This page covers Iceberg-specific configuration. For complete setup instructions including publications, general settings, and pipeline management, see the [ETL Replication Setup guide](/docs/guides/database/replication/etl-replication-setup).

</Admonition>

### Setup

Setting up Iceberg replication requires two steps: creating an Analytics Bucket, then configuring it as an ETL destination.

#### Step 1: Create an Analytics bucket

First, create an Analytics Bucket to store your replicated data:

1. Navigate to [Storage](/dashboard/project/_/storage/buckets) → **Analytics** in your Supabase Dashboard
2. Click **New bucket**

<Image
alt="Create New Analytics Bucket"
src="/docs/img/database/replication/etl-iceberg-new-bucket.png"
zoomable
/>

#### Step 2: Add Iceberg as an ETL destination

After clicking **New bucket**, fill in the bucket details and copy the credentials:

1. Fill in the bucket details:

<Image
alt="Analytics Bucket Details"
src="/docs/img/database/replication/etl-iceberg-details.png"
zoomable
/>

- **Name**: A unique name for your bucket
- **Region**: Select the region where your data will be stored

2. Click **Create bucket**
3. **Copy the credentials** displayed after bucket creation (Catalog Token, S3 Access Key ID, S3 Secret Access Key). You'll need these in the next steps.
4. Navigate to [Database](/dashboard/project/_/database/etl) → **ETL Replication** in your Supabase Dashboard
5. Click **Add destination**
6. Configure the destination:
- **Destination type**: Select **Iceberg (Analytics Bucket)**
- **Bucket**: The name of your Analytics Bucket from Step 1
- **Namespace**: The schema name where your tables will be replicated (e.g., `public`)
- **Catalog Token**: Authentication token for accessing the Iceberg catalog (copied in Step 3)
- **S3 Access Key ID**: Access key for S3-compatible storage (copied in Step 3)
- **S3 Secret Access Key**: Secret key for S3-compatible storage (copied in Step 3)
7. Complete the remaining configuration following the [ETL Replication Setup guide](/docs/guides/database/replication/etl-replication-setup)

For more information about Analytics Buckets, see the [Analytics Buckets documentation](/docs/guides/storage/analytics).

### Limitations

Iceberg-specific limitations:

- **Append-only log**: Currently provides an append-only log format rather than a full table representation

For general ETL Replication limitations that apply to all destinations, see the [ETL Replication Setup guide](/docs/guides/database/replication/etl-replication-setup#limitations).

### Next steps

- [Set up ETL Replication](/docs/guides/database/replication/etl-replication-setup)
- [Monitor ETL Replication](/docs/guides/database/replication/etl-replication-monitoring)
- [View ETL Replication FAQ](/docs/guides/database/replication/etl-replication-faq)
Loading
Loading