Skip to content

ticdc: add header line for CSV protocol #20478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions ticdc/ticdc-csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ null = '\N'
include-commit-ts = true
binary-encoding-method = 'base64'
output-old-value = false
output-field-header = false
```

## 数据保存的事务性约束
Expand All @@ -52,6 +53,12 @@ CSV 文件中,单行的每一列定义如下:
- 第五列:`is-update`,该列仅在 `output-old-value` 为 true 时存在,用于标识该行变更来自 Update 事件(值为 true),还是来自 Insert/Delete 事件(值为 false)。
- 第六列至最后一列:变更数据的列,可为一列或多列。

当配置中 `output-field-header = true` 时,CSV 文件将包含一个表头行,表头行的列名如下:

| 第一列 | 第二列 | 第三列 | 第四列(可选) | 第五列(可选) | 第六列 | ... | 最后一列 |
| --- | --- | --- | --- | --- | --- | --- | --- |
| `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | 涉及数据变更的第一列的列名 | ... | 涉及数据变更的最后一列的列名 |

假设某张表 `hr.employee` 的定义如下:

```sql
Expand Down Expand Up @@ -86,6 +93,19 @@ CREATE TABLE `employee` (
"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing"
```

当配置中 `include-commit-ts = true` 且 `output-old-value = true` 且 `output-field-header = true` 时,该表上的 DML 事件以 CSV 格式存储后如下所示:

```
ticdc-meta$operation,ticdc-meta$table,ticdc-meta$schema,ticdc-meta$commit-ts,ticdc-meta$is-update,Id,LastName,FirstName,HireDate,OfficeLocation
"I","employee","hr",433305438660591626,false,101,"Smith","Bob","2014-06-04","New York"
"D","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Shanghai"
"I","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Los Angeles"
"D","employee","hr",433305438660591629,false,101,"Smith","Bob","2017-03-13","Dallas"
"I","employee","hr",433305438660591630,false,102,"Alex","Alice","2017-03-14","Shanghai"
"D","employee","hr",433305438660591630,true,102,"Alex","Alice","2017-03-14","Beijing"
"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing"
```

## 数据类型映射

| MySQL 类型 | CSV 类型 | 示例 | 描述 |
Expand Down