|
| 1 | +# CloudSQL MySQL Batch Sink |
| 2 | + |
| 3 | + |
| 4 | +Description |
| 5 | +----------- |
| 6 | +Writes records to a CloudSQL MySQL table. Each record will be written to a row in the table. |
| 7 | + |
| 8 | + |
| 9 | +Use Case |
| 10 | +-------- |
| 11 | +This sink is used whenever you need to write to a CloudSQL MySQL table. |
| 12 | +Suppose you periodically build a recommendation model for products on your online store. |
| 13 | +The model is stored in a GCS bucket and you want to export the contents |
| 14 | +of the bucket to a CloudSQL MySQL table where it can be served to your users. |
| 15 | + |
| 16 | +Column names would be autodetected from input schema. |
| 17 | + |
| 18 | +Properties |
| 19 | +---------- |
| 20 | +**Reference Name:** Name used to uniquely identify this sink for lineage, annotating metadata, etc. |
| 21 | + |
| 22 | +**Driver Name:** Name of the JDBC driver to use. |
| 23 | + |
| 24 | +**Database:** MySQL database name. |
| 25 | + |
| 26 | +**Connection Name:** The CloudSQL instance to connect to in the format <PROJECT_ID>:\<REGION>:<INSTANCE_NAME>. |
| 27 | +Can be found in the instance overview page. |
| 28 | + |
| 29 | +**CloudSQL Instance Type:** Whether the CloudSQL instance to connect to is private or public. Defaults to 'Public'. |
| 30 | + |
| 31 | +**Table Name:** Name of the table to export to. Table must exist prior to running the pipeline. |
| 32 | + |
| 33 | +**Username:** User identity for connecting to the specified database. |
| 34 | + |
| 35 | +**Password:** Password to use to connect to the specified database. |
| 36 | + |
| 37 | +**Transaction Isolation Level:** Transaction isolation level for queries run by this sink. |
| 38 | + |
| 39 | +**Connection Timeout:** The timeout value (in seconds) used for socket connect operations. If connecting to the server |
| 40 | +takes longer than this value, the connection is broken. A value of 0 means that it is disabled. |
| 41 | + |
| 42 | +**Connection Arguments:** A list of arbitrary string key/value pairs as connection arguments. These arguments |
| 43 | +will be passed to the JDBC driver as connection arguments for JDBC drivers that may need additional configurations. |
| 44 | + |
| 45 | + |
| 46 | +Data Types Mapping |
| 47 | +------------------ |
| 48 | + |
| 49 | + | MySQL Data Type | CDAP Schema Data Type | |
| 50 | + | ------------------------------ | --------------------- | |
| 51 | + | BIT | boolean | |
| 52 | + | TINYINT | int | |
| 53 | + | BOOL, BOOLEAN | boolean | |
| 54 | + | SMALLINT | int | |
| 55 | + | MEDIUMINT | double | |
| 56 | + | INT,INTEGER | int | |
| 57 | + | BIGINT | long | |
| 58 | + | FLOAT | float | |
| 59 | + | DOUBLE | double | |
| 60 | + | DECIMAL | decimal | |
| 61 | + | DATE | date | |
| 62 | + | DATETIME | timestamp | |
| 63 | + | TIMESTAMP | timestamp | |
| 64 | + | TIME | time | |
| 65 | + | YEAR | date | |
| 66 | + | CHAR | string | |
| 67 | + | VARCHAR | string | |
| 68 | + | BINARY | bytes | |
| 69 | + | VARBINARY | bytes | |
| 70 | + | TINYBLOB | bytes | |
| 71 | + | TINYTEXT | string | |
| 72 | + | BLOB | bytes | |
| 73 | + | TEXT | string | |
| 74 | + | MEDIUMBLOB | bytes | |
| 75 | + | MEDIUMTEXT | string | |
| 76 | + | LONGBLOB | bytes | |
| 77 | + | LONGTEXT | string | |
| 78 | + | ENUM | string | |
| 79 | + | SET | string | |
| 80 | + |
| 81 | + |
| 82 | + |
| 83 | +Examples |
| 84 | +-------- |
| 85 | +**Connecting to a public CloudSQL MySQL instance** |
| 86 | + |
| 87 | +Suppose you want to write output records to "users" table of CloudSQL MySQL database named "prod", as "root" user with |
| 88 | +"root" password (Get the latest version of the CloudSQL socket factory jar with driver and dependencies |
| 89 | +[here](https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/releases), then configure plugin with: |
| 90 | + |
| 91 | + |
| 92 | +``` |
| 93 | +Reference Name: "sink1" |
| 94 | +Driver Name: "cloudsql-mysql" |
| 95 | +Instance Name: [PROJECT_ID]:[REGION_ID]:[INSTANCE_NAME] |
| 96 | +CloudSQL Instance Type: "Public" |
| 97 | +Database: "prod" |
| 98 | +Table Name: "users" |
| 99 | +Username: "root" |
| 100 | +Password: "root" |
| 101 | +``` |
| 102 | + |
| 103 | + |
| 104 | +**Connecting to a private CloudSQL MySQL instance** |
| 105 | + |
| 106 | +If you want to connect to a private CloudSQL MySQL instance, create a Compute Engine VM that runs the CloudSQL Proxy |
| 107 | +docker image using the following command |
| 108 | + |
| 109 | +``` |
| 110 | +# Set the environment variables |
| 111 | +export PROJECT=[project_id] |
| 112 | +export REGION=[vm-region] |
| 113 | +export ZONE=`gcloud compute zones list --filter="name=${REGION}" --limit |
| 114 | +1 --uri --project=${PROJECT}| sed 's/.*\///'` |
| 115 | +export SUBNET=[vpc-subnet-name] |
| 116 | +export NAME=[gce-vm-name] |
| 117 | +export MYSQL_CONN=[mysql-instance-connection-name] |
| 118 | +
|
| 119 | +# Create a Compute Engine VM |
| 120 | +gcloud beta compute --project=${PROJECT_ID} instances create ${INSTANCE_NAME} |
| 121 | +--zone=${ZONE} --machine-type=g1-small --subnet=${SUBNE} --no-address |
| 122 | +--metadata=startup-script="docker run -d -p 0.0.0.0:3306:3306 |
| 123 | +gcr.io/cloudsql-docker/gce-proxy:1.16 /cloud_sql_proxy |
| 124 | +-instances=${MYSQL_CONNECTION_NAME}=tcp:0.0.0.0:3306" --maintenance-policy=MIGRATE |
| 125 | +--scopes=https://www.googleapis.com/auth/cloud-platform |
| 126 | +--image=cos-69-10895-385-0 --image-project=cos-cloud |
| 127 | +``` |
| 128 | + |
| 129 | +Optionally, you can promote the internal IP address of the VM running the Proxy image to a static IP using |
| 130 | + |
| 131 | +``` |
| 132 | +# Get the VM internal IP |
| 133 | +export IP=`gcloud compute instances describe ${NAME} --zone ${ZONE} | |
| 134 | +grep "networkIP" | awk '{print $2}'` |
| 135 | +
|
| 136 | +# Promote the VM internal IP to static IP |
| 137 | +gcloud compute addresses create mysql-proxy --addresses ${IP} --region |
| 138 | +${REGION} --subnet ${SUBNET} |
| 139 | +``` |
| 140 | + |
| 141 | +Get the latest version of the CloudSQL socket factory jar with driver and dependencies from |
| 142 | +[here](https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/releases), then configure plugin with: |
| 143 | + |
| 144 | +``` |
| 145 | +Reference Name: "sink1" |
| 146 | +Driver Name: "cloudsql-mysql" |
| 147 | +Instance Name: [PROJECT_ID]:[REGION_ID]:[INSTANCE_NAME] |
| 148 | +CloudSQL Instance Type: "Private" |
| 149 | +Database: "prod" |
| 150 | +Table Name: "users" |
| 151 | +Username: "root" |
| 152 | +Password: "root" |
| 153 | +``` |
0 commit comments