Skip to content

Commit f30c56b

Browse files
author
Bhooshan Mogal
authored
Merge pull request #67 from melburnerodrigues/develop
Added CloudSQL MySQL source, sink and action plugins.
2 parents 0a075ce + c11412a commit f30c56b

18 files changed

+1629
-0
lines changed
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# CloudSQL MySQL Action
2+
3+
4+
Description
5+
-----------
6+
Action that runs a MySQL command on a CloudSQL MySQL instance.
7+
8+
9+
Use Case
10+
--------
11+
The action can be used whenever you want to run a MySQL command before or after a data pipeline.
12+
For example, you may want to run a SQL update command on a database before the pipeline source pulls data from tables.
13+
14+
15+
Properties
16+
----------
17+
**Driver Name:** Name of the JDBC driver to use.
18+
19+
**Database Command:** Database command to execute.
20+
21+
**Database:** MySQL database name.
22+
23+
**Connection Name:** The CloudSQL instance to connect to in the format <PROJECT_ID>:\<REGION>:<INSTANCE_NAME>.
24+
Can be found in the instance overview page.
25+
26+
**CloudSQL Instance Type:** Whether the CloudSQL instance to connect to is private or public. Defaults to 'Public'.
27+
28+
**Username:** User identity for connecting to the specified database.
29+
30+
**Password:** Password to use to connect to the specified database.
31+
32+
**Connection Timeout:** The timeout value (in seconds) used for socket connect operations. If connecting to the server
33+
takes longer than this value, the connection is broken. A value of 0 means that it is disabled.
34+
35+
**Connection Arguments:** A list of arbitrary string key/value pairs as connection arguments. These arguments
36+
will be passed to the JDBC driver as connection arguments for JDBC drivers that may need additional configurations.
37+
38+
39+
40+
Examples
41+
--------
42+
**Connecting to a public CloudSQL MySQL instance**
43+
44+
Suppose you want to execute a query against a CloudSQL MySQL database named "prod", as "root" user with "root" password
45+
(Get the latest version of the CloudSQL socket factory jar with driver and dependencies
46+
[here](https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/releases), then configure plugin with:
47+
48+
```
49+
Driver Name: "cloudsql-mysql"
50+
Database Command: "UPDATE table_name SET price = 20 WHERE ID = 6"
51+
Instance Name: [PROJECT_ID]:[REGION]:[INSTANCE_NAME]
52+
CloudSQL Instance Type: "Public"
53+
Database: "prod"
54+
Username: "root"
55+
Password: "root"
56+
```
57+
58+
59+
**Connecting to a private CloudSQL MySQL instance**
60+
61+
If you want to connect to a private CloudSQL MySQL instance, create a Compute Engine VM that runs the CloudSQL Proxy
62+
docker image using the following command
63+
64+
```
65+
# Set the environment variables
66+
export PROJECT=[project_id]
67+
export REGION=[vm-region]
68+
export ZONE=`gcloud compute zones list --filter="name=${REGION}" --limit
69+
1 --uri --project=${PROJECT}| sed 's/.*\///'`
70+
export SUBNET=[vpc-subnet-name]
71+
export NAME=[gce-vm-name]
72+
export MYSQL_CONN=[mysql-instance-connection-name]
73+
74+
# Create a Compute Engine VM
75+
gcloud beta compute --project=${PROJECT_ID} instances create ${INSTANCE_NAME}
76+
--zone=${ZONE} --machine-type=g1-small --subnet=${SUBNE} --no-address
77+
--metadata=startup-script="docker run -d -p 0.0.0.0:3306:3306
78+
gcr.io/cloudsql-docker/gce-proxy:1.16 /cloud_sql_proxy
79+
-instances=${MYSQL_CONNECTION_NAME}=tcp:0.0.0.0:3306" --maintenance-policy=MIGRATE
80+
--scopes=https://www.googleapis.com/auth/cloud-platform
81+
--image=cos-69-10895-385-0 --image-project=cos-cloud
82+
```
83+
84+
Optionally, you can promote the internal IP address of the VM running the Proxy image to a static IP using
85+
86+
```
87+
# Get the VM internal IP
88+
export IP=`gcloud compute instances describe ${NAME} --zone ${ZONE} |
89+
grep "networkIP" | awk '{print $2}'`
90+
91+
# Promote the VM internal IP to static IP
92+
gcloud compute addresses create mysql-proxy --addresses ${IP} --region
93+
${REGION} --subnet ${SUBNET}
94+
```
95+
96+
Get the latest version of the CloudSQL socket factory jar with driver and dependencies from
97+
[here](https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/releases), then configure plugin with:
98+
99+
```
100+
Driver Name: "cloudsql-mysql"
101+
Database Command: "UPDATE table_name SET price = 20 WHERE ID = 6"
102+
Instance Name: [PROJECT_ID]:[REGION]:[INSTANCE_NAME]
103+
CloudSQL Instance Type: "Private"
104+
Database: "prod"
105+
Username: "root"
106+
Password: "root"
107+
```
Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
# CloudSQL MySQL Batch Sink
2+
3+
4+
Description
5+
-----------
6+
Writes records to a CloudSQL MySQL table. Each record will be written to a row in the table.
7+
8+
9+
Use Case
10+
--------
11+
This sink is used whenever you need to write to a CloudSQL MySQL table.
12+
Suppose you periodically build a recommendation model for products on your online store.
13+
The model is stored in a GCS bucket and you want to export the contents
14+
of the bucket to a CloudSQL MySQL table where it can be served to your users.
15+
16+
Column names would be autodetected from input schema.
17+
18+
Properties
19+
----------
20+
**Reference Name:** Name used to uniquely identify this sink for lineage, annotating metadata, etc.
21+
22+
**Driver Name:** Name of the JDBC driver to use.
23+
24+
**Database:** MySQL database name.
25+
26+
**Connection Name:** The CloudSQL instance to connect to in the format <PROJECT_ID>:\<REGION>:<INSTANCE_NAME>.
27+
Can be found in the instance overview page.
28+
29+
**CloudSQL Instance Type:** Whether the CloudSQL instance to connect to is private or public. Defaults to 'Public'.
30+
31+
**Table Name:** Name of the table to export to. Table must exist prior to running the pipeline.
32+
33+
**Username:** User identity for connecting to the specified database.
34+
35+
**Password:** Password to use to connect to the specified database.
36+
37+
**Transaction Isolation Level:** Transaction isolation level for queries run by this sink.
38+
39+
**Connection Timeout:** The timeout value (in seconds) used for socket connect operations. If connecting to the server
40+
takes longer than this value, the connection is broken. A value of 0 means that it is disabled.
41+
42+
**Connection Arguments:** A list of arbitrary string key/value pairs as connection arguments. These arguments
43+
will be passed to the JDBC driver as connection arguments for JDBC drivers that may need additional configurations.
44+
45+
46+
Data Types Mapping
47+
------------------
48+
49+
| MySQL Data Type | CDAP Schema Data Type |
50+
| ------------------------------ | --------------------- |
51+
| BIT | boolean |
52+
| TINYINT | int |
53+
| BOOL, BOOLEAN | boolean |
54+
| SMALLINT | int |
55+
| MEDIUMINT | double |
56+
| INT,INTEGER | int |
57+
| BIGINT | long |
58+
| FLOAT | float |
59+
| DOUBLE | double |
60+
| DECIMAL | decimal |
61+
| DATE | date |
62+
| DATETIME | timestamp |
63+
| TIMESTAMP | timestamp |
64+
| TIME | time |
65+
| YEAR | date |
66+
| CHAR | string |
67+
| VARCHAR | string |
68+
| BINARY | bytes |
69+
| VARBINARY | bytes |
70+
| TINYBLOB | bytes |
71+
| TINYTEXT | string |
72+
| BLOB | bytes |
73+
| TEXT | string |
74+
| MEDIUMBLOB | bytes |
75+
| MEDIUMTEXT | string |
76+
| LONGBLOB | bytes |
77+
| LONGTEXT | string |
78+
| ENUM | string |
79+
| SET | string |
80+
81+
82+
83+
Examples
84+
--------
85+
**Connecting to a public CloudSQL MySQL instance**
86+
87+
Suppose you want to write output records to "users" table of CloudSQL MySQL database named "prod", as "root" user with
88+
"root" password (Get the latest version of the CloudSQL socket factory jar with driver and dependencies
89+
[here](https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/releases), then configure plugin with:
90+
91+
92+
```
93+
Reference Name: "sink1"
94+
Driver Name: "cloudsql-mysql"
95+
Instance Name: [PROJECT_ID]:[REGION_ID]:[INSTANCE_NAME]
96+
CloudSQL Instance Type: "Public"
97+
Database: "prod"
98+
Table Name: "users"
99+
Username: "root"
100+
Password: "root"
101+
```
102+
103+
104+
**Connecting to a private CloudSQL MySQL instance**
105+
106+
If you want to connect to a private CloudSQL MySQL instance, create a Compute Engine VM that runs the CloudSQL Proxy
107+
docker image using the following command
108+
109+
```
110+
# Set the environment variables
111+
export PROJECT=[project_id]
112+
export REGION=[vm-region]
113+
export ZONE=`gcloud compute zones list --filter="name=${REGION}" --limit
114+
1 --uri --project=${PROJECT}| sed 's/.*\///'`
115+
export SUBNET=[vpc-subnet-name]
116+
export NAME=[gce-vm-name]
117+
export MYSQL_CONN=[mysql-instance-connection-name]
118+
119+
# Create a Compute Engine VM
120+
gcloud beta compute --project=${PROJECT_ID} instances create ${INSTANCE_NAME}
121+
--zone=${ZONE} --machine-type=g1-small --subnet=${SUBNE} --no-address
122+
--metadata=startup-script="docker run -d -p 0.0.0.0:3306:3306
123+
gcr.io/cloudsql-docker/gce-proxy:1.16 /cloud_sql_proxy
124+
-instances=${MYSQL_CONNECTION_NAME}=tcp:0.0.0.0:3306" --maintenance-policy=MIGRATE
125+
--scopes=https://www.googleapis.com/auth/cloud-platform
126+
--image=cos-69-10895-385-0 --image-project=cos-cloud
127+
```
128+
129+
Optionally, you can promote the internal IP address of the VM running the Proxy image to a static IP using
130+
131+
```
132+
# Get the VM internal IP
133+
export IP=`gcloud compute instances describe ${NAME} --zone ${ZONE} |
134+
grep "networkIP" | awk '{print $2}'`
135+
136+
# Promote the VM internal IP to static IP
137+
gcloud compute addresses create mysql-proxy --addresses ${IP} --region
138+
${REGION} --subnet ${SUBNET}
139+
```
140+
141+
Get the latest version of the CloudSQL socket factory jar with driver and dependencies from
142+
[here](https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory/releases), then configure plugin with:
143+
144+
```
145+
Reference Name: "sink1"
146+
Driver Name: "cloudsql-mysql"
147+
Instance Name: [PROJECT_ID]:[REGION_ID]:[INSTANCE_NAME]
148+
CloudSQL Instance Type: "Private"
149+
Database: "prod"
150+
Table Name: "users"
151+
Username: "root"
152+
Password: "root"
153+
```

0 commit comments

Comments
 (0)