The source code of SQLFlow is in Go, Java, protobuf, yacc, and Python. To build from source code, we need toolchains of all these languages. In addition to that, we need to install MySQL, Hive, and MaxCompute client for unit tests. To ease the software installation and configuration, we provide a Dockerfile that contains all the requirement software for building and testing.
- Git for checking out the source code.
- Docker CE >= 18.x for building the Docker image of development tools.
We can clone the source code to any working directory, say, ~/sqlflow.
cd ~
git clone https://github.com/sql-machine-learning/sqlflowWe can build the Docker image from the Dockerfile.
cd sqlflow
docker build -t sqlflow .Or, we can pull the Docker image pre-built by the CI system from DockerHub.
docker pull sqlflow/sqlflow
docker tag sqlflow/sqlflow:latest sqlflow:latestLet us start a container running the development Docker image.
docker run --rm -it -v $HOME/sqlflow:/sqlflow -w /sqlflow sqlflow bashIn the Docker container, we need to start a MySQL server for testing.
service mysql startThen, we can build and run tests.
go generate ./...
PYTHONPATH=/sqlflow/python SQLFLOW_TEST_DB=mysql gotest -v -p 1 ./...The commandline go generate is necessary to call protoc for translating gRPC interface and to call goyacc for generating the parser.
The environment variable PYTHONPATH=$GOPATH/src/sqlflow.org/sqlflow/python ensures the python part of SQLFlow in the Docker image is up to date.
The environment variable SQLFLOW_TEST_DB=mysql specify MySQL as the SQL engine during testing. You can also choose hive for Apache Hive and maxcompute for Alibaba MaxCompute.
The command gotest with -p 1 argument is necessary to run all tests, otherwise you will encounter the same problem as this issue. Please feel free to use go test instead of gotest. We use the latter one for colorized output.
As the above docker run command binds the source code directory on the host computer to the container, we can edit the source code on the host using any editor, VS Code, Emacs, etc.
After the editing and before you can Git commit, please install the pre-commit tool. SQLFlow needs it to run pre-commit checks.
SQLFlow provides a command-line tool repl for evaluating SQL statements. This tool makes it easy to debug. To build it, run the following commands.
cd cmd/repl
go install
~/go/bin/repl --datasource="mysql://root:root@tcp(localhost:3306)/?maxAllowedPacket=0"Please follow the REPL tutorial to understand what we can do with the REPL.