# Apache Kyuubi
-
-
-[](https://www.apache.org/licenses/LICENSE-2.0.html)
-[](https://github.com/apache/kyuubi/releases)
-[](https://github.com/apache/kyuubi)
-[](https://codecov.io/gh/apache/kyuubi)
-
-[](https://travis-ci.com/apache/kyuubi)
-[](https://kyuubi.readthedocs.io/en/master/)
-
-[](https://github.com/apache/kyuubi/graphs/commit-activity)
-[](http://isitmaintained.com/project/apache/kyuubi "Average time to resolve an issue")
-[](http://isitmaintained.com/project/apache/kyuubi "Percentage of issues still open")
-
-
-## What is Kyuubi?
-
Apache Kyuubi™ is a distributed and multi-tenant gateway to provide serverless
SQL on data warehouses and lakehouses.
+## What is Kyuubi?
+
Kyuubi provides a pure SQL gateway through Thrift JDBC/ODBC interface for end-users to manipulate large-scale data with pre-programmed and extensible Spark SQL engines. This "out-of-the-box" model minimizes the barriers and costs for end-users to use Spark at the client side. At the server-side, Kyuubi server and engines' multi-tenant architecture provides the administrators a way to achieve computing resource isolation, data security, high availability, high client concurrency, etc.

@@ -45,19 +59,16 @@ Kyuubi provides a pure SQL gateway through Thrift JDBC/ODBC interface for end-us
- [x] Multi-tenant Spark Support
- [x] Running Spark in a serverless way
-
### Target Users
Kyuubi's goal is to make it easy and efficient for `anyone` to use Spark(maybe other engines soon) and facilitate users to handle big data like ordinary data. Here, `anyone` means that users do not need to have a Spark technical background but a human language, SQL only. Sometimes, SQL skills are unnecessary when integrating Kyuubi with Apache Superset, which supports rich visualizations and dashboards.
-
In typical big data production environments with Kyuubi, there should be system administrators and end-users.
- System administrators: A small group consists of Spark experts responsible for Kyuubi deployment, configuration, and tuning.
- End-users: Focus on business data of their own, not where it stores, how it computes.
-Additionally, the Kyuubi community will continuously optimize the whole system with various features, such as History-Based Optimizer, Auto-tuning, Materialized View, SQL Dialects, Functions, e.t.c.
-
+Additionally, the Kyuubi community will continuously optimize the whole system with various features, such as History-Based Optimizer, Auto-tuning, Materialized View, SQL Dialects, Functions, etc.
### Usage scenarios
@@ -71,8 +82,7 @@ HiveServer2 can identify and authenticate a caller, and then if the caller also
Kyuubi extends the use of STS in a multi-tenant model based on a unified interface and relies on the concept of multi-tenancy to interact with cluster managers to finally gain the ability of resources sharing/isolation and data security. The loosely coupled architecture of the Kyuubi server and engine dramatically improves the client concurrency and service stability of the service itself.
-
-#### DataLake/LakeHouse Support
+#### DataLake/Lakehouse Support
The vision of Kyuubi is to unify the portal and become an easy-to-use data lake management platform. Different kinds of workloads, such as ETL processing and BI analytics, can be supported by one platform, using one copy of data, with one SQL interface.
@@ -80,30 +90,20 @@ The vision of Kyuubi is to unify the portal and become an easy-to-use data lake
- Multiple Catalogs support
- SQL Standard Authorization support for DataLake(coming)
-
#### Cloud Native Support
Kyuubi can deploy its engines on different kinds of Cluster Managers, such as, Hadoop YARN, Kubernetes, etc.
-

-
### The Kyuubi Ecosystem(present and future)
-
The figure below shows our vision for the Kyuubi Ecosystem. Some of them have been realized, some in development,
and others would not be possible without your help.

-
-
-## Online Documentation
-
-Since Kyuubi 1.3.0-incubating, the Kyuubi online documentation is hosted by [https://kyuubi.apache.org/](https://kyuubi.apache.org/).
-You can find the latest Kyuubi documentation on [this web page](https://kyuubi.readthedocs.io/en/master/).
-For 1.2 and earlier versions, please check the [Readthedocs](https://kyuubi.readthedocs.io/en/v1.2.0/) directly.
+## Online Documentation
## Quick Start
@@ -111,9 +111,32 @@ Ready? [Getting Started](https://kyuubi.readthedocs.io/en/master/quick_start/) w
## [Contributing](./CONTRIBUTING.md)
-## Contributor over time
-
-[](https://api7.ai/contributor-graph?chart=contributorOverTime&repo=apache/kyuubi)
+## Project & Community Status
+
+
## Aside
@@ -121,7 +144,3 @@ The project took its name from a character of a popular Japanese manga - `Naruto
The character is named `Kyuubi Kitsune/Kurama`, which is a nine-tailed fox in mythology.
`Kyuubi` spread the power and spirit of fire, which is used here to represent the powerful [Apache Spark](http://spark.apache.org).
Its nine tails stand for end-to-end multi-tenancy support of this project.
-
-## License
-
-This project is licensed under the Apache 2.0 License. See the [LICENSE](./LICENSE) file for details.
diff --git a/bin/docker-image-tool.sh b/bin/docker-image-tool.sh
index e9e4338b5..14d5fe7b0 100755
--- a/bin/docker-image-tool.sh
+++ b/bin/docker-image-tool.sh
@@ -27,19 +27,21 @@ function error {
if [ -z "${KYUUBI_HOME}" ]; then
KYUUBI_HOME="$(cd "`dirname "$0"`"/..; pwd)"
fi
-
-CTX_DIR="$KYUUBI_HOME/target/tmp/docker"
+KYUUBI_IMAGE_NAME="kyuubi"
function is_dev_build {
[ ! -f "$KYUUBI_HOME/RELEASE" ]
}
-function cleanup_ctx_dir {
- if is_dev_build; then
- rm -rf "$CTX_DIR"
- fi
-}
-trap cleanup_ctx_dir EXIT
+if is_dev_build; then
+ cat <"
exit 1
fi
diff --git a/build/Dockerfile b/build/Dockerfile
index b53b6716e..8ecc6c8b7 100644
--- a/build/Dockerfile
+++ b/build/Dockerfile
@@ -29,15 +29,15 @@
# Declare the BASE_IMAGE argument in the first line, for more detail
# see: https://github.com/moby/moby/issues/38379
-ARG BASE_IMAGE=openjdk:8-jdk
+ARG BASE_IMAGE=eclipse-temurin:8-jdk-focal
-FROM maven:3.6-jdk-8 as builder
+FROM eclipse-temurin:8-jdk-focal as builder
ARG MVN_ARG
# Pass the environment variable `CI` into container, for internal use only.
#
-# Continuous integration(aka. CI) services like GitHub Actions, Travis always provide
+# Continuous integration(aka. CI) services like GitHub Actions always provide
# an environment variable `CI` in runners, and we detect this variable to run some
# specific actions, e.g. run `mvn` in batch mode to suppress noisy logs.
ARG CI
@@ -48,7 +48,8 @@ WORKDIR /workspace/kyuubi
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive \
- apt-get install -y python3 && \
+ apt-get install -y bash python3 && \
+ ln -snf /bin/bash /bin/sh && \
./build/dist ${MVN_ARG} && \
mv /workspace/kyuubi/dist /opt/kyuubi && \
# Removing stuff saves time because docker creates a temporary layer
@@ -71,7 +72,8 @@ COPY --from=builder /opt/kyuubi ${KYUUBI_HOME}
RUN set -ex && \
apt-get update && \
DEBIAN_FRONTEND=noninteractive \
- apt install -y bash tini libc6 libpam-modules krb5-user libnss3 procps && \
+ apt-get install -y bash tini libc6 libpam-modules krb5-user libnss3 procps && \
+ ln -snf /bin/bash /bin/sh && \
useradd -u ${kyuubi_uid} -g root kyuubi && \
mkdir -p ${KYUUBI_HOME} ${KYUUBI_LOG_DIR} ${KYUUBI_PID_DIR} ${KYUUBI_WORK_DIR_ROOT} && \
chmod ug+rw -R ${KYUUBI_HOME} && \
diff --git a/build/dist b/build/dist
index 7b51886df..df9498008 100755
--- a/build/dist
+++ b/build/dist
@@ -31,6 +31,7 @@ set -x
KYUUBI_HOME="$(cd "`dirname "$0"`/.."; pwd)"
DISTDIR="$KYUUBI_HOME/dist"
MAKE_TGZ=false
+ENABLE_WEBUI=false
FLINK_PROVIDED=false
SPARK_PROVIDED=false
HIVE_PROVIDED=false
@@ -42,15 +43,16 @@ function usage {
echo "./build/dist - Tool for making binary distributions of Kyuubi"
echo ""
echo "Usage:"
- echo "+------------------------------------------------------------------------------------------------------+"
- echo "| ./build/dist [--name ] [--tgz] [--flink-provided] [--spark-provided] [--hive-provided] |"
- echo "| [--mvn ] |"
- echo "+------------------------------------------------------------------------------------------------------+"
+ echo "+----------------------------------------------------------------------------------------------+"
+ echo "| ./build/dist [--name ] [--tgz] [--web-ui] [--flink-provided] [--hive-provided] |"
+ echo "| [--spark-provided] [--mvn ] |"
+ echo "+----------------------------------------------------------------------------------------------+"
echo "name: - custom binary name, using project version if undefined"
echo "tgz: - whether to make a whole bundled package"
+ echo "web-ui: - whether to include web ui"
echo "flink-provided: - whether to make a package without Flink binary"
- echo "spark-provided: - whether to make a package without Spark binary"
echo "hive-provided: - whether to make a package without Hive binary"
+ echo "spark-provided: - whether to make a package without Spark binary"
echo "mvn: - external maven executable location"
echo ""
}
@@ -67,6 +69,9 @@ while (( "$#" )); do
--tgz)
MAKE_TGZ=true
;;
+ --web-ui)
+ ENABLE_WEBUI=true
+ ;;
--flink-provided)
FLINK_PROVIDED=true
;;
@@ -210,7 +215,11 @@ else
echo "Making distribution for Kyuubi $VERSION in '$DISTDIR'..."
fi
-MVN_DIST_OPT="-DskipTests"
+MVN_DIST_OPT="-DskipTests -Dmaven.javadoc.skip=true -Dmaven.scaladoc.skip=true -Dmaven.source.skip"
+
+if [[ "$ENABLE_WEBUI" == "true" ]]; then
+ MVN_DIST_OPT="$MVN_DIST_OPT -Pweb-ui"
+fi
if [[ "$SPARK_PROVIDED" == "true" ]]; then
MVN_DIST_OPT="$MVN_DIST_OPT -Pspark-provided"
@@ -238,14 +247,16 @@ echo -e "\$ ${BUILD_COMMAND[@]}\n"
rm -rf "$DISTDIR"
mkdir -p "$DISTDIR/pid"
mkdir -p "$DISTDIR/logs"
-mkdir -p "$DISTDIR/jars"
mkdir -p "$DISTDIR/work"
+mkdir -p "$DISTDIR/jars"
+mkdir -p "$DISTDIR/beeline-jars"
+mkdir -p "$DISTDIR/web-ui"
mkdir -p "$DISTDIR/externals/engines/flink"
mkdir -p "$DISTDIR/externals/engines/spark"
mkdir -p "$DISTDIR/externals/engines/trino"
mkdir -p "$DISTDIR/externals/engines/hive"
mkdir -p "$DISTDIR/externals/engines/jdbc"
-mkdir -p "$DISTDIR/beeline-jars"
+mkdir -p "$DISTDIR/externals/engines/chat"
echo "Kyuubi $VERSION $GITREVSTRING built for" > "$DISTDIR/RELEASE"
echo "Java $JAVA_VERSION" >> "$DISTDIR/RELEASE"
echo "Scala $SCALA_VERSION" >> "$DISTDIR/RELEASE"
@@ -303,6 +314,18 @@ for jar in $(ls "$DISTDIR/jars/"); do
fi
done
+# Copy chat engines
+cp "$KYUUBI_HOME/externals/kyuubi-chat-engine/target/kyuubi-chat-engine_${SCALA_VERSION}-${VERSION}.jar" "$DISTDIR/externals/engines/chat/"
+cp -r "$KYUUBI_HOME"/externals/kyuubi-chat-engine/target/scala-$SCALA_VERSION/jars/*.jar "$DISTDIR/externals/engines/chat/"
+
+# Share the jars w/ server to reduce binary size
+# shellcheck disable=SC2045
+for jar in $(ls "$DISTDIR/jars/"); do
+ if [[ -f "$DISTDIR/externals/engines/chat/$jar" ]]; then
+ (cd $DISTDIR/externals/engines/chat; ln -snf "../../../jars/$jar" "$DISTDIR/externals/engines/chat/$jar")
+ fi
+done
+
# Copy kyuubi tools
if [[ -f "$KYUUBI_HOME/tools/spark-block-cleaner/target/spark-block-cleaner_${SCALA_VERSION}-${VERSION}.jar" ]]; then
mkdir -p "$DISTDIR/tools/spark-block-cleaner/kubernetes"
@@ -312,7 +335,7 @@ if [[ -f "$KYUUBI_HOME/tools/spark-block-cleaner/target/spark-block-cleaner_${SC
fi
# Copy Kyuubi Spark extension
-SPARK_EXTENSION_VERSIONS=('3-1' '3-2' '3-3')
+SPARK_EXTENSION_VERSIONS=('3-1' '3-2' '3-3' '3-4' '3-5')
# shellcheck disable=SC2068
for SPARK_EXTENSION_VERSION in ${SPARK_EXTENSION_VERSIONS[@]}; do
if [[ -f $"$KYUUBI_HOME/extensions/spark/kyuubi-extension-spark-$SPARK_EXTENSION_VERSION/target/kyuubi-extension-spark-${SPARK_EXTENSION_VERSION}_${SCALA_VERSION}-${VERSION}.jar" ]]; then
@@ -321,6 +344,11 @@ for SPARK_EXTENSION_VERSION in ${SPARK_EXTENSION_VERSIONS[@]}; do
fi
done
+if [[ "$ENABLE_WEBUI" == "true" ]]; then
+ # Copy web ui dist
+ cp -r "$KYUUBI_HOME/kyuubi-server/web-ui/dist" "$DISTDIR/web-ui/"
+fi
+
if [[ "$FLINK_PROVIDED" != "true" ]]; then
# Copy flink binary dist
FLINK_BUILTIN="$(find "$KYUUBI_HOME/externals/kyuubi-download/target" -name 'flink-*' -type d)"
@@ -356,7 +384,11 @@ if [[ "$MAKE_TGZ" == "true" ]]; then
TARDIR="$KYUUBI_HOME/$TARDIR_NAME"
rm -rf "$TARDIR"
cp -R "$DISTDIR" "$TARDIR"
- tar czf "$TARDIR_NAME.tgz" -C "$KYUUBI_HOME" "$TARDIR_NAME"
+ TAR="tar"
+ if [ "$(uname -s)" = "Darwin" ]; then
+ TAR="tar --no-mac-metadata --no-xattrs --no-fflags"
+ fi
+ $TAR -czf "$TARDIR_NAME.tgz" -C "$KYUUBI_HOME" "$TARDIR_NAME"
rm -rf "$TARDIR"
echo "The Kyuubi tarball $TARDIR_NAME.tgz is successfully generated in $KYUUBI_HOME."
fi
diff --git a/build/kyuubi-build-info.cmd b/build/kyuubi-build-info.cmd
index 7717b48e4..d9e8e6c6a 100755
--- a/build/kyuubi-build-info.cmd
+++ b/build/kyuubi-build-info.cmd
@@ -36,6 +36,7 @@ echo kyuubi_trino_version=%~9
echo user=%username%
FOR /F %%i IN ('git rev-parse HEAD') DO SET "revision=%%i"
+FOR /F "delims=" %%i IN ('git show -s --format^=%%ci HEAD') DO SET "revision_time=%%i"
FOR /F %%i IN ('git rev-parse --abbrev-ref HEAD') DO SET "branch=%%i"
FOR /F %%i IN ('git config --get remote.origin.url') DO SET "url=%%i"
@@ -44,6 +45,7 @@ FOR /f %%i IN ("%TIME%") DO SET current_time=%%i
set date=%current_date%_%current_time%
echo revision=%revision%
+echo revision_time=%revision_time%
echo branch=%branch%
echo date=%date%
echo url=%url%
diff --git a/build/mvn b/build/mvn
index d67638ba2..cd6c0c796 100755
--- a/build/mvn
+++ b/build/mvn
@@ -35,7 +35,7 @@ fi
## Arg2 - Tarball Name
## Arg3 - Checkable Binary
install_app() {
- local remote_tarball="$1/$2"
+ local remote_tarball="$1/$2$4"
local local_tarball="${_DIR}/$2"
local binary="${_DIR}/$3"
@@ -76,13 +76,26 @@ install_mvn() {
fi
# See simple version normalization: http://stackoverflow.com/questions/16989598/bash-comparing-version-numbers
function version { echo "$@" | awk -F. '{ printf("%03d%03d%03d\n", $1,$2,$3); }'; }
- if [ $(version $MVN_DETECTED_VERSION) -lt $(version $MVN_VERSION) ]; then
- local APACHE_MIRROR=${APACHE_MIRROR:-'https://archive.apache.org/dist/'}
+ if [ $(version $MVN_DETECTED_VERSION) -ne $(version $MVN_VERSION) ]; then
+ local APACHE_MIRROR=${APACHE_MIRROR:-'https://www.apache.org/dyn/closer.lua'}
+ local MIRROR_URL_QUERY="?action=download"
+ local MVN_TARBALL="apache-maven-${MVN_VERSION}-bin.tar.gz"
+ local FILE_PATH="maven/maven-3/${MVN_VERSION}/binaries"
+
+ if [ $(command -v curl) ]; then
+ if ! curl -L --output /dev/null --silent --head --fail "${APACHE_MIRROR}/${FILE_PATH}/${MVN_TARBALL}${MIRROR_URL_QUERY}" ; then
+ # Fall back to archive.apache.org for older Maven
+ echo "Falling back to archive.apache.org to download Maven"
+ APACHE_MIRROR="https://archive.apache.org/dist"
+ MIRROR_URL_QUERY=""
+ fi
+ fi
install_app \
- "${APACHE_MIRROR}/maven/maven-3/${MVN_VERSION}/binaries" \
- "apache-maven-${MVN_VERSION}-bin.tar.gz" \
- "apache-maven-${MVN_VERSION}/bin/mvn"
+ "${APACHE_MIRROR}/${FILE_PATH}" \
+ "${MVN_TARBALL}" \
+ "apache-maven-${MVN_VERSION}/bin/mvn" \
+ "${MIRROR_URL_QUERY}"
MVN_BIN="${_DIR}/apache-maven-${MVN_VERSION}/bin/mvn"
fi
diff --git a/build/release/create-package.sh b/build/release/create-package.sh
index c98e7c0f8..28a89165e 100755
--- a/build/release/create-package.sh
+++ b/build/release/create-package.sh
@@ -75,7 +75,7 @@ package_binary() {
echo "Creating binary release tarball ${BIN_TGZ_FILE}"
- ${KYUUBI_DIR}/build/dist --tgz --spark-provided --flink-provided --hive-provided
+ ${KYUUBI_DIR}/build/dist --tgz --web-ui --spark-provided --flink-provided --hive-provided
cp "${BIN_TGZ_FILE}" "${RELEASE_DIR}"
diff --git a/build/release/release.sh b/build/release/release.sh
index 4afac3865..49fef9f8b 100755
--- a/build/release/release.sh
+++ b/build/release/release.sh
@@ -52,6 +52,21 @@ if [[ ${RELEASE_VERSION} =~ .*-SNAPSHOT ]]; then
exit 1
fi
+if [ -n "${JAVA_HOME}" ]; then
+ JAVA="${JAVA_HOME}/bin/java"
+elif [ "$(command -v java)" ]; then
+ JAVA="java"
+else
+ echo "JAVA_HOME is not set" >&2
+ exit 1
+fi
+
+JAVA_VERSION=$($JAVA -version 2>&1 | awk -F '"' '/version/ {print $2}')
+if [[ $JAVA_VERSION != 1.8.* ]]; then
+ echo "Unexpected Java version: $JAVA_VERSION. Java 8 is required for release."
+ exit 1
+fi
+
RELEASE_TAG="v${RELEASE_VERSION}-rc${RELEASE_RC_NO}"
SVN_STAGING_REPO="https://dist.apache.org/repos/dist/dev/kyuubi"
@@ -85,7 +100,7 @@ upload_svn_staging() {
svn add "${SVN_STAGING_DIR}/${RELEASE_TAG}"
- echo "Uploading release tarballs to ${SVN_STAGING_DIR}/${RELEASE_TAG}"
+ echo "Uploading release tarballs to ${SVN_STAGING_REPO}/${RELEASE_TAG}"
(
cd "${SVN_STAGING_DIR}" && \
svn commit --username "${ASF_USERNAME}" --password "${ASF_PASSWORD}" --message "Apache Kyuubi ${RELEASE_TAG}"
@@ -94,17 +109,34 @@ upload_svn_staging() {
}
upload_nexus_staging() {
- ${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided \
- -s "${KYUUBI_DIR}/build/release/asf-settings.xml"
+ # Spark Extension Plugin for Spark 3.1
${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided,spark-3.1 \
-s "${KYUUBI_DIR}/build/release/asf-settings.xml" \
-pl extensions/spark/kyuubi-extension-spark-3-1 -am
+
+ # Spark Extension Plugin for Spark 3.2
${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided,spark-3.2 \
-s "${KYUUBI_DIR}/build/release/asf-settings.xml" \
-pl extensions/spark/kyuubi-extension-spark-3-2 -am
+
+ # Spark Extension Plugin for Spark 3.3
${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided,spark-3.3 \
-s "${KYUUBI_DIR}/build/release/asf-settings.xml" \
-pl extensions/spark/kyuubi-extension-spark-3-3 -am
+
+ # Spark Extension Plugin for Spark 3.5
+ ${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided,spark-3.5 \
+ -s "${KYUUBI_DIR}/build/release/asf-settings.xml" \
+ -pl extensions/spark/kyuubi-extension-spark-3-5 -am
+
+ # Spark TPC-DS/TPC-H Connector built with default Spark version (3.4) and Scala 2.13
+ ${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided,spark-3.4,scala-2.13 \
+ -s "${KYUUBI_DIR}/build/release/asf-settings.xml" \
+ -pl extensions/spark/kyuubi-spark-connector-tpcds,extensions/spark/kyuubi-spark-connector-tpch -am
+
+ # All modules including Spark Extension Plugin and Connectors built with default Spark version (3.4) and default Scala version (2.12)
+ ${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided,spark-3.4 \
+ -s "${KYUUBI_DIR}/build/release/asf-settings.xml"
}
finalize_svn() {
diff --git a/build/release/script/announce.sh b/build/release/script/announce.sh
old mode 100644
new mode 100755
diff --git a/build/release/script/dev_kyuubi_vote.sh b/build/release/script/dev_kyuubi_vote.sh
old mode 100644
new mode 100755
diff --git a/charts/kyuubi/Chart.yaml b/charts/kyuubi/Chart.yaml
index 6b377ecc5..56abc9edc 100644
--- a/charts/kyuubi/Chart.yaml
+++ b/charts/kyuubi/Chart.yaml
@@ -20,7 +20,7 @@ name: kyuubi
description: A Helm chart for Kyuubi server
type: application
version: 0.1.0
-appVersion: "master-snapshot"
+appVersion: 1.7.3
home: https://kyuubi.apache.org
icon: https://raw.githubusercontent.com/apache/kyuubi/master/docs/imgs/logo.png
sources:
diff --git a/charts/kyuubi/README.md b/charts/kyuubi/README.md
new file mode 100644
index 000000000..dfec578dd
--- /dev/null
+++ b/charts/kyuubi/README.md
@@ -0,0 +1,57 @@
+
+
+# Helm Chart for Apache Kyuubi
+
+[Apache Kyuubi](https://kyuubi.apache.org) is a distributed and multi-tenant gateway to provide serverless SQL on Data Warehouses and Lakehouses.
+
+
+## Introduction
+
+This chart will bootstrap an [Kyuubi](https://kyuubi.apache.org) deployment on a [Kubernetes](http://kubernetes.io)
+cluster using the [Helm](https://helm.sh) package manager.
+
+## Requirements
+
+- Kubernetes cluster
+- Helm 3.0+
+
+## Template rendering
+
+When you want to test the template rendering, but not actually install anything. [Debugging templates](https://helm.sh/docs/chart_template_guide/debugging/) provide a quick way of viewing the generated content without YAML parse errors blocking.
+
+There are two ways to render templates. It will return the rendered template to you so you can see the output.
+
+- Local rendering chart templates
+```shell
+helm template --debug ../kyuubi
+```
+- Server side rendering chart templates
+```shell
+helm install --dry-run --debug --generate-name ../kyuubi
+```
+
+
+## Documentation
+
+Configuration guide documentation for Kyuubi lives [on the website](https://kyuubi.readthedocs.io/en/master/configuration/settings.html#kyuubi-configurations). (Not just for Helm Chart)
+
+## Contributing
+
+Want to help build Apache Kyuubi? Check out our [contributing documentation](https://kyuubi.readthedocs.io/en/master/community/CONTRIBUTING.html).
\ No newline at end of file
diff --git a/charts/kyuubi/templates/NOTES.txt b/charts/kyuubi/templates/NOTES.txt
index 44a35b6b7..2693f5ef6 100644
--- a/charts/kyuubi/templates/NOTES.txt
+++ b/charts/kyuubi/templates/NOTES.txt
@@ -1,21 +1,47 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
-Get kyuubi expose URL by running these commands:
- export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "kyuubi.fullname" . }}-nodeport)
- export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
- echo $NODE_IP:$NODE_PORT
\ No newline at end of file
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+The chart has been installed!
+
+In order to check the release status, use:
+ helm status {{ .Release.Name }} -n {{ .Release.Namespace }}
+ or for more detailed info
+ helm get all {{ .Release.Name }} -n {{ .Release.Namespace }}
+
+************************
+******* Services *******
+************************
+{{- range $name, $frontend := .Values.server }}
+{{- if $frontend.enabled }}
+{{ $name | snakecase | upper }}:
+- To access {{ $.Release.Name }}-{{ $name | kebabcase }} service within the cluster, use the following URL:
+ {{ $.Release.Name }}-{{ $name | kebabcase }}.{{ $.Release.Namespace }}.svc.cluster.local
+{{- if $.Values.kyuubiConf.kyuubiDefaults }}
+{{- if regexMatch "(^|\\s)kyuubi.frontend.bind.host\\s*=?\\s*(localhost|127\\.0\\.0\\.1)($|\\s)" $.Values.kyuubiConf.kyuubiDefaults }}
+- To access {{ $.Release.Name }}-{{ $name | kebabcase }} service from outside the cluster for debugging, run the following command:
+ kubectl port-forward svc/{{ $.Release.Name }}-{{ $name | kebabcase }} {{ tpl $frontend.service.port $ }}:{{ tpl $frontend.service.port $ }} -n {{ $.Release.Namespace }}
+ and use 127.0.0.1:{{ tpl $frontend.service.port $ }}
+{{- end }}
+{{- end }}
+{{- if eq $frontend.service.type "NodePort" }}
+- To access {{ $.Release.Name }}-{{ $name | kebabcase }} service from outside the cluster through configured NodePort, run the following commands:
+ export NODE_PORT=$(kubectl get service {{ $.Release.Name }}-{{ $name | kebabcase }} -n {{ $.Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}")
+ export NODE_IP=$(kubectl get nodes -n {{ $.Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}")
+ echo http://$NODE_IP:$NODE_PORT
+{{- end }}
+{{- end }}
+{{- end }}
diff --git a/charts/kyuubi/templates/_helpers.tpl b/charts/kyuubi/templates/_helpers.tpl
index 684c1f354..502bf4646 100644
--- a/charts/kyuubi/templates/_helpers.tpl
+++ b/charts/kyuubi/templates/_helpers.tpl
@@ -16,33 +16,36 @@
*/}}
{{/*
-Expand the name of the chart.
+A comma separated string of enabled frontend protocols, e.g. "REST,THRIFT_BINARY".
+For details, see 'kyuubi.frontend.protocols': https://kyuubi.readthedocs.io/en/master/configuration/settings.html#frontend
*/}}
-{{- define "kyuubi.name" -}}
-{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }}
+{{- define "kyuubi.frontend.protocols" -}}
+ {{- $protocols := list }}
+ {{- range $name, $frontend := .Values.server }}
+ {{- if $frontend.enabled }}
+ {{- $protocols = $name | snakecase | upper | append $protocols }}
+ {{- end }}
+ {{- end }}
+ {{- if not $protocols }}
+ {{ fail "At least one frontend protocol must be enabled!" }}
+ {{- end }}
+ {{- $protocols | join "," }}
{{- end }}
{{/*
-Create a default fully qualified app name.
-We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
-If release name contains chart name it will be used as a full name.
+Selector labels
*/}}
-{{- define "kyuubi.fullname" -}}
-{{- if .Values.fullnameOverride }}
-{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }}
-{{- else }}
-{{- $name := default .Chart.Name .Values.nameOverride }}
-{{- if contains $name .Release.Name }}
-{{- .Release.Name | trunc 63 | trimSuffix "-" }}
-{{- else }}
-{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }}
-{{- end }}
-{{- end }}
-{{- end }}
+{{- define "kyuubi.selectorLabels" -}}
+app.kubernetes.io/name: {{ .Chart.Name }}
+app.kubernetes.io/instance: {{ .Release.Name }}
+{{- end -}}
{{/*
-Create chart name and version as used by the chart label.
+Common labels
*/}}
-{{- define "kyuubi.chart" -}}
-{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
-{{- end }}
\ No newline at end of file
+{{- define "kyuubi.labels" -}}
+helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
+{{ include "kyuubi.selectorLabels" . }}
+app.kubernetes.io/version: {{ .Values.image.tag | default .Chart.AppVersion | quote }}
+app.kubernetes.io/managed-by: {{ .Release.Service }}
+{{- end -}}
diff --git a/charts/kyuubi/templates/kyuubi-alert.yaml b/charts/kyuubi/templates/kyuubi-alert.yaml
new file mode 100644
index 000000000..89fd11dc7
--- /dev/null
+++ b/charts/kyuubi/templates/kyuubi-alert.yaml
@@ -0,0 +1,28 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+{{- if and .Values.monitoring.prometheus.enabled (eq .Values.metricsReporters "PROMETHEUS") .Values.prometheusRule.enabled }}
+apiVersion: monitoring.coreos.com/v1
+kind: PrometheusRule
+metadata:
+ name: {{ .Release.Name }}
+ labels:
+ {{- include "kyuubi.labels" . | nindent 4 }}
+spec:
+ groups:
+ {{- toYaml .Values.prometheusRule.groups | nindent 4 }}
+{{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-configmap.yaml b/charts/kyuubi/templates/kyuubi-configmap.yaml
index ada9e3dc8..62413567d 100644
--- a/charts/kyuubi/templates/kyuubi-configmap.yaml
+++ b/charts/kyuubi/templates/kyuubi-configmap.yaml
@@ -1,47 +1,51 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ .Release.Name }}
labels:
- helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
- app.kubernetes.io/version: {{ .Values.image.tag | default .Chart.AppVersion | quote }}
- app.kubernetes.io/managed-by: {{ .Release.Service }}
+ {{- include "kyuubi.labels" . | nindent 4 }}
data:
- {{- with .Values.server.conf.kyuubiEnv }}
+ {{- with .Values.kyuubiConf.kyuubiEnv }}
kyuubi-env.sh: |
#!/usr/bin/env bash
{{- tpl . $ | nindent 4 }}
{{- end }}
kyuubi-defaults.conf: |
## Helm chart provided Kyuubi configurations
- kyuubi.frontend.bind.host={{ .Values.server.bind.host }}
- kyuubi.frontend.bind.port={{ .Values.server.bind.port }}
kyuubi.kubernetes.namespace={{ .Release.Namespace }}
+ kyuubi.frontend.connection.url.use.hostname=false
+ kyuubi.frontend.thrift.binary.bind.port={{ .Values.server.thriftBinary.port }}
+ kyuubi.frontend.thrift.http.bind.port={{ .Values.server.thriftHttp.port }}
+ kyuubi.frontend.rest.bind.port={{ .Values.server.rest.port }}
+ kyuubi.frontend.mysql.bind.port={{ .Values.server.mysql.port }}
+ kyuubi.frontend.protocols={{ include "kyuubi.frontend.protocols" . }}
+
+ # Kyuubi Metrics
+ kyuubi.metrics.enabled={{ .Values.monitoring.prometheus.enabled }}
+ kyuubi.metrics.reporters={{ .Values.metricsReporters }}
## User provided Kyuubi configurations
- {{- with .Values.server.conf.kyuubiDefaults }}
- {{- tpl . $ | nindent 4 }}
+ {{- with .Values.kyuubiConf.kyuubiDefaults }}
+ {{- tpl . $ | nindent 4 }}
{{- end }}
- {{- with .Values.server.conf.log4j2 }}
+ {{- with .Values.kyuubiConf.log4j2 }}
log4j2.xml: |
{{- tpl . $ | nindent 4 }}
{{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-deployment.yaml b/charts/kyuubi/templates/kyuubi-deployment.yaml
deleted file mode 100644
index 941fdf164..000000000
--- a/charts/kyuubi/templates/kyuubi-deployment.yaml
+++ /dev/null
@@ -1,113 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-apiVersion: apps/v1
-kind: Deployment
-metadata:
- name: {{ .Release.Name }}
- labels:
- helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
- app.kubernetes.io/version: {{ .Values.image.tag | default .Chart.AppVersion | quote }}
- app.kubernetes.io/managed-by: {{ .Release.Service }}
-spec:
- replicas: {{ .Values.replicaCount }}
- selector:
- matchLabels:
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
- template:
- metadata:
- labels:
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
- annotations:
- checksum/conf: {{ include (print $.Template.BasePath "/kyuubi-configmap.yaml") . | sha256sum }}
- spec:
- {{- with .Values.imagePullSecrets }}
- imagePullSecrets: {{- toYaml . | nindent 8 }}
- {{- end }}
- serviceAccountName: {{ .Values.serviceAccount.name | default .Release.Name }}
- {{- with .Values.initContainers }}
- initContainers: {{- tpl (toYaml .) $ | nindent 8 }}
- {{- end }}
- containers:
- - name: kyuubi-server
- image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
- imagePullPolicy: {{ .Values.image.pullPolicy }}
- {{- with .Values.env }}
- env: {{- tpl (toYaml .) $ | nindent 12 }}
- {{- end }}
- {{- with .Values.envFrom }}
- envFrom: {{- tpl (toYaml .) $ | nindent 12 }}
- {{- end }}
- ports:
- - name: frontend-port
- containerPort: {{ .Values.server.bind.port }}
- protocol: TCP
- {{- if .Values.probe.liveness.enabled }}
- livenessProbe:
- tcpSocket:
- port: {{ .Values.server.bind.port }}
- initialDelaySeconds: {{ .Values.probe.liveness.initialDelaySeconds }}
- periodSeconds: {{ .Values.probe.liveness.periodSeconds }}
- timeoutSeconds: {{ .Values.probe.liveness.timeoutSeconds }}
- failureThreshold: {{ .Values.probe.liveness.failureThreshold }}
- successThreshold: {{ .Values.probe.liveness.successThreshold }}
- {{- end }}
- {{- if .Values.probe.readiness.enabled }}
- readinessProbe:
- tcpSocket:
- port: {{ .Values.server.bind.port }}
- initialDelaySeconds: {{ .Values.probe.readiness.initialDelaySeconds }}
- periodSeconds: {{ .Values.probe.readiness.periodSeconds }}
- timeoutSeconds: {{ .Values.probe.readiness.timeoutSeconds }}
- failureThreshold: {{ .Values.probe.readiness.failureThreshold }}
- successThreshold: {{ .Values.probe.readiness.successThreshold }}
- {{- end }}
- {{- with .Values.resources }}
- resources: {{- toYaml . | nindent 12 }}
- {{- end }}
- volumeMounts:
- - name: conf
- mountPath: {{ .Values.server.confDir }}
- {{- with .Values.volumeMounts }}
- {{- tpl (toYaml .) $ | nindent 12 }}
- {{- end }}
- {{- with .Values.containers }}
- {{- tpl (toYaml .) $ | nindent 8 }}
- {{- end }}
- volumes:
- - name: conf
- configMap:
- name: {{ .Release.Name }}
- {{- with .Values.volumes }}
- {{- tpl (toYaml .) $ | nindent 8 }}
- {{- end }}
- {{- with .Values.nodeSelector }}
- nodeSelector: {{- toYaml . | nindent 8 }}
- {{- end }}
- {{- with .Values.affinity }}
- affinity: {{- toYaml . | nindent 8 }}
- {{- end }}
- {{- with .Values.tolerations }}
- tolerations: {{- toYaml . | nindent 8 }}
- {{- end }}
- {{- with .Values.securityContext }}
- securityContext: {{- toYaml . | nindent 8 }}
- {{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-headless-service.yaml b/charts/kyuubi/templates/kyuubi-headless-service.yaml
new file mode 100644
index 000000000..fa04ffeef
--- /dev/null
+++ b/charts/kyuubi/templates/kyuubi-headless-service.yaml
@@ -0,0 +1,40 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+apiVersion: v1
+kind: Service
+metadata:
+ name: {{ .Release.Name }}-headless
+ labels:
+ {{- include "kyuubi.labels" $ | nindent 4 }}
+spec:
+ type: ClusterIP
+ clusterIP: None
+ ports:
+ {{- range $name, $frontend := .Values.server }}
+ - name: {{ $name | kebabcase }}
+ port: {{ tpl $frontend.service.port $ }}
+ targetPort: {{ $frontend.port }}
+ {{- end }}
+ {{- if .Values.monitoring.prometheus.enabled }}
+ - name: prometheus
+ port: {{ .Values.monitoring.prometheus.port }}
+ targetPort: {{ .Values.monitoring.prometheus.port }}
+ {{- end }}
+ selector:
+ {{- include "kyuubi.selectorLabels" $ | nindent 4 }}
+
diff --git a/charts/kyuubi/templates/kyuubi-podmonitor.yaml b/charts/kyuubi/templates/kyuubi-podmonitor.yaml
new file mode 100644
index 000000000..458ff66ed
--- /dev/null
+++ b/charts/kyuubi/templates/kyuubi-podmonitor.yaml
@@ -0,0 +1,31 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+{{- if and .Values.monitoring.prometheus.enabled (eq .Values.metricsReporters "PROMETHEUS") .Values.podMonitor.enabled }}
+apiVersion: monitoring.coreos.com/v1
+kind: PodMonitor
+metadata:
+ name: {{ .Release.Name }}
+ labels:
+ {{- include "kyuubi.labels" . | nindent 4 }}
+spec:
+ selector:
+ matchLabels:
+ app: {{ .Release.Name }}
+ podMetricsEndpoints:
+ {{- toYaml .Values.podMonitor.podMetricsEndpoint | nindent 4 }}
+{{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-priorityclass.yaml b/charts/kyuubi/templates/kyuubi-priorityclass.yaml
new file mode 100644
index 000000000..c756108ae
--- /dev/null
+++ b/charts/kyuubi/templates/kyuubi-priorityclass.yaml
@@ -0,0 +1,26 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+{{- if .Values.priorityClass.create }}
+apiVersion: scheduling.k8s.io/v1
+kind: PriorityClass
+metadata:
+ name: {{ .Values.priorityClass.name | default .Release.Name }}
+ labels:
+ {{- include "kyuubi.labels" . | nindent 4 }}
+value: {{ .Values.priorityClass.value }}
+{{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-role.yaml b/charts/kyuubi/templates/kyuubi-role.yaml
index fcb5a9f6e..5ee8c1dff 100644
--- a/charts/kyuubi/templates/kyuubi-role.yaml
+++ b/charts/kyuubi/templates/kyuubi-role.yaml
@@ -1,19 +1,19 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
{{- if .Values.rbac.create }}
apiVersion: rbac.authorization.k8s.io/v1
@@ -21,10 +21,6 @@ kind: Role
metadata:
name: {{ .Release.Name }}
labels:
- helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
- app.kubernetes.io/version: {{ .Values.image.tag | default .Chart.AppVersion | quote }}
- app.kubernetes.io/managed-by: {{ .Release.Service }}
+ {{- include "kyuubi.labels" . | nindent 4 }}
rules: {{- toYaml .Values.rbac.rules | nindent 2 }}
{{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-rolebinding.yaml b/charts/kyuubi/templates/kyuubi-rolebinding.yaml
index 8f74efc2d..0f9dbd049 100644
--- a/charts/kyuubi/templates/kyuubi-rolebinding.yaml
+++ b/charts/kyuubi/templates/kyuubi-rolebinding.yaml
@@ -1,19 +1,19 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
{{- if .Values.rbac.create }}
apiVersion: rbac.authorization.k8s.io/v1
@@ -21,11 +21,7 @@ kind: RoleBinding
metadata:
name: {{ .Release.Name }}
labels:
- helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
- app.kubernetes.io/version: {{ .Values.image.tag | default .Chart.AppVersion | quote }}
- app.kubernetes.io/managed-by: {{ .Release.Service }}
+ {{- include "kyuubi.labels" . | nindent 4 }}
subjects:
- kind: ServiceAccount
name: {{ .Values.serviceAccount.name | default .Release.Name }}
diff --git a/charts/kyuubi/templates/kyuubi-service.yaml b/charts/kyuubi/templates/kyuubi-service.yaml
index 0152bd23d..64c8b06ac 100644
--- a/charts/kyuubi/templates/kyuubi-service.yaml
+++ b/charts/kyuubi/templates/kyuubi-service.yaml
@@ -1,41 +1,42 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+{{- range $name, $frontend := .Values.server }}
+{{- if $frontend.enabled }}
apiVersion: v1
kind: Service
metadata:
- name: {{ .Release.Name }}
+ name: {{ $.Release.Name }}-{{ $name | kebabcase }}
labels:
- helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
- app.kubernetes.io/version: {{ .Values.image.tag | default .Chart.AppVersion | quote }}
- app.kubernetes.io/managed-by: {{ .Release.Service }}
- {{- with .Values.service.annotations }}
- annotations:
- {{- toYaml . | nindent 4 }}
+ {{- include "kyuubi.labels" $ | nindent 4 }}
+ {{- with $frontend.service.annotations }}
+ annotations: {{- toYaml . | nindent 4 }}
{{- end }}
spec:
+ type: {{ $frontend.service.type }}
ports:
- - name: http
- nodePort: {{ .Values.service.port }}
- port: {{ .Values.server.bind.port }}
- protocol: TCP
- type: {{ .Values.service.type }}
+ - name: {{ $name | kebabcase }}
+ port: {{ tpl $frontend.service.port $ }}
+ targetPort: {{ $frontend.port }}
+ {{- if and (eq $frontend.service.type "NodePort") ($frontend.service.nodePort) }}
+ nodePort: {{ $frontend.service.nodePort }}
+ {{- end }}
selector:
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
+ {{- include "kyuubi.selectorLabels" $ | nindent 4 }}
+---
+{{- end }}
+{{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-serviceaccount.yaml b/charts/kyuubi/templates/kyuubi-serviceaccount.yaml
index 770d50136..a8e282a1f 100644
--- a/charts/kyuubi/templates/kyuubi-serviceaccount.yaml
+++ b/charts/kyuubi/templates/kyuubi-serviceaccount.yaml
@@ -1,19 +1,19 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
{{- if .Values.serviceAccount.create }}
apiVersion: v1
@@ -21,9 +21,5 @@ kind: ServiceAccount
metadata:
name: {{ .Values.serviceAccount.name | default .Release.Name }}
labels:
- helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
- app.kubernetes.io/name: {{ .Chart.Name }}
- app.kubernetes.io/instance: {{ .Release.Name }}
- app.kubernetes.io/version: {{ .Values.image.tag | default .Chart.AppVersion | quote }}
- app.kubernetes.io/managed-by: {{ .Release.Service }}
+ {{- include "kyuubi.labels" . | nindent 4 }}
{{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-servicemonitor.yaml b/charts/kyuubi/templates/kyuubi-servicemonitor.yaml
new file mode 100644
index 000000000..11098a0ea
--- /dev/null
+++ b/charts/kyuubi/templates/kyuubi-servicemonitor.yaml
@@ -0,0 +1,31 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+{{- if and .Values.monitoring.prometheus.enabled (eq .Values.metricsReporters "PROMETHEUS") .Values.serviceMonitor.enabled }}
+apiVersion: monitoring.coreos.com/v1
+kind: ServiceMonitor
+metadata:
+ name: {{ .Release.Name }}
+ labels:
+ {{- include "kyuubi.labels" . | nindent 4 }}
+spec:
+ selector:
+ matchLabels:
+ app: {{ .Release.Name }}
+ endpoints:
+ {{- toYaml .Values.serviceMonitor.endpoints | nindent 4 }}
+{{- end }}
diff --git a/charts/kyuubi/templates/kyuubi-statefulset.yaml b/charts/kyuubi/templates/kyuubi-statefulset.yaml
new file mode 100644
index 000000000..309ef8ec9
--- /dev/null
+++ b/charts/kyuubi/templates/kyuubi-statefulset.yaml
@@ -0,0 +1,132 @@
+{{/*
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+*/}}
+
+apiVersion: apps/v1
+kind: StatefulSet
+metadata:
+ name: {{ .Release.Name }}
+ labels:
+ {{- include "kyuubi.labels" . | nindent 4 }}
+spec:
+ selector:
+ matchLabels:
+ {{- include "kyuubi.selectorLabels" . | nindent 6 }}
+ serviceName: {{ .Release.Name }}-headless
+ minReadySeconds: {{ .Values.minReadySeconds }}
+ replicas: {{ .Values.replicaCount }}
+ revisionHistoryLimit: {{ .Values.revisionHistoryLimit }}
+ podManagementPolicy: {{ .Values.podManagementPolicy }}
+ {{- with .Values.updateStrategy }}
+ updateStrategy: {{- toYaml . | nindent 4 }}
+ {{- end }}
+ template:
+ metadata:
+ labels:
+ {{- include "kyuubi.selectorLabels" . | nindent 8 }}
+ annotations:
+ checksum/conf: {{ include (print $.Template.BasePath "/kyuubi-configmap.yaml") . | sha256sum }}
+ spec:
+ {{- with .Values.imagePullSecrets }}
+ imagePullSecrets: {{- toYaml . | nindent 8 }}
+ {{- end }}
+ {{- if or .Values.serviceAccount.name .Values.serviceAccount.create }}
+ serviceAccountName: {{ .Values.serviceAccount.name | default .Release.Name }}
+ {{- end }}
+ {{- if or .Values.priorityClass.name .Values.priorityClass.create }}
+ priorityClassName: {{ .Values.priorityClass.name | default .Release.Name }}
+ {{- end }}
+ {{- with .Values.initContainers }}
+ initContainers: {{- tpl (toYaml .) $ | nindent 8 }}
+ {{- end }}
+ containers:
+ - name: kyuubi-server
+ image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
+ imagePullPolicy: {{ .Values.image.pullPolicy }}
+ {{- with .Values.command }}
+ command: {{- tpl (toYaml .) $ | nindent 12 }}
+ {{- end }}
+ {{- with .Values.args }}
+ args: {{- tpl (toYaml .) $ | nindent 12 }}
+ {{- end }}
+ {{- with .Values.env }}
+ env: {{- tpl (toYaml .) $ | nindent 12 }}
+ {{- end }}
+ {{- with .Values.envFrom }}
+ envFrom: {{- tpl (toYaml .) $ | nindent 12 }}
+ {{- end }}
+ ports:
+ {{- range $name, $frontend := .Values.server }}
+ {{- if $frontend.enabled }}
+ - name: {{ $name | kebabcase }}
+ containerPort: {{ $frontend.port }}
+ {{- end }}
+ {{- end }}
+ {{- if .Values.monitoring.prometheus.enabled }}
+ - name: prometheus
+ containerPort: {{ .Values.monitoring.prometheus.port }}
+ {{- end }}
+ {{- if .Values.livenessProbe.enabled }}
+ livenessProbe:
+ exec:
+ command: ["/bin/bash", "-c", "bin/kyuubi status"]
+ initialDelaySeconds: {{ .Values.livenessProbe.initialDelaySeconds }}
+ periodSeconds: {{ .Values.livenessProbe.periodSeconds }}
+ timeoutSeconds: {{ .Values.livenessProbe.timeoutSeconds }}
+ failureThreshold: {{ .Values.livenessProbe.failureThreshold }}
+ successThreshold: {{ .Values.livenessProbe.successThreshold }}
+ {{- end }}
+ {{- if .Values.readinessProbe.enabled }}
+ readinessProbe:
+ exec:
+ command: ["/bin/bash", "-c", "$KYUUBI_HOME/bin/kyuubi status"]
+ initialDelaySeconds: {{ .Values.readinessProbe.initialDelaySeconds }}
+ periodSeconds: {{ .Values.readinessProbe.periodSeconds }}
+ timeoutSeconds: {{ .Values.readinessProbe.timeoutSeconds }}
+ failureThreshold: {{ .Values.readinessProbe.failureThreshold }}
+ successThreshold: {{ .Values.readinessProbe.successThreshold }}
+ {{- end }}
+ {{- with .Values.resources }}
+ resources: {{- toYaml . | nindent 12 }}
+ {{- end }}
+ volumeMounts:
+ - name: conf
+ mountPath: {{ .Values.kyuubiConfDir }}
+ {{- with .Values.volumeMounts }}
+ {{- tpl (toYaml .) $ | nindent 12 }}
+ {{- end }}
+ {{- with .Values.containers }}
+ {{- tpl (toYaml .) $ | nindent 8 }}
+ {{- end }}
+ volumes:
+ - name: conf
+ configMap:
+ name: {{ .Release.Name }}
+ {{- with .Values.volumes }}
+ {{- tpl (toYaml .) $ | nindent 8 }}
+ {{- end }}
+ {{- with .Values.nodeSelector }}
+ nodeSelector: {{- toYaml . | nindent 8 }}
+ {{- end }}
+ {{- with .Values.affinity }}
+ affinity: {{- toYaml . | nindent 8 }}
+ {{- end }}
+ {{- with .Values.tolerations }}
+ tolerations: {{- toYaml . | nindent 8 }}
+ {{- end }}
+ {{- with .Values.securityContext }}
+ securityContext: {{- toYaml . | nindent 8 }}
+ {{- end }}
diff --git a/charts/kyuubi/values.yaml b/charts/kyuubi/values.yaml
index 22ae9d5a9..faa854b10 100644
--- a/charts/kyuubi/values.yaml
+++ b/charts/kyuubi/values.yaml
@@ -22,61 +22,143 @@
# Kyuubi server numbers
replicaCount: 2
+# controls how Kyuubi server pods are created during initial scale up,
+# when replacing pods on nodes, or when scaling down.
+# The default policy is `OrderedReady`, alternative policy is `Parallel`.
+podManagementPolicy: OrderedReady
+
+# Minimum number of seconds for which a newly created kyuubi server
+# should be ready without any of its container crashing for it to be considered available.
+minReadySeconds: 30
+
+# maximum number of revisions that will be maintained in the StatefulSet's revision history.
+revisionHistoryLimit: 10
+
+# indicates the StatefulSetUpdateStrategy that will be employed to update Kyuubi server Pods in the StatefulSet
+# when a revision is made to Template.
+updateStrategy:
+ type: RollingUpdate
+ rollingUpdate:
+ maxUnavailable: 1
+ partition: 0
+
image:
repository: apache/kyuubi
- pullPolicy: Always
+ pullPolicy: IfNotPresent
tag: ~
imagePullSecrets: []
-# ServiceAccount used for Kyuubi create/list/delete pod in kubernetes
+# ServiceAccount used for Kyuubi create/list/delete pod in Kubernetes
serviceAccount:
+ # Specifies whether a ServiceAccount should be created
create: true
+ # Specifies ServiceAccount name to be used (created if `create: true`)
+ name: ~
+
+# priorityClass used for Kyuubi server pod
+priorityClass:
+ # Specifies whether a priorityClass should be created
+ create: false
+ # Specifies priorityClass name to be used (created if `create: true`)
name: ~
+ # half of system-cluster-critical by default
+ value: 1000000000
+# Role-based access control
rbac:
+ # Specifies whether RBAC resources should be created
create: true
+ # RBAC rules
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "list", "delete"]
-probe:
- liveness:
- enabled: true
- initialDelaySeconds: 30
- periodSeconds: 10
- timeoutSeconds: 2
- failureThreshold: 10
- successThreshold: 1
- readiness:
- enabled: true
- initialDelaySeconds: 30
- periodSeconds: 10
- timeoutSeconds: 2
- failureThreshold: 10
- successThreshold: 1
-
server:
- bind:
- host: 0.0.0.0
+ # Thrift Binary protocol (HiveServer2 compatible)
+ thriftBinary:
+ enabled: true
port: 10009
- confDir: /opt/kyuubi/conf
- conf:
- # The value (templated string) is used for kyuubi-env.sh file
- # See https://kyuubi.apache.org/docs/latest/deployment/settings.html#environments for more details
- kyuubiEnv: ~
-
- # The value (templated string) is used for kyuubi-defaults.conf file
- # See https://kyuubi.apache.org/docs/latest/deployment/settings.html#kyuubi-configurations for more details
- kyuubiDefaults: ~
-
- # The value (templated string) is used for log4j2.xml file
- # See https://kyuubi.apache.org/docs/latest/deployment/settings.html#logging for more details
- log4j2: ~
+ service:
+ type: ClusterIP
+ port: "{{ .Values.server.thriftBinary.port }}"
+ nodePort: ~
+ annotations: {}
+
+ # Thrift HTTP protocol (HiveServer2 compatible)
+ thriftHttp:
+ enabled: false
+ port: 10010
+ service:
+ type: ClusterIP
+ port: "{{ .Values.server.thriftHttp.port }}"
+ nodePort: ~
+ annotations: {}
+
+ # REST API protocol (experimental)
+ rest:
+ enabled: true
+ port: 10099
+ service:
+ type: ClusterIP
+ port: "{{ .Values.server.rest.port }}"
+ nodePort: ~
+ annotations: {}
+
+ # MySQL compatible text protocol (experimental)
+ mysql:
+ enabled: false
+ port: 3309
+ service:
+ type: ClusterIP
+ port: "{{ .Values.server.mysql.port }}"
+ nodePort: ~
+ annotations: {}
+
+monitoring:
+ # Exposes metrics in Prometheus format
+ prometheus:
+ enabled: true
+ port: 10019
+
+# $KYUUBI_CONF_DIR directory
+kyuubiConfDir: /opt/kyuubi/conf
+# Kyuubi configurations files
+kyuubiConf:
+ # The value (templated string) is used for kyuubi-env.sh file
+ # See example at conf/kyuubi-env.sh.template and https://kyuubi.readthedocs.io/en/master/configuration/settings.html#environments for more details
+ kyuubiEnv: ~
+ # kyuubiEnv: |
+ # export JAVA_HOME=/usr/jdk64/jdk1.8.0_152
+ # export SPARK_HOME=/opt/spark
+ # export FLINK_HOME=/opt/flink
+ # export HIVE_HOME=/opt/hive
+
+ # The value (templated string) is used for kyuubi-defaults.conf file
+ # See https://kyuubi.readthedocs.io/en/master/configuration/settings.html#kyuubi-configurations for more details
+ kyuubiDefaults: ~
+ # kyuubiDefaults: |
+ # kyuubi.authentication=NONE
+ # kyuubi.frontend.bind.host=10.0.0.1
+ # kyuubi.engine.type=SPARK_SQL
+ # kyuubi.engine.share.level=USER
+ # kyuubi.session.engine.initialize.timeout=PT3M
+ # kyuubi.ha.addresses=zk1:2181,zk2:2181,zk3:2181
+ # kyuubi.ha.namespace=kyuubi
+
+ # The value (templated string) is used for log4j2.xml file
+ # See example at conf/log4j2.xml.template https://kyuubi.readthedocs.io/en/master/configuration/settings.html#logging for more details
+ log4j2: ~
+
+# Command to launch Kyuubi server (templated)
+command: ~
+# Arguments to launch Kyuubi server (templated)
+args: ~
# Environment variables (templated)
env: []
+# Environment variables from ConfigMaps and Secrets (templated)
envFrom: []
# Additional volumes for Kyuubi pod (templated)
@@ -89,30 +171,67 @@ initContainers: []
# Additional containers for Kyuubi pod (templated)
containers: []
-service:
- type: NodePort
- # The default port limit of kubernetes is 30000-32767
- # to change:
- # vim kube-apiserver.yaml (usually under path: /etc/kubernetes/manifests/)
- # add or change line 'service-node-port-range=1-32767' under kube-apiserver
- port: 30009
- annotations: {}
-
+# Resource requests and limits for Kyuubi pods
resources: {}
- # Used to specify resource, default unlimited.
- # If you do want to specify resources:
- # 1. remove the curly braces after 'resources:'
- # 2. uncomment the following lines
- # limits:
- # cpu: 4
- # memory: 10Gi
- # requests:
- # cpu: 2
- # memory: 4Gi
-
-# Constrain Kyuubi server pods to specific nodes
+# resources:
+# requests:
+# cpu: 2
+# memory: 4Gi
+# limits:
+# cpu: 4
+# memory: 10Gi
+
+# Liveness probe
+livenessProbe:
+ enabled: true
+ initialDelaySeconds: 30
+ periodSeconds: 10
+ timeoutSeconds: 2
+ failureThreshold: 10
+ successThreshold: 1
+
+# Readiness probe
+readinessProbe:
+ enabled: true
+ initialDelaySeconds: 30
+ periodSeconds: 10
+ timeoutSeconds: 2
+ failureThreshold: 10
+ successThreshold: 1
+
+# Constrain Kyuubi pods to nodes with specific node labels
nodeSelector: {}
+# Allow to schedule Kyuubi pods on nodes with matching taints
tolerations: []
+# Constrain Kyuubi pods to nodes by complex affinity/anti-affinity rules
affinity: {}
+# Kyuubi pods security context
securityContext: {}
+
+# Monitoring Kyuubi - Server Metrics
+# PROMETHEUS - PrometheusReporter which exposes metrics in Prometheus format
+metricsReporters: ~
+
+# Prometheus pod monitor
+podMonitor:
+ # If enabled, podMonitor for operator's pod will be created
+ enabled: false
+ # The podMetricsEndpoint contains metrics information such as port, interval, scheme, and possibly other relevant details.
+ # This information is used to configure the endpoint from which Prometheus can scrape and collect metrics for a specific Pod in Kubernetes.
+ podMetricsEndpoint: []
+
+# Prometheus service monitor
+serviceMonitor:
+ # If enabled, ServiceMonitor resources for Prometheus Operator are created
+ enabled: false
+ # The endpoints section in a ServiceMonitor specifies the metrics information for each target endpoint.
+ # This allows you to collect metrics from multiple Services across your Kubernetes cluster in a standardized and automated way.
+ endpoints: []
+
+# Rules for the Prometheus Operator
+prometheusRule:
+ # If enabled, a PrometheusRule resource for Prometheus Operator is created
+ enabled: false
+ # Contents of Prometheus rules file
+ groups: []
diff --git a/codecov.yml b/codecov.yml
index 6267ea380..1be776f58 100644
--- a/codecov.yml
+++ b/codecov.yml
@@ -16,4 +16,11 @@
#
codecov:
- token: b624e642-b0c8-4d45-94a1-a370888435bb
+ token: 5115fd3e-2ef2-40ed-b012-376a2afdc382
+
+coverage:
+ status:
+ project:
+ default:
+ target: auto # auto compares coverage to the previous base commit
+ threshold: 2% #this allows a 2% drop from the previous base commit coverage
diff --git a/conf/kyuubi-defaults.conf.template b/conf/kyuubi-defaults.conf.template
index d3e6026d9..eef36ad10 100644
--- a/conf/kyuubi-defaults.conf.template
+++ b/conf/kyuubi-defaults.conf.template
@@ -18,9 +18,19 @@
## Kyuubi Configurations
#
-# kyuubi.authentication NONE
-# kyuubi.frontend.bind.host localhost
-# kyuubi.frontend.bind.port 10009
+# kyuubi.authentication NONE
+#
+# kyuubi.frontend.bind.host 10.0.0.1
+# kyuubi.frontend.protocols THRIFT_BINARY,REST
+# kyuubi.frontend.thrift.binary.bind.port 10009
+# kyuubi.frontend.rest.bind.port 10099
+#
+# kyuubi.engine.type SPARK_SQL
+# kyuubi.engine.share.level USER
+# kyuubi.session.engine.initialize.timeout PT3M
+#
+# kyuubi.ha.addresses zk1:2181,zk2:2181,zk3:2181
+# kyuubi.ha.namespace kyuubi
#
-# Details in https://kyuubi.readthedocs.io/en/master/deployment/settings.html
+# Details in https://kyuubi.readthedocs.io/en/master/configuration/settings.html
diff --git a/conf/log4j2.xml.template b/conf/log4j2.xml.template
index 37fc8acf0..215fddf47 100644
--- a/conf/log4j2.xml.template
+++ b/conf/log4j2.xml.template
@@ -21,19 +21,30 @@
Set to debug or trace if log4j initialization is failing. -->
+ ${env:KYUUBI_LOG_DIR}rest-audit.logrest-audit-%d{yyyy-MM-dd}-%i.log
+ k8s-audit.log
+ k8s-audit-%d{yyyy-MM-dd}-%i.log
-
+
-
-
+
+
+
+
+
+
+
+
@@ -58,5 +69,8 @@
+
+
+
diff --git a/dev/dependencyList b/dev/dependencyList
index 9b8064e42..ede67c961 100644
--- a/dev/dependencyList
+++ b/dev/dependencyList
@@ -16,38 +16,40 @@
#
HikariCP/4.0.3//HikariCP-4.0.3.jar
+ST4/4.3.4//ST4-4.3.4.jar
animal-sniffer-annotations/1.21//animal-sniffer-annotations-1.21.jar
annotations/4.1.1.4//annotations-4.1.1.4.jar
+antlr-runtime/3.5.3//antlr-runtime-3.5.3.jar
antlr4-runtime/4.9.3//antlr4-runtime-4.9.3.jar
aopalliance-repackaged/2.6.1//aopalliance-repackaged-2.6.1.jar
-automaton/1.11-8//automaton-1.11-8.jar
+arrow-format/12.0.0//arrow-format-12.0.0.jar
+arrow-memory-core/12.0.0//arrow-memory-core-12.0.0.jar
+arrow-memory-netty/12.0.0//arrow-memory-netty-12.0.0.jar
+arrow-vector/12.0.0//arrow-vector-12.0.0.jar
classgraph/4.8.138//classgraph-4.8.138.jar
commons-codec/1.15//commons-codec-1.15.jar
commons-collections/3.2.2//commons-collections-3.2.2.jar
commons-lang/2.6//commons-lang-2.6.jar
-commons-lang3/3.12.0//commons-lang3-3.12.0.jar
+commons-lang3/3.13.0//commons-lang3-3.13.0.jar
commons-logging/1.1.3//commons-logging-1.1.3.jar
-curator-client/2.12.0//curator-client-2.12.0.jar
-curator-framework/2.12.0//curator-framework-2.12.0.jar
-curator-recipes/2.12.0//curator-recipes-2.12.0.jar
derby/10.14.2.0//derby-10.14.2.0.jar
error_prone_annotations/2.14.0//error_prone_annotations-2.14.0.jar
failsafe/2.4.4//failsafe-2.4.4.jar
failureaccess/1.0.1//failureaccess-1.0.1.jar
+flatbuffers-java/1.12.0//flatbuffers-java-1.12.0.jar
fliptables/1.0.2//fliptables-1.0.2.jar
-generex/1.0.2//generex-1.0.2.jar
-grpc-api/1.48.0//grpc-api-1.48.0.jar
-grpc-context/1.48.0//grpc-context-1.48.0.jar
-grpc-core/1.48.0//grpc-core-1.48.0.jar
-grpc-grpclb/1.48.0//grpc-grpclb-1.48.0.jar
-grpc-netty/1.48.0//grpc-netty-1.48.0.jar
-grpc-protobuf-lite/1.48.0//grpc-protobuf-lite-1.48.0.jar
-grpc-protobuf/1.48.0//grpc-protobuf-1.48.0.jar
-grpc-stub/1.48.0//grpc-stub-1.48.0.jar
+grpc-api/1.53.0//grpc-api-1.53.0.jar
+grpc-context/1.53.0//grpc-context-1.53.0.jar
+grpc-core/1.53.0//grpc-core-1.53.0.jar
+grpc-grpclb/1.53.0//grpc-grpclb-1.53.0.jar
+grpc-netty/1.53.0//grpc-netty-1.53.0.jar
+grpc-protobuf-lite/1.53.0//grpc-protobuf-lite-1.53.0.jar
+grpc-protobuf/1.53.0//grpc-protobuf-1.53.0.jar
+grpc-stub/1.53.0//grpc-stub-1.53.0.jar
gson/2.9.0//gson-2.9.0.jar
-guava/31.1-jre//guava-31.1-jre.jar
-hadoop-client-api/3.3.4//hadoop-client-api-3.3.4.jar
-hadoop-client-runtime/3.3.4//hadoop-client-runtime-3.3.4.jar
+guava/32.0.1-jre//guava-32.0.1-jre.jar
+hadoop-client-api/3.3.6//hadoop-client-api-3.3.6.jar
+hadoop-client-runtime/3.3.6//hadoop-client-runtime-3.3.6.jar
hive-common/3.1.3//hive-common-3.1.3.jar
hive-metastore/3.1.3//hive-metastore-3.1.3.jar
hive-serde/3.1.3//hive-serde-3.1.3.jar
@@ -63,16 +65,16 @@ httpclient/4.5.14//httpclient-4.5.14.jar
httpcore/4.4.16//httpcore-4.4.16.jar
httpmime/4.5.14//httpmime-4.5.14.jar
j2objc-annotations/1.3//j2objc-annotations-1.3.jar
-jackson-annotations/2.14.1//jackson-annotations-2.14.1.jar
-jackson-core/2.14.1//jackson-core-2.14.1.jar
-jackson-databind/2.14.1//jackson-databind-2.14.1.jar
-jackson-dataformat-yaml/2.14.1//jackson-dataformat-yaml-2.14.1.jar
-jackson-datatype-jdk8/2.12.3//jackson-datatype-jdk8-2.12.3.jar
-jackson-datatype-jsr310/2.14.1//jackson-datatype-jsr310-2.14.1.jar
-jackson-jaxrs-base/2.14.1//jackson-jaxrs-base-2.14.1.jar
-jackson-jaxrs-json-provider/2.14.1//jackson-jaxrs-json-provider-2.14.1.jar
-jackson-module-jaxb-annotations/2.14.1//jackson-module-jaxb-annotations-2.14.1.jar
-jackson-module-scala_2.12/2.14.1//jackson-module-scala_2.12-2.14.1.jar
+jackson-annotations/2.15.0//jackson-annotations-2.15.0.jar
+jackson-core/2.15.0//jackson-core-2.15.0.jar
+jackson-databind/2.15.0//jackson-databind-2.15.0.jar
+jackson-dataformat-yaml/2.15.0//jackson-dataformat-yaml-2.15.0.jar
+jackson-datatype-jdk8/2.15.0//jackson-datatype-jdk8-2.15.0.jar
+jackson-datatype-jsr310/2.15.0//jackson-datatype-jsr310-2.15.0.jar
+jackson-jaxrs-base/2.15.0//jackson-jaxrs-base-2.15.0.jar
+jackson-jaxrs-json-provider/2.15.0//jackson-jaxrs-json-provider-2.15.0.jar
+jackson-module-jaxb-annotations/2.15.0//jackson-module-jaxb-annotations-2.15.0.jar
+jackson-module-scala_2.12/2.15.0//jackson-module-scala_2.12-2.15.0.jar
jakarta.annotation-api/1.3.5//jakarta.annotation-api-1.3.5.jar
jakarta.inject/2.6.1//jakarta.inject-2.6.1.jar
jakarta.servlet-api/4.0.4//jakarta.servlet-api-4.0.4.jar
@@ -81,77 +83,85 @@ jakarta.ws.rs-api/2.1.6//jakarta.ws.rs-api-2.1.6.jar
jakarta.xml.bind-api/2.3.2//jakarta.xml.bind-api-2.3.2.jar
javassist/3.25.0-GA//javassist-3.25.0-GA.jar
jcl-over-slf4j/1.7.36//jcl-over-slf4j-1.7.36.jar
-jersey-client/2.38//jersey-client-2.38.jar
-jersey-common/2.38//jersey-common-2.38.jar
-jersey-container-servlet-core/2.38//jersey-container-servlet-core-2.38.jar
-jersey-entity-filtering/2.38//jersey-entity-filtering-2.38.jar
-jersey-hk2/2.38//jersey-hk2-2.38.jar
-jersey-media-json-jackson/2.38//jersey-media-json-jackson-2.38.jar
-jersey-media-multipart/2.38//jersey-media-multipart-2.38.jar
-jersey-server/2.38//jersey-server-2.38.jar
+jersey-client/2.39.1//jersey-client-2.39.1.jar
+jersey-common/2.39.1//jersey-common-2.39.1.jar
+jersey-container-servlet-core/2.39.1//jersey-container-servlet-core-2.39.1.jar
+jersey-entity-filtering/2.39.1//jersey-entity-filtering-2.39.1.jar
+jersey-hk2/2.39.1//jersey-hk2-2.39.1.jar
+jersey-media-json-jackson/2.39.1//jersey-media-json-jackson-2.39.1.jar
+jersey-media-multipart/2.39.1//jersey-media-multipart-2.39.1.jar
+jersey-server/2.39.1//jersey-server-2.39.1.jar
jetcd-api/0.7.3//jetcd-api-0.7.3.jar
jetcd-common/0.7.3//jetcd-common-0.7.3.jar
jetcd-core/0.7.3//jetcd-core-0.7.3.jar
jetcd-grpc/0.7.3//jetcd-grpc-0.7.3.jar
-jetty-http/9.4.50.v20221201//jetty-http-9.4.50.v20221201.jar
-jetty-io/9.4.50.v20221201//jetty-io-9.4.50.v20221201.jar
-jetty-security/9.4.50.v20221201//jetty-security-9.4.50.v20221201.jar
-jetty-server/9.4.50.v20221201//jetty-server-9.4.50.v20221201.jar
-jetty-servlet/9.4.50.v20221201//jetty-servlet-9.4.50.v20221201.jar
-jetty-util-ajax/9.4.50.v20221201//jetty-util-ajax-9.4.50.v20221201.jar
-jetty-util/9.4.50.v20221201//jetty-util-9.4.50.v20221201.jar
+jetty-client/9.4.52.v20230823//jetty-client-9.4.52.v20230823.jar
+jetty-http/9.4.52.v20230823//jetty-http-9.4.52.v20230823.jar
+jetty-io/9.4.52.v20230823//jetty-io-9.4.52.v20230823.jar
+jetty-proxy/9.4.52.v20230823//jetty-proxy-9.4.52.v20230823.jar
+jetty-security/9.4.52.v20230823//jetty-security-9.4.52.v20230823.jar
+jetty-server/9.4.52.v20230823//jetty-server-9.4.52.v20230823.jar
+jetty-servlet/9.4.52.v20230823//jetty-servlet-9.4.52.v20230823.jar
+jetty-util-ajax/9.4.52.v20230823//jetty-util-ajax-9.4.52.v20230823.jar
+jetty-util/9.4.52.v20230823//jetty-util-9.4.52.v20230823.jar
jline/0.9.94//jline-0.9.94.jar
jul-to-slf4j/1.7.36//jul-to-slf4j-1.7.36.jar
-kubernetes-client/5.12.1//kubernetes-client-5.12.1.jar
-kubernetes-model-admissionregistration/5.12.1//kubernetes-model-admissionregistration-5.12.1.jar
-kubernetes-model-apiextensions/5.12.1//kubernetes-model-apiextensions-5.12.1.jar
-kubernetes-model-apps/5.12.1//kubernetes-model-apps-5.12.1.jar
-kubernetes-model-autoscaling/5.12.1//kubernetes-model-autoscaling-5.12.1.jar
-kubernetes-model-batch/5.12.1//kubernetes-model-batch-5.12.1.jar
-kubernetes-model-certificates/5.12.1//kubernetes-model-certificates-5.12.1.jar
-kubernetes-model-common/5.12.1//kubernetes-model-common-5.12.1.jar
-kubernetes-model-coordination/5.12.1//kubernetes-model-coordination-5.12.1.jar
-kubernetes-model-core/5.12.1//kubernetes-model-core-5.12.1.jar
-kubernetes-model-discovery/5.12.1//kubernetes-model-discovery-5.12.1.jar
-kubernetes-model-events/5.12.1//kubernetes-model-events-5.12.1.jar
-kubernetes-model-extensions/5.12.1//kubernetes-model-extensions-5.12.1.jar
-kubernetes-model-flowcontrol/5.12.1//kubernetes-model-flowcontrol-5.12.1.jar
-kubernetes-model-metrics/5.12.1//kubernetes-model-metrics-5.12.1.jar
-kubernetes-model-networking/5.12.1//kubernetes-model-networking-5.12.1.jar
-kubernetes-model-node/5.12.1//kubernetes-model-node-5.12.1.jar
-kubernetes-model-policy/5.12.1//kubernetes-model-policy-5.12.1.jar
-kubernetes-model-rbac/5.12.1//kubernetes-model-rbac-5.12.1.jar
-kubernetes-model-scheduling/5.12.1//kubernetes-model-scheduling-5.12.1.jar
-kubernetes-model-storageclass/5.12.1//kubernetes-model-storageclass-5.12.1.jar
+kafka-clients/3.5.1//kafka-clients-3.5.1.jar
+kubernetes-client-api/6.8.1//kubernetes-client-api-6.8.1.jar
+kubernetes-client/6.8.1//kubernetes-client-6.8.1.jar
+kubernetes-httpclient-okhttp/6.8.1//kubernetes-httpclient-okhttp-6.8.1.jar
+kubernetes-model-admissionregistration/6.8.1//kubernetes-model-admissionregistration-6.8.1.jar
+kubernetes-model-apiextensions/6.8.1//kubernetes-model-apiextensions-6.8.1.jar
+kubernetes-model-apps/6.8.1//kubernetes-model-apps-6.8.1.jar
+kubernetes-model-autoscaling/6.8.1//kubernetes-model-autoscaling-6.8.1.jar
+kubernetes-model-batch/6.8.1//kubernetes-model-batch-6.8.1.jar
+kubernetes-model-certificates/6.8.1//kubernetes-model-certificates-6.8.1.jar
+kubernetes-model-common/6.8.1//kubernetes-model-common-6.8.1.jar
+kubernetes-model-coordination/6.8.1//kubernetes-model-coordination-6.8.1.jar
+kubernetes-model-core/6.8.1//kubernetes-model-core-6.8.1.jar
+kubernetes-model-discovery/6.8.1//kubernetes-model-discovery-6.8.1.jar
+kubernetes-model-events/6.8.1//kubernetes-model-events-6.8.1.jar
+kubernetes-model-extensions/6.8.1//kubernetes-model-extensions-6.8.1.jar
+kubernetes-model-flowcontrol/6.8.1//kubernetes-model-flowcontrol-6.8.1.jar
+kubernetes-model-gatewayapi/6.8.1//kubernetes-model-gatewayapi-6.8.1.jar
+kubernetes-model-metrics/6.8.1//kubernetes-model-metrics-6.8.1.jar
+kubernetes-model-networking/6.8.1//kubernetes-model-networking-6.8.1.jar
+kubernetes-model-node/6.8.1//kubernetes-model-node-6.8.1.jar
+kubernetes-model-policy/6.8.1//kubernetes-model-policy-6.8.1.jar
+kubernetes-model-rbac/6.8.1//kubernetes-model-rbac-6.8.1.jar
+kubernetes-model-resource/6.8.1//kubernetes-model-resource-6.8.1.jar
+kubernetes-model-scheduling/6.8.1//kubernetes-model-scheduling-6.8.1.jar
+kubernetes-model-storageclass/6.8.1//kubernetes-model-storageclass-6.8.1.jar
libfb303/0.9.3//libfb303-0.9.3.jar
libthrift/0.9.3//libthrift-0.9.3.jar
-log4j-1.2-api/2.19.0//log4j-1.2-api-2.19.0.jar
-log4j-api/2.19.0//log4j-api-2.19.0.jar
-log4j-core/2.19.0//log4j-core-2.19.0.jar
-log4j-slf4j-impl/2.19.0//log4j-slf4j-impl-2.19.0.jar
+log4j-1.2-api/2.20.0//log4j-1.2-api-2.20.0.jar
+log4j-api/2.20.0//log4j-api-2.20.0.jar
+log4j-core/2.20.0//log4j-core-2.20.0.jar
+log4j-slf4j-impl/2.20.0//log4j-slf4j-impl-2.20.0.jar
logging-interceptor/3.12.12//logging-interceptor-3.12.12.jar
+lz4-java/1.8.0//lz4-java-1.8.0.jar
metrics-core/4.2.8//metrics-core-4.2.8.jar
metrics-jmx/4.2.8//metrics-jmx-4.2.8.jar
metrics-json/4.2.8//metrics-json-4.2.8.jar
metrics-jvm/4.2.8//metrics-jvm-4.2.8.jar
mimepull/1.9.15//mimepull-1.9.15.jar
-netty-all/4.1.87.Final//netty-all-4.1.87.Final.jar
-netty-buffer/4.1.87.Final//netty-buffer-4.1.87.Final.jar
-netty-codec-dns/4.1.87.Final//netty-codec-dns-4.1.87.Final.jar
-netty-codec-http/4.1.87.Final//netty-codec-http-4.1.87.Final.jar
-netty-codec-http2/4.1.87.Final//netty-codec-http2-4.1.87.Final.jar
-netty-codec-socks/4.1.87.Final//netty-codec-socks-4.1.87.Final.jar
-netty-codec/4.1.87.Final//netty-codec-4.1.87.Final.jar
-netty-common/4.1.87.Final//netty-common-4.1.87.Final.jar
-netty-handler-proxy/4.1.87.Final//netty-handler-proxy-4.1.87.Final.jar
-netty-handler/4.1.87.Final//netty-handler-4.1.87.Final.jar
-netty-resolver-dns/4.1.87.Final//netty-resolver-dns-4.1.87.Final.jar
-netty-resolver/4.1.87.Final//netty-resolver-4.1.87.Final.jar
-netty-transport-classes-epoll/4.1.87.Final//netty-transport-classes-epoll-4.1.87.Final.jar
-netty-transport-native-epoll/4.1.87.Final/linux-aarch_64/netty-transport-native-epoll-4.1.87.Final-linux-aarch_64.jar
-netty-transport-native-epoll/4.1.87.Final/linux-x86_64/netty-transport-native-epoll-4.1.87.Final-linux-x86_64.jar
-netty-transport-native-unix-common/4.1.87.Final//netty-transport-native-unix-common-4.1.87.Final.jar
-netty-transport/4.1.87.Final//netty-transport-4.1.87.Final.jar
+netty-all/4.1.93.Final//netty-all-4.1.93.Final.jar
+netty-buffer/4.1.93.Final//netty-buffer-4.1.93.Final.jar
+netty-codec-dns/4.1.93.Final//netty-codec-dns-4.1.93.Final.jar
+netty-codec-http/4.1.93.Final//netty-codec-http-4.1.93.Final.jar
+netty-codec-http2/4.1.93.Final//netty-codec-http2-4.1.93.Final.jar
+netty-codec-socks/4.1.93.Final//netty-codec-socks-4.1.93.Final.jar
+netty-codec/4.1.93.Final//netty-codec-4.1.93.Final.jar
+netty-common/4.1.93.Final//netty-common-4.1.93.Final.jar
+netty-handler-proxy/4.1.93.Final//netty-handler-proxy-4.1.93.Final.jar
+netty-handler/4.1.93.Final//netty-handler-4.1.93.Final.jar
+netty-resolver-dns/4.1.93.Final//netty-resolver-dns-4.1.93.Final.jar
+netty-resolver/4.1.93.Final//netty-resolver-4.1.93.Final.jar
+netty-transport-classes-epoll/4.1.93.Final//netty-transport-classes-epoll-4.1.93.Final.jar
+netty-transport-native-epoll/4.1.93.Final/linux-aarch_64/netty-transport-native-epoll-4.1.93.Final-linux-aarch_64.jar
+netty-transport-native-epoll/4.1.93.Final/linux-x86_64/netty-transport-native-epoll-4.1.93.Final-linux-x86_64.jar
+netty-transport-native-unix-common/4.1.93.Final//netty-transport-native-unix-common-4.1.93.Final.jar
+netty-transport/4.1.93.Final//netty-transport-4.1.93.Final.jar
okhttp-urlconnection/3.14.9//okhttp-urlconnection-3.14.9.jar
okhttp/3.12.12//okhttp-3.12.12.jar
okio/1.15.0//okio-1.15.0.jar
@@ -161,7 +171,7 @@ perfmark-api/0.25.0//perfmark-api-0.25.0.jar
proto-google-common-protos/2.9.0//proto-google-common-protos-2.9.0.jar
protobuf-java-util/3.21.7//protobuf-java-util-3.21.7.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-scala-library/2.12.17//scala-library-2.12.17.jar
+scala-library/2.12.18//scala-library-2.12.18.jar
scopt_2.12/4.1.0//scopt_2.12-4.1.0.jar
simpleclient/0.16.0//simpleclient-0.16.0.jar
simpleclient_common/0.16.0//simpleclient_common-0.16.0.jar
@@ -172,16 +182,18 @@ simpleclient_tracer_common/0.16.0//simpleclient_tracer_common-0.16.0.jar
simpleclient_tracer_otel/0.16.0//simpleclient_tracer_otel-0.16.0.jar
simpleclient_tracer_otel_agent/0.16.0//simpleclient_tracer_otel_agent-0.16.0.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
-snakeyaml/1.33//snakeyaml-1.33.jar
+snakeyaml-engine/2.6//snakeyaml-engine-2.6.jar
+snakeyaml/2.2//snakeyaml-2.2.jar
+snappy-java/1.1.10.1//snappy-java-1.1.10.1.jar
+sqlite-jdbc/3.42.0.0//sqlite-jdbc-3.42.0.0.jar
swagger-annotations/2.2.1//swagger-annotations-2.2.1.jar
swagger-core/2.2.1//swagger-core-2.2.1.jar
swagger-integration/2.2.1//swagger-integration-2.2.1.jar
swagger-jaxrs2/2.2.1//swagger-jaxrs2-2.2.1.jar
swagger-models/2.2.1//swagger-models-2.2.1.jar
-swagger-ui/4.9.1//swagger-ui-4.9.1.jar
trino-client/363//trino-client-363.jar
units/1.6//units-1.6.jar
vertx-core/4.3.2//vertx-core-4.3.2.jar
vertx-grpc/4.3.2//vertx-grpc-4.3.2.jar
zjsonpatch/0.3.0//zjsonpatch-0.3.0.jar
-zookeeper/3.4.14//zookeeper-3.4.14.jar
+zstd-jni/1.5.5-1//zstd-jni-1.5.5-1.jar
diff --git a/dev/gen/gen_all_config_docs.sh b/dev/gen/gen_all_config_docs.sh
new file mode 100755
index 000000000..2a5dca7f9
--- /dev/null
+++ b/dev/gen/gen_all_config_docs.sh
@@ -0,0 +1,27 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Golden result file:
+# docs/deployment/settings.md
+
+KYUUBI_UPDATE="${KYUUBI_UPDATE:-1}" \
+build/mvn clean test \
+ -pl kyuubi-server -am \
+ -Pflink-provided,spark-provided,hive-provided \
+ -Dtest=none \
+ -DwildcardSuites=org.apache.kyuubi.config.AllKyuubiConfiguration
diff --git a/dev/gen/gen_hive_kdf_docs.sh b/dev/gen/gen_hive_kdf_docs.sh
new file mode 100755
index 000000000..b670dc3c5
--- /dev/null
+++ b/dev/gen/gen_hive_kdf_docs.sh
@@ -0,0 +1,26 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Golden result file:
+# docs/extensions/engines/hive/functions.md
+
+KYUUBI_UPDATE="${KYUUBI_UPDATE:-1}" \
+build/mvn clean test \
+ -pl externals/kyuubi-hive-sql-engine -am \
+ -Pflink-provided,spark-provided,hive-provided \
+ -DwildcardSuites=org.apache.kyuubi.engine.hive.udf.KyuubiDefinedFunctionSuite
diff --git a/dev/gen/gen_ranger_policy_json.sh b/dev/gen/gen_ranger_policy_json.sh
new file mode 100755
index 000000000..1f4193d3e
--- /dev/null
+++ b/dev/gen/gen_ranger_policy_json.sh
@@ -0,0 +1,27 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Golden result file:
+# extensions/spark/kyuubi-spark-authz/src/test/resources/sparkSql_hive_jenkins.json
+
+KYUUBI_UPDATE="${KYUUBI_UPDATE:-1}" \
+build/mvn clean test \
+ -pl extensions/spark/kyuubi-spark-authz \
+ -Pgen-policy \
+ -Dtest=none \
+ -DwildcardSuites=org.apache.kyuubi.plugin.spark.authz.gen.PolicyJsonFileGenerator
diff --git a/dev/gen/gen_ranger_spec_json.sh b/dev/gen/gen_ranger_spec_json.sh
new file mode 100755
index 000000000..e00857f8f
--- /dev/null
+++ b/dev/gen/gen_ranger_spec_json.sh
@@ -0,0 +1,27 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Golden result file:
+# extensions/spark/kyuubi-spark-authz/src/main/resources/*_spec.json
+
+KYUUBI_UPDATE="${KYUUBI_UPDATE:-1}" \
+build/mvn clean test \
+ -pl extensions/spark/kyuubi-spark-authz \
+ -Pgen-policy \
+ -Dtest=none \
+ -DwildcardSuites=org.apache.kyuubi.plugin.spark.authz.gen.JsonSpecFileGenerator
diff --git a/dev/gen/gen_spark_kdf_docs.sh b/dev/gen/gen_spark_kdf_docs.sh
new file mode 100755
index 000000000..ac13082e3
--- /dev/null
+++ b/dev/gen/gen_spark_kdf_docs.sh
@@ -0,0 +1,26 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Golden result file:
+# docs/extensions/engines/spark/functions.md
+
+KYUUBI_UPDATE="${KYUUBI_UPDATE:-1}" \
+build/mvn clean test \
+ -pl externals/kyuubi-spark-sql-engine -am \
+ -Pflink-provided,spark-provided,hive-provided \
+ -DwildcardSuites=org.apache.kyuubi.engine.spark.udf.KyuubiDefinedFunctionSuite
diff --git a/dev/gen/gen_tpcds_output_schema.sh b/dev/gen/gen_tpcds_output_schema.sh
new file mode 100755
index 000000000..49f8d7798
--- /dev/null
+++ b/dev/gen/gen_tpcds_output_schema.sh
@@ -0,0 +1,27 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Golden result file:
+# extensions/spark/kyuubi-spark-authz/src/test/resources/*.output.schema
+
+KYUUBI_UPDATE="${KYUUBI_UPDATE:-1}" \
+build/mvn clean install \
+ -pl kyuubi-server -am \
+ -Dmaven.plugin.scalatest.exclude.tags="" \
+ -Dtest=none \
+ -DwildcardSuites=org.apache.kyuubi.operation.tpcds.OutputSchemaTPCDSSuite
diff --git a/dev/gen/gen_tpcds_queries.sh b/dev/gen/gen_tpcds_queries.sh
new file mode 100755
index 000000000..07f075b7a
--- /dev/null
+++ b/dev/gen/gen_tpcds_queries.sh
@@ -0,0 +1,27 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Golden result file:
+# kyuubi-spark-connector-tpcds/src/main/resources/kyuubi/tpcds_*/*.sql
+
+KYUUBI_UPDATE="${KYUUBI_UPDATE:-1}" \
+build/mvn clean install \
+ -pl extensions/spark/kyuubi-spark-connector-tpcds -am \
+ -Dmaven.plugin.scalatest.exclude.tags="" \
+ -Dtest=none \
+ -DwildcardSuites=org.apache.kyuubi.spark.connector.tpcds.TPCDSQuerySuite
diff --git a/dev/gen/gen_tpch_queries.sh b/dev/gen/gen_tpch_queries.sh
new file mode 100755
index 000000000..d0c65256f
--- /dev/null
+++ b/dev/gen/gen_tpch_queries.sh
@@ -0,0 +1,27 @@
+#!/bin/bash
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Golden result file:
+# kyuubi-spark-connector-tpcds/src/main/resources/kyuubi/tpcdh_*/*.sql
+
+KYUUBI_UPDATE="${KYUUBI_UPDATE:-1}" \
+build/mvn clean install \
+ -pl extensions/spark/kyuubi-spark-connector-tpch -am \
+ -Dmaven.plugin.scalatest.exclude.tags="" \
+ -Dtest=none \
+ -DwildcardSuites=org.apache.kyuubi.spark.connector.tpch.TPCHQuerySuite
diff --git a/dev/kyuubi-codecov/pom.xml b/dev/kyuubi-codecov/pom.xml
index 1d1dcb574..0f22c3316 100644
--- a/dev/kyuubi-codecov/pom.xml
+++ b/dev/kyuubi-codecov/pom.xml
@@ -21,16 +21,28 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../../pom.xml
- kyuubi-codecov_2.12
+ kyuubi-codecov_${scala.binary.version}pomKyuubi Dev Code Coveragehttps://kyuubi.apache.org/
+
+ org.apache.kyuubi
+ kyuubi-util
+ ${project.version}
+
+
+
+ org.apache.kyuubi
+ kyuubi-util-scala_${scala.binary.version}
+ ${project.version}
+
+
org.apache.kyuubikyuubi-common_${scala.binary.version}
@@ -199,7 +211,17 @@
org.apache.kyuubi
- kyuubi-spark-connector-kudu_${scala.binary.version}
+ kyuubi-spark-connector-hive_${scala.binary.version}
+ ${project.version}
+
+
+
+
+ spark-3.4
+
+
+ org.apache.kyuubi
+ kyuubi-extension-spark-3-4_${scala.binary.version}${project.version}
@@ -209,5 +231,15 @@
+
+ spark-3.5
+
+
+ org.apache.kyuubi
+ kyuubi-extension-spark-3-5_${scala.binary.version}
+ ${project.version}
+
+
+
diff --git a/dev/kyuubi-tpcds/README.md b/dev/kyuubi-tpcds/README.md
index adffb6726..a9a6487aa 100644
--- a/dev/kyuubi-tpcds/README.md
+++ b/dev/kyuubi-tpcds/README.md
@@ -1,21 +1,22 @@
+- Licensed to the Apache Software Foundation (ASF) under one or more
+- contributor license agreements. See the NOTICE file distributed with
+- this work for additional information regarding copyright ownership.
+- The ASF licenses this file to You under the Apache License, Version 2.0
+- (the "License"); you may not use this file except in compliance with
+- the License. You may obtain a copy of the License at
+-
+- http://www.apache.org/licenses/LICENSE-2.0
+-
+- Unless required by applicable law or agreed to in writing, software
+- distributed under the License is distributed on an "AS IS" BASIS,
+- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+- See the License for the specific language governing permissions and
+- limitations under the License.
+-->
# Introduction
+
This module includes TPC-DS data generator and benchmark tool.
# How to use
@@ -27,12 +28,12 @@ package jar with following command:
Support options:
-| key | default | description |
-|--------------|-----------------|-----------------------------------|
-| db | default | the database to write data |
-| scaleFactor | 1 | the scale factor of TPC-DS |
-| format | parquet | the format of table to store data |
-| parallel | scaleFactor * 2 | the parallelism of Spark job |
+| key | default | description |
+|-------------|-----------------|-----------------------------------|
+| db | default | the database to write data |
+| scaleFactor | 1 | the scale factor of TPC-DS |
+| format | parquet | the format of table to store data |
+| parallel | scaleFactor * 2 | the parallelism of Spark job |
Example: the following command to generate 10GB data with new database `tpcds_sf10`.
@@ -47,7 +48,7 @@ $SPARK_HOME/bin/spark-submit \
Support options:
-| key | default | description |
+| key | default | description |
|-------------|------------------------|---------------------------------------------------------------|
| db | none(required) | the TPC-DS database |
| benchmark | tpcds-v2.4-benchmark | the name of application |
@@ -65,6 +66,7 @@ $SPARK_HOME/bin/spark-submit \
```
We also support run one of the TPC-DS query:
+
```shell
$SPARK_HOME/bin/spark-submit \
--class org.apache.kyuubi.tpcds.benchmark.RunBenchmark \
@@ -73,6 +75,7 @@ $SPARK_HOME/bin/spark-submit \
The result of TPC-DS benchmark like:
-| name | minTimeMs | maxTimeMs | avgTimeMs | stdDev | stdDevPercent |
-|---------|-----------|-------------|------------|----------|----------------|
-| q1-v2.4 | 50.522384 | 868.010383 | 323.398267 | 471.6482 | 145.8413108576 |
+| name | minTimeMs | maxTimeMs | avgTimeMs | stdDev | stdDevPercent |
+|---------|-----------|------------|------------|----------|----------------|
+| q1-v2.4 | 50.522384 | 868.010383 | 323.398267 | 471.6482 | 145.8413108576 |
+
diff --git a/dev/kyuubi-tpcds/pom.xml b/dev/kyuubi-tpcds/pom.xml
index 2921cbe8b..b80c1227f 100644
--- a/dev/kyuubi-tpcds/pom.xml
+++ b/dev/kyuubi-tpcds/pom.xml
@@ -21,11 +21,11 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../../pom.xml
- kyuubi-tpcds_2.12
+ kyuubi-tpcds_${scala.binary.version}jarKyuubi Dev TPCDS Generatorhttps://kyuubi.apache.org/
diff --git a/dev/merge_kyuubi_pr.py b/dev/merge_kyuubi_pr.py
index cb3696d1f..fe8893748 100755
--- a/dev/merge_kyuubi_pr.py
+++ b/dev/merge_kyuubi_pr.py
@@ -30,9 +30,9 @@
import re
import subprocess
import sys
-from urllib.request import urlopen
-from urllib.request import Request
from urllib.error import HTTPError
+from urllib.request import Request
+from urllib.request import urlopen
KYUUBI_HOME = os.environ.get("KYUUBI_HOME", os.getcwd())
PR_REMOTE_NAME = os.environ.get("PR_REMOTE_NAME", "apache")
@@ -248,6 +248,8 @@ def main():
user_login = pr["user"]["login"]
base_ref = pr["head"]["ref"]
pr_repo_desc = "%s/%s" % (user_login, base_ref)
+ assignees = pr["assignees"]
+ milestone = pr["milestone"]
# Merged pull requests don't appear as merged in the GitHub API;
# Instead, they're closed by asfgit.
@@ -276,6 +278,17 @@ def main():
print("\n=== Pull Request #%s ===" % pr_num)
print("title:\t%s\nsource:\t%s\ntarget:\t%s\nurl:\t%s\nbody:\n\n%s" %
(title, pr_repo_desc, target_ref, url, body))
+
+ if assignees is None or len(assignees)==0:
+ continue_maybe("Assignees have NOT been set. Continue?")
+ else:
+ print("assignees: %s" % [assignee["login"] for assignee in assignees])
+
+ if milestone is None:
+ continue_maybe("Milestone has NOT been set. Continue?")
+ else:
+ print("milestone: %s" % milestone["title"])
+
continue_maybe("Proceed with merging pull request #%s?" % pr_num)
merged_refs = [target_ref]
diff --git a/dev/reformat b/dev/reformat
index 7c6ef7124..7ad26ae2e 100755
--- a/dev/reformat
+++ b/dev/reformat
@@ -20,7 +20,7 @@ set -x
KYUUBI_HOME="$(cd "`dirname "$0"`/.."; pwd)"
-PROFILES="-Pflink-provided,hive-provided,spark-provided,spark-block-cleaner,spark-3.3,spark-3.2,spark-3.1,tpcds"
+PROFILES="-Pflink-provided,hive-provided,spark-provided,spark-block-cleaner,spark-3.5,spark-3.4,spark-3.3,spark-3.2,spark-3.1,tpcds,kubernetes-it"
# python style checks rely on `black` in path
if ! command -v black &> /dev/null
diff --git a/docker/Dockerfile b/docker/Dockerfile
index 588f99b1f..0440022de 100644
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@@ -24,7 +24,7 @@
# -t the target repo and tag name
# more options can be found with -h
-ARG BASE_IMAGE=openjdk:8-jre-slim
+ARG BASE_IMAGE=eclipse-temurin:8-jdk-focal
ARG spark_provided="spark_builtin"
FROM ${BASE_IMAGE} as builder_spark_provided
@@ -34,7 +34,7 @@ ONBUILD ENV SPARK_HOME ${spark_home_in_docker}
FROM ${BASE_IMAGE} as builder_spark_builtin
ONBUILD ENV SPARK_HOME /opt/spark
-ONBUILD RUN mkdir -p ${SPARK_HOME}
+ONBUILD RUN mkdir -p ${SPARK_HOME}
ONBUILD COPY spark-binary ${SPARK_HOME}
FROM builder_${spark_provided}
@@ -50,7 +50,8 @@ ENV KYUUBI_WORK_DIR_ROOT ${KYUUBI_HOME}/work
RUN set -ex && \
sed -i 's/http:\/\/deb.\(.*\)/https:\/\/deb.\1/g' /etc/apt/sources.list && \
apt-get update && \
- apt install -y bash tini libc6 libpam-modules krb5-user libnss3 procps && \
+ apt-get install -y bash tini libc6 libpam-modules krb5-user libnss3 procps && \
+ ln -snf /bin/bash /bin/sh && \
useradd -u ${kyuubi_uid} -g root kyuubi -d /home/kyuubi -m && \
mkdir -p ${KYUUBI_HOME} ${KYUUBI_LOG_DIR} ${KYUUBI_PID_DIR} ${KYUUBI_WORK_DIR_ROOT} && \
rm -rf /var/cache/apt/*
@@ -59,6 +60,7 @@ COPY LICENSE NOTICE RELEASE ${KYUUBI_HOME}/
COPY bin ${KYUUBI_HOME}/bin
COPY jars ${KYUUBI_HOME}/jars
COPY beeline-jars ${KYUUBI_HOME}/beeline-jars
+COPY web-ui ${KYUUBI_HOME}/web-ui
COPY externals/engines/spark ${KYUUBI_HOME}/externals/engines/spark
WORKDIR ${KYUUBI_HOME}
diff --git a/docker/kyuubi-configmap.yaml b/docker/kyuubi-configmap.yaml
index 13835493b..6a6d430ce 100644
--- a/docker/kyuubi-configmap.yaml
+++ b/docker/kyuubi-configmap.yaml
@@ -52,4 +52,4 @@ data:
# kyuubi.frontend.bind.port 10009
#
- # Details in https://kyuubi.apache.org/docs/latest/deployment/settings.html
+ # Details in https://kyuubi.readthedocs.io/en/master/configuration/settings.html
diff --git a/docker/playground/.env b/docker/playground/.env
index d50e964cf..24284bd39 100644
--- a/docker/playground/.env
+++ b/docker/playground/.env
@@ -15,16 +15,16 @@
# limitations under the License.
#
-AWS_JAVA_SDK_VERSION=1.12.239
-HADOOP_VERSION=3.3.1
+AWS_JAVA_SDK_VERSION=1.12.367
+HADOOP_VERSION=3.3.6
HIVE_VERSION=2.3.9
-ICEBERG_VERSION=1.1.0
-KYUUBI_VERSION=1.6.1-incubating
-KYUUBI_HADOOP_VERSION=3.3.4
+ICEBERG_VERSION=1.3.1
+KYUUBI_VERSION=1.7.3
+KYUUBI_HADOOP_VERSION=3.3.5
POSTGRES_VERSION=12
POSTGRES_JDBC_VERSION=42.3.4
SCALA_BINARY_VERSION=2.12
-SPARK_VERSION=3.3.1
+SPARK_VERSION=3.3.3
SPARK_BINARY_VERSION=3.3
SPARK_HADOOP_VERSION=3.3.2
ZOOKEEPER_VERSION=3.6.3
diff --git a/docker/playground/README.md b/docker/playground/README.md
index d9e227c2c..66dca2af0 100644
--- a/docker/playground/README.md
+++ b/docker/playground/README.md
@@ -1,5 +1,5 @@
Playground
-===
+==========
## For Users
@@ -45,3 +45,4 @@ Kyuubi supply some built-in dataset, after Kyuubi started, you can run the follo
1. Build images `docker/playground/build-image.sh`;
2. Optional to use `buildx` to build and publish cross-platform images `BUILDX=1 docker/playground/build-image.sh`;
+
diff --git a/docker/playground/compose.yml b/docker/playground/compose.yml
index 069624ee2..362b3505b 100644
--- a/docker/playground/compose.yml
+++ b/docker/playground/compose.yml
@@ -17,11 +17,11 @@
services:
minio:
- image: alekcander/bitnami-minio-multiarch:RELEASE.2022-05-26T05-48-41Z
+ image: bitnami/minio:2023-debian-11
environment:
MINIO_ROOT_USER: minio
MINIO_ROOT_PASSWORD: minio_minio
- MINIO_DEFAULT_BUCKETS: spark-bucket,iceberg-bucket
+ MINIO_DEFAULT_BUCKETS: spark-bucket
container_name: minio
hostname: minio
ports:
@@ -68,6 +68,7 @@ services:
ports:
- 4040-4050:4040-4050
- 10009:10009
+ - 10099:10099
volumes:
- ./conf/core-site.xml:/etc/hadoop/conf/core-site.xml
- ./conf/hive-site.xml:/etc/hive/conf/hive-site.xml
diff --git a/docker/playground/conf/kyuubi-defaults.conf b/docker/playground/conf/kyuubi-defaults.conf
index 4906c5de4..e4a674634 100644
--- a/docker/playground/conf/kyuubi-defaults.conf
+++ b/docker/playground/conf/kyuubi-defaults.conf
@@ -18,8 +18,10 @@
## Kyuubi Configurations
kyuubi.authentication=NONE
-kyuubi.frontend.thrift.binary.bind.host=0.0.0.0
+kyuubi.frontend.bind.host=0.0.0.0
+kyuubi.frontend.protocols=THRIFT_BINARY,REST
kyuubi.frontend.thrift.binary.bind.port=10009
+kyuubi.frontend.rest.bind.port=10099
kyuubi.ha.addresses=zookeeper:2181
kyuubi.session.engine.idle.timeout=PT5M
kyuubi.operation.incremental.collect=true
@@ -28,4 +30,4 @@ kyuubi.operation.progress.enabled=true
kyuubi.engine.session.initialize.sql \
show namespaces in tpcds; \
show namespaces in tpch; \
- show namespaces in postgres;
+ show namespaces in postgres
diff --git a/docker/playground/conf/kyuubi-log4j2.xml b/docker/playground/conf/kyuubi-log4j2.xml
index 6aedf7652..313c121bc 100644
--- a/docker/playground/conf/kyuubi-log4j2.xml
+++ b/docker/playground/conf/kyuubi-log4j2.xml
@@ -22,7 +22,7 @@
-
+
diff --git a/docker/playground/conf/spark-defaults.conf b/docker/playground/conf/spark-defaults.conf
index 9d1d4a602..7983b5e70 100644
--- a/docker/playground/conf/spark-defaults.conf
+++ b/docker/playground/conf/spark-defaults.conf
@@ -38,7 +38,3 @@ spark.sql.catalog.postgres.url=jdbc:postgresql://postgres:5432/metastore
spark.sql.catalog.postgres.driver=org.postgresql.Driver
spark.sql.catalog.postgres.user=postgres
spark.sql.catalog.postgres.password=postgres
-
-spark.sql.catalog.iceberg=org.apache.iceberg.spark.SparkCatalog
-spark.sql.catalog.iceberg.type=hadoop
-spark.sql.catalog.iceberg.warehouse=s3a://iceberg-bucket/iceberg-warehouse
diff --git a/docker/playground/image/kyuubi-playground-base.Dockerfile b/docker/playground/image/kyuubi-playground-base.Dockerfile
index 6ee4ed405..e8375eb68 100644
--- a/docker/playground/image/kyuubi-playground-base.Dockerfile
+++ b/docker/playground/image/kyuubi-playground-base.Dockerfile
@@ -20,4 +20,4 @@ RUN set -x && \
mkdir /opt/busybox && \
busybox --install /opt/busybox
-ENV PATH=/opt/java/openjdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/busybox
+ENV PATH=${PATH}:/opt/busybox
diff --git a/docs/appendix/terminology.md b/docs/appendix/terminology.md
index 21b8cb1b6..b349d77c7 100644
--- a/docs/appendix/terminology.md
+++ b/docs/appendix/terminology.md
@@ -129,9 +129,9 @@ As an enterprise service, SLA commitment is essential. Deploying Kyuubi in High
-## DataLake & LakeHouse
+## DataLake & Lakehouse
-Kyuubi unifies DataLake & LakeHouse access in the simplest pure SQL way, meanwhile it's also the securest way with authentication and SQL standard authorization.
+Kyuubi unifies DataLake & Lakehouse access in the simplest pure SQL way, meanwhile it's also the securest way with authentication and SQL standard authorization.
### Apache Iceberg
@@ -139,7 +139,7 @@ Kyuubi unifies DataLake & LakeHouse access in the simplest pure SQL way, meanwhi
diff --git a/docs/client/advanced/kerberos.md b/docs/client/advanced/kerberos.md
index 4962dd2c8..a9cb55812 100644
--- a/docs/client/advanced/kerberos.md
+++ b/docs/client/advanced/kerberos.md
@@ -242,5 +242,5 @@ jdbc:hive2://:/;kyuubiServerPrinc
- `principal` is inherited from Hive JDBC Driver and is a little ambiguous, and we could use `kyuubiServerPrincipal` as its alias.
- `kyuubi_server_principal` is the value of `kyuubi.kinit.principal` set in `kyuubi-defaults.conf`.
- As a command line argument, JDBC URL should be quoted to avoid being split into 2 commands by ";".
-- As to DBeaver, `;principal=` should be set as the `Database/Schema` argument.
+- As to DBeaver, `;principal=` or `;kyuubiServerPrincipal=` should be set as the `Database/Schema` argument.
diff --git a/docs/client/cli/hive_beeline.rst b/docs/client/cli/hive_beeline.rst
index fda925aa1..f75e00819 100644
--- a/docs/client/cli/hive_beeline.rst
+++ b/docs/client/cli/hive_beeline.rst
@@ -17,7 +17,7 @@ Hive Beeline
============
Kyuubi supports Apache Hive beeline that works with Kyuubi server.
-Hive beeline is a `SQLLine CLI `_ based on the `Hive JDBC Driver <../jdbc/hive_jdbc.html>`_.
+Hive beeline is a `SQLLine CLI `_ based on the `Hive JDBC Driver <../jdbc/hive_jdbc.html>`_.
Prerequisites
-------------
diff --git a/docs/client/cli/index.rst b/docs/client/cli/index.rst
index 61be9ad8c..19122ced4 100644
--- a/docs/client/cli/index.rst
+++ b/docs/client/cli/index.rst
@@ -21,3 +21,4 @@ Command Line Interface(CLI)s
kyuubi_beeline
hive_beeline
+ trino_cli
diff --git a/docs/client/cli/trino_cli.md b/docs/client/cli/trino_cli.md
new file mode 100644
index 000000000..68ebd8300
--- /dev/null
+++ b/docs/client/cli/trino_cli.md
@@ -0,0 +1,88 @@
+
+
+# Trino command line interface
+
+The Trino CLI provides a terminal-based, interactive shell for running queries. We can use it to connect Kyuubi server now.
+
+## Start Kyuubi Trino Server
+
+First we should configure the trino protocol and the service port in the `kyuubi.conf`
+
+```
+kyuubi.frontend.protocols TRINO
+kyuubi.frontend.trino.bind.port 10999 #default port
+```
+
+## Install
+
+Download [trino-cli-363-executable.jar](https://repo1.maven.org/maven2/io/trino/trino-jdbc/363/trino-jdbc-363.jar), rename it to `trino`, make it executable with `chmod +x`, and run it to show the version of the CLI:
+
+```
+wget https://repo1.maven.org/maven2/io/trino/trino-jdbc/363/trino-jdbc-363.jar
+mv trino-jdbc-363.jar trino
+chmod +x trino
+./trino --version
+```
+
+## Running the CLI
+
+The minimal command to start the CLI in interactive mode specifies the URL of the kyuubi server with the Trino protocol:
+
+```
+./trino --server http://localhost:10999
+```
+
+If successful, you will get a prompt to execute commands. Use the help command to see a list of supported commands. Use the clear command to clear the terminal. To stop and exit the CLI, run exit or quit.:
+
+```
+trino> help
+
+Supported commands:
+QUIT
+EXIT
+CLEAR
+EXPLAIN [ ( option [, ...] ) ]
+ options: FORMAT { TEXT | GRAPHVIZ | JSON }
+ TYPE { LOGICAL | DISTRIBUTED | VALIDATE | IO }
+DESCRIBE
+SHOW COLUMNS FROM
+SHOW FUNCTIONS
+SHOW CATALOGS [LIKE ]
+SHOW SCHEMAS [FROM ] [LIKE ]
+SHOW TABLES [FROM ] [LIKE ]
+USE [.]
+```
+
+You can now run SQL statements. After processing, the CLI will show results and statistics.
+
+```
+trino> select 1;
+ _col0
+-------
+ 1
+(1 row)
+
+Query 20230216_125233_00806_examine_6hxus, FINISHED, 1 node
+Splits: 1 total, 1 done (100.00%)
+0.29 [0 rows, 0B] [0 rows/s, 0B/s]
+
+trino>
+```
+
+Many other options are available to further configure the CLI in interactive mode to
+refer https://trino.io/docs/current/client/cli.html#running-the-cli
diff --git a/docs/client/jdbc/hive_jdbc.md b/docs/client/jdbc/hive_jdbc.md
index 42d2f7b5a..00498dfaa 100644
--- a/docs/client/jdbc/hive_jdbc.md
+++ b/docs/client/jdbc/hive_jdbc.md
@@ -19,14 +19,18 @@
## Instructions
-Kyuubi does not provide its own JDBC Driver so far,
-as it is fully compatible with Hive JDBC and ODBC drivers that let you connect to popular Business Intelligence (BI) tools to query,
-analyze and visualize data though Spark SQL engines.
+Kyuubi is fully compatible with Hive JDBC and ODBC drivers that let you connect to popular Business Intelligence (BI)
+tools to query, analyze and visualize data though Spark SQL engines.
+
+It's recommended to use [Kyuubi JDBC driver](./kyuubi_jdbc.html) for new applications.
## Install Hive JDBC
For programing, the easiest way to get `hive-jdbc` is from [the maven central](https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc). For example,
+The following sections demonstrate how to use Hive JDBC driver 2.3.8 to connect Kyuubi Server, actually, any version
+less or equals 3.1.x should work fine.
+
- **maven**
```xml
@@ -76,7 +80,3 @@ jdbc:hive2://:/;?#<[spark|hive]Var
jdbc:hive2://localhost:10009/default;hive.server2.proxy.user=proxy_user?kyuubi.engine.share.level=CONNECTION;spark.ui.enabled=false#var_x=y
```
-## Unsupported Hive Features
-
-- Connect to HiveServer2 using HTTP transport. ```transportMode=http```
-
diff --git a/docs/client/jdbc/index.rst b/docs/client/jdbc/index.rst
index 31871f138..abcd6a452 100644
--- a/docs/client/jdbc/index.rst
+++ b/docs/client/jdbc/index.rst
@@ -22,4 +22,5 @@ JDBC Drivers
kyuubi_jdbc
hive_jdbc
mysql_jdbc
+ trino_jdbc
diff --git a/docs/client/jdbc/kyuubi_jdbc.rst b/docs/client/jdbc/kyuubi_jdbc.rst
index fdc40d599..7a63dbd98 100644
--- a/docs/client/jdbc/kyuubi_jdbc.rst
+++ b/docs/client/jdbc/kyuubi_jdbc.rst
@@ -17,14 +17,14 @@ Kyuubi Hive JDBC Driver
=======================
.. versionadded:: 1.4.0
- Since 1.4.0, kyuubi community maintains a forked hive jdbc driver module and provides both shaded and non-shaded packages.
+ Kyuubi community maintains a forked Hive JDBC driver module and provides both shaded and non-shaded packages.
-This packages aims to support some missing functionalities of the original hive jdbc.
-For kyuubi engines that support multiple catalogs, it provides meta APIs for better support.
-The behaviors of the original hive jdbc have remained.
+This packages aims to support some missing functionalities of the original Hive JDBC driver.
+For Kyuubi engines that support multiple catalogs, it provides meta APIs for better support.
+The behaviors of the original Hive JDBC driver have remained.
-To access a Hive data warehouse or new lakehouse formats, such as Apache Iceberg/Hudi, delta lake using the kyuubi jdbc driver for Apache kyuubi, you need to configure
-the following:
+To access a Hive data warehouse or new Lakehouse formats, such as Apache Iceberg/Hudi, Delta Lake using the Kyuubi JDBC driver
+for Apache kyuubi, you need to configure the following:
- The list of driver library files - :ref:`referencing-libraries`.
- The Driver or DataSource class - :ref:`registering_class`.
@@ -46,28 +46,28 @@ In the code, specify the artifact `kyuubi-hive-jdbc-shaded` from `Maven Central`
Maven
^^^^^
-.. code-block:: xml
+.. parsed-literal::
org.apache.kyuubikyuubi-hive-jdbc-shaded
- 1.5.2-incubating
+ \ |release|\
-Sbt
+sbt
^^^
-.. code-block:: sbt
+.. parsed-literal::
- libraryDependencies += "org.apache.kyuubi" % "kyuubi-hive-jdbc-shaded" % "1.5.2-incubating"
+ libraryDependencies += "org.apache.kyuubi" % "kyuubi-hive-jdbc-shaded" % "\ |release|\"
Gradle
^^^^^^
-.. code-block:: gradle
+.. parsed-literal::
- implementation group: 'org.apache.kyuubi', name: 'kyuubi-hive-jdbc-shaded', version: '1.5.2-incubating'
+ implementation group: 'org.apache.kyuubi', name: 'kyuubi-hive-jdbc-shaded', version: '\ |release|\'
Using the Driver in a JDBC Application
**************************************
@@ -92,11 +92,9 @@ connection for JDBC:
.. code-block:: java
- private static Connection connectViaDM() throws Exception
- {
- Connection connection = null;
- connection = DriverManager.getConnection(CONNECTION_URL);
- return connection;
+ private static Connection newKyuubiConnection() throws Exception {
+ Connection connection = DriverManager.getConnection(CONNECTION_URL);
+ return connection;
}
.. _building_url:
@@ -112,12 +110,13 @@ accessing. The following is the format of the connection URL for the Kyuubi Hive
.. code-block:: jdbc
- jdbc:subprotocol://host:port/schema;<[#|?]sessionProperties>
+ jdbc:subprotocol://host:port[/catalog]/[schema];<[#|?]sessionProperties>
- subprotocol: kyuubi or hive2
- host: DNS or IP address of the kyuubi server
- port: The number of the TCP port that the server uses to listen for client requests
-- dbName: Optional database name to set the current database to run the query against, use `default` if absent.
+- catalog: Optional catalog name to set the current catalog to run the query against.
+- schema: Optional database name to set the current database to run the query against, use `default` if absent.
- clientProperties: Optional `semicolon(;)` separated `key=value` parameters identified and affect the client behavior locally. e.g., user=foo;password=bar.
- sessionProperties: Optional `semicolon(;)` separated `key=value` parameters used to configure the session, operation or background engines.
For instance, `kyuubi.engine.share.level=CONNECTION` determines the background engine instance is used only by the current connection. `spark.ui.enabled=false` disables the Spark UI of the engine.
@@ -127,7 +126,7 @@ accessing. The following is the format of the connection URL for the Kyuubi Hive
- Properties are case-sensitive
- Do not duplicate properties in the connection URL
-Connection URL over Http
+Connection URL over HTTP
************************
.. versionadded:: 1.6.0
@@ -145,16 +144,101 @@ Connection URL over Service Discovery
jdbc:subprotocol:///;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=kyuubi
-- zookeeper quorum is the corresponding zookeeper cluster configured by `kyuubi.ha.zookeeper.quorum` at the server side.
-- zooKeeperNamespace is the corresponding namespace configured by `kyuubi.ha.zookeeper.namespace` at the server side.
+- zookeeper quorum is the corresponding zookeeper cluster configured by `kyuubi.ha.addresses` at the server side.
+- zooKeeperNamespace is the corresponding namespace configured by `kyuubi.ha.namespace` at the server side.
-Authentication
---------------
+HiveServer2 Compatibility
+*************************
+.. versionadded:: 1.8.0
-DataTypes
----------
+JDBC Drivers need to negotiate a protocol version with Kyuubi Server/HiveServer2 when connecting.
+
+Kyuubi Hive JDBC Driver offers protocol version v10 (`clientProtocolVersion=9`, supported since Hive 2.3.0)
+to server by default.
+
+If you need to connect to HiveServer2 before 2.3.0,
+please set client property `clientProtocolVersion` to a lower number.
+
+.. code-block:: jdbc
+
+ jdbc:subprotocol://host:port[/catalog]/[schema];clientProtocolVersion=9;
+
+
+.. tip::
+ All supported protocol versions and corresponding Hive versions can be found in `TProtocolVersion.java`_
+ and its git commits.
+
+Kerberos Authentication
+-----------------------
+Since 1.6.0, Kyuubi JDBC driver implements the Kerberos authentication based on JAAS framework instead of `Hadoop UserGroupInformation`_,
+which means it does not forcibly rely on Hadoop dependencies to connect a kerberized Kyuubi Server.
+
+Kyuubi JDBC driver supports different approaches to connect a kerberized Kyuubi Server. First of all, please follow
+the `krb5.conf instruction`_ to setup ``krb5.conf`` properly.
+
+Authentication by Principal and Keytab
+**************************************
+
+.. versionadded:: 1.6.0
+
+.. tip::
+
+ It's the simplest way w/ minimal setup requirements for Kerberos authentication.
+
+It's straightforward to use principal and keytab for Kerberos authentication, just simply configure them in the JDBC URL.
+
+.. code-block::
+
+ jdbc:kyuubi://host:port/schema;kyuubiClientPrincipal=;kyuubiClientKeytab=;kyuubiServerPrincipal=
+
+- kyuubiClientPrincipal: Kerberos ``principal`` for client authentication
+- kyuubiClientKeytab: path of Kerberos ``keytab`` file for client authentication
+- kyuubiServerPrincipal: Kerberos ``principal`` configured by `kyuubi.kinit.principal` at the server side. ``kyuubiServerPrincipal`` is available
+ as an alias of ``principal`` since 1.7.0, use ``principal`` for previous versions.
+
+Authentication by Principal and TGT Cache
+*****************************************
+
+Another typical usage of Kerberos authentication is using `kinit` to generate the TGT cache first, then the application
+does Kerberos authentication through the TGT cache.
+
+.. code-block::
+
+ jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal=
+
+Authentication by `Hadoop UserGroupInformation`_ ``doAs`` (programing only)
+***************************************************************************
+
+.. tip::
+
+ This approach allows project which already uses `Hadoop UserGroupInformation`_ for Kerberos authentication to easily
+ connect the kerberized Kyuubi Server. This approach does not work between [1.6.0, 1.7.0], and got fixed in 1.7.1.
+
+.. code-block::
+
+ String jdbcUrl = "jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal="
+ UserGroupInformation ugi = UserGroupInformation.loginUserFromKeytab(clientPrincipal, clientKeytab);
+ ugi.doAs((PrivilegedExceptionAction) () -> {
+ Connection conn = DriverManager.getConnection(jdbcUrl);
+ ...
+ });
+
+Authentication by Subject (programing only)
+*******************************************
+
+.. code-block:: java
+
+ String jdbcUrl = "jdbc:kyuubi://host:port/schema;kyuubiServerPrincipal=;kerberosAuthType=fromSubject"
+ Subject kerberizedSubject = ...;
+ Subject.doAs(kerberizedSubject, (PrivilegedExceptionAction) () -> {
+ Connection conn = DriverManager.getConnection(jdbcUrl);
+ ...
+ });
.. _Maven Central: https://mvnrepository.com/artifact/org.apache.kyuubi/kyuubi-hive-jdbc-shaded
.. _JDBC Applications: ../bi_tools/index.html
.. _java.sql.DriverManager: https://docs.oracle.com/javase/8/docs/api/java/sql/DriverManager.html
+.. _Hadoop UserGroupInformation: https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/security/UserGroupInformation.html
+.. _krb5.conf instruction: https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/KerberosReq.html
+.. _TProtocolVersion.java: https://github.com/apache/hive/blob/master/service-rpc/src/gen/thrift/gen-javabean/org/apache/hive/service/rpc/thrift/TProtocolVersion.java
\ No newline at end of file
diff --git a/docs/client/jdbc/trino_jdbc.md b/docs/client/jdbc/trino_jdbc.md
new file mode 100644
index 000000000..0f91c4337
--- /dev/null
+++ b/docs/client/jdbc/trino_jdbc.md
@@ -0,0 +1,92 @@
+
+
+# Trino JDBC Driver
+
+## Instructions
+
+Kyuubi currently supports the Trino connection protocol, so we can use Trino-JDBC to connect to the kyuubi server
+and submit SQL to Spark, Trino and other engines for execution.
+
+## Start Kyuubi Trino Server
+
+First we should configure the trino protocol and the service port in the `kyuubi.conf`
+
+```
+kyuubi.frontend.protocols TRINO
+kyuubi.frontend.trino.bind.port 10999 #default port
+```
+
+## Install Trino JDBC
+
+Download [trino-jdbc-363.jar](https://repo1.maven.org/maven2/io/trino/trino-jdbc/363/trino-jdbc-363.jar) and add it to the classpath of your Java application.
+
+The driver is also available from Maven Central:
+
+```xml
+
+ io.trino
+ trino-jdbc
+ 363
+
+```
+
+## JDBC URL
+
+When your driver is loaded, registered and configured, you are ready to connect to Trino from your application. The following JDBC URL formats are supported:
+
+```
+jdbc:trino://host:port
+```
+
+Trino JDBC example
+
+```java
+String trinoHost = "localhost";
+String trinoPort = "10999";
+String trinoUser = "default";
+String trinoPassword = null;
+Connection connection = null;
+ResultSet rs = null;
+
+try {
+ // Create the connection using the JDBC URL
+ connection = DriverManager.getConnection("jdbc:trino://" + trinoHost + ":" + trinoPort, trinoUser, trinoPassword);
+
+ // Do whatever you need to do with the connection
+ Statement stmt = connection.createStatement();
+ rs = stmt.executeQuery("SELECT 1");
+
+ while (rs.next()) {
+ // retrieve data from the ResultSet
+ }
+
+} catch (Exception e) {
+ e.printStackTrace();
+} finally {
+ try {
+ // Close the connection when you're done with it
+ if (rs != null) rs.close();
+ if (connection != null) connection.close();
+ } catch (Exception e) {
+ e.printStackTrace();
+ }
+}
+```
+
+The configuration of the connection parameters can be found in the official trino documentation at: https://trino.io/docs/current/client/jdbc.html#connection-parameters
+
diff --git a/docs/client/python/index.rst b/docs/client/python/index.rst
index 70d2bc9e3..5e8ae4228 100644
--- a/docs/client/python/index.rst
+++ b/docs/client/python/index.rst
@@ -22,4 +22,4 @@ Python
pyhive
pyspark
-
+ jaydebeapi
diff --git a/docs/client/python/jaydebeapi.md b/docs/client/python/jaydebeapi.md
new file mode 100644
index 000000000..3d89fd722
--- /dev/null
+++ b/docs/client/python/jaydebeapi.md
@@ -0,0 +1,87 @@
+
+
+# Python-JayDeBeApi
+
+The [JayDeBeApi](https://pypi.org/project/JayDeBeApi/) module allows you to connect from Python code to databases using Java JDBC.
+It provides a Python DB-API v2.0 to that database.
+
+## Requirements
+
+To install Python-JayDeBeApi, you can use pip, the Python package manager. Open your command-line interface or terminal and run the following command:
+
+```shell
+pip install jaydebeapi
+```
+
+If you want to install JayDeBeApi in Jython, you'll need to ensure that you have either pip or EasyInstall available for Jython. These tools are used to install Python packages, including JayDeBeApi.
+Or you can get a copy of the source by cloning from the [JayDeBeApi GitHub project](https://github.com/baztian/jaydebeapi) and install it.
+
+```shell
+python setup.py install
+```
+
+or if you are using Jython use
+
+```shell
+jython setup.py install
+```
+
+## Preparation
+
+Using the Python-JayDeBeApi package to connect to Kyuubi, you need to install the library and configure the relevant JDBC driver. You can download JDBC driver from maven repository and specify its path in Python. Choose the matching driver `kyuubi-hive-jdbc-*.jar` package based on the Kyuubi server version.
+The driver class name is `org.apache.kyuubi.jdbc.KyuubiHiveDriver`.
+
+| Package | Repo |
+|--------------------|-----------------------------------------------------------------------------------------------------|
+| kyuubi jdbc driver | [kyuubi-hive-jdbc-*.jar](https://repo1.maven.org/maven2/org/apache/kyuubi/kyuubi-hive-jdbc-shaded/) |
+
+## Usage
+
+Below is a simple example demonstrating how to use Python-JayDeBeApi to connect to Kyuubi database and execute a query:
+
+```python
+import jaydebeapi
+
+# Set JDBC driver path and connection URL
+driver = "org.apache.kyuubi.jdbc.KyuubiHiveDriver"
+url = "jdbc:kyuubi://host:port/default"
+jdbc_driver_path = ["/path/to/kyuubi-hive-jdbc-*.jar"]
+
+# Connect to the database using JayDeBeApi
+conn = jaydebeapi.connect(driver, url, ["user", "password"], jdbc_driver_path)
+
+# Create a cursor object
+cursor = conn.cursor()
+
+# Execute the SQL query
+cursor.execute("SELECT * FROM example_table LIMIT 10")
+
+# Retrieve query results
+result_set = cursor.fetchall()
+
+# Process the results
+for row in result_set:
+ print(row)
+
+# Close the cursor and the connection
+cursor.close()
+conn.close()
+```
+
+Make sure to replace the placeholders (host, port, user, password) with your actual Kyuubi configuration.
+With the above code, you can connect to Kyuubi and execute SQL queries in Python. Please handle exceptions and errors appropriately in real-world applications.
diff --git a/docs/client/python/pyhive.md b/docs/client/python/pyhive.md
index dbebf684f..b5e57ea2e 100644
--- a/docs/client/python/pyhive.md
+++ b/docs/client/python/pyhive.md
@@ -64,7 +64,47 @@ If password is provided for connection, make sure the `auth` param set to either
```python
# open connection
-conn = hive.Connection(host=kyuubi_host,port=10009,
-user='user', password='password', auth='CUSTOM')
+conn = hive.Connection(host=kyuubi_host, port=10009,
+ username='user', password='password', auth='CUSTOM')
+```
+
+Use Kerberos to connect to Kyuubi.
+
+`kerberos_service_name` must be the name of the service that started the Kyuubi server, usually the prefix of the first slash of `kyuubi.kinit.principal`.
+
+Note that PyHive does not support passing in `principal`, it splices in part of `principal` with `kerberos_service_name` and `kyuubi_host`.
+
+```python
+# open connection
+conn = hive.Connection(host=kyuubi_host, port=10009, auth="KERBEROS", kerberos_service_name="kyuubi")
+```
+
+If you encounter the following errors, you need to install related packages.
+
+```
+thrift.transport.TTransport.TTransportException: Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found'
+```
+
+```bash
+yum install -y cyrus-sasl-plain cyrus-sasl-devel cyrus-sasl-gssapi cyrus-sasl-md5
+```
+
+Note that PyHive does not support the connection method based on zookeeper HA, you can connect to zookeeper to get the service address via [Kazoo](https://pypi.org/project/kazoo/).
+
+Code reference [https://stackoverflow.com/a/73326589](https://stackoverflow.com/a/73326589)
+
+```python
+from pyhive import hive
+import random
+from kazoo.client import KazooClient
+zk = KazooClient(hosts='kyuubi1.xx.com:2181,kyuubi2.xx.com:2181,kyuubi3.xx.com:2181', read_only=True)
+zk.start()
+servers = [kyuubi_server.split(';')[0].split('=')[1].split(':')
+ for kyuubi_server
+ in zk.get_children(path='kyuubi')]
+kyuubi_host, kyuubi_port = random.choice(servers)
+zk.stop()
+print(kyuubi_host, kyuubi_port)
+conn = hive.Connection(host=kyuubi_host, port=kyuubi_port, auth="KERBEROS", kerberos_service_name="kyuubi")
```
diff --git a/docs/client/rest/rest_api.md b/docs/client/rest/rest_api.md
index f863404a6..fc04857d0 100644
--- a/docs/client/rest/rest_api.md
+++ b/docs/client/rest/rest_api.md
@@ -89,19 +89,16 @@ Create a session
#### Request Parameters
-| Name | Description | Type |
-|:----------------|:-----------------------------------------|:-------|
-| protocolVersion | The protocol version of Hive CLI service | Int |
-| user | The user name | String |
-| password | The user password | String |
-| ipAddr | The user client IP address | String |
-| configs | The configuration of the session | Map |
+| Name | Description | Type |
+|:--------|:---------------------------------|:-----|
+| configs | The configuration of the session | Map |
#### Response Body
-| Name | Description | Type |
-|:-----------|:------------------------------|:-------|
-| identifier | The session handle identifier | String |
+| Name | Description | Type |
+|:---------------|:---------------------------------------------------------------------------------------------------|:-------|
+| identifier | The session handle identifier | String |
+| kyuubiInstance | The Kyuubi instance that holds the session and to call for the following operations in the session | String |
### DELETE /sessions/${sessionHandle}
@@ -113,11 +110,12 @@ Create an operation with EXECUTE_STATEMENT type
#### Request Body
-| Name | Description | Type |
-|:-------------|:---------------------------------------------------------------|:--------|
-| statement | The SQL statement that you execute | String |
-| runAsync | The flag indicates whether the query runs synchronously or not | Boolean |
-| queryTimeout | The interval of query time out | Long |
+| Name | Description | Type |
+|:-------------|:---------------------------------------------------------------|:---------------|
+| statement | The SQL statement that you execute | String |
+| runAsync | The flag indicates whether the query runs synchronously or not | Boolean |
+| queryTimeout | The interval of query time out | Long |
+| confOverlay | The conf to overlay only for current operation | Map of key=val |
#### Response Body
@@ -400,7 +398,7 @@ curl --location --request POST 'http://localhost:10099/api/v1/batches' \
The created [Batch](#batch) object.
-### GET /batches/{batchId}
+### GET /batches/${batchId}
Returns the batch information.
@@ -451,7 +449,13 @@ Refresh the Hadoop configurations of the Kyuubi server.
### POST /admin/refresh/user_defaults_conf
-Refresh the [user defaults configs](../../deployment/settings.html#user-defaults) with key in format in the form of `___{username}___.{config key}` from default property file.
+Refresh the [user defaults configs](../../configuration/settings.html#user-defaults) with key in format in the form of `___{username}___.{config key}` from default property file.
+
+### POST /admin/refresh/kubernetes_conf
+
+Refresh the kubernetes configs with key prefixed with `kyuubi.kubernetes` from default property file.
+
+It is helpful if you need to support multiple kubernetes contexts and namespaces, see [KYUUBI #4843](https://github.com/apache/kyuubi/issues/4843).
### DELETE /admin/engine
@@ -493,6 +497,7 @@ The [Engine](#engine) List.
| user | The user created the batch | String |
| batchType | The batch type | String |
| name | The batch name | String |
+| appStartTime | The batch application start time | Long |
| appId | The batch application Id | String |
| appUrl | The batch application tracking url | String |
| appState | The batch application state | String |
diff --git a/docs/community/release.md b/docs/community/release.md
index 5d3a00b03..f2c8541b1 100644
--- a/docs/community/release.md
+++ b/docs/community/release.md
@@ -43,12 +43,14 @@ The release process consists of several steps:
1. Decide to release
2. Prepare for the release
-3. Cut branch off for __major__ release
+3. Cut branch off for __feature__ release
4. Build a release candidate
5. Vote on the release candidate
6. If necessary, fix any issues and go back to step 3.
7. Finalize the release
8. Promote the release
+9. Remove the dist repo directories for deprecated release candidates
+10. Publish docker image
## Decide to release
@@ -151,12 +153,12 @@ gpg --keyserver hkp://keyserver.ubuntu.com --send-keys ${PUBLIC_KEY} # send publ
gpg --keyserver hkp://keyserver.ubuntu.com --recv-keys ${PUBLIC_KEY} # verify
```
-## Cut branch if for major release
+## Cut branch if for feature release
Kyuubi use version pattern `{MAJOR_VERSION}.{MINOR_VERSION}.{PATCH_VERSION}[-{OPTIONAL_SUFFIX}]`, e.g. `1.7.0`.
-__Major Release__ means `MAJOR_VERSION` or `MINOR_VERSION` changed, and __Patch Release__ means `PATCH_VERSION` changed.
+__Feature Release__ means `MAJOR_VERSION` or `MINOR_VERSION` changed, and __Patch Release__ means `PATCH_VERSION` changed.
-The main step towards preparing a major release is to create a release branch. This is done via standard Git branching
+The main step towards preparing a feature release is to create a release branch. This is done via standard Git branching
mechanism and should be announced to the community once the branch is created.
> Note: If you are releasing a patch version, you can ignore this step.
@@ -169,29 +171,49 @@ After cutting release branch, don't forget bump version in `master` branch.
> Don't forget to switch to the release branch!
-1. Set environment variables.
+- Set environment variables.
```shell
export RELEASE_VERSION=
export RELEASE_RC_NO=
+export NEXT_VERSION=
```
-2. Bump version.
+- Bump version, and create a git tag for the release candidate.
+
+Considering that other committers may merge PRs during your release period, you should accomplish the version change
+first, and then come back to the release candidate tag to continue the rest release process.
+
+The tag pattern is `v${RELEASE_VERSION}-rc${RELEASE_RC_NO}`, e.g. `v1.7.0-rc0`
+
+> NOTE: After all the voting passed, be sure to create a final tag with the pattern: `v${RELEASE_VERSION}`
```shell
+# Bump to the release version
build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="${RELEASE_VERSION}"
-
+(cd kyuubi-server/web-ui && npm version "${RELEASE_VERSION}")
git commit -am "[RELEASE] Bump ${RELEASE_VERSION}"
-```
-3. Create a git tag for the release candidate.
+# Create tag
+git tag v${RELEASE_VERSION}-rc${RELEASE_RC_NO}
-The tag pattern is `v${RELEASE_VERSION}-rc${RELEASE_RC_NO}`, e.g. `v1.7.0-rc0`
+# Prepare for the next development version
+build/mvn versions:set -DgenerateBackupPoms=false -DnewVersion="${NEXT_VERSION}-SNAPSHOT"
+(cd kyuubi-server/web-ui && npm version "${NEXT_VERSION}-SNAPSHOT")
+git commit -am "[RELEASE] Bump ${NEXT_VERSION}-SNAPSHOT"
-> NOTE: After all the voting passed, be sure to create a final tag with the pattern: `v${RELEASE_VERSION}`
+# Push branch to apache remote repo
+git push apache
-4. Package the release binaries & sources, and upload them to the Apache staging SVN repo. Publish jars to the Apache
- staging Maven repo.
+# Push tag to apache remote repo
+git push apache v${RELEASE_VERSION}-rc${RELEASE_RC_NO}
+
+# Go back to release candidate tag
+git checkout v${RELEASE_VERSION}-rc${RELEASE_RC_NO}
+```
+
+- Package source and binary artifacts, and upload them to the Apache staging SVN repo. Publish jars to the Apache
+ staging Maven repo.
```shell
build/release/release.sh publish
@@ -199,7 +221,7 @@ build/release/release.sh publish
To make your release available in the staging repository, you must close the staging repo in the [Apache Nexus](https://repository.apache.org/#stagingRepositories). Until you close, you can re-run deploying to staging multiple times. But once closed, it will create a new staging repo. So ensure you close this, so that the next RC (if need be) is on a new repo. Once everything is good, close the staging repository on Apache Nexus.
-5. Generate a pre-release note from GitHub for the subsequent voting.
+- Generate a pre-release note from GitHub for the subsequent voting.
Goto the [release page](https://github.com/apache/kyuubi/releases) and click the "Draft a new release" button, then it would jump to a new page to prepare the release.
@@ -255,8 +277,7 @@ Fork and clone [Apache Kyuubi website](https://github.com/apache/kyuubi-website)
1. Add a new markdown file in `src/zh/news/`, `src/en/news/`
2. Add a new markdown file in `src/zh/release/`, `src/en/release/`
-3. Follow [Build Document](../develop_tools/build_document.md) to build documents, then copy `apache/kyuubi`'s
- folder `docs/_build/html` to `apache/kyuubi-website`'s folder `content/docs/r{RELEASE_VERSION}`
+3. Update `releases` defined in `hugo.toml`'s `[params]` part.
### Create an Announcement
@@ -280,3 +301,9 @@ svn delete https://dist.apache.org/repos/dist/dev/kyuubi/{RELEASE_TAG} \
--message "Remove deprecated Apache Kyuubi ${RELEASE_TAG}"
```
+## Keep other artifacts up-to-date
+
+- Docker Image: https://github.com/apache/kyuubi-docker/blob/master/release/release_guide.md
+- Helm Charts: https://github.com/apache/kyuubi/blob/master/charts/kyuubi/Chart.yaml
+- Playground: https://github.com/apache/kyuubi/blob/master/docker/playground/.env
+
diff --git a/docs/conf.py b/docs/conf.py
index 3df98c6e3..eaac1aced 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -64,7 +64,7 @@
author = 'Apache Kyuubi Community'
# The full version, including alpha/beta/rc tags
-release = subprocess.getoutput("cd .. && build/mvn help:evaluate -Dexpression=project.version|grep -v Using|grep -v INFO|grep -v WARNING|tail -n 1").split('\n')[-1]
+release = subprocess.getoutput("grep 'kyuubi-parent' -C1 ../pom.xml | grep '' | awk -F '[<>]' '{print $3}'")
# -- General configuration ---------------------------------------------------
@@ -77,9 +77,11 @@
'sphinx.ext.napoleon',
'sphinx.ext.mathjax',
'recommonmark',
+ 'sphinx_copybutton',
'sphinx_markdown_tables',
'sphinx_togglebutton',
'notfound.extension',
+ 'sphinxemoji.sphinxemoji',
]
master_doc = 'index'
diff --git a/docs/deployment/settings.md b/docs/configuration/settings.md
similarity index 62%
rename from docs/deployment/settings.md
rename to docs/configuration/settings.md
index f8beaa83b..5e00d0b75 100644
--- a/docs/deployment/settings.md
+++ b/docs/configuration/settings.md
@@ -16,151 +16,62 @@
-->
-# Introduction to the Kyuubi Configurations System
+# Configurations
Kyuubi provides several ways to configure the system and corresponding engines.
## Environments
-You can configure the environment variables in `$KYUUBI_HOME/conf/kyuubi-env.sh`, e.g, `JAVA_HOME`, then this java runtime will be used both for Kyuubi server instance and the applications it launches. You can also change the variable in the subprocess's env configuration file, e.g.`$SPARK_HOME/conf/spark-env.sh` to use more specific ENV for SQL engine applications.
-
-```bash
-#!/usr/bin/env bash
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-#
-# - JAVA_HOME Java runtime to use. By default use "java" from PATH.
-#
-#
-# - KYUUBI_CONF_DIR Directory containing the Kyuubi configurations to use.
-# (Default: $KYUUBI_HOME/conf)
-# - KYUUBI_LOG_DIR Directory for Kyuubi server-side logs.
-# (Default: $KYUUBI_HOME/logs)
-# - KYUUBI_PID_DIR Directory stores the Kyuubi instance pid file.
-# (Default: $KYUUBI_HOME/pid)
-# - KYUUBI_MAX_LOG_FILES Maximum number of Kyuubi server logs can rotate to.
-# (Default: 5)
-# - KYUUBI_JAVA_OPTS JVM options for the Kyuubi server itself in the form "-Dx=y".
-# (Default: none).
-# - KYUUBI_CTL_JAVA_OPTS JVM options for the Kyuubi ctl itself in the form "-Dx=y".
-# (Default: none).
-# - KYUUBI_BEELINE_OPTS JVM options for the Kyuubi BeeLine in the form "-Dx=Y".
-# (Default: none)
-# - KYUUBI_NICENESS The scheduling priority for Kyuubi server.
-# (Default: 0)
-# - KYUUBI_WORK_DIR_ROOT Root directory for launching sql engine applications.
-# (Default: $KYUUBI_HOME/work)
-# - HADOOP_CONF_DIR Directory containing the Hadoop / YARN configuration to use.
-# - YARN_CONF_DIR Directory containing the YARN configuration to use.
-#
-# - SPARK_HOME Spark distribution which you would like to use in Kyuubi.
-# - SPARK_CONF_DIR Optional directory where the Spark configuration lives.
-# (Default: $SPARK_HOME/conf)
-# - FLINK_HOME Flink distribution which you would like to use in Kyuubi.
-# - FLINK_CONF_DIR Optional directory where the Flink configuration lives.
-# (Default: $FLINK_HOME/conf)
-# - FLINK_HADOOP_CLASSPATH Required Hadoop jars when you use the Kyuubi Flink engine.
-# - HIVE_HOME Hive distribution which you would like to use in Kyuubi.
-# - HIVE_CONF_DIR Optional directory where the Hive configuration lives.
-# (Default: $HIVE_HOME/conf)
-# - HIVE_HADOOP_CLASSPATH Required Hadoop jars when you use the Kyuubi Hive engine.
-#
-
-
-## Examples ##
-
-# export JAVA_HOME=/usr/jdk64/jdk1.8.0_152
-# export SPARK_HOME=/opt/spark
-# export FLINK_HOME=/opt/flink
-# export HIVE_HOME=/opt/hive
-# export FLINK_HADOOP_CLASSPATH=/path/to/hadoop-client-runtime-3.3.2.jar:/path/to/hadoop-client-api-3.3.2.jar
-# export HIVE_HADOOP_CLASSPATH=${HADOOP_HOME}/share/hadoop/common/lib/commons-collections-3.2.2.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-runtime-3.1.0.jar:${HADOOP_HOME}/share/hadoop/client/hadoop-client-api-3.1.0.jar:${HADOOP_HOME}/share/hadoop/common/lib/htrace-core4-4.1.0-incubating.jar
-# export HADOOP_CONF_DIR=/usr/ndp/current/mapreduce_client/conf
-# export YARN_CONF_DIR=/usr/ndp/current/yarn/conf
-# export KYUUBI_JAVA_OPTS="-Xmx10g -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark -XX:MaxDirectMemorySize=1024m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./logs -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -Xloggc:./logs/kyuubi-server-gc-%t.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=5M -XX:NewRatio=3 -XX:MetaspaceSize=512m"
-# export KYUUBI_BEELINE_OPTS="-Xmx2g -XX:+UnlockDiagnosticVMOptions -XX:ParGCCardsPerStrideChunk=4096 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:+CMSParallelRemarkEnabled -XX:+UseCondCardMark"
-```
-
+You can configure the environment variables in `$KYUUBI_HOME/conf/kyuubi-env.sh`, e.g, `JAVA_HOME`, then this java runtime will be used both for Kyuubi server instance and the applications it launches. You can also change the variable in the subprocess's env configuration file, e.g.`$SPARK_HOME/conf/spark-env.sh` to use more specific ENV for SQL engine applications. see `$KYUUBI_HOME/conf/kyuubi-env.sh.template` as an example.
For the environment variables that only needed to be transferred into engine side, you can set it with a Kyuubi configuration item formatted `kyuubi.engineEnv.VAR_NAME`. For example, with `kyuubi.engineEnv.SPARK_DRIVER_MEMORY=4g`, the environment variable `SPARK_DRIVER_MEMORY` with value `4g` would be transferred into engine side. With `kyuubi.engineEnv.SPARK_CONF_DIR=/apache/confs/spark/conf`, the value of `SPARK_CONF_DIR` on the engine side is set to `/apache/confs/spark/conf`.
## Kyuubi Configurations
-You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.conf`. For example:
-
-```bash
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements. See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License. You may obtain a copy of the License at
-#
-# http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-## Kyuubi Configurations
-
-#
-# kyuubi.authentication NONE
-# kyuubi.frontend.bind.host localhost
-# kyuubi.frontend.bind.port 10009
-#
-
-# Details in https://kyuubi.readthedocs.io/en/master/deployment/settings.html
-```
+You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.conf`, see `$KYUUBI_HOME/conf/kyuubi-defaults.conf.template` as an example.
### Authentication
-| Key | Default | Meaning | Type | Since |
-|-----------------------------------------|-------------------||--------|-------|
-| kyuubi.authentication | NONE | A comma-separated list of client authentication types.
The following tree describes the catalog of each option.
NOSASL
SASL
SASL/PLAIN
NONE
LDAP
JDBC
CUSTOM
SASL/GSSAPI
KERBEROS
Note that: for SASL authentication, KERBEROS and PLAIN auth types are supported at the same time, and only the first specified PLAIN auth type is valid. | seq | 1.0.0 |
-| kyuubi.authentication.custom.class | <undefined> | User-defined authentication implementation of org.apache.kyuubi.service.authentication.PasswdAuthenticationProvider | string | 1.3.0 |
-| kyuubi.authentication.jdbc.driver.class | <undefined> | Driver class name for JDBC Authentication Provider. | string | 1.6.0 |
-| kyuubi.authentication.jdbc.password | <undefined> | Database password for JDBC Authentication Provider. | string | 1.6.0 |
-| kyuubi.authentication.jdbc.query | <undefined> | Query SQL template with placeholders for JDBC Authentication Provider to execute. Authentication passes if the result set is not empty.The SQL statement must start with the `SELECT` clause. Available placeholders are `${user}` and `${password}`. | string | 1.6.0 |
-| kyuubi.authentication.jdbc.url | <undefined> | JDBC URL for JDBC Authentication Provider. | string | 1.6.0 |
-| kyuubi.authentication.jdbc.user | <undefined> | Database user for JDBC Authentication Provider. | string | 1.6.0 |
-| kyuubi.authentication.ldap.base.dn | <undefined> | LDAP base DN. | string | 1.0.0 |
-| kyuubi.authentication.ldap.domain | <undefined> | LDAP domain. | string | 1.0.0 |
-| kyuubi.authentication.ldap.guidKey | uid | LDAP attribute name whose values are unique in this LDAP server.For example:uid or cn. | string | 1.2.0 |
-| kyuubi.authentication.ldap.url | <undefined> | SPACE character separated LDAP connection URL(s). | string | 1.0.0 |
-| kyuubi.authentication.sasl.qop | auth | Sasl QOP enable higher levels of protection for Kyuubi communication with clients.
auth - authentication only (default)
auth-int - authentication plus integrity protection
auth-conf - authentication plus integrity and confidentiality protection. This is applicable only if Kyuubi is configured to use Kerberos authentication.
| string | 1.0.0 |
+| Key | Default | Meaning | Type | Since |
+|-----------------------------------------------|-------------------||--------|-------|
+| kyuubi.authentication | NONE | A comma-separated list of client authentication types.
The following tree describes the catalog of each option.
NOSASL
SASL
SASL/PLAIN
NONE
LDAP
JDBC
CUSTOM
SASL/GSSAPI
KERBEROS
Note that: for SASL authentication, KERBEROS and PLAIN auth types are supported at the same time, and only the first specified PLAIN auth type is valid. | set | 1.0.0 |
+| kyuubi.authentication.custom.class | <undefined> | User-defined authentication implementation of org.apache.kyuubi.service.authentication.PasswdAuthenticationProvider | string | 1.3.0 |
+| kyuubi.authentication.jdbc.driver.class | <undefined> | Driver class name for JDBC Authentication Provider. | string | 1.6.0 |
+| kyuubi.authentication.jdbc.password | <undefined> | Database password for JDBC Authentication Provider. | string | 1.6.0 |
+| kyuubi.authentication.jdbc.query | <undefined> | Query SQL template with placeholders for JDBC Authentication Provider to execute. Authentication passes if the result set is not empty.The SQL statement must start with the `SELECT` clause. Available placeholders are `${user}` and `${password}`. | string | 1.6.0 |
+| kyuubi.authentication.jdbc.url | <undefined> | JDBC URL for JDBC Authentication Provider. | string | 1.6.0 |
+| kyuubi.authentication.jdbc.user | <undefined> | Database user for JDBC Authentication Provider. | string | 1.6.0 |
+| kyuubi.authentication.ldap.baseDN | <undefined> | LDAP base DN. | string | 1.7.0 |
+| kyuubi.authentication.ldap.binddn | <undefined> | The user with which to bind to the LDAP server, and search for the full domain name of the user being authenticated. This should be the full domain name of the user, and should have search access across all users in the LDAP tree. If not specified, then the user being authenticated will be used as the bind user. For example: CN=bindUser,CN=Users,DC=subdomain,DC=domain,DC=com | string | 1.7.0 |
+| kyuubi.authentication.ldap.bindpw | <undefined> | The password for the bind user, to be used to search for the full name of the user being authenticated. If the username is specified, this parameter must also be specified. | string | 1.7.0 |
+| kyuubi.authentication.ldap.customLDAPQuery | <undefined> | A full LDAP query that LDAP Atn provider uses to execute against LDAP Server. If this query returns a null resultset, the LDAP Provider fails the Authentication request, succeeds if the user is part of the resultset.For example: `(&(objectClass=group)(objectClass=top)(instanceType=4)(cn=Domain*))`, `(&(objectClass=person)(|(sAMAccountName=admin)(|(memberOf=CN=Domain Admins,CN=Users,DC=domain,DC=com)(memberOf=CN=Administrators,CN=Builtin,DC=domain,DC=com))))` | string | 1.7.0 |
+| kyuubi.authentication.ldap.domain | <undefined> | LDAP domain. | string | 1.0.0 |
+| kyuubi.authentication.ldap.groupClassKey | groupOfNames | LDAP attribute name on the group entry that is to be used in LDAP group searches. For example: group, groupOfNames or groupOfUniqueNames. | string | 1.7.0 |
+| kyuubi.authentication.ldap.groupDNPattern | <undefined> | COLON-separated list of patterns to use to find DNs for group entities in this directory. Use %s where the actual group name is to be substituted for. For example: CN=%s,CN=Groups,DC=subdomain,DC=domain,DC=com. | string | 1.7.0 |
+| kyuubi.authentication.ldap.groupFilter || COMMA-separated list of LDAP Group names (short name not full DNs). For example: HiveAdmins,HadoopAdmins,Administrators | set | 1.7.0 |
+| kyuubi.authentication.ldap.groupMembershipKey | member | LDAP attribute name on the group object that contains the list of distinguished names for the user, group, and contact objects that are members of the group. For example: member, uniqueMember or memberUid | string | 1.7.0 |
+| kyuubi.authentication.ldap.guidKey | uid | LDAP attribute name whose values are unique in this LDAP server. For example: uid or CN. | string | 1.2.0 |
+| kyuubi.authentication.ldap.url | <undefined> | SPACE character separated LDAP connection URL(s). | string | 1.0.0 |
+| kyuubi.authentication.ldap.userDNPattern | <undefined> | COLON-separated list of patterns to use to find DNs for users in this directory. Use %s where the actual group name is to be substituted for. For example: CN=%s,CN=Users,DC=subdomain,DC=domain,DC=com. | string | 1.7.0 |
+| kyuubi.authentication.ldap.userFilter || COMMA-separated list of LDAP usernames (just short names, not full DNs). For example: hiveuser,impalauser,hiveadmin,hadoopadmin | set | 1.7.0 |
+| kyuubi.authentication.ldap.userMembershipKey | <undefined> | LDAP attribute name on the user object that contains groups of which the user is a direct member, except for the primary group, which is represented by the primaryGroupId. For example: memberOf | string | 1.7.0 |
+| kyuubi.authentication.sasl.qop | auth | Sasl QOP enable higher levels of protection for Kyuubi communication with clients.
auth - authentication only (default)
auth-int - authentication plus integrity protection
auth-conf - authentication plus integrity and confidentiality protection. This is applicable only if Kyuubi is configured to use Kerberos authentication.
| string | 1.0.0 |
### Backend
-| Key | Default | Meaning | Type | Since |
-|--------------------------------------------------|---------------------------||----------|-------|
-| kyuubi.backend.engine.exec.pool.keepalive.time | PT1M | Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in SQL engine applications | duration | 1.0.0 |
-| kyuubi.backend.engine.exec.pool.shutdown.timeout | PT10S | Timeout(ms) for the operation execution thread pool to terminate in SQL engine applications | duration | 1.0.0 |
-| kyuubi.backend.engine.exec.pool.size | 100 | Number of threads in the operation execution thread pool of SQL engine applications | int | 1.0.0 |
-| kyuubi.backend.engine.exec.pool.wait.queue.size | 100 | Size of the wait queue for the operation execution thread pool in SQL engine applications | int | 1.0.0 |
-| kyuubi.backend.server.event.json.log.path | file:///tmp/kyuubi/events | The location of server events go for the built-in JSON logger | string | 1.4.0 |
-| kyuubi.backend.server.event.loggers || A comma-separated list of server history loggers, where session/operation etc events go.
JSON: the events will be written to the location of kyuubi.backend.server.event.json.log.path
JDBC: to be done
CUSTOM: User-defined event handlers.
Note that: Kyuubi supports custom event handlers with the Java SPI. To register a custom event handler, the user needs to implement a class which is a child of org.apache.kyuubi.events.handler.CustomEventHandlerProvider which has a zero-arg constructor. | seq | 1.4.0 |
-| kyuubi.backend.server.exec.pool.keepalive.time | PT1M | Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in Kyuubi server | duration | 1.0.0 |
-| kyuubi.backend.server.exec.pool.shutdown.timeout | PT10S | Timeout(ms) for the operation execution thread pool to terminate in Kyuubi server | duration | 1.0.0 |
-| kyuubi.backend.server.exec.pool.size | 100 | Number of threads in the operation execution thread pool of Kyuubi server | int | 1.0.0 |
-| kyuubi.backend.server.exec.pool.wait.queue.size | 100 | Size of the wait queue for the operation execution thread pool of Kyuubi server | int | 1.0.0 |
+| Key | Default | Meaning | Type | Since |
+|--------------------------------------------------|---------------------------||----------|-------|
+| kyuubi.backend.engine.exec.pool.keepalive.time | PT1M | Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in SQL engine applications | duration | 1.0.0 |
+| kyuubi.backend.engine.exec.pool.shutdown.timeout | PT10S | Timeout(ms) for the operation execution thread pool to terminate in SQL engine applications | duration | 1.0.0 |
+| kyuubi.backend.engine.exec.pool.size | 100 | Number of threads in the operation execution thread pool of SQL engine applications | int | 1.0.0 |
+| kyuubi.backend.engine.exec.pool.wait.queue.size | 100 | Size of the wait queue for the operation execution thread pool in SQL engine applications | int | 1.0.0 |
+| kyuubi.backend.server.event.json.log.path | file:///tmp/kyuubi/events | The location of server events go for the built-in JSON logger | string | 1.4.0 |
+| kyuubi.backend.server.event.kafka.close.timeout | PT5S | Period to wait for Kafka producer of server event handlers to close. | duration | 1.8.0 |
+| kyuubi.backend.server.event.kafka.topic | <undefined> | The topic of server events go for the built-in Kafka logger | string | 1.8.0 |
+| kyuubi.backend.server.event.loggers || A comma-separated list of server history loggers, where session/operation etc events go.
JSON: the events will be written to the location of kyuubi.backend.server.event.json.log.path
KAFKA: the events will be serialized in JSON format and sent to topic of `kyuubi.backend.server.event.kafka.topic`. Note: For the configs of Kafka producer, please specify them with the prefix: `kyuubi.backend.server.event.kafka.`. For example, `kyuubi.backend.server.event.kafka.bootstrap.servers=127.0.0.1:9092`
JDBC: to be done
CUSTOM: User-defined event handlers.
Note that: Kyuubi supports custom event handlers with the Java SPI. To register a custom event handler, the user needs to implement a class which is a child of org.apache.kyuubi.events.handler.CustomEventHandlerProvider which has a zero-arg constructor. | seq | 1.4.0 |
+| kyuubi.backend.server.exec.pool.keepalive.time | PT1M | Time(ms) that an idle async thread of the operation execution thread pool will wait for a new task to arrive before terminating in Kyuubi server | duration | 1.0.0 |
+| kyuubi.backend.server.exec.pool.shutdown.timeout | PT10S | Timeout(ms) for the operation execution thread pool to terminate in Kyuubi server | duration | 1.0.0 |
+| kyuubi.backend.server.exec.pool.size | 100 | Number of threads in the operation execution thread pool of Kyuubi server | int | 1.0.0 |
+| kyuubi.backend.server.exec.pool.wait.queue.size | 100 | Size of the wait queue for the operation execution thread pool of Kyuubi server | int | 1.0.0 |
### Batch
@@ -168,7 +79,7 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
|---------------------------------------------|---------||----------|-------|
| kyuubi.batch.application.check.interval | PT5S | The interval to check batch job application information. | duration | 1.6.0 |
| kyuubi.batch.application.starvation.timeout | PT3M | Threshold above which to warn batch application may be starved. | duration | 1.7.0 |
-| kyuubi.batch.conf.ignore.list || A comma-separated list of ignored keys for batch conf. If the batch conf contains any of them, the key and the corresponding value will be removed silently during batch job submission. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering. You can also pre-define some config for batch job submission with the prefix: kyuubi.batchConf.[batchType]. For example, you can pre-define `spark.master` for the Spark batch job with key `kyuubi.batchConf.spark.spark.master`. | seq | 1.6.0 |
+| kyuubi.batch.conf.ignore.list || A comma-separated list of ignored keys for batch conf. If the batch conf contains any of them, the key and the corresponding value will be removed silently during batch job submission. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering. You can also pre-define some config for batch job submission with the prefix: kyuubi.batchConf.[batchType]. For example, you can pre-define `spark.master` for the Spark batch job with key `kyuubi.batchConf.spark.spark.master`. | set | 1.6.0 |
| kyuubi.batch.session.idle.timeout | PT6H | Batch session idle timeout, it will be closed when it's not accessed for this duration | duration | 1.6.2 |
### Credentials
@@ -209,59 +120,82 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
### Engine
-| Key | Default | Meaning | Type | Since |
-|----------------------------------------------------------|---------------------------||----------|-------|
-| kyuubi.engine.connection.url.use.hostname | true | (deprecated) When true, the engine registers with hostname to zookeeper. When Spark runs on K8s with cluster mode, set to false to ensure that server can connect to engine | boolean | 1.3.0 |
-| kyuubi.engine.deregister.exception.classes || A comma-separated list of exception classes. If there is any exception thrown, whose class matches the specified classes, the engine would deregister itself. | seq | 1.2.0 |
-| kyuubi.engine.deregister.exception.messages || A comma-separated list of exception messages. If there is any exception thrown, whose message or stacktrace matches the specified message list, the engine would deregister itself. | seq | 1.2.0 |
-| kyuubi.engine.deregister.exception.ttl | PT30M | Time to live(TTL) for exceptions pattern specified in kyuubi.engine.deregister.exception.classes and kyuubi.engine.deregister.exception.messages to deregister engines. Once the total error count hits the kyuubi.engine.deregister.job.max.failures within the TTL, an engine will deregister itself and wait for self-terminated. Otherwise, we suppose that the engine has recovered from temporary failures. | duration | 1.2.0 |
-| kyuubi.engine.deregister.job.max.failures | 4 | Number of failures of job before deregistering the engine. | int | 1.2.0 |
-| kyuubi.engine.event.json.log.path | file:///tmp/kyuubi/events | The location where all the engine events go for the built-in JSON logger.
Local Path: start with 'file://'
HDFS Path: start with 'hdfs://'
| string | 1.3.0 |
-| kyuubi.engine.event.loggers | SPARK | A comma-separated list of engine history loggers, where engine/session/operation etc events go.
SPARK: the events will be written to the Spark listener bus.
JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
JDBC: to be done
CUSTOM: User-defined event handlers.
Note that: Kyuubi supports custom event handlers with the Java SPI. To register a custom event handler, the user needs to implement a subclass of `org.apache.kyuubi.events.handler.CustomEventHandlerProvider` which has a zero-arg constructor. | seq | 1.3.0 |
-| kyuubi.engine.flink.extra.classpath | <undefined> | The extra classpath for the Flink SQL engine, for configuring the location of hadoop client jars, etc | string | 1.6.0 |
-| kyuubi.engine.flink.java.options | <undefined> | The extra Java options for the Flink SQL engine | string | 1.6.0 |
-| kyuubi.engine.flink.memory | 1g | The heap memory for the Flink SQL engine | string | 1.6.0 |
-| kyuubi.engine.hive.event.loggers | JSON | A comma-separated list of engine history loggers, where engine/session/operation etc events go.
JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
JDBC: to be done
CUSTOM: to be done.
| seq | 1.7.0 |
-| kyuubi.engine.hive.extra.classpath | <undefined> | The extra classpath for the Hive query engine, for configuring location of the hadoop client jars and etc. | string | 1.6.0 |
-| kyuubi.engine.hive.java.options | <undefined> | The extra Java options for the Hive query engine | string | 1.6.0 |
-| kyuubi.engine.hive.memory | 1g | The heap memory for the Hive query engine | string | 1.6.0 |
-| kyuubi.engine.initialize.sql | SHOW DATABASES | SemiColon-separated list of SQL statements to be initialized in the newly created engine before queries. i.e. use `SHOW DATABASES` to eagerly active HiveClient. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver. | seq | 1.2.0 |
-| kyuubi.engine.jdbc.connection.password | <undefined> | The password is used for connecting to server | string | 1.6.0 |
-| kyuubi.engine.jdbc.connection.properties || The additional properties are used for connecting to server | seq | 1.6.0 |
-| kyuubi.engine.jdbc.connection.provider | <undefined> | The connection provider is used for getting a connection from the server | string | 1.6.0 |
-| kyuubi.engine.jdbc.connection.url | <undefined> | The server url that engine will connect to | string | 1.6.0 |
-| kyuubi.engine.jdbc.connection.user | <undefined> | The user is used for connecting to server | string | 1.6.0 |
-| kyuubi.engine.jdbc.driver.class | <undefined> | The driver class for JDBC engine connection | string | 1.6.0 |
-| kyuubi.engine.jdbc.extra.classpath | <undefined> | The extra classpath for the JDBC query engine, for configuring the location of the JDBC driver and etc. | string | 1.6.0 |
-| kyuubi.engine.jdbc.java.options | <undefined> | The extra Java options for the JDBC query engine | string | 1.6.0 |
-| kyuubi.engine.jdbc.memory | 1g | The heap memory for the JDBC query engine | string | 1.6.0 |
-| kyuubi.engine.jdbc.type | <undefined> | The short name of JDBC type | string | 1.6.0 |
-| kyuubi.engine.operation.convert.catalog.database.enabled | true | When set to true, The engine converts the JDBC methods of set/get Catalog and set/get Schema to the implementation of different engines | boolean | 1.6.0 |
-| kyuubi.engine.operation.log.dir.root | engine_operation_logs | Root directory for query operation log at engine-side. | string | 1.4.0 |
-| kyuubi.engine.pool.name | engine-pool | The name of the engine pool. | string | 1.5.0 |
-| kyuubi.engine.pool.selectPolicy | RANDOM | The select policy of an engine from the corresponding engine pool engine for a session.
RANDOM - Randomly use the engine in the pool
POLLING - Polling use the engine in the pool
| string | 1.7.0 |
-| kyuubi.engine.pool.size | -1 | The size of the engine pool. Note that, if the size is less than 1, the engine pool will not be enabled; otherwise, the size of the engine pool will be min(this, kyuubi.engine.pool.size.threshold). | int | 1.4.0 |
-| kyuubi.engine.pool.size.threshold | 9 | This parameter is introduced as a server-side parameter controlling the upper limit of the engine pool. | int | 1.4.0 |
-| kyuubi.engine.session.initialize.sql || SemiColon-separated list of SQL statements to be initialized in the newly created engine session before queries. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver. | seq | 1.3.0 |
-| kyuubi.engine.share.level | USER | Engines will be shared in different levels, available configs are:
CONNECTION: engine will not be shared but only used by the current client connection
USER: engine will be shared by all sessions created by a unique username, see also kyuubi.engine.share.level.subdomain
GROUP: the engine will be shared by all sessions created by all users belong to the same primary group name. The engine will be launched by the group name as the effective username, so here the group name is in value of special user who is able to visit the computing resources/data of the team. It follows the [Hadoop GroupsMapping](https://reurl.cc/xE61Y5) to map user to a primary group. If the primary group is not found, it fallback to the USER level.
SERVER: the App will be shared by Kyuubi servers
| string | 1.2.0 |
-| kyuubi.engine.share.level.sub.domain | <undefined> | (deprecated) - Using kyuubi.engine.share.level.subdomain instead | string | 1.2.0 |
-| kyuubi.engine.share.level.subdomain | <undefined> | Allow end-users to create a subdomain for the share level of an engine. A subdomain is a case-insensitive string values that must be a valid zookeeper subpath. For example, for the `USER` share level, an end-user can share a certain engine within a subdomain, not for all of its clients. End-users are free to create multiple engines in the `USER` share level. When disable engine pool, use 'default' if absent. | string | 1.4.0 |
-| kyuubi.engine.single.spark.session | false | When set to true, this engine is running in a single session mode. All the JDBC/ODBC connections share the temporary views, function registries, SQL configuration and the current database. | boolean | 1.3.0 |
-| kyuubi.engine.spark.event.loggers | SPARK | A comma-separated list of engine loggers, where engine/session/operation etc events go.
SPARK: the events will be written to the Spark listener bus.
JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
JDBC: to be done
CUSTOM: to be done.
| seq | 1.7.0 |
-| kyuubi.engine.spark.python.env.archive | <undefined> | Portable Python env archive used for Spark engine Python language mode. | string | 1.7.0 |
-| kyuubi.engine.spark.python.env.archive.exec.path | bin/python | The Python exec path under the Python env archive. | string | 1.7.0 |
-| kyuubi.engine.spark.python.home.archive | <undefined> | Spark archive containing $SPARK_HOME/python directory, which is used to init session Python worker for Python language mode. | string | 1.7.0 |
-| kyuubi.engine.trino.event.loggers | JSON | A comma-separated list of engine history loggers, where engine/session/operation etc events go.
JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
JDBC: to be done
CUSTOM: to be done.
| seq | 1.7.0 |
-| kyuubi.engine.trino.extra.classpath | <undefined> | The extra classpath for the Trino query engine, for configuring other libs which may need by the Trino engine | string | 1.6.0 |
-| kyuubi.engine.trino.java.options | <undefined> | The extra Java options for the Trino query engine | string | 1.6.0 |
-| kyuubi.engine.trino.memory | 1g | The heap memory for the Trino query engine | string | 1.6.0 |
-| kyuubi.engine.type | SPARK_SQL | Specify the detailed engine supported by Kyuubi. The engine type bindings to SESSION scope. This configuration is experimental. Currently, available configs are:
SPARK_SQL: specify this engine type will launch a Spark engine which can provide all the capacity of the Apache Spark. Note, it's a default engine type.
FLINK_SQL: specify this engine type will launch a Flink engine which can provide all the capacity of the Apache Flink.
TRINO: specify this engine type will launch a Trino engine which can provide all the capacity of the Trino.
HIVE_SQL: specify this engine type will launch a Hive engine which can provide all the capacity of the Hive Server2.
JDBC: specify this engine type will launch a JDBC engine which can provide a MySQL protocol connector, for now we only support Doris dialect.
| string | 1.4.0 |
-| kyuubi.engine.ui.retainedSessions | 200 | The number of SQL client sessions kept in the Kyuubi Query Engine web UI. | int | 1.4.0 |
-| kyuubi.engine.ui.retainedStatements | 200 | The number of statements kept in the Kyuubi Query Engine web UI. | int | 1.4.0 |
-| kyuubi.engine.ui.stop.enabled | true | When true, allows Kyuubi engine to be killed from the Spark Web UI. | boolean | 1.3.0 |
-| kyuubi.engine.user.isolated.spark.session | true | When set to false, if the engine is running in a group or server share level, all the JDBC/ODBC connections will be isolated against the user. Including the temporary views, function registries, SQL configuration, and the current database. Note that, it does not affect if the share level is connection or user. | boolean | 1.6.0 |
-| kyuubi.engine.user.isolated.spark.session.idle.interval | PT1M | The interval to check if the user-isolated Spark session is timeout. | duration | 1.6.0 |
-| kyuubi.engine.user.isolated.spark.session.idle.timeout | PT6H | If kyuubi.engine.user.isolated.spark.session is false, we will release the Spark session if its corresponding user is inactive after this configured timeout. | duration | 1.6.0 |
+| Key | Default | Meaning | Type | Since |
+|----------------------------------------------------------|---------------------------||----------|-------|
+| kyuubi.engine.chat.extra.classpath | <undefined> | The extra classpath for the Chat engine, for configuring the location of the SDK and etc. | string | 1.8.0 |
+| kyuubi.engine.chat.gpt.apiKey | <undefined> | The key to access OpenAI open API, which could be got at https://platform.openai.com/account/api-keys | string | 1.8.0 |
+| kyuubi.engine.chat.gpt.http.connect.timeout | PT2M | The timeout[ms] for establishing the connection with the Chat GPT server. A timeout value of zero is interpreted as an infinite timeout. | duration | 1.8.0 |
+| kyuubi.engine.chat.gpt.http.proxy | <undefined> | HTTP proxy url for API calling in Chat GPT engine. e.g. http://127.0.0.1:1087 | string | 1.8.0 |
+| kyuubi.engine.chat.gpt.http.socket.timeout | PT2M | The timeout[ms] for waiting for data packets after Chat GPT server connection is established. A timeout value of zero is interpreted as an infinite timeout. | duration | 1.8.0 |
+| kyuubi.engine.chat.gpt.model | gpt-3.5-turbo | ID of the model used in ChatGPT. Available models refer to OpenAI's [Model overview](https://platform.openai.com/docs/models/overview). | string | 1.8.0 |
+| kyuubi.engine.chat.java.options | <undefined> | The extra Java options for the Chat engine | string | 1.8.0 |
+| kyuubi.engine.chat.memory | 1g | The heap memory for the Chat engine | string | 1.8.0 |
+| kyuubi.engine.chat.provider | ECHO | The provider for the Chat engine. Candidates:
ECHO: simply replies a welcome message.
GPT: a.k.a ChatGPT, powered by OpenAI.
| string | 1.8.0 |
+| kyuubi.engine.connection.url.use.hostname | true | (deprecated) When true, the engine registers with hostname to zookeeper. When Spark runs on K8s with cluster mode, set to false to ensure that server can connect to engine | boolean | 1.3.0 |
+| kyuubi.engine.deregister.exception.classes || A comma-separated list of exception classes. If there is any exception thrown, whose class matches the specified classes, the engine would deregister itself. | set | 1.2.0 |
+| kyuubi.engine.deregister.exception.messages || A comma-separated list of exception messages. If there is any exception thrown, whose message or stacktrace matches the specified message list, the engine would deregister itself. | set | 1.2.0 |
+| kyuubi.engine.deregister.exception.ttl | PT30M | Time to live(TTL) for exceptions pattern specified in kyuubi.engine.deregister.exception.classes and kyuubi.engine.deregister.exception.messages to deregister engines. Once the total error count hits the kyuubi.engine.deregister.job.max.failures within the TTL, an engine will deregister itself and wait for self-terminated. Otherwise, we suppose that the engine has recovered from temporary failures. | duration | 1.2.0 |
+| kyuubi.engine.deregister.job.max.failures | 4 | Number of failures of job before deregistering the engine. | int | 1.2.0 |
+| kyuubi.engine.event.json.log.path | file:///tmp/kyuubi/events | The location where all the engine events go for the built-in JSON logger.
Local Path: start with 'file://'
HDFS Path: start with 'hdfs://'
| string | 1.3.0 |
+| kyuubi.engine.event.loggers | SPARK | A comma-separated list of engine history loggers, where engine/session/operation etc events go.
SPARK: the events will be written to the Spark listener bus.
JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
JDBC: to be done
CUSTOM: User-defined event handlers.
Note that: Kyuubi supports custom event handlers with the Java SPI. To register a custom event handler, the user needs to implement a subclass of `org.apache.kyuubi.events.handler.CustomEventHandlerProvider` which has a zero-arg constructor. | seq | 1.3.0 |
+| kyuubi.engine.flink.application.jars | <undefined> | A comma-separated list of the local jars to be shipped with the job to the cluster. For example, SQL UDF jars. Only effective in yarn application mode. | string | 1.8.0 |
+| kyuubi.engine.flink.extra.classpath | <undefined> | The extra classpath for the Flink SQL engine, for configuring the location of hadoop client jars, etc. Only effective in yarn session mode. | string | 1.6.0 |
+| kyuubi.engine.flink.java.options | <undefined> | The extra Java options for the Flink SQL engine. Only effective in yarn session mode. | string | 1.6.0 |
+| kyuubi.engine.flink.memory | 1g | The heap memory for the Flink SQL engine. Only effective in yarn session mode. | string | 1.6.0 |
+| kyuubi.engine.hive.event.loggers | JSON | A comma-separated list of engine history loggers, where engine/session/operation etc events go.
JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
JDBC: to be done
CUSTOM: to be done.
| seq | 1.7.0 |
+| kyuubi.engine.hive.extra.classpath | <undefined> | The extra classpath for the Hive query engine, for configuring location of the hadoop client jars and etc. | string | 1.6.0 |
+| kyuubi.engine.hive.java.options | <undefined> | The extra Java options for the Hive query engine | string | 1.6.0 |
+| kyuubi.engine.hive.memory | 1g | The heap memory for the Hive query engine | string | 1.6.0 |
+| kyuubi.engine.initialize.sql | SHOW DATABASES | SemiColon-separated list of SQL statements to be initialized in the newly created engine before queries. i.e. use `SHOW DATABASES` to eagerly active HiveClient. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver. | seq | 1.2.0 |
+| kyuubi.engine.jdbc.connection.password | <undefined> | The password is used for connecting to server | string | 1.6.0 |
+| kyuubi.engine.jdbc.connection.propagateCredential | false | Whether to use the session's user and password to connect to database | boolean | 1.8.0 |
+| kyuubi.engine.jdbc.connection.properties || The additional properties are used for connecting to server | seq | 1.6.0 |
+| kyuubi.engine.jdbc.connection.provider | <undefined> | The connection provider is used for getting a connection from the server | string | 1.6.0 |
+| kyuubi.engine.jdbc.connection.url | <undefined> | The server url that engine will connect to | string | 1.6.0 |
+| kyuubi.engine.jdbc.connection.user | <undefined> | The user is used for connecting to server | string | 1.6.0 |
+| kyuubi.engine.jdbc.driver.class | <undefined> | The driver class for JDBC engine connection | string | 1.6.0 |
+| kyuubi.engine.jdbc.extra.classpath | <undefined> | The extra classpath for the JDBC query engine, for configuring the location of the JDBC driver and etc. | string | 1.6.0 |
+| kyuubi.engine.jdbc.initialize.sql | SELECT 1 | SemiColon-separated list of SQL statements to be initialized in the newly created engine before queries. i.e. use `SELECT 1` to eagerly active JDBCClient. | seq | 1.8.0 |
+| kyuubi.engine.jdbc.java.options | <undefined> | The extra Java options for the JDBC query engine | string | 1.6.0 |
+| kyuubi.engine.jdbc.memory | 1g | The heap memory for the JDBC query engine | string | 1.6.0 |
+| kyuubi.engine.jdbc.session.initialize.sql || SemiColon-separated list of SQL statements to be initialized in the newly created engine session before queries. | seq | 1.8.0 |
+| kyuubi.engine.jdbc.type | <undefined> | The short name of JDBC type | string | 1.6.0 |
+| kyuubi.engine.kubernetes.submit.timeout | PT30S | The engine submit timeout for Kubernetes application. | duration | 1.7.2 |
+| kyuubi.engine.operation.convert.catalog.database.enabled | true | When set to true, The engine converts the JDBC methods of set/get Catalog and set/get Schema to the implementation of different engines | boolean | 1.6.0 |
+| kyuubi.engine.operation.log.dir.root | engine_operation_logs | Root directory for query operation log at engine-side. | string | 1.4.0 |
+| kyuubi.engine.pool.name | engine-pool | The name of the engine pool. | string | 1.5.0 |
+| kyuubi.engine.pool.selectPolicy | RANDOM | The select policy of an engine from the corresponding engine pool engine for a session.
RANDOM - Randomly use the engine in the pool
POLLING - Polling use the engine in the pool
| string | 1.7.0 |
+| kyuubi.engine.pool.size | -1 | The size of the engine pool. Note that, if the size is less than 1, the engine pool will not be enabled; otherwise, the size of the engine pool will be min(this, kyuubi.engine.pool.size.threshold). | int | 1.4.0 |
+| kyuubi.engine.pool.size.threshold | 9 | This parameter is introduced as a server-side parameter controlling the upper limit of the engine pool. | int | 1.4.0 |
+| kyuubi.engine.session.initialize.sql || SemiColon-separated list of SQL statements to be initialized in the newly created engine session before queries. This configuration can not be used in JDBC url due to the limitation of Beeline/JDBC driver. | seq | 1.3.0 |
+| kyuubi.engine.share.level | USER | Engines will be shared in different levels, available configs are:
CONNECTION: engine will not be shared but only used by the current client connection
USER: engine will be shared by all sessions created by a unique username, see also kyuubi.engine.share.level.subdomain
GROUP: the engine will be shared by all sessions created by all users belong to the same primary group name. The engine will be launched by the group name as the effective username, so here the group name is in value of special user who is able to visit the computing resources/data of the team. It follows the [Hadoop GroupsMapping](https://reurl.cc/xE61Y5) to map user to a primary group. If the primary group is not found, it fallback to the USER level.
SERVER: the App will be shared by Kyuubi servers
| string | 1.2.0 |
+| kyuubi.engine.share.level.sub.domain | <undefined> | (deprecated) - Using kyuubi.engine.share.level.subdomain instead | string | 1.2.0 |
+| kyuubi.engine.share.level.subdomain | <undefined> | Allow end-users to create a subdomain for the share level of an engine. A subdomain is a case-insensitive string values that must be a valid zookeeper subpath. For example, for the `USER` share level, an end-user can share a certain engine within a subdomain, not for all of its clients. End-users are free to create multiple engines in the `USER` share level. When disable engine pool, use 'default' if absent. | string | 1.4.0 |
+| kyuubi.engine.single.spark.session | false | When set to true, this engine is running in a single session mode. All the JDBC/ODBC connections share the temporary views, function registries, SQL configuration and the current database. | boolean | 1.3.0 |
+| kyuubi.engine.spark.event.loggers | SPARK | A comma-separated list of engine loggers, where engine/session/operation etc events go.
SPARK: the events will be written to the Spark listener bus.
JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
JDBC: to be done
CUSTOM: to be done.
| seq | 1.7.0 |
+| kyuubi.engine.spark.python.env.archive | <undefined> | Portable Python env archive used for Spark engine Python language mode. | string | 1.7.0 |
+| kyuubi.engine.spark.python.env.archive.exec.path | bin/python | The Python exec path under the Python env archive. | string | 1.7.0 |
+| kyuubi.engine.spark.python.home.archive | <undefined> | Spark archive containing $SPARK_HOME/python directory, which is used to init session Python worker for Python language mode. | string | 1.7.0 |
+| kyuubi.engine.submit.timeout | PT30S | Period to tolerant Driver Pod ephemerally invisible after submitting. In some Resource Managers, e.g. K8s, the Driver Pod is not visible immediately after `spark-submit` is returned. | duration | 1.7.1 |
+| kyuubi.engine.trino.connection.keystore.password | <undefined> | The keystore password used for connecting to trino cluster | string | 1.8.0 |
+| kyuubi.engine.trino.connection.keystore.path | <undefined> | The keystore path used for connecting to trino cluster | string | 1.8.0 |
+| kyuubi.engine.trino.connection.keystore.type | <undefined> | The keystore type used for connecting to trino cluster | string | 1.8.0 |
+| kyuubi.engine.trino.connection.password | <undefined> | The password used for connecting to trino cluster | string | 1.8.0 |
+| kyuubi.engine.trino.connection.truststore.password | <undefined> | The truststore password used for connecting to trino cluster | string | 1.8.0 |
+| kyuubi.engine.trino.connection.truststore.path | <undefined> | The truststore path used for connecting to trino cluster | string | 1.8.0 |
+| kyuubi.engine.trino.connection.truststore.type | <undefined> | The truststore type used for connecting to trino cluster | string | 1.8.0 |
+| kyuubi.engine.trino.event.loggers | JSON | A comma-separated list of engine history loggers, where engine/session/operation etc events go.
JSON: the events will be written to the location of kyuubi.engine.event.json.log.path
JDBC: to be done
CUSTOM: to be done.
| seq | 1.7.0 |
+| kyuubi.engine.trino.extra.classpath | <undefined> | The extra classpath for the Trino query engine, for configuring other libs which may need by the Trino engine | string | 1.6.0 |
+| kyuubi.engine.trino.java.options | <undefined> | The extra Java options for the Trino query engine | string | 1.6.0 |
+| kyuubi.engine.trino.memory | 1g | The heap memory for the Trino query engine | string | 1.6.0 |
+| kyuubi.engine.type | SPARK_SQL | Specify the detailed engine supported by Kyuubi. The engine type bindings to SESSION scope. This configuration is experimental. Currently, available configs are:
SPARK_SQL: specify this engine type will launch a Spark engine which can provide all the capacity of the Apache Spark. Note, it's a default engine type.
FLINK_SQL: specify this engine type will launch a Flink engine which can provide all the capacity of the Apache Flink.
TRINO: specify this engine type will launch a Trino engine which can provide all the capacity of the Trino.
HIVE_SQL: specify this engine type will launch a Hive engine which can provide all the capacity of the Hive Server2.
JDBC: specify this engine type will launch a JDBC engine which can forward queries to the database system through the certain JDBC driver, for now, it supports Doris and Phoenix.
CHAT: specify this engine type will launch a Chat engine.
| string | 1.4.0 |
+| kyuubi.engine.ui.retainedSessions | 200 | The number of SQL client sessions kept in the Kyuubi Query Engine web UI. | int | 1.4.0 |
+| kyuubi.engine.ui.retainedStatements | 200 | The number of statements kept in the Kyuubi Query Engine web UI. | int | 1.4.0 |
+| kyuubi.engine.ui.stop.enabled | true | When true, allows Kyuubi engine to be killed from the Spark Web UI. | boolean | 1.3.0 |
+| kyuubi.engine.user.isolated.spark.session | true | When set to false, if the engine is running in a group or server share level, all the JDBC/ODBC connections will be isolated against the user. Including the temporary views, function registries, SQL configuration, and the current database. Note that, it does not affect if the share level is connection or user. | boolean | 1.6.0 |
+| kyuubi.engine.user.isolated.spark.session.idle.interval | PT1M | The interval to check if the user-isolated Spark session is timeout. | duration | 1.6.0 |
+| kyuubi.engine.user.isolated.spark.session.idle.timeout | PT6H | If kyuubi.engine.user.isolated.spark.session is false, we will release the Spark session if its corresponding user is inactive after this configured timeout. | duration | 1.6.0 |
+| kyuubi.engine.yarn.submit.timeout | PT30S | The engine submit timeout for YARN application. | duration | 1.7.2 |
### Event
@@ -273,94 +207,96 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
### Frontend
-| Key | Default | Meaning | Type | Since |
-|--------------------------------------------------------|-------------------||----------|-------|
-| kyuubi.frontend.backoff.slot.length | PT0.1S | (deprecated) Time to back off during login to the thrift frontend service. | duration | 1.0.0 |
-| kyuubi.frontend.bind.host | <undefined> | Hostname or IP of the machine on which to run the frontend services. | string | 1.0.0 |
-| kyuubi.frontend.bind.port | 10009 | (deprecated) Port of the machine on which to run the thrift frontend service via the binary protocol. | int | 1.0.0 |
-| kyuubi.frontend.connection.url.use.hostname | true | When true, frontend services prefer hostname, otherwise, ip address. Note that, the default value is set to `false` when engine running on Kubernetes to prevent potential network issues. | boolean | 1.5.0 |
-| kyuubi.frontend.login.timeout | PT20S | (deprecated) Timeout for Thrift clients during login to the thrift frontend service. | duration | 1.0.0 |
-| kyuubi.frontend.max.message.size | 104857600 | (deprecated) Maximum message size in bytes a Kyuubi server will accept. | int | 1.0.0 |
-| kyuubi.frontend.max.worker.threads | 999 | (deprecated) Maximum number of threads in the frontend worker thread pool for the thrift frontend service | int | 1.0.0 |
-| kyuubi.frontend.min.worker.threads | 9 | (deprecated) Minimum number of threads in the frontend worker thread pool for the thrift frontend service | int | 1.0.0 |
-| kyuubi.frontend.mysql.bind.host | <undefined> | Hostname or IP of the machine on which to run the MySQL frontend service. | string | 1.4.0 |
-| kyuubi.frontend.mysql.bind.port | 3309 | Port of the machine on which to run the MySQL frontend service. | int | 1.4.0 |
-| kyuubi.frontend.mysql.max.worker.threads | 999 | Maximum number of threads in the command execution thread pool for the MySQL frontend service | int | 1.4.0 |
-| kyuubi.frontend.mysql.min.worker.threads | 9 | Minimum number of threads in the command execution thread pool for the MySQL frontend service | int | 1.4.0 |
-| kyuubi.frontend.mysql.netty.worker.threads | <undefined> | Number of thread in the netty worker event loop of MySQL frontend service. Use min(cpu_cores, 8) in default. | int | 1.4.0 |
-| kyuubi.frontend.mysql.worker.keepalive.time | PT1M | Time(ms) that an idle async thread of the command execution thread pool will wait for a new task to arrive before terminating in MySQL frontend service | duration | 1.4.0 |
-| kyuubi.frontend.protocols | THRIFT_BINARY | A comma-separated list for all frontend protocols
| seq | 1.4.0 |
-| kyuubi.frontend.proxy.http.client.ip.header | X-Real-IP | The HTTP header to record the real client IP address. If your server is behind a load balancer or other proxy, the server will see this load balancer or proxy IP address as the client IP address, to get around this common issue, most load balancers or proxies offer the ability to record the real remote IP address in an HTTP header that will be added to the request for other devices to use. Note that, because the header value can be specified to any IP address, so it will not be used for authentication. | string | 1.6.0 |
-| kyuubi.frontend.rest.bind.host | <undefined> | Hostname or IP of the machine on which to run the REST frontend service. | string | 1.4.0 |
-| kyuubi.frontend.rest.bind.port | 10099 | Port of the machine on which to run the REST frontend service. | int | 1.4.0 |
-| kyuubi.frontend.rest.max.worker.threads | 999 | Maximum number of threads in the frontend worker thread pool for the rest frontend service | int | 1.6.2 |
-| kyuubi.frontend.ssl.keystore.algorithm | <undefined> | SSL certificate keystore algorithm. | string | 1.7.0 |
-| kyuubi.frontend.ssl.keystore.password | <undefined> | SSL certificate keystore password. | string | 1.7.0 |
-| kyuubi.frontend.ssl.keystore.path | <undefined> | SSL certificate keystore location. | string | 1.7.0 |
-| kyuubi.frontend.ssl.keystore.type | <undefined> | SSL certificate keystore type. | string | 1.7.0 |
-| kyuubi.frontend.thrift.backoff.slot.length | PT0.1S | Time to back off during login to the thrift frontend service. | duration | 1.4.0 |
-| kyuubi.frontend.thrift.binary.bind.host | <undefined> | Hostname or IP of the machine on which to run the thrift frontend service via the binary protocol. | string | 1.4.0 |
-| kyuubi.frontend.thrift.binary.bind.port | 10009 | Port of the machine on which to run the thrift frontend service via the binary protocol. | int | 1.4.0 |
-| kyuubi.frontend.thrift.binary.ssl.disallowed.protocols | SSLv2,SSLv3 | SSL versions to disallow for Kyuubi thrift binary frontend. | seq | 1.7.0 |
-| kyuubi.frontend.thrift.binary.ssl.enabled | false | Set this to true for using SSL encryption in thrift binary frontend server. | boolean | 1.7.0 |
-| kyuubi.frontend.thrift.binary.ssl.include.ciphersuites || A comma-separated list of include SSL cipher suite names for thrift binary frontend. | seq | 1.7.0 |
-| kyuubi.frontend.thrift.http.allow.user.substitution | true | Allow alternate user to be specified as part of open connection request when using HTTP transport mode. | boolean | 1.6.0 |
-| kyuubi.frontend.thrift.http.bind.host | <undefined> | Hostname or IP of the machine on which to run the thrift frontend service via http protocol. | string | 1.6.0 |
-| kyuubi.frontend.thrift.http.bind.port | 10010 | Port of the machine on which to run the thrift frontend service via http protocol. | int | 1.6.0 |
-| kyuubi.frontend.thrift.http.compression.enabled | true | Enable thrift http compression via Jetty compression support | boolean | 1.6.0 |
-| kyuubi.frontend.thrift.http.cookie.auth.enabled | true | When true, Kyuubi in HTTP transport mode, will use cookie-based authentication mechanism | boolean | 1.6.0 |
-| kyuubi.frontend.thrift.http.cookie.domain | <undefined> | Domain for the Kyuubi generated cookies | string | 1.6.0 |
-| kyuubi.frontend.thrift.http.cookie.is.httponly | true | HttpOnly attribute of the Kyuubi generated cookie. | boolean | 1.6.0 |
-| kyuubi.frontend.thrift.http.cookie.max.age | 86400 | Maximum age in seconds for server side cookie used by Kyuubi in HTTP mode. | int | 1.6.0 |
-| kyuubi.frontend.thrift.http.cookie.path | <undefined> | Path for the Kyuubi generated cookies | string | 1.6.0 |
-| kyuubi.frontend.thrift.http.max.idle.time | PT30M | Maximum idle time for a connection on the server when in HTTP mode. | duration | 1.6.0 |
-| kyuubi.frontend.thrift.http.path | cliservice | Path component of URL endpoint when in HTTP mode. | string | 1.6.0 |
-| kyuubi.frontend.thrift.http.request.header.size | 6144 | Request header size in bytes, when using HTTP transport mode. Jetty defaults used. | int | 1.6.0 |
-| kyuubi.frontend.thrift.http.response.header.size | 6144 | Response header size in bytes, when using HTTP transport mode. Jetty defaults used. | int | 1.6.0 |
-| kyuubi.frontend.thrift.http.ssl.exclude.ciphersuites || A comma-separated list of exclude SSL cipher suite names for thrift http frontend. | seq | 1.7.0 |
-| kyuubi.frontend.thrift.http.ssl.keystore.password | <undefined> | SSL certificate keystore password. | string | 1.6.0 |
-| kyuubi.frontend.thrift.http.ssl.keystore.path | <undefined> | SSL certificate keystore location. | string | 1.6.0 |
-| kyuubi.frontend.thrift.http.ssl.protocol.blacklist | SSLv2,SSLv3 | SSL Versions to disable when using HTTP transport mode. | seq | 1.6.0 |
-| kyuubi.frontend.thrift.http.use.SSL | false | Set this to true for using SSL encryption in http mode. | boolean | 1.6.0 |
-| kyuubi.frontend.thrift.http.xsrf.filter.enabled | false | If enabled, Kyuubi will block any requests made to it over HTTP if an X-XSRF-HEADER header is not present | boolean | 1.6.0 |
-| kyuubi.frontend.thrift.login.timeout | PT20S | Timeout for Thrift clients during login to the thrift frontend service. | duration | 1.4.0 |
-| kyuubi.frontend.thrift.max.message.size | 104857600 | Maximum message size in bytes a Kyuubi server will accept. | int | 1.4.0 |
-| kyuubi.frontend.thrift.max.worker.threads | 999 | Maximum number of threads in the frontend worker thread pool for the thrift frontend service | int | 1.4.0 |
-| kyuubi.frontend.thrift.min.worker.threads | 9 | Minimum number of threads in the frontend worker thread pool for the thrift frontend service | int | 1.4.0 |
-| kyuubi.frontend.thrift.worker.keepalive.time | PT1M | Keep-alive time (in milliseconds) for an idle worker thread | duration | 1.4.0 |
-| kyuubi.frontend.trino.bind.host | <undefined> | Hostname or IP of the machine on which to run the TRINO frontend service. | string | 1.7.0 |
-| kyuubi.frontend.trino.bind.port | 10999 | Port of the machine on which to run the TRINO frontend service. | int | 1.7.0 |
-| kyuubi.frontend.trino.max.worker.threads | 999 | Maximum number of threads in the frontend worker thread pool for the Trino frontend service | int | 1.7.0 |
-| kyuubi.frontend.worker.keepalive.time | PT1M | (deprecated) Keep-alive time (in milliseconds) for an idle worker thread | duration | 1.0.0 |
+| Key | Default | Meaning | Type | Since |
+|--------------------------------------------------------|--------------------||----------|-------|
+| kyuubi.frontend.advertised.host | <undefined> | Hostname or IP of the Kyuubi server's frontend services to publish to external systems such as the service discovery ensemble and metadata store. Use it when you want to advertise a different hostname or IP than the bind host. | string | 1.8.0 |
+| kyuubi.frontend.backoff.slot.length | PT0.1S | (deprecated) Time to back off during login to the thrift frontend service. | duration | 1.0.0 |
+| kyuubi.frontend.bind.host | <undefined> | Hostname or IP of the machine on which to run the frontend services. | string | 1.0.0 |
+| kyuubi.frontend.bind.port | 10009 | (deprecated) Port of the machine on which to run the thrift frontend service via the binary protocol. | int | 1.0.0 |
+| kyuubi.frontend.connection.url.use.hostname | true | When true, frontend services prefer hostname, otherwise, ip address. Note that, the default value is set to `false` when engine running on Kubernetes to prevent potential network issues. | boolean | 1.5.0 |
+| kyuubi.frontend.login.timeout | PT20S | (deprecated) Timeout for Thrift clients during login to the thrift frontend service. | duration | 1.0.0 |
+| kyuubi.frontend.max.message.size | 104857600 | (deprecated) Maximum message size in bytes a Kyuubi server will accept. | int | 1.0.0 |
+| kyuubi.frontend.max.worker.threads | 999 | (deprecated) Maximum number of threads in the frontend worker thread pool for the thrift frontend service | int | 1.0.0 |
+| kyuubi.frontend.min.worker.threads | 9 | (deprecated) Minimum number of threads in the frontend worker thread pool for the thrift frontend service | int | 1.0.0 |
+| kyuubi.frontend.mysql.bind.host | <undefined> | Hostname or IP of the machine on which to run the MySQL frontend service. | string | 1.4.0 |
+| kyuubi.frontend.mysql.bind.port | 3309 | Port of the machine on which to run the MySQL frontend service. | int | 1.4.0 |
+| kyuubi.frontend.mysql.max.worker.threads | 999 | Maximum number of threads in the command execution thread pool for the MySQL frontend service | int | 1.4.0 |
+| kyuubi.frontend.mysql.min.worker.threads | 9 | Minimum number of threads in the command execution thread pool for the MySQL frontend service | int | 1.4.0 |
+| kyuubi.frontend.mysql.netty.worker.threads | <undefined> | Number of thread in the netty worker event loop of MySQL frontend service. Use min(cpu_cores, 8) in default. | int | 1.4.0 |
+| kyuubi.frontend.mysql.worker.keepalive.time | PT1M | Time(ms) that an idle async thread of the command execution thread pool will wait for a new task to arrive before terminating in MySQL frontend service | duration | 1.4.0 |
+| kyuubi.frontend.protocols | THRIFT_BINARY,REST | A comma-separated list for all frontend protocols
| seq | 1.4.0 |
+| kyuubi.frontend.proxy.http.client.ip.header | X-Real-IP | The HTTP header to record the real client IP address. If your server is behind a load balancer or other proxy, the server will see this load balancer or proxy IP address as the client IP address, to get around this common issue, most load balancers or proxies offer the ability to record the real remote IP address in an HTTP header that will be added to the request for other devices to use. Note that, because the header value can be specified to any IP address, so it will not be used for authentication. | string | 1.6.0 |
+| kyuubi.frontend.rest.bind.host | <undefined> | Hostname or IP of the machine on which to run the REST frontend service. | string | 1.4.0 |
+| kyuubi.frontend.rest.bind.port | 10099 | Port of the machine on which to run the REST frontend service. | int | 1.4.0 |
+| kyuubi.frontend.rest.max.worker.threads | 999 | Maximum number of threads in the frontend worker thread pool for the rest frontend service | int | 1.6.2 |
+| kyuubi.frontend.ssl.keystore.algorithm | <undefined> | SSL certificate keystore algorithm. | string | 1.7.0 |
+| kyuubi.frontend.ssl.keystore.password | <undefined> | SSL certificate keystore password. | string | 1.7.0 |
+| kyuubi.frontend.ssl.keystore.path | <undefined> | SSL certificate keystore location. | string | 1.7.0 |
+| kyuubi.frontend.ssl.keystore.type | <undefined> | SSL certificate keystore type. | string | 1.7.0 |
+| kyuubi.frontend.thrift.backoff.slot.length | PT0.1S | Time to back off during login to the thrift frontend service. | duration | 1.4.0 |
+| kyuubi.frontend.thrift.binary.bind.host | <undefined> | Hostname or IP of the machine on which to run the thrift frontend service via the binary protocol. | string | 1.4.0 |
+| kyuubi.frontend.thrift.binary.bind.port | 10009 | Port of the machine on which to run the thrift frontend service via the binary protocol. | int | 1.4.0 |
+| kyuubi.frontend.thrift.binary.ssl.disallowed.protocols | SSLv2,SSLv3 | SSL versions to disallow for Kyuubi thrift binary frontend. | set | 1.7.0 |
+| kyuubi.frontend.thrift.binary.ssl.enabled | false | Set this to true for using SSL encryption in thrift binary frontend server. | boolean | 1.7.0 |
+| kyuubi.frontend.thrift.binary.ssl.include.ciphersuites || A comma-separated list of include SSL cipher suite names for thrift binary frontend. | seq | 1.7.0 |
+| kyuubi.frontend.thrift.http.allow.user.substitution | true | Allow alternate user to be specified as part of open connection request when using HTTP transport mode. | boolean | 1.6.0 |
+| kyuubi.frontend.thrift.http.bind.host | <undefined> | Hostname or IP of the machine on which to run the thrift frontend service via http protocol. | string | 1.6.0 |
+| kyuubi.frontend.thrift.http.bind.port | 10010 | Port of the machine on which to run the thrift frontend service via http protocol. | int | 1.6.0 |
+| kyuubi.frontend.thrift.http.compression.enabled | true | Enable thrift http compression via Jetty compression support | boolean | 1.6.0 |
+| kyuubi.frontend.thrift.http.cookie.auth.enabled | true | When true, Kyuubi in HTTP transport mode, will use cookie-based authentication mechanism | boolean | 1.6.0 |
+| kyuubi.frontend.thrift.http.cookie.domain | <undefined> | Domain for the Kyuubi generated cookies | string | 1.6.0 |
+| kyuubi.frontend.thrift.http.cookie.is.httponly | true | HttpOnly attribute of the Kyuubi generated cookie. | boolean | 1.6.0 |
+| kyuubi.frontend.thrift.http.cookie.max.age | 86400 | Maximum age in seconds for server side cookie used by Kyuubi in HTTP mode. | int | 1.6.0 |
+| kyuubi.frontend.thrift.http.cookie.path | <undefined> | Path for the Kyuubi generated cookies | string | 1.6.0 |
+| kyuubi.frontend.thrift.http.max.idle.time | PT30M | Maximum idle time for a connection on the server when in HTTP mode. | duration | 1.6.0 |
+| kyuubi.frontend.thrift.http.path | cliservice | Path component of URL endpoint when in HTTP mode. | string | 1.6.0 |
+| kyuubi.frontend.thrift.http.request.header.size | 6144 | Request header size in bytes, when using HTTP transport mode. Jetty defaults used. | int | 1.6.0 |
+| kyuubi.frontend.thrift.http.response.header.size | 6144 | Response header size in bytes, when using HTTP transport mode. Jetty defaults used. | int | 1.6.0 |
+| kyuubi.frontend.thrift.http.ssl.exclude.ciphersuites || A comma-separated list of exclude SSL cipher suite names for thrift http frontend. | seq | 1.7.0 |
+| kyuubi.frontend.thrift.http.ssl.keystore.password | <undefined> | SSL certificate keystore password. | string | 1.6.0 |
+| kyuubi.frontend.thrift.http.ssl.keystore.path | <undefined> | SSL certificate keystore location. | string | 1.6.0 |
+| kyuubi.frontend.thrift.http.ssl.protocol.blacklist | SSLv2,SSLv3 | SSL Versions to disable when using HTTP transport mode. | seq | 1.6.0 |
+| kyuubi.frontend.thrift.http.use.SSL | false | Set this to true for using SSL encryption in http mode. | boolean | 1.6.0 |
+| kyuubi.frontend.thrift.http.xsrf.filter.enabled | false | If enabled, Kyuubi will block any requests made to it over HTTP if an X-XSRF-HEADER header is not present | boolean | 1.6.0 |
+| kyuubi.frontend.thrift.login.timeout | PT20S | Timeout for Thrift clients during login to the thrift frontend service. | duration | 1.4.0 |
+| kyuubi.frontend.thrift.max.message.size | 104857600 | Maximum message size in bytes a Kyuubi server will accept. | int | 1.4.0 |
+| kyuubi.frontend.thrift.max.worker.threads | 999 | Maximum number of threads in the frontend worker thread pool for the thrift frontend service | int | 1.4.0 |
+| kyuubi.frontend.thrift.min.worker.threads | 9 | Minimum number of threads in the frontend worker thread pool for the thrift frontend service | int | 1.4.0 |
+| kyuubi.frontend.thrift.worker.keepalive.time | PT1M | Keep-alive time (in milliseconds) for an idle worker thread | duration | 1.4.0 |
+| kyuubi.frontend.trino.bind.host | <undefined> | Hostname or IP of the machine on which to run the TRINO frontend service. | string | 1.7.0 |
+| kyuubi.frontend.trino.bind.port | 10999 | Port of the machine on which to run the TRINO frontend service. | int | 1.7.0 |
+| kyuubi.frontend.trino.max.worker.threads | 999 | Maximum number of threads in the frontend worker thread pool for the Trino frontend service | int | 1.7.0 |
+| kyuubi.frontend.worker.keepalive.time | PT1M | (deprecated) Keep-alive time (in milliseconds) for an idle worker thread | duration | 1.0.0 |
### Ha
-| Key | Default | Meaning | Type | Since |
-|------------------------------------------------|----------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
-| kyuubi.ha.addresses || The connection string for the discovery ensemble | string | 1.6.0 |
-| kyuubi.ha.client.class | org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient | Class name for service discovery client.
| string | 1.6.0 |
-| kyuubi.ha.etcd.lease.timeout | PT10S | Timeout for etcd keep alive lease. The kyuubi server will know the unexpected loss of engine after up to this seconds. | duration | 1.6.0 |
-| kyuubi.ha.etcd.ssl.ca.path | <undefined> | Where the etcd CA certificate file is stored. | string | 1.6.0 |
-| kyuubi.ha.etcd.ssl.client.certificate.path | <undefined> | Where the etcd SSL certificate file is stored. | string | 1.6.0 |
-| kyuubi.ha.etcd.ssl.client.key.path | <undefined> | Where the etcd SSL key file is stored. | string | 1.6.0 |
-| kyuubi.ha.etcd.ssl.enabled | false | When set to true, will build an SSL secured etcd client. | boolean | 1.6.0 |
-| kyuubi.ha.namespace | kyuubi | The root directory for the service to deploy its instance uri | string | 1.6.0 |
-| kyuubi.ha.zookeeper.acl.enabled | false | Set to true if the ZooKeeper ensemble is kerberized | boolean | 1.0.0 |
-| kyuubi.ha.zookeeper.auth.digest | <undefined> | The digest auth string is used for ZooKeeper authentication, like: username:password. | string | 1.3.2 |
-| kyuubi.ha.zookeeper.auth.keytab | <undefined> | Location of the Kyuubi server's keytab is used for ZooKeeper authentication. | string | 1.3.2 |
-| kyuubi.ha.zookeeper.auth.principal | <undefined> | Name of the Kerberos principal is used for ZooKeeper authentication. | string | 1.3.2 |
-| kyuubi.ha.zookeeper.auth.type | NONE | The type of ZooKeeper authentication, all candidates are
NONE
KERBEROS
DIGEST
| string | 1.3.2 |
-| kyuubi.ha.zookeeper.connection.base.retry.wait | 1000 | Initial amount of time to wait between retries to the ZooKeeper ensemble | int | 1.0.0 |
-| kyuubi.ha.zookeeper.connection.max.retries | 3 | Max retry times for connecting to the ZooKeeper ensemble | int | 1.0.0 |
-| kyuubi.ha.zookeeper.connection.max.retry.wait | 30000 | Max amount of time to wait between retries for BOUNDED_EXPONENTIAL_BACKOFF policy can reach, or max time until elapsed for UNTIL_ELAPSED policy to connect the zookeeper ensemble | int | 1.0.0 |
-| kyuubi.ha.zookeeper.connection.retry.policy | EXPONENTIAL_BACKOFF | The retry policy for connecting to the ZooKeeper ensemble, all candidates are:
ONE_TIME
N_TIME
EXPONENTIAL_BACKOFF
BOUNDED_EXPONENTIAL_BACKOFF
UNTIL_ELAPSED
| string | 1.0.0 |
-| kyuubi.ha.zookeeper.connection.timeout | 15000 | The timeout(ms) of creating the connection to the ZooKeeper ensemble | int | 1.0.0 |
-| kyuubi.ha.zookeeper.engine.auth.type | NONE | The type of ZooKeeper authentication for the engine, all candidates are
NONE
KERBEROS
DIGEST
| string | 1.3.2 |
-| kyuubi.ha.zookeeper.namespace | kyuubi | (deprecated) The root directory for the service to deploy its instance uri | string | 1.0.0 |
-| kyuubi.ha.zookeeper.node.creation.timeout | PT2M | Timeout for creating ZooKeeper node | duration | 1.2.0 |
-| kyuubi.ha.zookeeper.publish.configs | false | When set to true, publish Kerberos configs to Zookeeper. Note that the Hive driver needs to be greater than 1.3 or 2.0 or apply HIVE-11581 patch. | boolean | 1.4.0 |
-| kyuubi.ha.zookeeper.quorum || (deprecated) The connection string for the ZooKeeper ensemble | string | 1.0.0 |
-| kyuubi.ha.zookeeper.session.timeout | 60000 | The timeout(ms) of a connected session to be idled | int | 1.0.0 |
+| Key | Default | Meaning | Type | Since |
+|------------------------------------------------|----------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
+| kyuubi.ha.addresses || The connection string for the discovery ensemble | string | 1.6.0 |
+| kyuubi.ha.client.class | org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient | Class name for service discovery client.
| string | 1.6.0 |
+| kyuubi.ha.etcd.lease.timeout | PT10S | Timeout for etcd keep alive lease. The kyuubi server will know the unexpected loss of engine after up to this seconds. | duration | 1.6.0 |
+| kyuubi.ha.etcd.ssl.ca.path | <undefined> | Where the etcd CA certificate file is stored. | string | 1.6.0 |
+| kyuubi.ha.etcd.ssl.client.certificate.path | <undefined> | Where the etcd SSL certificate file is stored. | string | 1.6.0 |
+| kyuubi.ha.etcd.ssl.client.key.path | <undefined> | Where the etcd SSL key file is stored. | string | 1.6.0 |
+| kyuubi.ha.etcd.ssl.enabled | false | When set to true, will build an SSL secured etcd client. | boolean | 1.6.0 |
+| kyuubi.ha.namespace | kyuubi | The root directory for the service to deploy its instance uri | string | 1.6.0 |
+| kyuubi.ha.zookeeper.acl.enabled | false | Set to true if the ZooKeeper ensemble is kerberized | boolean | 1.0.0 |
+| kyuubi.ha.zookeeper.auth.digest | <undefined> | The digest auth string is used for ZooKeeper authentication, like: username:password. | string | 1.3.2 |
+| kyuubi.ha.zookeeper.auth.keytab | <undefined> | Location of the Kyuubi server's keytab that is used for ZooKeeper authentication. | string | 1.3.2 |
+| kyuubi.ha.zookeeper.auth.principal | <undefined> | Kerberos principal name that is used for ZooKeeper authentication. | string | 1.3.2 |
+| kyuubi.ha.zookeeper.auth.serverPrincipal | <undefined> | Kerberos principal name of ZooKeeper Server. It only takes effect when Zookeeper client's version at least 3.5.7 or 3.6.0 or applies ZOOKEEPER-1467. To use Zookeeper 3.6 client, compile Kyuubi with `-Pzookeeper-3.6`. | string | 1.8.0 |
+| kyuubi.ha.zookeeper.auth.type | NONE | The type of ZooKeeper authentication, all candidates are
NONE
KERBEROS
DIGEST
| string | 1.3.2 |
+| kyuubi.ha.zookeeper.connection.base.retry.wait | 1000 | Initial amount of time to wait between retries to the ZooKeeper ensemble | int | 1.0.0 |
+| kyuubi.ha.zookeeper.connection.max.retries | 3 | Max retry times for connecting to the ZooKeeper ensemble | int | 1.0.0 |
+| kyuubi.ha.zookeeper.connection.max.retry.wait | 30000 | Max amount of time to wait between retries for BOUNDED_EXPONENTIAL_BACKOFF policy can reach, or max time until elapsed for UNTIL_ELAPSED policy to connect the zookeeper ensemble | int | 1.0.0 |
+| kyuubi.ha.zookeeper.connection.retry.policy | EXPONENTIAL_BACKOFF | The retry policy for connecting to the ZooKeeper ensemble, all candidates are:
ONE_TIME
N_TIME
EXPONENTIAL_BACKOFF
BOUNDED_EXPONENTIAL_BACKOFF
UNTIL_ELAPSED
| string | 1.0.0 |
+| kyuubi.ha.zookeeper.connection.timeout | 15000 | The timeout(ms) of creating the connection to the ZooKeeper ensemble | int | 1.0.0 |
+| kyuubi.ha.zookeeper.engine.auth.type | NONE | The type of ZooKeeper authentication for the engine, all candidates are
NONE
KERBEROS
DIGEST
| string | 1.3.2 |
+| kyuubi.ha.zookeeper.namespace | kyuubi | (deprecated) The root directory for the service to deploy its instance uri | string | 1.0.0 |
+| kyuubi.ha.zookeeper.node.creation.timeout | PT2M | Timeout for creating ZooKeeper node | duration | 1.2.0 |
+| kyuubi.ha.zookeeper.publish.configs | false | When set to true, publish Kerberos configs to Zookeeper. Note that the Hive driver needs to be greater than 1.3 or 2.0 or apply HIVE-11581 patch. | boolean | 1.4.0 |
+| kyuubi.ha.zookeeper.quorum || (deprecated) The connection string for the ZooKeeper ensemble | string | 1.0.0 |
+| kyuubi.ha.zookeeper.session.timeout | 60000 | The timeout(ms) of a connected session to be idled | int | 1.0.0 |
### Kinit
@@ -373,98 +309,118 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
### Kubernetes
-| Key | Default | Meaning | Type | Since |
-|-----------------------------------------------|-------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|-------|
-| kyuubi.kubernetes.authenticate.caCertFile | <undefined> | Path to the CA cert file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) | string | 1.7.0 |
-| kyuubi.kubernetes.authenticate.clientCertFile | <undefined> | Path to the client cert file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) | string | 1.7.0 |
-| kyuubi.kubernetes.authenticate.clientKeyFile | <undefined> | Path to the client key file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) | string | 1.7.0 |
-| kyuubi.kubernetes.authenticate.oauthToken | <undefined> | The OAuth token to use when authenticating against the Kubernetes API server. Note that unlike, the other authentication options, this must be the exact string value of the token to use for the authentication. | string | 1.7.0 |
-| kyuubi.kubernetes.authenticate.oauthTokenFile | <undefined> | Path to the file containing the OAuth token to use when authenticating against the Kubernetes API server. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) | string | 1.7.0 |
-| kyuubi.kubernetes.context | <undefined> | The desired context from your kubernetes config file used to configure the K8s client for interacting with the cluster. | string | 1.6.0 |
-| kyuubi.kubernetes.master.address | <undefined> | The internal Kubernetes master (API server) address to be used for kyuubi. | string | 1.7.0 |
-| kyuubi.kubernetes.namespace | default | The namespace that will be used for running the kyuubi pods and find engines. | string | 1.7.0 |
-| kyuubi.kubernetes.trust.certificates | false | If set to true then client can submit to kubernetes cluster only with token | boolean | 1.7.0 |
+| Key | Default | Meaning | Type | Since |
+|-----------------------------------------------------|-------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
+| kyuubi.kubernetes.authenticate.caCertFile | <undefined> | Path to the CA cert file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) | string | 1.7.0 |
+| kyuubi.kubernetes.authenticate.clientCertFile | <undefined> | Path to the client cert file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) | string | 1.7.0 |
+| kyuubi.kubernetes.authenticate.clientKeyFile | <undefined> | Path to the client key file for connecting to the Kubernetes API server over TLS from the kyuubi. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) | string | 1.7.0 |
+| kyuubi.kubernetes.authenticate.oauthToken | <undefined> | The OAuth token to use when authenticating against the Kubernetes API server. Note that unlike, the other authentication options, this must be the exact string value of the token to use for the authentication. | string | 1.7.0 |
+| kyuubi.kubernetes.authenticate.oauthTokenFile | <undefined> | Path to the file containing the OAuth token to use when authenticating against the Kubernetes API server. Specify this as a path as opposed to a URI (i.e. do not provide a scheme) | string | 1.7.0 |
+| kyuubi.kubernetes.context | <undefined> | The desired context from your kubernetes config file used to configure the K8s client for interacting with the cluster. | string | 1.6.0 |
+| kyuubi.kubernetes.context.allow.list || The allowed kubernetes context list, if it is empty, there is no kubernetes context limitation. | set | 1.8.0 |
+| kyuubi.kubernetes.master.address | <undefined> | The internal Kubernetes master (API server) address to be used for kyuubi. | string | 1.7.0 |
+| kyuubi.kubernetes.namespace | default | The namespace that will be used for running the kyuubi pods and find engines. | string | 1.7.0 |
+| kyuubi.kubernetes.namespace.allow.list || The allowed kubernetes namespace list, if it is empty, there is no kubernetes namespace limitation. | set | 1.8.0 |
+| kyuubi.kubernetes.terminatedApplicationRetainPeriod | PT5M | The period for which the Kyuubi server retains application information after the application terminates. | duration | 1.7.1 |
+| kyuubi.kubernetes.trust.certificates | false | If set to true then client can submit to kubernetes cluster only with token | boolean | 1.7.0 |
+
+### Lineage
+
+| Key | Default | Meaning | Type | Since |
+|---------------------------------------|--------------------------------------------------------|---------------------------------------------------|--------|-------|
+| kyuubi.lineage.parser.plugin.provider | org.apache.kyuubi.plugin.lineage.LineageParserProvider | The provider for the Spark lineage parser plugin. | string | 1.8.0 |
### Metadata
-| Key | Default | Meaning | Type | Since |
-|-------------------------------------------------|----------------------------------------------------------||----------|-------|
-| kyuubi.metadata.cleaner.enabled | true | Whether to clean the metadata periodically. If it is enabled, Kyuubi will clean the metadata that is in the terminate state with max age limitation. | boolean | 1.6.0 |
-| kyuubi.metadata.cleaner.interval | PT30M | The interval to check and clean expired metadata. | duration | 1.6.0 |
-| kyuubi.metadata.max.age | PT72H | The maximum age of metadata, the metadata exceeding the age will be cleaned. | duration | 1.6.0 |
-| kyuubi.metadata.recovery.threads | 10 | The number of threads for recovery from the metadata store when the Kyuubi server restarts. | int | 1.6.0 |
-| kyuubi.metadata.request.retry.interval | PT5S | The interval to check and trigger the metadata request retry tasks. | duration | 1.6.0 |
-| kyuubi.metadata.request.retry.queue.size | 65536 | The maximum queue size for buffering metadata requests in memory when the external metadata storage is down. Requests will be dropped if the queue exceeds. | int | 1.6.0 |
-| kyuubi.metadata.request.retry.threads | 10 | Number of threads in the metadata request retry manager thread pool. The metadata store might be unavailable sometimes and the requests will fail, tolerant for this case and unblock the main thread, we support retrying the failed requests in an async way. | int | 1.6.0 |
-| kyuubi.metadata.store.class | org.apache.kyuubi.server.metadata.jdbc.JDBCMetadataStore | Fully qualified class name for server metadata store. | string | 1.6.0 |
-| kyuubi.metadata.store.jdbc.database.schema.init | true | Whether to init the JDBC metadata store database schema. | boolean | 1.6.0 |
-| kyuubi.metadata.store.jdbc.database.type | DERBY | The database type for server jdbc metadata store.
CUSTOM: User-defined database type, need to specify corresponding JDBC driver.
Note that: The JDBC datasource is powered by HiKariCP, for datasource properties, please specify them with the prefix: kyuubi.metadata.store.jdbc.datasource. For example, kyuubi.metadata.store.jdbc.datasource.connectionTimeout=10000. | string | 1.6.0 |
-| kyuubi.metadata.store.jdbc.driver | <undefined> | JDBC driver class name for server jdbc metadata store. | string | 1.6.0 |
-| kyuubi.metadata.store.jdbc.password || The password for server JDBC metadata store. | string | 1.6.0 |
-| kyuubi.metadata.store.jdbc.url | jdbc:derby:memory:kyuubi_state_store_db;create=true | The JDBC url for server JDBC metadata store. By default, it is a DERBY in-memory database url, and the state information is not shared across kyuubi instances. To enable high availability for multiple kyuubi instances, please specify a production JDBC url. | string | 1.6.0 |
-| kyuubi.metadata.store.jdbc.user || The username for server JDBC metadata store. | string | 1.6.0 |
+| Key | Default | Meaning | Type | Since |
+|-------------------------------------------------|----------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
+| kyuubi.metadata.cleaner.enabled | true | Whether to clean the metadata periodically. If it is enabled, Kyuubi will clean the metadata that is in the terminate state with max age limitation. | boolean | 1.6.0 |
+| kyuubi.metadata.cleaner.interval | PT30M | The interval to check and clean expired metadata. | duration | 1.6.0 |
+| kyuubi.metadata.max.age | PT72H | The maximum age of metadata, the metadata exceeding the age will be cleaned. | duration | 1.6.0 |
+| kyuubi.metadata.recovery.threads | 10 | The number of threads for recovery from the metadata store when the Kyuubi server restarts. | int | 1.6.0 |
+| kyuubi.metadata.request.async.retry.enabled | true | Whether to retry in async when metadata request failed. When true, return success response immediately even the metadata request failed, and schedule it in background until success, to tolerate long-time metadata store outages w/o blocking the submission request. | boolean | 1.7.0 |
+| kyuubi.metadata.request.async.retry.queue.size | 65536 | The maximum queue size for buffering metadata requests in memory when the external metadata storage is down. Requests will be dropped if the queue exceeds. Only take affect when kyuubi.metadata.request.async.retry.enabled is `true`. | int | 1.6.0 |
+| kyuubi.metadata.request.async.retry.threads | 10 | Number of threads in the metadata request async retry manager thread pool. Only take affect when kyuubi.metadata.request.async.retry.enabled is `true`. | int | 1.6.0 |
+| kyuubi.metadata.request.retry.interval | PT5S | The interval to check and trigger the metadata request retry tasks. | duration | 1.6.0 |
+| kyuubi.metadata.store.class | org.apache.kyuubi.server.metadata.jdbc.JDBCMetadataStore | Fully qualified class name for server metadata store. | string | 1.6.0 |
+| kyuubi.metadata.store.jdbc.database.schema.init | true | Whether to init the JDBC metadata store database schema. | boolean | 1.6.0 |
+| kyuubi.metadata.store.jdbc.database.type | SQLITE | The database type for server jdbc metadata store.
CUSTOM: User-defined database type, need to specify corresponding JDBC driver.
Note that: The JDBC datasource is powered by HiKariCP, for datasource properties, please specify them with the prefix: kyuubi.metadata.store.jdbc.datasource. For example, kyuubi.metadata.store.jdbc.datasource.connectionTimeout=10000. | string | 1.6.0 |
+| kyuubi.metadata.store.jdbc.driver | <undefined> | JDBC driver class name for server jdbc metadata store. | string | 1.6.0 |
+| kyuubi.metadata.store.jdbc.password || The password for server JDBC metadata store. | string | 1.6.0 |
+| kyuubi.metadata.store.jdbc.priority.enabled | false | Whether to enable the priority scheduling for batch impl v2. When false, ignore kyuubi.batch.priority and use the FIFO ordering strategy for batch job scheduling. Note: this feature may cause significant performance issues when using MySQL 5.7 as the metastore backend due to the lack of support for mixed order index. See more details at KYUUBI #5329. | boolean | 1.8.0 |
+| kyuubi.metadata.store.jdbc.url | jdbc:sqlite:kyuubi_state_store.db | The JDBC url for server JDBC metadata store. By default, it is a SQLite database url, and the state information is not shared across kyuubi instances. To enable high availability for multiple kyuubi instances, please specify a production JDBC url. | string | 1.6.0 |
+| kyuubi.metadata.store.jdbc.user || The username for server JDBC metadata store. | string | 1.6.0 |
### Metrics
-| Key | Default | Meaning | Type | Since |
-|---------------------------------|----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
-| kyuubi.metrics.console.interval | PT5S | How often should report metrics to console | duration | 1.2.0 |
-| kyuubi.metrics.enabled | true | Set to true to enable kyuubi metrics system | boolean | 1.2.0 |
-| kyuubi.metrics.json.interval | PT5S | How often should report metrics to JSON file | duration | 1.2.0 |
-| kyuubi.metrics.json.location | metrics | Where the JSON metrics file located | string | 1.2.0 |
-| kyuubi.metrics.prometheus.path | /metrics | URI context path of prometheus metrics HTTP server | string | 1.2.0 |
-| kyuubi.metrics.prometheus.port | 10019 | Prometheus metrics HTTP server port | int | 1.2.0 |
-| kyuubi.metrics.reporters | JSON | A comma-separated list for all metrics reporters
CONSOLE - ConsoleReporter which outputs measurements to CONSOLE periodically.
JMX - JmxReporter which listens for new metrics and exposes them as MBeans.
JSON - JsonReporter which outputs measurements to json file periodically.
PROMETHEUS - PrometheusReporter which exposes metrics in Prometheus format.
SLF4J - Slf4jReporter which outputs measurements to system log periodically.
| seq | 1.2.0 |
-| kyuubi.metrics.slf4j.interval | PT5S | How often should report metrics to SLF4J logger | duration | 1.2.0 |
+| Key | Default | Meaning | Type | Since |
+|---------------------------------|------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
+| kyuubi.metrics.console.interval | PT5S | How often should report metrics to console | duration | 1.2.0 |
+| kyuubi.metrics.enabled | true | Set to true to enable kyuubi metrics system | boolean | 1.2.0 |
+| kyuubi.metrics.json.interval | PT5S | How often should report metrics to JSON file | duration | 1.2.0 |
+| kyuubi.metrics.json.location | metrics | Where the JSON metrics file located | string | 1.2.0 |
+| kyuubi.metrics.prometheus.path | /metrics | URI context path of prometheus metrics HTTP server | string | 1.2.0 |
+| kyuubi.metrics.prometheus.port | 10019 | Prometheus metrics HTTP server port | int | 1.2.0 |
+| kyuubi.metrics.reporters | PROMETHEUS | A comma-separated list for all metrics reporters
CONSOLE - ConsoleReporter which outputs measurements to CONSOLE periodically.
JMX - JmxReporter which listens for new metrics and exposes them as MBeans.
JSON - JsonReporter which outputs measurements to json file periodically.
PROMETHEUS - PrometheusReporter which exposes metrics in Prometheus format.
SLF4J - Slf4jReporter which outputs measurements to system log periodically.
| set | 1.2.0 |
+| kyuubi.metrics.slf4j.interval | PT5S | How often should report metrics to SLF4J logger | duration | 1.2.0 |
### Operation
-| Key | Default | Meaning | Type | Since |
-|-----------------------------------------|---------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
-| kyuubi.operation.idle.timeout | PT3H | Operation will be closed when it's not accessed for this duration of time | duration | 1.0.0 |
-| kyuubi.operation.interrupt.on.cancel | true | When true, all running tasks will be interrupted if one cancels a query. When false, all running tasks will remain until finished. | boolean | 1.2.0 |
-| kyuubi.operation.language | SQL | Choose a programing language for the following inputs
SQL: (Default) Run all following statements as SQL queries.
SCALA: Run all following input a scala codes
| string | 1.5.0 |
-| kyuubi.operation.log.dir.root | server_operation_logs | Root directory for query operation log at server-side. | string | 1.4.0 |
-| kyuubi.operation.plan.only.excludes | ResetCommand,SetCommand,SetNamespaceCommand,UseStatement,SetCatalogAndNamespace | Comma-separated list of query plan names, in the form of simple class names, i.e, for `SET abc=xyz`, the value will be `SetCommand`. For those auxiliary plans, such as `switch databases`, `set properties`, or `create temporary view` etc., which are used for setup evaluating environments for analyzing actual queries, we can use this config to exclude them and let them take effect. See also kyuubi.operation.plan.only.mode. | seq | 1.5.0 |
-| kyuubi.operation.plan.only.mode | none | Configures the statement performed mode, The value can be 'parse', 'analyze', 'optimize', 'optimize_with_stats', 'physical', 'execution', or 'none', when it is 'none', indicate to the statement will be fully executed, otherwise only way without executing the query. different engines currently support different modes, the Spark engine supports all modes, and the Flink engine supports 'parse', 'physical', and 'execution', other engines do not support planOnly currently. | string | 1.4.0 |
-| kyuubi.operation.plan.only.output.style | plain | Configures the planOnly output style. The value can be 'plain' or 'json', and the default value is 'plain'. This configuration supports only the output styles of the Spark engine | string | 1.7.0 |
-| kyuubi.operation.progress.enabled | false | Whether to enable the operation progress. When true, the operation progress will be returned in `GetOperationStatus`. | boolean | 1.6.0 |
-| kyuubi.operation.query.timeout | <undefined> | Timeout for query executions at server-side, take effect with client-side timeout(`java.sql.Statement.setQueryTimeout`) together, a running query will be cancelled automatically if timeout. It's off by default, which means only client-side take full control of whether the query should timeout or not. If set, client-side timeout is capped at this point. To cancel the queries right away without waiting for task to finish, consider enabling kyuubi.operation.interrupt.on.cancel together. | duration | 1.2.0 |
-| kyuubi.operation.result.format | thrift | Specify the result format, available configs are:
THRIFT: the result will convert to TRow at the engine driver side.
ARROW: the result will be encoded as Arrow at the executor side before collecting by the driver, and deserialized at the client side. note that it only takes effect for kyuubi-hive-jdbc clients now.
| string | 1.7.0 |
-| kyuubi.operation.result.max.rows | 0 | Max rows of Spark query results. Rows exceeding the limit would be ignored. By setting this value to 0 to disable the max rows limit. | int | 1.6.0 |
-| kyuubi.operation.scheduler.pool | <undefined> | The scheduler pool of job. Note that, this config should be used after changing Spark config spark.scheduler.mode=FAIR. | string | 1.1.1 |
-| kyuubi.operation.spark.listener.enabled | true | When set to true, Spark engine registers an SQLOperationListener before executing the statement, logging a few summary statistics when each stage completes. | boolean | 1.6.0 |
-| kyuubi.operation.status.polling.timeout | PT5S | Timeout(ms) for long polling asynchronous running sql query's status | duration | 1.0.0 |
+| Key | Default | Meaning | Type | Since |
+|--------------------------------------------------|---------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
+| kyuubi.operation.getTables.ignoreTableProperties | false | Speed up the `GetTables` operation by returning table identities only. | boolean | 1.8.0 |
+| kyuubi.operation.idle.timeout | PT3H | Operation will be closed when it's not accessed for this duration of time | duration | 1.0.0 |
+| kyuubi.operation.interrupt.on.cancel | true | When true, all running tasks will be interrupted if one cancels a query. When false, all running tasks will remain until finished. | boolean | 1.2.0 |
+| kyuubi.operation.language | SQL | Choose a programing language for the following inputs
SQL: (Default) Run all following statements as SQL queries.
SCALA: Run all following input as scala codes
PYTHON: (Experimental) Run all following input as Python codes with Spark engine
| string | 1.5.0 |
+| kyuubi.operation.log.dir.root | server_operation_logs | Root directory for query operation log at server-side. | string | 1.4.0 |
+| kyuubi.operation.plan.only.excludes | SetCatalogAndNamespace,UseStatement,SetNamespaceCommand,SetCommand,ResetCommand | Comma-separated list of query plan names, in the form of simple class names, i.e, for `SET abc=xyz`, the value will be `SetCommand`. For those auxiliary plans, such as `switch databases`, `set properties`, or `create temporary view` etc., which are used for setup evaluating environments for analyzing actual queries, we can use this config to exclude them and let them take effect. See also kyuubi.operation.plan.only.mode. | set | 1.5.0 |
+| kyuubi.operation.plan.only.mode | none | Configures the statement performed mode, The value can be 'parse', 'analyze', 'optimize', 'optimize_with_stats', 'physical', 'execution', 'lineage' or 'none', when it is 'none', indicate to the statement will be fully executed, otherwise only way without executing the query. different engines currently support different modes, the Spark engine supports all modes, and the Flink engine supports 'parse', 'physical', and 'execution', other engines do not support planOnly currently. | string | 1.4.0 |
+| kyuubi.operation.plan.only.output.style | plain | Configures the planOnly output style. The value can be 'plain' or 'json', and the default value is 'plain'. This configuration supports only the output styles of the Spark engine | string | 1.7.0 |
+| kyuubi.operation.progress.enabled | false | Whether to enable the operation progress. When true, the operation progress will be returned in `GetOperationStatus`. | boolean | 1.6.0 |
+| kyuubi.operation.query.timeout | <undefined> | Timeout for query executions at server-side, take effect with client-side timeout(`java.sql.Statement.setQueryTimeout`) together, a running query will be cancelled automatically if timeout. It's off by default, which means only client-side take full control of whether the query should timeout or not. If set, client-side timeout is capped at this point. To cancel the queries right away without waiting for task to finish, consider enabling kyuubi.operation.interrupt.on.cancel together. | duration | 1.2.0 |
+| kyuubi.operation.result.arrow.timestampAsString | false | When true, arrow-based rowsets will convert columns of type timestamp to strings for transmission. | boolean | 1.7.0 |
+| kyuubi.operation.result.format | thrift | Specify the result format, available configs are:
THRIFT: the result will convert to TRow at the engine driver side.
ARROW: the result will be encoded as Arrow at the executor side before collecting by the driver, and deserialized at the client side. note that it only takes effect for kyuubi-hive-jdbc clients now.
| string | 1.7.0 |
+| kyuubi.operation.result.max.rows | 0 | Max rows of Spark query results. Rows exceeding the limit would be ignored. By setting this value to 0 to disable the max rows limit. | int | 1.6.0 |
+| kyuubi.operation.scheduler.pool | <undefined> | The scheduler pool of job. Note that, this config should be used after changing Spark config spark.scheduler.mode=FAIR. | string | 1.1.1 |
+| kyuubi.operation.spark.listener.enabled | true | When set to true, Spark engine registers an SQLOperationListener before executing the statement, logging a few summary statistics when each stage completes. | boolean | 1.6.0 |
+| kyuubi.operation.status.polling.timeout | PT5S | Timeout(ms) for long polling asynchronous running sql query's status | duration | 1.0.0 |
### Server
-| Key | Default | Meaning | Type | Since |
-|----------------------------------------------------------|-------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|-------|
-| kyuubi.server.batch.limit.connections.per.ipaddress | <undefined> | Maximum kyuubi server batch connections per ipaddress. Any user exceeding this limit will not be allowed to connect. | int | 1.7.0 |
-| kyuubi.server.batch.limit.connections.per.user | <undefined> | Maximum kyuubi server batch connections per user. Any user exceeding this limit will not be allowed to connect. | int | 1.7.0 |
-| kyuubi.server.batch.limit.connections.per.user.ipaddress | <undefined> | Maximum kyuubi server batch connections per user:ipaddress combination. Any user-ipaddress exceeding this limit will not be allowed to connect. | int | 1.7.0 |
-| kyuubi.server.info.provider | ENGINE | The server information provider name, some clients may rely on this information to check the server compatibilities and functionalities.
SERVER: Return Kyuubi server information.
ENGINE: Return Kyuubi engine information.
| string | 1.6.1 |
-| kyuubi.server.limit.connections.per.ipaddress | <undefined> | Maximum kyuubi server connections per ipaddress. Any user exceeding this limit will not be allowed to connect. | int | 1.6.0 |
-| kyuubi.server.limit.connections.per.user | <undefined> | Maximum kyuubi server connections per user. Any user exceeding this limit will not be allowed to connect. | int | 1.6.0 |
-| kyuubi.server.limit.connections.per.user.ipaddress | <undefined> | Maximum kyuubi server connections per user:ipaddress combination. Any user-ipaddress exceeding this limit will not be allowed to connect. | int | 1.6.0 |
-| kyuubi.server.limit.connections.user.unlimited.list || The maximin connections of the user in the white list will not be limited. | seq | 1.7.0 |
-| kyuubi.server.name | <undefined> | The name of Kyuubi Server. | string | 1.5.0 |
-| kyuubi.server.redaction.regex | <undefined> | Regex to decide which Kyuubi contain sensitive information. When this regex matches a property key or value, the value is redacted from the various logs. || 1.6.0 |
+| Key | Default | Meaning | Type | Since |
+|----------------------------------------------------------|-------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
+| kyuubi.server.administrators || Comma-separated list of Kyuubi service administrators. We use this config to grant admin permission to any service accounts. | set | 1.8.0 |
+| kyuubi.server.info.provider | ENGINE | The server information provider name, some clients may rely on this information to check the server compatibilities and functionalities.
SERVER: Return Kyuubi server information.
ENGINE: Return Kyuubi engine information.
| string | 1.6.1 |
+| kyuubi.server.limit.batch.connections.per.ipaddress | <undefined> | Maximum kyuubi server batch connections per ipaddress. Any user exceeding this limit will not be allowed to connect. | int | 1.7.0 |
+| kyuubi.server.limit.batch.connections.per.user | <undefined> | Maximum kyuubi server batch connections per user. Any user exceeding this limit will not be allowed to connect. | int | 1.7.0 |
+| kyuubi.server.limit.batch.connections.per.user.ipaddress | <undefined> | Maximum kyuubi server batch connections per user:ipaddress combination. Any user-ipaddress exceeding this limit will not be allowed to connect. | int | 1.7.0 |
+| kyuubi.server.limit.client.fetch.max.rows | <undefined> | Max rows limit for getting result row set operation. If the max rows specified by client-side is larger than the limit, request will fail directly. | int | 1.8.0 |
+| kyuubi.server.limit.connections.per.ipaddress | <undefined> | Maximum kyuubi server connections per ipaddress. Any user exceeding this limit will not be allowed to connect. | int | 1.6.0 |
+| kyuubi.server.limit.connections.per.user | <undefined> | Maximum kyuubi server connections per user. Any user exceeding this limit will not be allowed to connect. | int | 1.6.0 |
+| kyuubi.server.limit.connections.per.user.ipaddress | <undefined> | Maximum kyuubi server connections per user:ipaddress combination. Any user-ipaddress exceeding this limit will not be allowed to connect. | int | 1.6.0 |
+| kyuubi.server.limit.connections.user.deny.list || The user in the deny list will be denied to connect to kyuubi server, if the user has configured both user.unlimited.list and user.deny.list, the priority of the latter is higher. | set | 1.8.0 |
+| kyuubi.server.limit.connections.user.unlimited.list || The maximum connections of the user in the white list will not be limited. | set | 1.7.0 |
+| kyuubi.server.name | <undefined> | The name of Kyuubi Server. | string | 1.5.0 |
+| kyuubi.server.periodicGC.interval | PT30M | How often to trigger a garbage collection. | duration | 1.7.0 |
+| kyuubi.server.redaction.regex | <undefined> | Regex to decide which Kyuubi contain sensitive information. When this regex matches a property key or value, the value is redacted from the various logs. || 1.6.0 |
### Session
| Key | Default | Meaning | Type | Since |
|------------------------------------------------------|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|-------|
| kyuubi.session.check.interval | PT5M | The check interval for session timeout. | duration | 1.0.0 |
-| kyuubi.session.conf.advisor | <undefined> | A config advisor plugin for Kyuubi Server. This plugin can provide some custom configs for different users or session configs and overwrite the session configs before opening a new session. This config value should be a subclass of `org.apache.kyuubi.plugin.SessionConfAdvisor` which has a zero-arg constructor. | string | 1.5.0 |
+| kyuubi.session.close.on.disconnect | true | Session will be closed when client disconnects from kyuubi gateway. Set this to false to have session outlive its parent connection. | boolean | 1.8.0 |
+| kyuubi.session.conf.advisor | <undefined> | A config advisor plugin for Kyuubi Server. This plugin can provide a list of custom configs for different users or session configs and overwrite the session configs before opening a new session. This config value should be a subclass of `org.apache.kyuubi.plugin.SessionConfAdvisor` which has a zero-arg constructor. | seq | 1.5.0 |
| kyuubi.session.conf.file.reload.interval | PT10M | When `FileSessionConfAdvisor` is used, this configuration defines the expired time of `$KYUUBI_CONF_DIR/kyuubi-session-.conf` in the cache. After exceeding this value, the file will be reloaded. | duration | 1.7.0 |
-| kyuubi.session.conf.ignore.list || A comma-separated list of ignored keys. If the client connection contains any of them, the key and the corresponding value will be removed silently during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax. | seq | 1.2.0 |
+| kyuubi.session.conf.ignore.list || A comma-separated list of ignored keys. If the client connection contains any of them, the key and the corresponding value will be removed silently during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax. | set | 1.2.0 |
| kyuubi.session.conf.profile | <undefined> | Specify a profile to load session-level configurations from `$KYUUBI_CONF_DIR/kyuubi-session-.conf`. This configuration will be ignored if the file does not exist. This configuration only takes effect when `kyuubi.session.conf.advisor` is set as `org.apache.kyuubi.session.FileSessionConfAdvisor`. | string | 1.7.0 |
-| kyuubi.session.conf.restrict.list || A comma-separated list of restricted keys. If the client connection contains any of them, the connection will be rejected explicitly during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax. | seq | 1.2.0 |
+| kyuubi.session.conf.restrict.list || A comma-separated list of restricted keys. If the client connection contains any of them, the connection will be rejected explicitly during engine bootstrap and connection setup. Note that this rule is for server-side protection defined via administrators to prevent some essential configs from tampering but will not forbid users to set dynamic configurations via SET syntax. | set | 1.2.0 |
+| kyuubi.session.engine.alive.max.failures | 3 | The maximum number of failures allowed for the engine. | int | 1.8.0 |
| kyuubi.session.engine.alive.probe.enabled | false | Whether to enable the engine alive probe, it true, we will create a companion thrift client that keeps sending simple requests to check whether the engine is alive. | boolean | 1.6.0 |
| kyuubi.session.engine.alive.probe.interval | PT10S | The interval for engine alive probe. | duration | 1.6.0 |
| kyuubi.session.engine.alive.timeout | PT2M | The timeout for engine alive. If there is no alive probe success in the last timeout window, the engine will be marked as no-alive. | duration | 1.6.0 |
| kyuubi.session.engine.check.interval | PT1M | The check interval for engine timeout | duration | 1.0.0 |
+| kyuubi.session.engine.flink.fetch.timeout | <undefined> | Result fetch timeout for Flink engine. If the timeout is reached, the result fetch would be stopped and the current fetched would be returned. If no data are fetched, a TimeoutException would be thrown. | duration | 1.8.0 |
| kyuubi.session.engine.flink.main.resource | <undefined> | The package used to create Flink SQL engine remote job. If it is undefined, Kyuubi will use the default | string | 1.4.0 |
| kyuubi.session.engine.flink.max.rows | 1000000 | Max rows of Flink query results. For batch queries, rows exceeding the limit would be ignored. For streaming queries, the query would be canceled if the limit is reached. | int | 1.5.0 |
| kyuubi.session.engine.hive.main.resource | <undefined> | The package used to create Hive engine remote job. If it is undefined, Kyuubi will use the default | string | 1.6.0 |
@@ -477,10 +433,12 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
| kyuubi.session.engine.open.retry.wait | PT10S | How long to wait before retrying to open the engine after failure. | duration | 1.7.0 |
| kyuubi.session.engine.share.level | USER | (deprecated) - Using kyuubi.engine.share.level instead | string | 1.0.0 |
| kyuubi.session.engine.spark.main.resource | <undefined> | The package used to create Spark SQL engine remote application. If it is undefined, Kyuubi will use the default | string | 1.0.0 |
+| kyuubi.session.engine.spark.max.initial.wait | PT1M | Max wait time for the initial connection to Spark engine. The engine will self-terminate no new incoming connection is established within this time. This setting only applies at the CONNECTION share level. 0 or negative means not to self-terminate. | duration | 1.8.0 |
| kyuubi.session.engine.spark.max.lifetime | PT0S | Max lifetime for Spark engine, the engine will self-terminate when it reaches the end of life. 0 or negative means not to self-terminate. | duration | 1.6.0 |
| kyuubi.session.engine.spark.progress.timeFormat | yyyy-MM-dd HH:mm:ss.SSS | The time format of the progress bar | string | 1.6.0 |
| kyuubi.session.engine.spark.progress.update.interval | PT1S | Update period of progress bar. | duration | 1.6.0 |
| kyuubi.session.engine.spark.showProgress | false | When true, show the progress bar in the Spark's engine log. | boolean | 1.6.0 |
+| kyuubi.session.engine.startup.destroy.timeout | PT5S | Engine startup process destroy wait time, if the process does not stop after this time, force destroy instead. This configuration only takes effect when `kyuubi.session.engine.startup.waitCompletion=false`. | duration | 1.8.0 |
| kyuubi.session.engine.startup.error.max.size | 8192 | During engine bootstrapping, if anderror occurs, using this config to limit the length of error message(characters). | int | 1.1.0 |
| kyuubi.session.engine.startup.maxLogLines | 10 | The maximum number of engine log lines when errors occur during the engine startup phase. Note that this config effects on client-side to help track engine startup issues. | int | 1.4.0 |
| kyuubi.session.engine.startup.waitCompletion | true | Whether to wait for completion after the engine starts. If false, the startup process will be destroyed after the engine is started. Note that only use it when the driver is not running locally, such as in yarn-cluster mode; Otherwise, the engine will be killed. | boolean | 1.5.0 |
@@ -491,7 +449,7 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
| kyuubi.session.engine.trino.showProgress.debug | false | When true, show the progress debug info in the Trino engine log. | boolean | 1.6.0 |
| kyuubi.session.group.provider | hadoop | A group provider plugin for Kyuubi Server. This plugin can provide primary group and groups information for different users or session configs. This config value should be a subclass of `org.apache.kyuubi.plugin.GroupProvider` which has a zero-arg constructor. Kyuubi provides the following built-in implementations:
hadoop: delegate the user group mapping to hadoop UserGroupInformation.
| string | 1.7.0 |
| kyuubi.session.idle.timeout | PT6H | session idle timeout, it will be closed when it's not accessed for this duration | duration | 1.2.0 |
-| kyuubi.session.local.dir.allow.list || The local dir list that are allowed to access by the kyuubi session application. End-users might set some parameters such as `spark.files` and it will upload some local files when launching the kyuubi engine, if the local dir allow list is defined, kyuubi will check whether the path to upload is in the allow list. Note that, if it is empty, there is no limitation for that. And please use absolute paths. | seq | 1.6.0 |
+| kyuubi.session.local.dir.allow.list || The local dir list that are allowed to access by the kyuubi session application. End-users might set some parameters such as `spark.files` and it will upload some local files when launching the kyuubi engine, if the local dir allow list is defined, kyuubi will check whether the path to upload is in the allow list. Note that, if it is empty, there is no limitation for that. And please use absolute paths. | set | 1.6.0 |
| kyuubi.session.name | <undefined> | A human readable name of the session and we use empty string by default. This name will be recorded in the event. Note that, we only apply this value from session conf. | string | 1.4.0 |
| kyuubi.session.timeout | PT6H | (deprecated)session timeout, it will be closed when it's not accessed for this duration | duration | 1.0.0 |
| kyuubi.session.user.sign.enabled | false | Whether to verify the integrity of session user name on the engine side, e.g. Authz plugin in Spark. | boolean | 1.7.0 |
@@ -503,26 +461,34 @@ You can configure the Kyuubi properties in `$KYUUBI_HOME/conf/kyuubi-defaults.co
| kyuubi.spnego.keytab | <undefined> | Keytab file for SPNego principal | string | 1.6.0 |
| kyuubi.spnego.principal | <undefined> | SPNego service principal, typical value would look like HTTP/_HOST@EXAMPLE.COM. SPNego service principal would be used when restful Kerberos security is enabled. This needs to be set only if SPNEGO is to be used in authentication. | string | 1.6.0 |
+### Yarn
+
+| Key | Default | Meaning | Type | Since |
+|---------------------------|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|-------|
+| kyuubi.yarn.user.admin | yarn | When kyuubi.yarn.user.strategy is set to ADMIN, use this admin user to construct YARN client for application management, e.g. kill application. | string | 1.8.0 |
+| kyuubi.yarn.user.strategy | NONE | Determine which user to use to construct YARN client for application management, e.g. kill application. Options:
NONE: use Kyuubi server user.
ADMIN: use admin user configured in `kyuubi.yarn.user.admin`.
OWNER: use session user, typically is application owner.
| string | 1.8.0 |
+
### Zookeeper
-| Key | Default | Meaning | Type | Since |
-|--------------------------------------------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|-------|
-| kyuubi.zookeeper.embedded.client.port | 2181 | clientPort for the embedded ZooKeeper server to listen for client connections, a client here could be Kyuubi server, engine, and JDBC client | int | 1.2.0 |
-| kyuubi.zookeeper.embedded.client.port.address | <undefined> | clientPortAddress for the embedded ZooKeeper server to | string | 1.2.0 |
-| kyuubi.zookeeper.embedded.data.dir | embedded_zookeeper | dataDir for the embedded zookeeper server where stores the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database. | string | 1.2.0 |
-| kyuubi.zookeeper.embedded.data.log.dir | embedded_zookeeper | dataLogDir for the embedded ZooKeeper server where writes the transaction log . | string | 1.2.0 |
-| kyuubi.zookeeper.embedded.directory | embedded_zookeeper | The temporary directory for the embedded ZooKeeper server | string | 1.0.0 |
-| kyuubi.zookeeper.embedded.max.client.connections | 120 | maxClientCnxns for the embedded ZooKeeper server to limit the number of concurrent connections of a single client identified by IP address | int | 1.2.0 |
-| kyuubi.zookeeper.embedded.max.session.timeout | 60000 | maxSessionTimeout in milliseconds for the embedded ZooKeeper server will allow the client to negotiate. Defaults to 20 times the tickTime | int | 1.2.0 |
-| kyuubi.zookeeper.embedded.min.session.timeout | 6000 | minSessionTimeout in milliseconds for the embedded ZooKeeper server will allow the client to negotiate. Defaults to 2 times the tickTime | int | 1.2.0 |
-| kyuubi.zookeeper.embedded.port | 2181 | The port of the embedded ZooKeeper server | int | 1.0.0 |
-| kyuubi.zookeeper.embedded.tick.time | 3000 | tickTime in milliseconds for the embedded ZooKeeper server | int | 1.2.0 |
+| Key | Default | Meaning | Type | Since |
+|--------------------------------------------------|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|-------|
+| kyuubi.zookeeper.embedded.client.port | 2181 | clientPort for the embedded ZooKeeper server to listen for client connections, a client here could be Kyuubi server, engine, and JDBC client | int | 1.2.0 |
+| kyuubi.zookeeper.embedded.client.port.address | <undefined> | clientPortAddress for the embedded ZooKeeper server to | string | 1.2.0 |
+| kyuubi.zookeeper.embedded.client.use.hostname | false | When true, embedded Zookeeper prefer to bind hostname, otherwise, ip address. | boolean | 1.7.2 |
+| kyuubi.zookeeper.embedded.data.dir | embedded_zookeeper | dataDir for the embedded zookeeper server where stores the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database. If it is a relative path, it is resolved relative to KYUUBI_HOME. | string | 1.2.0 |
+| kyuubi.zookeeper.embedded.data.log.dir | embedded_zookeeper | dataLogDir for the embedded ZooKeeper server where writes the transaction log. If it is a relative path, it is resolved relative to KYUUBI_HOME. | string | 1.2.0 |
+| kyuubi.zookeeper.embedded.directory | embedded_zookeeper | The temporary directory for the embedded ZooKeeper server. If it is a relative path, it is resolved relative to KYUUBI_HOME. | string | 1.0.0 |
+| kyuubi.zookeeper.embedded.max.client.connections | 120 | maxClientCnxns for the embedded ZooKeeper server to limit the number of concurrent connections of a single client identified by IP address | int | 1.2.0 |
+| kyuubi.zookeeper.embedded.max.session.timeout | 60000 | maxSessionTimeout in milliseconds for the embedded ZooKeeper server will allow the client to negotiate. Defaults to 20 times the tickTime | int | 1.2.0 |
+| kyuubi.zookeeper.embedded.min.session.timeout | 6000 | minSessionTimeout in milliseconds for the embedded ZooKeeper server will allow the client to negotiate. Defaults to 2 times the tickTime | int | 1.2.0 |
+| kyuubi.zookeeper.embedded.port | 2181 | The port of the embedded ZooKeeper server | int | 1.0.0 |
+| kyuubi.zookeeper.embedded.tick.time | 3000 | tickTime in milliseconds for the embedded ZooKeeper server | int | 1.2.0 |
## Spark Configurations
### Via spark-defaults.conf
-Setting them in `$SPARK_HOME/conf/spark-defaults.conf` supplies with default values for SQL engine application. Available properties can be found at Spark official online documentation for [Spark Configurations](http://spark.apache.org/docs/latest/configuration.html)
+Setting them in `$SPARK_HOME/conf/spark-defaults.conf` supplies with default values for SQL engine application. Available properties can be found at Spark official online documentation for [Spark Configurations](https://spark.apache.org/docs/latest/configuration.html)
### Via kyuubi-defaults.conf
@@ -533,13 +499,13 @@ Setting them in `$KYUUBI_HOME/conf/kyuubi-defaults.conf` supplies with default v
Setting them in the JDBC Connection URL supplies session-specific for each SQL engine. For example: ```jdbc:hive2://localhost:10009/default;#spark.sql.shuffle.partitions=2;spark.executor.memory=5g```
- **Runtime SQL Configuration**
- - For [Runtime SQL Configurations](http://spark.apache.org/docs/latest/configuration.html#runtime-sql-configuration), they will take affect every time
+ - For [Runtime SQL Configurations](https://spark.apache.org/docs/latest/configuration.html#runtime-sql-configuration), they will take affect every time
- **Static SQL and Spark Core Configuration**
- - For [Static SQL Configurations](http://spark.apache.org/docs/latest/configuration.html#static-sql-configuration) and other spark core configs, e.g. `spark.executor.memory`, they will take effect if there is no existing SQL engine application. Otherwise, they will just be ignored
+ - For [Static SQL Configurations](https://spark.apache.org/docs/latest/configuration.html#static-sql-configuration) and other spark core configs, e.g. `spark.executor.memory`, they will take effect if there is no existing SQL engine application. Otherwise, they will just be ignored
### Via SET Syntax
-Please refer to the Spark official online documentation for [SET Command](http://spark.apache.org/docs/latest/sql-ref-syntax-aux-conf-mgmt-set.html)
+Please refer to the Spark official online documentation for [SET Command](https://spark.apache.org/docs/latest/sql-ref-syntax-aux-conf-mgmt-set.html)
## Flink Configurations
@@ -568,80 +534,42 @@ Setting them in the JDBC Connection URL supplies session-specific for each SQL e
Please refer to the Flink official online documentation for [SET Statements](https://nightlies.apache.org/flink/flink-docs-stable/docs/dev/table/sql/set/)
-## Logging
+## Trino Configurations
-Kyuubi uses [log4j](https://logging.apache.org/log4j/2.x/) for logging. You can configure it using `$KYUUBI_HOME/conf/log4j2.xml`.
+### Via config.properties
-```bash
-
-
-
-
-
-
-
- rest-audit.log
- rest-audit-%d{yyyy-MM-dd}-%i.log
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+Setting them in `$TRINO_HOME/etc/config.properties` supplies with default values for SQL engine application. Available properties can be found at Trino official online documentation for [Trino Configurations](https://trino.io/docs/current/admin/properties.html)
+
+### Via kyuubi-defaults.conf
+
+Setting them in `$KYUUBI_HOME/conf/kyuubi-defaults.conf` supplies with default values for SQL engine application too. You can use properties with the additional prefix `trino.` to override settings in `$TRINO_HOME/etc/config.properties`.
+
+For example:
+
+```
+trino.query_max_stage_count 500
+trino.parse_decimal_literals_as_double true
```
+The below options in `kyuubi-defaults.conf` will set `query_max_stage_count: 500` and `parse_decimal_literals_as_double: true` into trino session properties.
+
+### Via JDBC Connection URL
+
+Setting them in the JDBC Connection URL supplies session-specific for each SQL engine. For example: ```jdbc:hive2://localhost:10009/default;#trino.query_max_stage_count=500;trino.parse_decimal_literals_as_double=true```
+
+### Via SET Statements
+
+Please refer to the Trino official online documentation for [SET Statements](https://trino.io/docs/current/sql/set-session.html)
+
+## Logging
+
+Kyuubi uses [log4j](https://logging.apache.org/log4j/2.x/) for logging. You can configure it using `$KYUUBI_HOME/conf/log4j2.xml`, see `$KYUUBI_HOME/conf/log4j2.xml.template` as an example.
+
## Other Configurations
### Hadoop Configurations
-Specifying `HADOOP_CONF_DIR` to the directory containing Hadoop configuration files or treating them as Spark properties with a `spark.hadoop.` prefix. Please refer to the Spark official online documentation for [Inheriting Hadoop Cluster Configuration](http://spark.apache.org/docs/latest/configuration.html#inheriting-hadoop-cluster-configuration). Also, please refer to the [Apache Hadoop](http://hadoop.apache.org)'s online documentation for an overview on how to configure Hadoop.
+Specifying `HADOOP_CONF_DIR` to the directory containing Hadoop configuration files or treating them as Spark properties with a `spark.hadoop.` prefix. Please refer to the Spark official online documentation for [Inheriting Hadoop Cluster Configuration](https://spark.apache.org/docs/latest/configuration.html#inheriting-hadoop-cluster-configuration). Also, please refer to the [Apache Hadoop](https://hadoop.apache.org)'s online documentation for an overview on how to configure Hadoop.
### Hive Configurations
diff --git a/docs/connector/flink/index.rst b/docs/connector/flink/index.rst
index c9d91091f..e7d40fd43 100644
--- a/docs/connector/flink/index.rst
+++ b/docs/connector/flink/index.rst
@@ -19,6 +19,6 @@ Connectors For Flink SQL Query Engine
.. toctree::
:maxdepth: 2
- flink_table_store
+ paimon
hudi
iceberg
diff --git a/docs/connector/flink/flink_table_store.rst b/docs/connector/flink/paimon.rst
similarity index 51%
rename from docs/connector/flink/flink_table_store.rst
rename to docs/connector/flink/paimon.rst
index 14c576bf3..b67101488 100644
--- a/docs/connector/flink/flink_table_store.rst
+++ b/docs/connector/flink/paimon.rst
@@ -13,57 +13,56 @@
See the License for the specific language governing permissions and
limitations under the License.
-`Flink Table Store`_
-==========
+`Apache Paimon (Incubating)`_
+=============================
-Flink Table Store is a unified storage to build dynamic tables for both streaming and batch processing in Flink,
-supporting high-speed data ingestion and timely data query.
+Apache Paimon (Incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking, and efficient real-time analytics.
.. tip::
- This article assumes that you have mastered the basic knowledge and operation of `Flink Table Store`_.
- For the knowledge about Flink Table Store not mentioned in this article,
+ This article assumes that you have mastered the basic knowledge and operation of `Apache Paimon (Incubating)`_.
+ For the knowledge not mentioned in this article,
you can obtain it from its `Official Documentation`_.
-By using kyuubi, we can run SQL queries towards Flink Table Store which is more
-convenient, easy to understand, and easy to expand than directly using
-flink to manipulate Flink Table Store.
+By using kyuubi, we can run SQL queries towards Apache Paimon (Incubating) which is more
+convenient, easy to understand, and easy to expand than directly using flink.
-Flink Table Store Integration
--------------------
+Apache Paimon (Incubating) Integration
+--------------------------------------
-To enable the integration of kyuubi flink sql engine and Flink Table Store, you need to:
+To enable the integration of kyuubi flink sql engine and Apache Paimon (Incubating), you need to:
-- Referencing the Flink Table Store :ref:`dependencies`
+- Referencing the Apache Paimon (Incubating) :ref:`dependencies`
-.. _flink-table-store-deps:
+.. _flink-paimon-deps:
Dependencies
************
-The **classpath** of kyuubi flink sql engine with Flink Table Store supported consists of
+The **classpath** of kyuubi flink sql engine with Apache Paimon (Incubating) supported consists of
1. kyuubi-flink-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
2. a copy of flink distribution
-3. flink-table-store-dist-.jar (example: flink-table-store-dist-0.2.jar), which can be found in the `Maven Central`_
+3. paimon-flink-.jar (example: paimon-flink-1.16-0.4-SNAPSHOT.jar), which can be found in the `Apache Paimon (Incubating) Supported Engines Flink`_
+4. flink-shaded-hadoop-2-uber-.jar, which code can be found in the `Pre-bundled Hadoop Jar`_
-In order to make the Flink Table Store packages visible for the runtime classpath of engines, we can use these methods:
+In order to make the Apache Paimon (Incubating) packages visible for the runtime classpath of engines, you need to:
-1. Put the Flink Table Store packages into ``$FLINK_HOME/lib`` directly
+1. Put the Apache Paimon (Incubating) packages into ``$FLINK_HOME/lib`` directly
2. Setting the HADOOP_CLASSPATH environment variable or copy the `Pre-bundled Hadoop Jar`_ to flink/lib.
.. warning::
- Please mind the compatibility of different Flink Table Store and Flink versions, which can be confirmed on the page of `Flink Table Store multi engine support`_.
+ Please mind the compatibility of different Apache Paimon (Incubating) and Flink versions, which can be confirmed on the page of `Apache Paimon (Incubating) multi engine support`_.
-Flink Table Store Operations
-------------------
+Apache Paimon (Incubating) Operations
+-------------------------------------
Taking ``CREATE CATALOG`` as a example,
.. code-block:: sql
CREATE CATALOG my_catalog WITH (
- 'type'='table-store',
- 'warehouse'='hdfs://nn:8020/warehouse/path' -- or 'file:///tmp/foo/bar'
+ 'type'='paimon',
+ 'warehouse'='file:/tmp/paimon'
);
USE CATALOG my_catalog;
@@ -104,8 +103,8 @@ Taking ``Rescale Bucket`` as a example,
INSERT OVERWRITE my_table PARTITION (dt = '2022-01-01');
-.. _Flink Table Store: https://nightlies.apache.org/flink/flink-table-store-docs-stable/
-.. _Official Documentation: https://nightlies.apache.org/flink/flink-table-store-docs-stable/
-.. _Maven Central: https://mvnrepository.com/artifact/org.apache.flink/flink-table-store-dist
-.. _Pre-bundled Hadoop Jar: https://flink.apache.org/downloads.html
-.. _Flink Table Store multi engine support: https://nightlies.apache.org/flink/flink-table-store-docs-stable/docs/engines/overview/
+.. _Apache Paimon (Incubating): https://paimon.apache.org/
+.. _Official Documentation: https://paimon.apache.org/docs/master/
+.. _Apache Paimon (Incubating) Supported Engines Flink: https://paimon.apache.org/docs/master/engines/flink/#preparing-paimon-jar-file
+.. _Pre-bundled Hadoop Jar: https://flink.apache.org/downloads/#additional-components
+.. _Apache Paimon (Incubating) multi engine support: https://paimon.apache.org/docs/master/engines/overview/
diff --git a/docs/connector/hive/index.rst b/docs/connector/hive/index.rst
index 2b2b863a6..d96f8b041 100644
--- a/docs/connector/hive/index.rst
+++ b/docs/connector/hive/index.rst
@@ -19,4 +19,5 @@ Connectors for Hive SQL Query Engine
.. toctree::
:maxdepth: 2
+ paimon
iceberg
diff --git a/docs/connector/hive/paimon.rst b/docs/connector/hive/paimon.rst
new file mode 100644
index 000000000..000d2d7e8
--- /dev/null
+++ b/docs/connector/hive/paimon.rst
@@ -0,0 +1,100 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+`Apache Paimon (Incubating)`_
+==========
+
+Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
+
+.. tip::
+ This article assumes that you have mastered the basic knowledge and operation of `Apache Paimon (Incubating)`_.
+ For the knowledge about Apache Paimon (Incubating) not mentioned in this article,
+ you can obtain it from its `Official Documentation`_.
+
+By using Kyuubi, we can run SQL queries towards Apache Paimon (Incubating) which is more
+convenient, easy to understand, and easy to expand than directly using
+Hive to manipulate Apache Paimon (Incubating).
+
+Apache Paimon (Incubating) Integration
+-------------------
+
+To enable the integration of kyuubi hive sql engine and Apache Paimon (Incubating), you need to:
+
+- Referencing the Apache Paimon (Incubating) :ref:`dependencies`
+- Setting the environment variable :ref:`configurations`
+
+.. _hive-paimon-deps:
+
+Dependencies
+************
+
+The **classpath** of kyuubi hive sql engine with Iceberg supported consists of
+
+1. kyuubi-hive-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
+2. a copy of hive distribution
+3. paimon-hive-connector--.jar (example: paimon-hive-connector-3.1-0.4-SNAPSHOT.jar), which can be found in the `Apache Paimon (Incubating) Supported Engines Hive`_
+
+In order to make the Hive packages visible for the runtime classpath of engines, we can use one of these methods:
+
+1. You can create an auxlib folder under the root directory of Hive, and copy paimon-hive-connector-3.1-.jar into auxlib.
+2. Execute ADD JAR statement in the Kyuubi to add dependencies to Hive’s auxiliary classpath. For example:
+
+.. code-block:: sql
+
+ ADD JAR /path/to/paimon-hive-connector-3.1-.jar;
+
+.. warning::
+ The second method is not recommended. If you’re using the MR execution engine and running a join statement, you may be faced with the exception
+ ``org.apache.hive.com.esotericsoftware.kryo.kryoexception: unable to find class.``
+
+.. warning::
+ Please mind the compatibility of different Apache Paimon (Incubating) and Hive versions, which can be confirmed on the page of `Apache Paimon (Incubating) multi engine support`_.
+
+.. _hive-paimon-conf:
+
+Configurations
+**************
+
+If you are using HDFS, make sure that the environment variable HADOOP_HOME or HADOOP_CONF_DIR is set.
+
+Apache Paimon (Incubating) Operations
+------------------
+
+Apache Paimon (Incubating) only supports only reading table store tables through Hive.
+A common scenario is to write data with Spark or Flink and read data with Hive.
+You can follow this document `Apache Paimon (Incubating) Quick Start with Paimon Hive Catalog`_ to write data to a table which can also be accessed directly from Hive.
+and then use Kyuubi Hive SQL engine to query the table with the following SQL ``SELECT`` statement.
+
+Taking ``Query Data`` as an example,
+
+.. code-block:: sql
+
+ SELECT a, b FROM test_table ORDER BY a;
+
+Taking ``Query External Table`` as an example,
+
+.. code-block:: sql
+
+ CREATE EXTERNAL TABLE external_test_table
+ STORED BY 'org.apache.paimon.hive.PaimonStorageHandler'
+ LOCATION '/path/to/table/store/warehouse/default.db/test_table';
+
+ SELECT a, b FROM test_table ORDER BY a;
+
+.. _Apache Paimon (Incubating): https://paimon.apache.org/
+.. _Official Documentation: https://paimon.apache.org/docs/master/
+.. _Apache Paimon (Incubating) Quick Start with Paimon Hive Catalog: https://paimon.apache.org/docs/master/engines/hive/#quick-start-with-paimon-hive-catalog
+.. _Apache Paimon (Incubating) Supported Engines Hive: https://paimon.apache.org/docs/master/engines/hive/
+.. _Apache Paimon (Incubating) multi engine support: https://paimon.apache.org/docs/master/engines/overview/
diff --git a/docs/connector/spark/flink_table_store.rst b/docs/connector/spark/flink_table_store.rst
deleted file mode 100644
index ee4c2b352..000000000
--- a/docs/connector/spark/flink_table_store.rst
+++ /dev/null
@@ -1,90 +0,0 @@
-.. Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
-.. http://www.apache.org/licenses/LICENSE-2.0
-
-.. Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
-
-`Flink Table Store`_
-==========
-
-Flink Table Store is a unified storage to build dynamic tables for both streaming and batch processing in Flink,
-supporting high-speed data ingestion and timely data query.
-
-.. tip::
- This article assumes that you have mastered the basic knowledge and operation of `Flink Table Store`_.
- For the knowledge about Flink Table Store not mentioned in this article,
- you can obtain it from its `Official Documentation`_.
-
-By using kyuubi, we can run SQL queries towards Flink Table Store which is more
-convenient, easy to understand, and easy to expand than directly using
-spark to manipulate Flink Table Store.
-
-Flink Table Store Integration
--------------------
-
-To enable the integration of kyuubi spark sql engine and Flink Table Store through
-Apache Spark Datasource V2 and Catalog APIs, you need to:
-
-- Referencing the Flink Table Store :ref:`dependencies`
-- Setting the spark extension and catalog :ref:`configurations`
-
-.. _spark-flink-table-store-deps:
-
-Dependencies
-************
-
-The **classpath** of kyuubi spark sql engine with Flink Table Store supported consists of
-
-1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
-2. a copy of spark distribution
-3. flink-table-store-spark-.jar (example: flink-table-store-spark-0.2.jar), which can be found in the `Maven Central`_
-
-In order to make the Flink Table Store packages visible for the runtime classpath of engines, we can use one of these methods:
-
-1. Put the Flink Table Store packages into ``$SPARK_HOME/jars`` directly
-2. Set ``spark.jars=/path/to/flink-table-store-spark``
-
-.. warning::
- Please mind the compatibility of different Flink Table Store and Spark versions, which can be confirmed on the page of `Flink Table Store multi engine support`_.
-
-.. _spark-flink-table-store-conf:
-
-Configurations
-**************
-
-To activate functionality of Flink Table Store, we can set the following configurations:
-
-.. code-block:: properties
-
- spark.sql.catalog.tablestore=org.apache.flink.table.store.spark.SparkCatalog
- spark.sql.catalog.tablestore.warehouse=file:/tmp/warehouse
-
-Flink Table Store Operations
-------------------
-
-Flink Table Store supports reading table store tables through Spark.
-A common scenario is to write data with Flink and read data with Spark.
-You can follow this document `Flink Table Store Quick Start`_ to write data to a table store table
-and then use kyuubi spark sql engine to query the table with the following SQL ``SELECT`` statement.
-
-
-.. code-block:: sql
-
- select * from table_store.default.word_count;
-
-
-
-.. _Flink Table Store: https://nightlies.apache.org/flink/flink-table-store-docs-stable/
-.. _Flink Table Store Quick Start: https://nightlies.apache.org/flink/flink-table-store-docs-stable/docs/try-table-store/quick-start/
-.. _Official Documentation: https://nightlies.apache.org/flink/flink-table-store-docs-stable/
-.. _Maven Central: https://mvnrepository.com/artifact/org.apache.flink
-.. _Flink Table Store multi engine support: https://nightlies.apache.org/flink/flink-table-store-docs-stable/docs/engines/overview/
diff --git a/docs/connector/spark/index.rst b/docs/connector/spark/index.rst
index 790e804f2..d1503443c 100644
--- a/docs/connector/spark/index.rst
+++ b/docs/connector/spark/index.rst
@@ -23,7 +23,7 @@ By default, it provides accessibility to hive warehouses with various file forma
supported, such as parquet, orc, json, etc.
Also,it can easily integrate with other third-party libraries, such as Hudi,
-Iceberg, Delta Lake, Kudu, Flink Table Store, HBase,Cassandra, etc.
+Iceberg, Delta Lake, Kudu, Apache Paimon (Incubating), HBase,Cassandra, etc.
We also provide sample data sources like TDC-DS, TPC-H for testing and benchmarking
purpose.
@@ -37,7 +37,7 @@ purpose.
iceberg
kudu
hive
- flink_table_store
+ paimon
tidb
tpcds
tpch
diff --git a/docs/connector/spark/paimon.rst b/docs/connector/spark/paimon.rst
new file mode 100644
index 000000000..14e741955
--- /dev/null
+++ b/docs/connector/spark/paimon.rst
@@ -0,0 +1,110 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+`Apache Paimon (Incubating)`_
+==========
+
+Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
+
+.. tip::
+ This article assumes that you have mastered the basic knowledge and operation of `Apache Paimon (Incubating)`_.
+ For the knowledge about Apache Paimon (Incubating) not mentioned in this article,
+ you can obtain it from its `Official Documentation`_.
+
+By using kyuubi, we can run SQL queries towards Apache Paimon (Incubating) which is more
+convenient, easy to understand, and easy to expand than directly using
+spark to manipulate Apache Paimon (Incubating).
+
+Apache Paimon (Incubating) Integration
+-------------------
+
+To enable the integration of kyuubi spark sql engine and Apache Paimon (Incubating), you need to set the following configurations:
+
+- Referencing the Apache Paimon (Incubating) :ref:`dependencies`
+- Setting the spark extension and catalog :ref:`configurations`
+
+.. _spark-paimon-deps:
+
+Dependencies
+************
+
+The **classpath** of kyuubi spark sql engine with Apache Paimon (Incubating) consists of
+
+1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
+2. a copy of spark distribution
+3. paimon-spark-.jar (example: paimon-spark-3.3-0.4-20230323.002035-5.jar), which can be found in the `Apache Paimon (Incubating) Supported Engines Spark3`_
+
+In order to make the Apache Paimon (Incubating) packages visible for the runtime classpath of engines, we can use one of these methods:
+
+1. Put the Apache Paimon (Incubating) packages into ``$SPARK_HOME/jars`` directly
+2. Set ``spark.jars=/path/to/paimon-spark-.jar``
+
+.. warning::
+ Please mind the compatibility of different Apache Paimon (Incubating) and Spark versions, which can be confirmed on the page of `Apache Paimon (Incubating) multi engine support`_.
+
+.. _spark-paimon-conf:
+
+Configurations
+**************
+
+To activate functionality of Apache Paimon (Incubating), we can set the following configurations:
+
+.. code-block:: properties
+
+ spark.sql.catalog.paimon=org.apache.paimon.spark.SparkCatalog
+ spark.sql.catalog.paimon.warehouse=file:/tmp/paimon
+
+Apache Paimon (Incubating) Operations
+------------------
+
+
+Taking ``CREATE NAMESPACE`` as a example,
+
+.. code-block:: sql
+
+ CREATE DATABASE paimon.default;
+ USE paimon.default;
+
+Taking ``CREATE TABLE`` as a example,
+
+.. code-block:: sql
+
+ create table my_table (
+ k int,
+ v string
+ ) tblproperties (
+ 'primary-key' = 'k'
+ );
+
+Taking ``SELECT`` as a example,
+
+.. code-block:: sql
+
+ SELECT * FROM my_table;
+
+
+Taking ``INSERT`` as a example,
+
+.. code-block:: sql
+
+ INSERT INTO my_table VALUES (1, 'Hi Again'), (3, 'Test');
+
+
+
+
+.. _Apache Paimon (Incubating): https://paimon.apache.org/
+.. _Official Documentation: https://paimon.apache.org/docs/master/
+.. _Apache Paimon (Incubating) Supported Engines Spark3: https://paimon.apache.org/docs/master/engines/spark3/
+.. _Apache Paimon (Incubating) multi engine support: https://paimon.apache.org/docs/master/engines/overview/
diff --git a/docs/connector/trino/flink_table_store.rst b/docs/connector/trino/flink_table_store.rst
deleted file mode 100644
index 8dd0c4061..000000000
--- a/docs/connector/trino/flink_table_store.rst
+++ /dev/null
@@ -1,94 +0,0 @@
-.. Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
-
-.. http://www.apache.org/licenses/LICENSE-2.0
-
-.. Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
-
-`Flink Table Store`_
-==========
-
-Flink Table Store is a unified storage to build dynamic tables for both streaming and batch processing in Flink,
-supporting high-speed data ingestion and timely data query.
-
-.. tip::
- This article assumes that you have mastered the basic knowledge and operation of `Flink Table Store`_.
- For the knowledge about Flink Table Store not mentioned in this article,
- you can obtain it from its `Official Documentation`_.
-
-By using kyuubi, we can run SQL queries towards Flink Table Store which is more
-convenient, easy to understand, and easy to expand than directly using
-trino to manipulate Flink Table Store.
-
-Flink Table Store Integration
--------------------
-
-To enable the integration of kyuubi trino sql engine and Flink Table Store, you need to:
-
-- Referencing the Flink Table Store :ref:`dependencies`
-- Setting the trino extension and catalog :ref:`configurations`
-
-.. _trino-flink-table-store-deps:
-
-Dependencies
-************
-
-The **classpath** of kyuubi trino sql engine with Flink Table Store supported consists of
-
-1. kyuubi-trino-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
-2. a copy of trino distribution
-3. flink-table-store-trino-.jar (example: flink-table-store-trino-0.2.jar), which code can be found in the `Source Code`_
-4. flink-shaded-hadoop-2-uber-2.8.3-10.0.jar, which code can be found in the `Pre-bundled Hadoop 2.8.3`_
-
-In order to make the Flink Table Store packages visible for the runtime classpath of engines, we can use these methods:
-
-1. Build the flink-table-store-trino-.jar by reference to `Flink Table Store Trino README`_
-2. Put the flink-table-store-trino-.jar and flink-shaded-hadoop-2-uber-2.8.3-10.0.jar packages into ``$TRINO_SERVER_HOME/plugin/tablestore`` directly
-
-.. warning::
- Please mind the compatibility of different Flink Table Store and Trino versions, which can be confirmed on the page of `Flink Table Store multi engine support`_.
-
-.. _trino-flink-table-store-conf:
-
-Configurations
-**************
-
-To activate functionality of Flink Table Store, we can set the following configurations:
-
-Catalogs are registered by creating a catalog properties file in the $TRINO_SERVER_HOME/etc/catalog directory.
-For example, create $TRINO_SERVER_HOME/etc/catalog/tablestore.properties with the following contents to mount the tablestore connector as the tablestore catalog:
-
-.. code-block:: properties
-
- connector.name=tablestore
- warehouse=file:///tmp/warehouse
-
-Flink Table Store Operations
-------------------
-
-Flink Table Store supports reading table store tables through Trino.
-A common scenario is to write data with Flink and read data with Trino.
-You can follow this document `Flink Table Store Quick Start`_ to write data to a table store table
-and then use kyuubi trino sql engine to query the table with the following SQL ``SELECT`` statement.
-
-
-.. code-block:: sql
-
- SELECT * FROM tablestore.default.t1
-
-
-.. _Flink Table Store: https://nightlies.apache.org/flink/flink-table-store-docs-stable/
-.. _Flink Table Store Quick Start: https://nightlies.apache.org/flink/flink-table-store-docs-stable/docs/try-table-store/quick-start/
-.. _Official Documentation: https://nightlies.apache.org/flink/flink-table-store-docs-stable/
-.. _Source Code: https://github.com/JingsongLi/flink-table-store-trino
-.. _Flink Table Store multi engine support: https://nightlies.apache.org/flink/flink-table-store-docs-stable/docs/engines/overview/
-.. _Pre-bundled Hadoop 2.8.3: https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.8.3-10.0/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar
-.. _Flink Table Store Trino README: https://github.com/JingsongLi/flink-table-store-trino#readme
diff --git a/docs/connector/trino/hudi.rst b/docs/connector/trino/hudi.rst
new file mode 100644
index 000000000..5c965a0b6
--- /dev/null
+++ b/docs/connector/trino/hudi.rst
@@ -0,0 +1,80 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+`Hudi`_
+========
+
+Apache Hudi (pronounced “hoodie”) is the next generation streaming data lake platform.
+Apache Hudi brings core warehouse and database functionality directly to a data lake.
+
+.. tip::
+ This article assumes that you have mastered the basic knowledge and operation of `Hudi`_.
+ For the knowledge about Hudi not mentioned in this article,
+ you can obtain it from its `Official Documentation`_.
+
+By using Kyuubi, we can run SQL queries towards Hudi which is more convenient, easy to understand,
+and easy to expand than directly using Trino to manipulate Hudi.
+
+Hudi Integration
+----------------
+
+To enable the integration of Kyuubi Trino SQL engine and Hudi, you need to:
+
+- Setting the Trino extension and catalog :ref:`configurations`
+
+.. _trino-hudi-conf:
+
+Configurations
+**************
+
+Catalogs are registered by creating a file of catalog properties in the `$TRINO_SERVER_HOME/etc/catalog` directory.
+For example, we can create a `$TRINO_SERVER_HOME/etc/catalog/hudi.properties` with the following contents to mount the Hudi connector as a Hudi catalog:
+
+.. code-block:: properties
+
+ connector.name=hudi
+ hive.metastore.uri=thrift://example.net:9083
+
+Note: You need to replace $TRINO_SERVER_HOME above to your Trino server home path like `/opt/trino-server-406`.
+
+More configuration properties can be found in the `Hudi connector in Trino document`_.
+
+.. tip::
+ Trino version 398 or higher, it is recommended to use the Hudi connector.
+ You don't need to install any dependencies in version 398 or higher.
+
+Hudi Operations
+---------------
+The globally available and read operation statements are supported in Trino.
+These statements can be found in `Trino SQL Support`_.
+Currently, Trino cannot write data to a Hudi table.
+A common scenario is to write data with Spark/Flink and read data with Trino.
+You can use the Kyuubi Trino SQL engine to query the table with the following SQL ``SELECT`` statement.
+
+Taking ``Query Data`` as a example,
+
+.. code-block:: sql
+
+ USE example.example_schema;
+
+ SELECT symbol, max(ts)
+ FROM stock_ticks_cow
+ GROUP BY symbol
+ HAVING symbol = 'GOOG';
+
+.. _Hudi: https://hudi.apache.org/
+.. _Official Documentation: https://hudi.apache.org/docs/overview
+.. _Hudi connector in Trino document: https://trino.io/docs/current/connector/hudi.html
+.. _Trino SQL Support: https://trino.io/docs/current/language/sql-support.html#
diff --git a/docs/connector/trino/index.rst b/docs/connector/trino/index.rst
index a5c5675ce..290966a5c 100644
--- a/docs/connector/trino/index.rst
+++ b/docs/connector/trino/index.rst
@@ -19,5 +19,6 @@ Connectors For Trino SQL Engine
.. toctree::
:maxdepth: 2
- flink_table_store
+ paimon
+ hudi
iceberg
\ No newline at end of file
diff --git a/docs/connector/trino/paimon.rst b/docs/connector/trino/paimon.rst
new file mode 100644
index 000000000..5ac892234
--- /dev/null
+++ b/docs/connector/trino/paimon.rst
@@ -0,0 +1,92 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+`Apache Paimon (Incubating)`_
+==========
+
+Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.
+
+.. tip::
+ This article assumes that you have mastered the basic knowledge and operation of `Apache Paimon (Incubating)`_.
+ For the knowledge about Apache Paimon (Incubating) not mentioned in this article,
+ you can obtain it from its `Official Documentation`_.
+
+By using kyuubi, we can run SQL queries towards Apache Paimon (Incubating) which is more
+convenient, easy to understand, and easy to expand than directly using
+trino to manipulate Apache Paimon (Incubating).
+
+Apache Paimon (Incubating) Integration
+-------------------
+
+To enable the integration of kyuubi trino sql engine and Apache Paimon (Incubating), you need to:
+
+- Referencing the Apache Paimon (Incubating) :ref:`dependencies`
+- Setting the trino extension and catalog :ref:`configurations`
+
+.. _trino-paimon-deps:
+
+Dependencies
+************
+
+The **classpath** of kyuubi trino sql engine with Apache Paimon (Incubating) supported consists of
+
+1. kyuubi-trino-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with Kyuubi distributions
+2. a copy of trino distribution
+3. paimon-trino-.jar (example: paimon-trino-0.2.jar), which code can be found in the `Source Code`_
+4. flink-shaded-hadoop-2-uber-.jar, which code can be found in the `Pre-bundled Hadoop`_
+
+In order to make the Apache Paimon (Incubating) packages visible for the runtime classpath of engines, you need to:
+
+1. Build the paimon-trino-.jar by reference to `Apache Paimon (Incubating) Trino README`_
+2. Put the paimon-trino-.jar and flink-shaded-hadoop-2-uber-.jar packages into ``$TRINO_SERVER_HOME/plugin/tablestore`` directly
+
+.. warning::
+ Please mind the compatibility of different Apache Paimon (Incubating) and Trino versions, which can be confirmed on the page of `Apache Paimon (Incubating) multi engine support`_.
+
+.. _trino-paimon-conf:
+
+Configurations
+**************
+
+To activate functionality of Apache Paimon (Incubating), we can set the following configurations:
+
+Catalogs are registered by creating a catalog properties file in the $TRINO_SERVER_HOME/etc/catalog directory.
+For example, create $TRINO_SERVER_HOME/etc/catalog/tablestore.properties with the following contents to mount the tablestore connector as the tablestore catalog:
+
+.. code-block:: properties
+
+ connector.name=tablestore
+ warehouse=file:///tmp/warehouse
+
+Apache Paimon (Incubating) Operations
+------------------
+
+Apache Paimon (Incubating) supports reading table store tables through Trino.
+A common scenario is to write data with Spark or Flink and read data with Trino.
+You can follow this document `Apache Paimon (Incubating) Engines Flink Quick Start`_ to write data to a table store table
+and then use kyuubi trino sql engine to query the table with the following SQL ``SELECT`` statement.
+
+
+.. code-block:: sql
+
+ SELECT * FROM tablestore.default.t1
+
+.. _Apache Paimon (Incubating): https://paimon.apache.org/
+.. _Apache Paimon (Incubating) multi engine support: https://paimon.apache.org/docs/master/engines/overview/
+.. _Apache Paimon (Incubating) Engines Flink Quick Start: https://paimon.apache.org/docs/master/engines/flink/#quick-start
+.. _Official Documentation: https://paimon.apache.org/docs/master/
+.. _Source Code: https://github.com/JingsongLi/paimon-trino
+.. _Pre-bundled Hadoop: https://flink.apache.org/downloads/#additional-components
+.. _Apache Paimon (Incubating) Trino README: https://github.com/JingsongLi/paimon-trino#readme
diff --git a/docs/develop_tools/building.md b/docs/contributing/code/building.md
similarity index 90%
rename from docs/develop_tools/building.md
rename to docs/contributing/code/building.md
index 9dfc01f42..8c5c5aeec 100644
--- a/docs/develop_tools/building.md
+++ b/docs/contributing/code/building.md
@@ -15,11 +15,11 @@
- limitations under the License.
-->
-# Building Kyuubi
+# Building From Source
-## Building Kyuubi with Apache Maven
+## Building With Maven
-**Kyuubi** is built based on [Apache Maven](http://maven.apache.org),
+**Kyuubi** is built based on [Apache Maven](https://maven.apache.org),
```bash
./build/mvn clean package -DskipTests
@@ -33,7 +33,7 @@ If you want to test it manually, you can start Kyuubi directly from the Kyuubi p
bin/kyuubi start
```
-## Building a Submodule Individually
+## Building A Submodule Individually
For instance, you can build the Kyuubi Common module using:
@@ -49,7 +49,7 @@ For instance, you can build the Kyuubi Common module using:
build/mvn clean package -pl kyuubi-common,kyuubi-ha -DskipTests
```
-## Skipping Some modules
+## Skipping Some Modules
For instance, you can build the Kyuubi modules without Kyuubi Codecov and Assembly modules using:
@@ -57,7 +57,7 @@ For instance, you can build the Kyuubi modules without Kyuubi Codecov and Assemb
mvn clean install -pl '!dev/kyuubi-codecov,!kyuubi-assembly' -DskipTests
```
-## Building Kyuubi against Different Apache Spark versions
+## Building Kyuubi Against Different Apache Spark Versions
Since v1.1.0, Kyuubi support building with different Spark profiles,
@@ -67,7 +67,7 @@ Since v1.1.0, Kyuubi support building with different Spark profiles,
| -Pspark-3.2 | No | 1.4.0 |
| -Pspark-3.3 | Yes | 1.6.0 |
-## Building with Apache dlcdn site
+## Building With Apache dlcdn Site
By default, we use `https://archive.apache.org/dist/` to download the built-in release packages of engines,
such as Spark or Flink.
diff --git a/docs/develop_tools/debugging.md b/docs/contributing/code/debugging.md
similarity index 98%
rename from docs/develop_tools/debugging.md
rename to docs/contributing/code/debugging.md
index faf7173e4..d3fb6d16f 100644
--- a/docs/develop_tools/debugging.md
+++ b/docs/contributing/code/debugging.md
@@ -35,7 +35,7 @@ In the IDE, you set the corresponding parameters(host&port) in debug configurati
diff --git a/docs/develop_tools/developer.md b/docs/contributing/code/developer.md
similarity index 70%
rename from docs/develop_tools/developer.md
rename to docs/contributing/code/developer.md
index 329e219de..518d71871 100644
--- a/docs/develop_tools/developer.md
+++ b/docs/contributing/code/developer.md
@@ -24,16 +24,6 @@
build/mvn versions:set -DgenerateBackupPoms=false
```
-## Update Document Version
-
-Whenever project version updates, please also update the document version at `docs/conf.py` to target the upcoming release.
-
-For example,
-
-```python
-release = '1.2.0'
-```
-
## Update Dependency List
Kyuubi uses the `dev/dependencyList` file to indicate what upstream dependencies will actually go to the server-side classpath.
@@ -56,5 +46,13 @@ You can run `dev/reformat` to format all Java and Scala code.
Kyuubi uses settings.md to explain available configurations.
-You can run `KYUUBI_UPDATE=1 build/mvn clean test -pl kyuubi-server -am -Pflink-provided,spark-provided,hive-provided -DwildcardSuites=org.apache.kyuubi.config.AllKyuubiConfiguration`
-to append descriptions of new configurations to settings.md.
+You can run `dev/gen/gen_all_config_docs.sh` to append and update descriptions of new configurations to `settings.md`.
+
+## Generative Tooling Usage
+
+In general, the ASF allows contributions co-authored using generative AI tools. However, there are several considerations when you submit a patch containing generated content.
+
+Foremost, you are required to disclose usage of such tool. Furthermore, you are responsible for ensuring that the terms and conditions of the tool in question are
+compatible with usage in an Open Source project and inclusion of the generated content doesn't pose a risk of copyright violation.
+
+Please refer to [The ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for more detailed information.
diff --git a/docs/develop_tools/distribution.md b/docs/contributing/code/distribution.md
similarity index 86%
rename from docs/develop_tools/distribution.md
rename to docs/contributing/code/distribution.md
index abc2ac91b..23c9c6542 100644
--- a/docs/develop_tools/distribution.md
+++ b/docs/contributing/code/distribution.md
@@ -15,7 +15,7 @@
- limitations under the License.
-->
-# Building a Runnable Distribution
+# Building A Runnable Distribution
To create a Kyuubi distribution like those distributed by [Kyuubi Release Page](https://kyuubi.apache.org/releases.html),
and that is laid out to be runnable, use `./build/dist` in the project root directory.
@@ -26,15 +26,16 @@ For more information on usage, run `./build/dist --help`
./build/dist - Tool for making binary distributions of Kyuubi
Usage:
-+------------------------------------------------------------------------------------------------------+
-| ./build/dist [--name ] [--tgz] [--flink-provided] [--spark-provided] [--hive-provided] |
-| [--mvn ] |
-+------------------------------------------------------------------------------------------------------+
++----------------------------------------------------------------------------------------------+
+| ./build/dist [--name ] [--tgz] [--web-ui] [--flink-provided] [--hive-provided] |
+| [--spark-provided] [--mvn ] |
++----------------------------------------------------------------------------------------------+
name: - custom binary name, using project version if undefined
tgz: - whether to make a whole bundled package
+web-ui: - whether to include web ui
flink-provided: - whether to make a package without Flink binary
-spark-provided: - whether to make a package without Spark binary
hive-provided: - whether to make a package without Hive binary
+spark-provided: - whether to make a package without Spark binary
mvn: - external maven executable location
```
diff --git a/docs/contributing/code/get_started.rst b/docs/contributing/code/get_started.rst
new file mode 100644
index 000000000..0dcd90304
--- /dev/null
+++ b/docs/contributing/code/get_started.rst
@@ -0,0 +1,95 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+Get Started
+===========
+
+Good First Issues
+-----------------
+
+.. image:: https://img.shields.io/github/issues/apache/kyuubi/good%20first%20issue?color=green&label=Good%20first%20issue&logo=gfi&logoColor=red&style=for-the-badge
+ :alt: GitHub issues by-label
+ :target: `Good First Issues`_
+
+**Good First Issue** is initiative to curate easy pickings for first-time
+contributors. It helps you locate suitable development tasks with beginner's
+skills required, and finally make your first contribution to Kyuubi.
+
+After solving one or more good first issues, you should be able to
+
+- Find efficient ways to communicate with the community and get help
+- Setup `develop environment`_ on your machine
+- `Build`_ Kyuubi from source
+- `Run tests`_ locally
+- `Submit a pull request`_ through Github
+- Be listed in `Apache Kyuubi contributors`_
+- And most importantly, you can move to the next level and try some tricky issues
+
+.. note:: Don't linger too long at this stage.
+ :class: dropdown, toggle
+
+Help Wanted Issues
+------------------
+
+.. image:: https://img.shields.io/github/issues/apache/kyuubi/help%20wanted?color=brightgreen&label=HELP%20WANTED&style=for-the-badge
+ :alt: GitHub issues by-label
+ :target: `Help Wanted Issues`_
+
+Issues that maintainers labeled as help wanted are mostly
+
+- sub-tasks of an ongoing shorthanded umbrella
+- non-urgent improvements
+- bug fixes for corner cases
+- feature requests not covered by current technology stack of kyuubi community
+
+Since these problems are not urgent, you can take your time when fixing them.
+
+.. note:: Help wanted issues may contain easy pickings and tricky ones.
+ :class: dropdown, toggle
+
+
+Code Contribution Programs
+--------------------------
+
+Kyuubi Code Program is a **semi-annual** and **annual** coding program. It's
+a 2-month program and the first round will start in October, 2023.
+
+The program is open to all contributors and newbie-friendly as it will provide
+a mentor to help you get through the sub-tasks.
+
+You will be rewarded with a Kyuubi SWAG, such as a Kyuubi Contributor T-shirt,
+after you complete the program.
+
+.. image:: https://img.shields.io/badge/Kyuubi%20Code%20Program-2024H1-blue?style=for-the-badge
+
+- Status: Planning
+- Duration: 2024.02.01 - 2024.04.01
+- Sponsors: (keeping seats vacant in anticipation)
+
+.. image:: https://img.shields.io/badge/Kyuubi%20Code%20Program-2023-blue?style=for-the-badge
+ :target: https://github.com/apache/kyuubi/issues/5357
+
+- Status: In Progress
+- Duration: 2023.10.01 - 2023.12.01
+- Sponsors: NetEase
+
+.. _Good First Issues: https://github.com/apache/kyuubi/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22
+.. _develop environment: idea_setup.html
+.. _Build: build.html
+.. _Run tests: testing.html
+.. _Submit a pull request: https://kyuubi.apache.org/pull_request.html
+.. _Apache Kyuubi contributors: https://github.com/apache/kyuubi/graphs/contributors
+.. _Help Wanted Issues: https://github.com/apache/kyuubi/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22
+
diff --git a/docs/develop_tools/idea_setup.md b/docs/contributing/code/idea_setup.md
similarity index 100%
rename from docs/develop_tools/idea_setup.md
rename to docs/contributing/code/idea_setup.md
diff --git a/docs/develop_tools/index.rst b/docs/contributing/code/index.rst
similarity index 84%
rename from docs/develop_tools/index.rst
rename to docs/contributing/code/index.rst
index c56321cb3..25a6e421b 100644
--- a/docs/develop_tools/index.rst
+++ b/docs/contributing/code/index.rst
@@ -13,15 +13,19 @@
See the License for the specific language governing permissions and
limitations under the License.
-Develop Tools
-=============
+Contributing Code
+=================
+
+These sections explain the process, guidelines, and tools for contributing
+code to the Kyuubi project.
.. toctree::
:maxdepth: 2
+ get_started
+ style
building
distribution
- build_document
testing
debugging
developer
diff --git a/docs/contributing/code/style.rst b/docs/contributing/code/style.rst
new file mode 100644
index 000000000..d967e8959
--- /dev/null
+++ b/docs/contributing/code/style.rst
@@ -0,0 +1,39 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+Code Style Guide
+================
+
+Code is written once by its author, but read and modified multiple times by
+lots of other engineers. As most bugs actually come from future modification
+of the code, we need to optimize our codebase for long-term, global
+readability and maintainability. The best way to achieve this is to write
+simple code.
+
+Kyuubi's source code is multilingual, specific code style will be applied to
+corresponding language.
+
+Scala Coding Style Guide
+------------------------
+
+Kyuubi adopts the `Databricks Scala Coding Style Guide`_ for scala codes.
+
+Java Coding Style Guide
+-----------------------
+
+Kyuubi adopts the `Google Java style`_ for java codes.
+
+.. _Databricks Scala Coding Style Guide: https://github.com/databricks/scala-style-guide
+.. _Google Java style: https://google.github.io/styleguide/javaguide.html
\ No newline at end of file
diff --git a/docs/develop_tools/testing.md b/docs/contributing/code/testing.md
similarity index 87%
rename from docs/develop_tools/testing.md
rename to docs/contributing/code/testing.md
index 48a2e9787..3e63aa1a2 100644
--- a/docs/develop_tools/testing.md
+++ b/docs/contributing/code/testing.md
@@ -17,8 +17,8 @@
# Running Tests
-**Kyuubi** can be tested based on [Apache Maven](http://maven.apache.org) and the ScalaTest Maven Plugin,
-please refer to the [ScalaTest documentation](http://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin),
+**Kyuubi** can be tested based on [Apache Maven](https://maven.apache.org) and the ScalaTest Maven Plugin,
+please refer to the [ScalaTest documentation](https://www.scalatest.org/user_guide/using_the_scalatest_maven_plugin),
## Running Tests Fully
diff --git a/docs/contributing/doc/build.rst b/docs/contributing/doc/build.rst
new file mode 100644
index 000000000..4ec2362f3
--- /dev/null
+++ b/docs/contributing/doc/build.rst
@@ -0,0 +1,96 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+Building Documentation
+======================
+
+Follow the steps below and learn how to build the Kyuubi documentation as the
+one you are watching now.
+
+Setup Environment
+-----------------
+
+- Firstly, install ``virtualenv``, this is optional but recommended as it is useful
+ to create an independent environment to resolve dependency issues for building
+ the documentation.
+
+.. code-block:: sh
+ :caption: Install virtualenv
+
+ $ pip install virtualenv
+
+- Switch to the ``docs`` root directory.
+
+.. code-block:: sh
+ :caption: Switch to docs
+
+ $ cd $KYUUBI_SOURCE_PATH/docs
+
+- Create a virtual environment named 'kyuubi' or anything you like using ``virtualenv``
+ if it's not existing.
+
+.. code-block:: sh
+ :caption: New virtual environment
+
+ $ virtualenv kyuubi
+
+- Activate the virtual environment,
+
+.. code-block:: sh
+ :caption: Activate virtual environment
+
+ $ source ./kyuubi/bin/activate
+
+Install All Dependencies
+------------------------
+
+Install all dependencies enumerated in the ``requirements.txt``.
+
+.. code-block:: sh
+ :caption: Install dependencies
+
+ $ pip install -r requirements.txt
+
+
+Create Documentation
+--------------------
+
+Make sure you are in the ``$KYUUBI_SOURCE_PATH/docs`` directory.
+
+Linux & MacOS
+~~~~~~~~~~~~~
+
+.. code-block:: sh
+ :caption: Sphinx build on Unix-like OS
+
+ $ make html
+
+Windows
+~~~~~~~
+
+.. code-block:: sh
+ :caption: Sphinx build on Windows
+
+ $ make.bat html
+
+
+If the build process succeed, the HTML pages are in
+``$KYUUBI_SOURCE_PATH/docs/_build/html``.
+
+View Locally
+------------
+
+Open the `$KYUUBI_SOURCE_PATH/docs/_build/html/index.html` file in your
+favorite web browser.
diff --git a/docs/contributing/doc/get_started.rst b/docs/contributing/doc/get_started.rst
new file mode 100644
index 000000000..f262695b7
--- /dev/null
+++ b/docs/contributing/doc/get_started.rst
@@ -0,0 +1,117 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+Get Started
+===========
+
+.. image:: https://img.shields.io/github/issues/apache/kyuubi/kind:documentation?color=green&logo=gfi&logoColor=red&style=for-the-badge
+ :alt: GitHub issues by-label
+
+
+Trivial Fixes
+-------------
+
+For typos, layout, grammar, spelling, punctuation errors and other similar issues
+or changes that occur within a single file, it is acceptable to make edits directly
+on the page being viewed. When viewing a source file on kyuubi's
+`Github repository`_, a simple click on the ``edit icon`` or keyboard shortcut
+``e`` will activate the editor. Similarly, when viewing files on `Read The Docs`_
+platform, clicking on the ``suggest edit`` button will lead you to the editor.
+These methods do not require any local development environment setup and
+are convenient for making quick fixes.
+
+Upon completion of the editing process, opt the ``commit changes`` option,
+adhere to the provided instructions to submit a pull request,
+and await feedback from the designated reviewer.
+
+Major Fixes
+-----------
+
+For significant modifications that affect multiple files, it is advisable to
+clone the repository to a local development environment, implement the necessary
+changes, and conduct thorough testing prior to submitting a pull request.
+
+
+`Fork`_ The Repository
+~~~~~~~~~~~~~~~~~~~~~~
+
+Clone The Forked Repository
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+.. code-block::
+ :caption: Clone the repository
+
+ $ git clone https://github.com/your_username/kyuubi.git
+
+Replace "your_username" with your GitHub username. This will create a local
+copy of your forked repository on your machine. You will see the ``master``
+branch if you run ``git branch`` in the ``kyuubi`` folder.
+
+Create A New Branch
+~~~~~~~~~~~~~~~~~~~
+
+.. code-block::
+ :caption: Create a new branch
+
+ $ git checkout -b guide
+ Switched to a new branch 'guide'
+
+Editing And Testing
+~~~~~~~~~~~~~~~~~~~
+
+Make the necessary changes to the documentation files using a text editor.
+`Build and verify`_ the changes you have made to see if they look fine.
+
+.. note::
+ :class: dropdown, toggle
+
+Create A Pull Request
+~~~~~~~~~~~~~~~~~~~~~
+
+Once you have made the changes,
+
+- Commit them with a descriptive commit message using the command:
+
+.. code-block::
+ :caption: commit the changes
+
+ $ git commit -m "Description of changes made"
+
+- Push the changes to your forked repository using the command
+
+.. code-block::
+ :caption: push the changes
+
+ $ git push origin guide
+
+- `Create A Pull Request`_ with a descriptive PR title and description.
+
+- Polishing the PR with comments of reviews addressed
+
+Report Only
+-----------
+
+If you don't have time to fix the doc issue and submit a pull request on your own,
+`reporting a document issue`_ also helps. Please follow some basic rules:
+
+- Use the title field to clearly describe the issue
+- Choose the documentation report template
+- Fill out the required field in the documentation report
+
+.. _Home Page: https://kyuubi.apache.org
+.. _Fork: https://github.com/apache/kyuubi/fork
+.. _Build and verify: build.html
+.. _Create A Pull Request: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request
+.. _reporting a document issue: https://github.com/apache/kyuubi/issues/new/choose
\ No newline at end of file
diff --git a/docs/contributing/doc/index.rst b/docs/contributing/doc/index.rst
new file mode 100644
index 000000000..bf6ae41bd
--- /dev/null
+++ b/docs/contributing/doc/index.rst
@@ -0,0 +1,44 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+Contributing Documentations
+===========================
+
+The project documentation is crucial for users and contributors. This guide
+outlines the contribution guidelines for Apache Kyuubi documentation.
+
+Kyuubi's documentation source files are maintained in the same `github repository`_
+as the code base, which ensures updating code and documentation synchronously.
+All documentation source files can be found in the sub-folder named ``docs``.
+
+Kyuubi's documentation is published and hosted on `Read The Docs`_ platform by
+version. with each version having its own dedicated page. To access a specific
+version of the document, simply navigate to the "Docs" tab on our Home Page.
+
+We welcome any contributions to the documentation, including but not limited to
+writing, translation, report doc issues on Github, reposting.
+
+
+.. toctree::
+ :maxdepth: 2
+
+ get_started
+ style
+ build
+
+.. _Github repository: https://github.com/apache/kyuubi
+.. _Restructured Text: https://en.wikipedia.org/wiki/ReStructuredText
+.. _Read The Docs: https://kyuubi.rtfd.io
+.. _Home Page: https://kyuubi.apache.org
\ No newline at end of file
diff --git a/docs/contributing/doc/style.rst b/docs/contributing/doc/style.rst
new file mode 100644
index 000000000..14cc2b8ac
--- /dev/null
+++ b/docs/contributing/doc/style.rst
@@ -0,0 +1,135 @@
+.. Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+.. http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+
+Documentation Style Guide
+=========================
+
+This guide contains guidelines, not rules. While guidelines are important
+to follow, they are not hard and fast rules. It's important to use your
+own judgement and discretion when creating content, and to depart from the
+guidelines when necessary to improve the quality and effectiveness of your
+content. Ultimately, the goal is to create content that is clear, concise,
+and useful to your audience, and sometimes deviating from the guidelines
+may be necessary to achieve that goal.
+
+Goals
+-----
+
+- Source text files are readable and portable
+- Source diagram files are editable
+- Source files are maintainable over time and across community
+
+License Header
+--------------
+
+All original documents should include the ASF license header. All reproduced
+or quoted content should be authorized and attributed to the source.
+
+If you are about to quote some from commercial materials, please refer to
+`ASF 3RD PARTY LICENSE POLICY`_, or consult the Apache Kyuubi PMC to avoid
+legality issues.
+
+General Style
+-------------
+
+- Use `ReStructuredText`_ or `Markdown`_ format for text, avoid HTML hacks
+- Use `draw.io`_ for drawing or editing an image, and export it as PNG for
+ referencing in document. A pull request should commit both of them
+- Use Kyuubi for short instead of Apache Kyuubi after the first time in the
+ same page
+- Character line limit: 78, except unbreakable ones
+- Prefer lists to tables
+- Prefer unordered list than ordered
+
+ReStructuredText
+----------------
+
+Headings
+~~~~~~~~
+
+- Use **Pascal Case**, every word starts with an uppercase letter,
+ e.g., 'Documentation Style Guide'
+- Use a max of **three levels**
+ - Split into multiple files when there comes an H4
+ - Prefer `directive rubric`_ than H4
+- Use underline-only adornment styles, **DO NOT** use overline
+ - The length of underline characters **SHOULD** match the title
+ - H1 should be underlined with '='
+ - H2 should be underlined with '-'
+ - H3 should be underlined with '~'
+ - H4 should be underlined with '^', but it's better to avoid using H4
+- **DO NOT** use numbering for sections
+- **DO NOT** use "Kyuubi" in titles if possible
+
+Links
+~~~~~
+
+- Define links with short descriptive phrases, group them at the bottom of the file
+
+.. note::
+ :class: dropdown, toggle
+
+ .. code-block::
+ :caption: Recommended
+
+ Please refer to `Apache Kyuubi Home Page`_.
+
+ .. _Apache Kyuubi Home Page: https://kyuubi.apache.org/
+
+ .. code-block::
+ :caption: Not recommended
+
+ Please refer to `Apache Kyuubi Home Page `_.
+
+
+Markdown
+--------
+
+Headings
+~~~~~~~~
+
+- Use **Pascal Case**, every word starts with an uppercase letter,
+ e.g., 'Documentation Style Guide'
+- Use a max of **three levels**
+ - Split into multiple files when there comes an H4
+- **DO NOT** use numbering for sections
+- **DO NOT** use "Kyuubi" in titles if possible
+
+Images
+------
+
+Use images only when they provide helpful visual explanations of information
+otherwise difficult to express with words
+
+Third-party references
+----------------------
+
+If the preceding references don't provide explicit guidance, then see these
+third-party references, depending on the nature of your question:
+
+- `Google developer documentation style`_
+- `Apple Style Guide`_
+- `Red Hat supplementary style guide for product documentation`_
+
+.. References
+
+.. _ASF 3RD PARTY LICENSE POLICY: https://www.apache.org/legal/resolved.html#asf-3rd-party-license-policy
+.. _directive rubric :https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-rubric
+.. _ReStructuredText: https://docutils.sourceforge.io/rst.html
+.. _Markdown: https://en.wikipedia.org/wiki/Markdown
+.. _draw.io: https://www.diagrams.net/
+.. _Google developer documentation style: https://developers.google.com/style
+.. _Apple Style Guide: https://help.apple.com/applestyleguide/
+.. _Red Hat supplementary style guide for product documentation: https://redhat-documentation.github.io/supplementary-style-guide/
diff --git a/docs/deployment/engine_on_kubernetes.md b/docs/deployment/engine_on_kubernetes.md
index ae8edcb75..a8f7c6ca0 100644
--- a/docs/deployment/engine_on_kubernetes.md
+++ b/docs/deployment/engine_on_kubernetes.md
@@ -21,7 +21,7 @@
When you want to run Kyuubi's Spark SQL engines on Kubernetes, you'd better have cognition upon the following things.
-* Read about [Running Spark On Kubernetes](http://spark.apache.org/docs/latest/running-on-kubernetes.html)
+* Read about [Running Spark On Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html)
* An active Kubernetes cluster
* [Kubectl](https://kubernetes.io/docs/reference/kubectl/overview/)
* KubeConfig of the target cluster
@@ -36,6 +36,17 @@ Spark on Kubernetes config master by using a special format.
You can use cmd `kubectl cluster-info` to get api-server host and port.
+### Deploy Mode
+
+One of the main advantages of the Kyuubi server compared to other interactive Spark clients is that it supports cluster deploy mode.
+It is highly recommended to run Spark in k8s in cluster mode.
+
+The minimum required configurations are:
+
+* spark.submit.deployMode (cluster)
+* spark.kubernetes.file.upload.path (path on s3 or hdfs)
+* spark.kubernetes.authenticate.driver.serviceAccountName ([viz ServiceAccount](#serviceaccount))
+
### Docker Image
Spark ships a `./bin/docker-image-tool.sh` script to build and publish the Docker images for running Spark applications on Kubernetes.
@@ -97,7 +108,7 @@ As it known to us all, Kubernetes can use configurations to mount volumes into d
* persistentVolumeClaim: mounts a PersistentVolume into a pod.
Note: Please
-see [the Security section of this document](http://spark.apache.org/docs/latest/running-on-kubernetes.html#security) for security issues related to volume mounts.
+see [the Security section of this document](https://spark.apache.org/docs/latest/running-on-kubernetes.html#security) for security issues related to volume mounts.
```
spark.kubernetes.driver.volumes...options.path=
@@ -107,7 +118,7 @@ spark.kubernetes.executor.volumes...options.path=
spark.kubernetes.executor.volumes...mount.path=
```
-Read [Using Kubernetes Volumes](http://spark.apache.org/docs/latest/running-on-kubernetes.html#using-kubernetes-volumes) for more about volumes.
+Read [Using Kubernetes Volumes](https://spark.apache.org/docs/latest/running-on-kubernetes.html#using-kubernetes-volumes) for more about volumes.
### PodTemplateFile
@@ -117,4 +128,4 @@ To do so, specify the spark properties `spark.kubernetes.driver.podTemplateFile`
### Other
-You can read Spark's official documentation for [Running on Kubernetes](http://spark.apache.org/docs/latest/running-on-kubernetes.html) for more information.
+You can read Spark's official documentation for [Running on Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html) for more information.
diff --git a/docs/deployment/engine_on_yarn.md b/docs/deployment/engine_on_yarn.md
index cb5bdd9e0..1025418d9 100644
--- a/docs/deployment/engine_on_yarn.md
+++ b/docs/deployment/engine_on_yarn.md
@@ -15,19 +15,19 @@
- limitations under the License.
-->
-# Deploy Kyuubi engines on Yarn
+# Deploy Kyuubi engines on YARN
-## Deploy Kyuubi Spark Engine on Yarn
+## Deploy Kyuubi Spark Engine on YARN
### Requirements
-When you want to deploy Kyuubi's Spark SQL engines on YARN, you'd better have cognition upon the following things.
+To deploy Kyuubi's Spark SQL engines on YARN, you'd better have cognition upon the following things.
-- Knowing the basics about [Running Spark on YARN](http://spark.apache.org/docs/latest/running-on-yarn.html)
+- Knowing the basics about [Running Spark on YARN](https://spark.apache.org/docs/latest/running-on-yarn.html)
- A binary distribution of Spark which is built with YARN support
- You can use the built-in Spark distribution
- You can get it from [Spark official website](https://spark.apache.org/downloads.html) directly
- - You can [Build Spark](http://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn) with `-Pyarn` maven option
+ - You can [Build Spark](https://spark.apache.org/docs/latest/building-spark.html#specifying-the-hadoop-version-and-enabling-yarn) with `-Pyarn` maven option
- An active [Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) cluster
- An active Apache Hadoop HDFS cluster
- Setup Hadoop client configurations at the machine the Kyuubi server locates
@@ -92,7 +92,7 @@ and how many cpus and memory will Spark driver, ApplicationMaster and each execu
| spark.executor.memory | 1g | Amount of memory to use for the executor process |
| spark.executor.memoryOverhead | executorMemory * 0.10, with minimum of 384 | Amount of additional memory to be allocated per executor process. This is memory that accounts for things like VM overheads, interned strings other native overheads, etc |
-It is recommended to use [Dynamic Allocation](http://spark.apache.org/docs/3.0.1/configuration.html#dynamic-allocation) with Kyuubi,
+It is recommended to use [Dynamic Allocation](https://spark.apache.org/docs/3.0.1/configuration.html#dynamic-allocation) with Kyuubi,
since the SQL engine will be long-running for a period, execute user's queries from clients periodically,
and the demand for computing resources is not the same for those queries.
It is better for Spark to release some executors when either the query is lightweight, or the SQL engine is being idled.
@@ -104,20 +104,20 @@ which allows YARN to cache it on nodes so that it doesn't need to be distributed
##### Others
-Please refer to [Spark properties](http://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties) to check other acceptable configs.
+Please refer to [Spark properties](https://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties) to check other acceptable configs.
### Kerberos
-Kyuubi currently does not support Spark's [YARN-specific Kerberos Configuration](http://spark.apache.org/docs/3.0.1/running-on-yarn.html#kerberos),
+Kyuubi currently does not support Spark's [YARN-specific Kerberos Configuration](https://spark.apache.org/docs/3.0.1/running-on-yarn.html#kerberos),
so `spark.kerberos.keytab` and `spark.kerberos.principal` should not use now.
Instead, you can schedule a periodically `kinit` process via `crontab` task on the local machine that hosts Kyuubi server or simply use [Kyuubi Kinit](settings.html#kinit).
-## Deploy Kyuubi Flink Engine on Yarn
+## Deploy Kyuubi Flink Engine on YARN
### Requirements
-When you want to deploy Kyuubi's Flink SQL engines on YARN, you'd better have cognition upon the following things.
+To deploy Kyuubi's Flink SQL engines on YARN, you'd better have cognition upon the following things.
- Knowing the basics about [Running Flink on YARN](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/resource-providers/yarn)
- A binary distribution of Flink which is built with YARN support
@@ -127,13 +127,59 @@ When you want to deploy Kyuubi's Flink SQL engines on YARN, you'd better have co
- An active Object Storage cluster, e.g. [HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html), S3 and [Minio](https://min.io/) etc.
- Setup Hadoop client configurations at the machine the Kyuubi server locates
-### Yarn Session Mode
+### Flink Deployment Modes
+
+Currently, Flink supports two deployment modes on YARN: [YARN Application Mode](https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/resource-providers/yarn/#application-mode) and [YARN Session Mode](https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/deployment/resource-providers/yarn/#application-mode).
+
+- YARN Application Mode: In this mode, Kyuubi starts a dedicated Flink application cluster and runs the SQL engine on it.
+- YARN Session Mode: In this mode, Kyuubi starts the Flink SQL engine locally and connects to a running Flink YARN session cluster.
+
+As Kyuubi has to know the deployment mode before starting the SQL engine, it's required to specify the deployment mode in Kyuubi configuration.
+
+```properties
+# candidates: yarn-application, yarn-session
+flink.execution.target=yarn-application
+```
+
+### YARN Application Mode
+
+#### Flink Configurations
+
+Since the Flink SQL engine runs inside the JobManager, it's recommended to tune the resource configurations of the JobManager based on your workload.
+
+The related Flink configurations are listed below (see more details at [Flink Configuration](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#yarn)):
+
+| Name | Default | Meaning |
+|--------------------------------|---------|----------------------------------------------------------------------------------------|
+| yarn.appmaster.vcores | 1 | The number of virtual cores (vcores) used by the JobManager (YARN application master). |
+| jobmanager.memory.process.size | (none) | Total size of the memory of the JobManager process. |
+
+Note that Flink application mode doesn't support HA for multiple jobs as for now, this also applies to Kyuubi's Flink SQL engine. If JobManager fails and restarts, the submitted jobs would not be recovered and should be re-submitted.
+
+#### Environment
+
+Either `HADOOP_CONF_DIR` or `YARN_CONF_DIR` is configured and points to the Hadoop client configurations directory, usually, `$HADOOP_HOME/etc/hadoop`.
+
+You could verify your setup by the following command:
+
+```bash
+# we assume to be in the root directory of
+# the unzipped Flink distribution
+
+# (0) export HADOOP_CLASSPATH
+export HADOOP_CLASSPATH=`hadoop classpath`
+
+# (1) submit a Flink job and ensure it runs successfully
+./bin/flink run -m yarn-cluster ./examples/streaming/WordCount.jar
+```
+
+### YARN Session Mode
#### Flink Configurations
```bash
execution.target: yarn-session
-# Yarn Session Cluster application id.
+# YARN Session Cluster application id.
yarn.application.id: application_00000000XX_00XX
```
@@ -194,23 +240,19 @@ To use Hadoop vanilla jars, please configure $KYUUBI_HOME/conf/kyuubi-env.sh as
$ echo "export FLINK_HADOOP_CLASSPATH=`hadoop classpath`" >> $KYUUBI_HOME/conf/kyuubi-env.sh
```
-### Deployment Modes Supported by Flink on YARN
-
-For experiment use, we recommend deploying Kyuubi Flink SQL engine in [Session Mode](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/resource-providers/yarn/#session-mode).
-At present, [Application Mode](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/resource-providers/yarn/#application-mode) and [Per-Job Mode (deprecated)](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/resource-providers/yarn/#per-job-mode-deprecated) are not supported for Flink engine.
-
### Kerberos
-As Kyuubi Flink SQL engine wraps the Flink SQL client that currently does not support [Flink Kerberos Configuration](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/config/#security-kerberos-login-keytab),
-so `security.kerberos.login.keytab` and `security.kerberos.login.principal` should not use now.
+With regard to YARN application mode, Kerberos is supported natively by Flink, see [Flink Kerberos Configuration](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/config/#security-kerberos-login-keytab) for details.
-Instead, you can schedule a periodically `kinit` process via `crontab` task on the local machine that hosts Kyuubi server or simply use [Kyuubi Kinit](settings.html#kinit).
+With regard to YARN session mode, `security.kerberos.login.keytab` and `security.kerberos.login.principal` are not effective, as Kyuubi Flink SQL engine mainly relies on Flink SQL client which currently does not support [Flink Kerberos Configuration](https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/config/#security-kerberos-login-keytab),
+
+As a workaround, you can schedule a periodically `kinit` process via `crontab` task on the local machine that hosts Kyuubi server or simply use [Kyuubi Kinit](settings.html#kinit).
-## Deploy Kyuubi Hive Engine on Yarn
+## Deploy Kyuubi Hive Engine on YARN
### Requirements
-When you want to deploy Kyuubi's Hive SQL engines on YARN, you'd better have cognition upon the following things.
+To deploy Kyuubi's Hive SQL engines on YARN, you'd better have cognition upon the following things.
- Knowing the basics about [Running Hive on YARN](https://cwiki.apache.org/confluence/display/Hive/GettingStarted)
- A binary distribution of Hive
@@ -239,7 +281,7 @@ $ $HIVE_HOME/bin/beeline -u 'jdbc:hive2://localhost:10000/default'
0: jdbc:hive2://localhost:10000/default> INSERT INTO TABLE pokes VALUES (1, 'hello');
```
-If the `Hive SQL` passes and there is a job in Yarn Web UI, It indicates the hive environment is normal.
+If the `Hive SQL` passes and there is a job in YARN Web UI, it indicates the hive environment is good.
#### Required Environment Variable
diff --git a/docs/deployment/high_availability_guide.md b/docs/deployment/high_availability_guide.md
index 353e549eb..51c878157 100644
--- a/docs/deployment/high_availability_guide.md
+++ b/docs/deployment/high_availability_guide.md
@@ -39,7 +39,7 @@ Using multiple Kyuubi service units with load balancing instead of a single unit
- High concurrency
- By adding or removing Kyuubi server instances can easily scale up or down to meet the need of client requests.
- Upgrade smoothly
- - Kyuubi server supports stop gracefully. We could delete a `k.i.` but not stop it immediately.
+ - Kyuubi server supports stopping gracefully. We could delete a `k.i.` but not stop it immediately.
In this case, the `k.i.` will not take any new connection request but only operation requests from existing connections.
After all connection are released, it stops then.
- The dependencies of Kyuubi engines are free to change, such as bump up versions, modify configurations, add external jars, relocate to another engine home. Everything will be reloaded during start and stop.
diff --git a/docs/deployment/hive_metastore.md b/docs/deployment/hive_metastore.md
index f3a24d897..f60465a1a 100644
--- a/docs/deployment/hive_metastore.md
+++ b/docs/deployment/hive_metastore.md
@@ -30,7 +30,7 @@ In this section, you will learn how to configure Kyuubi to interact with Hive Me
- A Spark binary distribution built with `-Phive` support
- Use the built-in one in the Kyuubi distribution
- Download from [Spark official website](https://spark.apache.org/downloads.html)
- - Build from Spark source, [Building With Hive and JDBC Support](http://spark.apache.org/docs/latest/building-spark.html#building-with-hive-and-jdbc-support)
+ - Build from Spark source, [Building With Hive and JDBC Support](https://spark.apache.org/docs/latest/building-spark.html#building-with-hive-and-jdbc-support)
- A copy of Hive client configuration
So the whole thing here is to let Spark applications use this copy of Hive configuration to start a Hive metastore client for their own to talk to the Hive metastore server.
@@ -199,13 +199,13 @@ Caused by: org.apache.thrift.TApplicationException: Invalid method name: 'get_ta
... 93 more
```
-To prevent this problem, we can use Spark's [Interacting with Different Versions of Hive Metastore](http://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore).
+To prevent this problem, we can use Spark's [Interacting with Different Versions of Hive Metastore](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore).
## Further Readings
- Hive Wiki
- [Hive Metastore Administration](https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+Administration)
- Spark Online Documentation
- - [Custom Hadoop/Hive Configuration](http://spark.apache.org/docs/latest/configuration.html#custom-hadoophive-configuration)
- - [Hive Tables](http://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html)
+ - [Custom Hadoop/Hive Configuration](https://spark.apache.org/docs/latest/configuration.html#custom-hadoophive-configuration)
+ - [Hive Tables](https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html)
diff --git a/docs/deployment/index.rst b/docs/deployment/index.rst
index ec3ece951..1b6bf8766 100644
--- a/docs/deployment/index.rst
+++ b/docs/deployment/index.rst
@@ -31,15 +31,6 @@ Basics
high_availability_guide
migration-guide
-Configurations
---------------
-
-.. toctree::
- :maxdepth: 2
- :glob:
-
- settings
-
Engines
-------
diff --git a/docs/deployment/kyuubi_on_kubernetes.md b/docs/deployment/kyuubi_on_kubernetes.md
index 8bb1d88c3..11ffe8e48 100644
--- a/docs/deployment/kyuubi_on_kubernetes.md
+++ b/docs/deployment/kyuubi_on_kubernetes.md
@@ -90,7 +90,7 @@ See more related details in [Using RBAC Authorization](https://kubernetes.io/doc
## Config
-You can configure Kyuubi the old-fashioned way by placing kyuubi-default.conf inside the image. Kyuubi do not recommend using this way on Kubernetes.
+You can configure Kyuubi the old-fashioned way by placing `kyuubi-defaults.conf` inside the image. Kyuubi does not recommend using this way on Kubernetes.
Kyuubi provide `${KYUUBI_HOME}/docker/kyuubi-configmap.yaml` to build Configmap for Kyuubi.
diff --git a/docs/deployment/migration-guide.md b/docs/deployment/migration-guide.md
index 42905340e..bf5b184cd 100644
--- a/docs/deployment/migration-guide.md
+++ b/docs/deployment/migration-guide.md
@@ -17,6 +17,31 @@
# Kyuubi Migration Guide
+## Upgrading from Kyuubi 1.8 to 1.9
+
+* Since Kyuubi 1.9.0, `kyuubi.session.conf.advisor` can be set as a sequence, Kyuubi supported chaining SessionConfAdvisors.
+
+## Upgrading from Kyuubi 1.7 to 1.8
+
+* Since Kyuubi 1.8, SQLite is added and becomes the default database type of Kyuubi metastore, as Derby has been deprecated.
+ Both Derby and SQLite are mainly for testing purposes, and they're not supposed to be used in production.
+ To restore previous behavior, set `kyuubi.metadata.store.jdbc.database.type=DERBY` and
+ `kyuubi.metadata.store.jdbc.url=jdbc:derby:memory:kyuubi_state_store_db;create=true`.
+* Since Kyuubi 1.8, if the directory of the embedded zookeeper configuration (`kyuubi.zookeeper.embedded.directory`
+ & `kyuubi.zookeeper.embedded.data.dir` & `kyuubi.zookeeper.embedded.data.log.dir`) is a relative path, it is resolved
+ relative to `$KYUUBI_HOME` instead of `$PWD`.
+* Since Kyuubi 1.8, PROMETHEUS is changed as the default metrics reporter. To restore previous behavior,
+ set `kyuubi.metrics.reporters=JSON`.
+
+## Upgrading from Kyuubi 1.7.1 to 1.7.2
+
+* Since Kyuubi 1.7.2, for Kyuubi BeeLine, please use `--python-mode` option to run python code or script.
+
+## Upgrading from Kyuubi 1.7.0 to 1.7.1
+
+* Since Kyuubi 1.7.1, `protocolVersion` is removed from the request parameters of the REST API `Open(create) a session`. All removed or unknown parameters will be silently ignored and affects nothing.
+* Since Kyuubi 1.7.1, `confOverlay` is supported in the request parameters of the REST API `Create an operation with EXECUTE_STATEMENT type`.
+
## Upgrading from Kyuubi 1.6 to 1.7
* In Kyuubi 1.7, `kyuubi.ha.zookeeper.engine.auth.type` does not fallback to `kyuubi.ha.zookeeper.auth.type`.
@@ -24,7 +49,7 @@
* Since Kyuubi 1.7, Kyuubi returns engine's information for `GetInfo` request instead of server. To restore the previous behavior, set `kyuubi.server.info.provider` to `SERVER`.
* Since Kyuubi 1.7, Kyuubi session type `SQL` is refactored to `INTERACTIVE`, because Kyuubi supports not only `SQL` session, but also `SCALA` and `PYTHON` sessions.
User need to use `INTERACTIVE` sessionType to look up the session event.
-* Since Kyuubi 1.7, the REST API of `Open(create) a session` will not contains parameters `user` `password` and `IpAddr`. User and password should be set in `Authorization` of http request if needed.
+* Since Kyuubi 1.7, the REST API of `Open(create) a session` will not contain parameters `user` `password` and `IpAddr`. User and password should be set in `Authorization` of http request if needed.
## Upgrading from Kyuubi 1.6.0 to 1.6.1
diff --git a/docs/deployment/spark/aqe.md b/docs/deployment/spark/aqe.md
index 90cc5aff8..3682c7f9e 100644
--- a/docs/deployment/spark/aqe.md
+++ b/docs/deployment/spark/aqe.md
@@ -210,7 +210,7 @@ Kyuubi is a long-running service to make it easier for end-users to use Spark SQ
### Setting Default Configurations
-[Configuring by `spark-defaults.conf`](settings.html#via-spark-defaults-conf) at the engine side is the best way to set up Kyuubi with AQE. All engines will be instantiated with AQE enabled.
+[Configuring by `spark-defaults.conf`](../settings.html#via-spark-defaults-conf) at the engine side is the best way to set up Kyuubi with AQE. All engines will be instantiated with AQE enabled.
Here is a config setting that we use in our platform when deploying Kyuubi.
diff --git a/docs/deployment/spark/dynamic_allocation.md b/docs/deployment/spark/dynamic_allocation.md
index b177b63c3..1a5057e73 100644
--- a/docs/deployment/spark/dynamic_allocation.md
+++ b/docs/deployment/spark/dynamic_allocation.md
@@ -170,7 +170,7 @@ Kyuubi is a long-running service to make it easier for end-users to use Spark SQ
### Setting Default Configurations
-[Configuring by `spark-defaults.conf`](settings.html#via-spark-defaults-conf) at the engine side is the best way to set up Kyuubi with DRA. All engines will be instantiated with DRA enabled.
+[Configuring by `spark-defaults.conf`](../settings.html#via-spark-defaults-conf) at the engine side is the best way to set up Kyuubi with DRA. All engines will be instantiated with DRA enabled.
Here is a config setting that we use in our platform when deploying Kyuubi.
diff --git a/docs/develop_tools/build_document.md b/docs/develop_tools/build_document.md
deleted file mode 100644
index 0be5a1807..000000000
--- a/docs/develop_tools/build_document.md
+++ /dev/null
@@ -1,76 +0,0 @@
-
-
-# Building Kyuubi Documentation
-
-Follow the steps below and learn how to build the Kyuubi documentation as the one you are watching now.
-
-## Install & Activate `virtualenv`
-
-Firstly, install `virtualenv`, this is optional but recommended as it is useful to create an independent environment to resolve dependency issues for building the documentation.
-
-```bash
-pip install virtualenv
-```
-
-Switch to the `docs` root directory.
-
-```bash
-cd $KYUUBI_SOURCE_PATH/docs
-```
-
-Create a virtual environment named 'kyuubi' or anything you like using `virtualenv` if it's not existing.
-
-```bash
-virtualenv kyuubi
-```
-
-Activate it,
-
-```bash
-source ./kyuubi/bin/activate
-```
-
-## Install all dependencies
-
-Install all dependencies enumerated in the `requirements.txt`.
-
-```bash
-pip install -r requirements.txt
-```
-
-## Create Documentation
-
-Make sure you are in the `$KYUUBI_SOURCE_PATH/docs` directory.
-
-linux & macos
-
-```bash
-make html
-```
-
-windows
-
-```bash
-make.bat html
-```
-
-If the build process succeed, the HTML pages are in `$KYUUBI_SOURCE_PATH/docs/_build/html`.
-
-## View Locally
-
-Open the `$KYUUBI_SOURCE_PATH/docs/_build/html/index.html` file in your favorite web browser.
diff --git a/docs/extensions/engines/flink/functions.md b/docs/extensions/engines/flink/functions.md
new file mode 100644
index 000000000..1d047d078
--- /dev/null
+++ b/docs/extensions/engines/flink/functions.md
@@ -0,0 +1,30 @@
+
+
+# Auxiliary SQL Functions
+
+Kyuubi provides several auxiliary SQL functions as supplement to
+Flink's [Built-in Functions](https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/table/functions/systemfunctions/)
+
+| Name | Description | Return Type | Since |
+|---------------------|-------------------------------------------------------------|-------------|-------|
+| kyuubi_version | Return the version of Kyuubi Server | string | 1.8.0 |
+| kyuubi_engine_name | Return the application name for the associated query engine | string | 1.8.0 |
+| kyuubi_engine_id | Return the application id for the associated query engine | string | 1.8.0 |
+| kyuubi_system_user | Return the system user name for the associated query engine | string | 1.8.0 |
+| kyuubi_session_user | Return the session username for the associated query engine | string | 1.8.0 |
+
diff --git a/docs/extensions/engines/flink/index.rst b/docs/extensions/engines/flink/index.rst
index 01bbecf92..58105b0fa 100644
--- a/docs/extensions/engines/flink/index.rst
+++ b/docs/extensions/engines/flink/index.rst
@@ -20,6 +20,7 @@ Extensions for Flink
:maxdepth: 1
../../../connector/flink/index
+ functions
.. warning::
This page is still in-progress.
diff --git a/docs/extensions/engines/hive/functions.md b/docs/extensions/engines/hive/functions.md
new file mode 100644
index 000000000..24094ecce
--- /dev/null
+++ b/docs/extensions/engines/hive/functions.md
@@ -0,0 +1,30 @@
+
+
+
+# Auxiliary SQL Functions
+
+Kyuubi provides several auxiliary SQL functions as supplement to Hive's [Built-in Functions](https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-Built-inFunctions)
+
+| Name | Description | Return Type | Since |
+|----------------|-------------------------------------|-------------|-------|
+| kyuubi_version | Return the version of Kyuubi Server | string | 1.8.0 |
+| engine_name | Return the name of engine | string | 1.8.0 |
+| engine_id | Return the id of engine | string | 1.8.0 |
+| system_user | Return the system user | string | 1.8.0 |
+| session_user | Return the session user | string | 1.8.0 |
+
diff --git a/docs/extensions/engines/hive/index.rst b/docs/extensions/engines/hive/index.rst
index 8aeebf1bc..f43ec11e0 100644
--- a/docs/extensions/engines/hive/index.rst
+++ b/docs/extensions/engines/hive/index.rst
@@ -20,6 +20,7 @@ Extensions for Hive
:maxdepth: 2
../../../connector/hive/index
+ functions
.. warning::
This page is still in-progress.
diff --git a/docs/extensions/engines/spark/functions.md b/docs/extensions/engines/spark/functions.md
index 66f22aea8..78c269243 100644
--- a/docs/extensions/engines/spark/functions.md
+++ b/docs/extensions/engines/spark/functions.md
@@ -27,4 +27,5 @@ Kyuubi provides several auxiliary SQL functions as supplement to Spark's [Built-
| engine_id | Return the spark application id for the associated query engine | string | 1.4.0 |
| system_user | Return the system user name for the associated query engine | string | 1.3.0 |
| session_user | Return the session username for the associated query engine | string | 1.4.0 |
+| engine_url | Return the engine url for the associated query engine | string | 1.8.0 |
diff --git a/docs/extensions/engines/spark/lineage.md b/docs/extensions/engines/spark/lineage.md
index 1ef28c173..2dbb2a026 100644
--- a/docs/extensions/engines/spark/lineage.md
+++ b/docs/extensions/engines/spark/lineage.md
@@ -45,14 +45,14 @@ The lineage of this SQL:
```json
{
- "inputTables": ["default.test_table0"],
+ "inputTables": ["spark_catalog.default.test_table0"],
"outputTables": [],
"columnLineage": [{
"column": "col0",
- "originalColumns": ["default.test_table0.a"]
+ "originalColumns": ["spark_catalog.default.test_table0.a"]
}, {
"column": "col1",
- "originalColumns": ["default.test_table0.b"]
+ "originalColumns": ["spark_catalog.default.test_table0.b"]
}]
}
```
@@ -97,17 +97,16 @@ Currently supported column lineage for spark's `Command` and `Query` type:
### Build with Apache Maven
-Kyuubi Spark Lineage Listener Extension is built using [Apache Maven](http://maven.apache.org).
+Kyuubi Spark Lineage Listener Extension is built using [Apache Maven](https://maven.apache.org).
To build it, `cd` to the root direct of kyuubi project and run:
```shell
-build/mvn clean package -pl :kyuubi-spark-lineage_2.12 -DskipTests
+build/mvn clean package -pl :kyuubi-spark-lineage_2.12 -am -DskipTests
```
After a while, if everything goes well, you will get the plugin finally in two parts:
- The main plugin jar, which is under `./extensions/spark/kyuubi-spark-lineage/target/kyuubi-spark-lineage_${scala.binary.version}-${project.version}.jar`
-- The least transitive dependencies needed, which are under `./extensions/spark/kyuubi-spark-lineage/target/scala-${scala.binary.version}/jars`
### Build against Different Apache Spark Versions
@@ -118,7 +117,7 @@ Sometimes, it may be incompatible with other Spark distributions, then you may n
For example,
```shell
-build/mvn clean package -pl :kyuubi-spark-lineage_2.12 -DskipTests -Dspark.version=3.1.2
+build/mvn clean package -pl :kyuubi-spark-lineage_2.12 -am -DskipTests -Dspark.version=3.1.2
```
The available `spark.version`s are shown in the following table.
@@ -126,6 +125,7 @@ The available `spark.version`s are shown in the following table.
| Spark Version | Supported | Remark |
|:-------------:|:---------:|:------:|
| master | √ | - |
+| 3.4.x | √ | - |
| 3.3.x | √ | - |
| 3.2.x | √ | - |
| 3.1.x | √ | - |
@@ -168,13 +168,65 @@ Add `org.apache.kyuubi.plugin.lineage.SparkOperationLineageQueryExecutionListene
spark.sql.queryExecutionListeners=org.apache.kyuubi.plugin.lineage.SparkOperationLineageQueryExecutionListener
```
-### Settings for Lineage Logger and Path
+### Optional configuration
-#### Lineage Logger Path
+#### Whether to Skip Permanent View Resolution
-The location of all the engine operation lineage events go for the builtin JSON logger.
-We first need set `kyuubi.engine.event.loggers` to `JSON`.
-All operation lineage events will be written in the unified event json logger path, which be setting with
-`kyuubi.engine.event.json.log.path`. We can get the lineage logger from the `operation_lineage` dir in the
-`kyuubi.engine.event.json.log.path`.
+If enabled, lineage resolution will stop at permanent views and treats them as physical tables. We need
+to add one configurations.
+
+```properties
+spark.kyuubi.plugin.lineage.skip.parsing.permanent.view.enabled=true
+```
+
+### Get Lineage Events
+
+The lineage dispatchers are used to dispatch lineage events, configured via `spark.kyuubi.plugin.lineage.dispatchers`.
+
+
+
SPARK_EVENT (by default): send lineage event to spark event bus
+
KYUUBI_EVENT: send lineage event to kyuubi event bus
+
ATLAS: send lineage to apache atlas
+
+
+#### Get Lineage Events from SparkListener
+
+When using the `SPARK_EVENT` dispatcher, the lineage events will be sent to the `SparkListenerBus`. To handle lineage events, a new `SparkListener` needs to be added.
+Example for Adding `SparkListener`:
+
+```scala
+spark.sparkContext.addSparkListener(new SparkListener {
+ override def onOtherEvent(event: SparkListenerEvent): Unit = {
+ event match {
+ case lineageEvent: OperationLineageEvent =>
+ // Your processing logic
+ case _ =>
+ }
+ }
+ })
+```
+
+#### Get Lineage Events from Kyuubi EventHandler
+
+When using the `KYUUBI_EVENT` dispatcher, the lineage events will be sent to the Kyuubi `EventBus`. Refer to [Kyuubi Event Handler](../../server/events) to handle kyuubi events.
+
+#### Ingest Lineage Entities to Apache Atlas
+
+The lineage entities can be ingested into [Apache Atlas](https://atlas.apache.org/) using the `ATLAS` dispatcher.
+
+Extra works:
+
++ The least transitive dependencies needed, which are under `./extensions/spark/kyuubi-spark-lineage/target/scala-${scala.binary.version}/jars`
++ Use `spark.files` to specify the `atlas-application.properties` configuration file for Atlas
+
+Atlas Client configurations (Configure in `atlas-application.properties` or passed in `spark.atlas.` prefix):
+
+| Name | Default Value | Description | Since |
+|-----------------------------------------|------------------------|-------------------------------------------------------|-------|
+| atlas.rest.address | http://localhost:21000 | The rest endpoint url for the Atlas server | 1.8.0 |
+| atlas.client.type | rest | The client type (currently only supports rest) | 1.8.0 |
+| atlas.client.username | none | The client username | 1.8.0 |
+| atlas.client.password | none | The client password | 1.8.0 |
+| atlas.cluster.name | primary | The cluster name to use in qualifiedName of entities. | 1.8.0 |
+| atlas.hook.spark.column.lineage.enabled | true | Whether to ingest column lineages to Atlas. | 1.8.0 |
diff --git a/docs/extensions/engines/spark/rules.md b/docs/extensions/engines/spark/rules.md
index 5c8c04869..4614f5244 100644
--- a/docs/extensions/engines/spark/rules.md
+++ b/docs/extensions/engines/spark/rules.md
@@ -63,24 +63,33 @@ Now, you can enjoy the Kyuubi SQL Extension.
Kyuubi provides some configs to make these feature easy to use.
-| Name | Default Value | Description | Since |
-|---------------------------------------------------------------------|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
-| spark.sql.optimizer.insertRepartitionBeforeWrite.enabled | true | Add repartition node at the top of query plan. An approach of merging small files. | 1.2.0 |
-| spark.sql.optimizer.insertRepartitionNum | none | The partition number if `spark.sql.optimizer.insertRepartitionBeforeWrite.enabled` is enabled. If AQE is disabled, the default value is `spark.sql.shuffle.partitions`. If AQE is enabled, the default value is none that means depend on AQE. | 1.2.0 |
-| spark.sql.optimizer.dynamicPartitionInsertionRepartitionNum | 100 | The partition number of each dynamic partition if `spark.sql.optimizer.insertRepartitionBeforeWrite.enabled` is enabled. We will repartition by dynamic partition columns to reduce the small file but that can cause data skew. This config is to extend the partition of dynamic partition column to avoid skew but may generate some small files. | 1.2.0 |
-| spark.sql.optimizer.forceShuffleBeforeJoin.enabled | false | Ensure shuffle node exists before shuffled join (shj and smj) to make AQE `OptimizeSkewedJoin` works (complex scenario join, multi table join). | 1.2.0 |
-| spark.sql.optimizer.finalStageConfigIsolation.enabled | false | If true, the final stage support use different config with previous stage. The prefix of final stage config key should be `spark.sql.finalStage.`. For example, the raw spark config: `spark.sql.adaptive.advisoryPartitionSizeInBytes`, then the final stage config should be: `spark.sql.finalStage.adaptive.advisoryPartitionSizeInBytes`. | 1.2.0 |
-| spark.sql.analyzer.classification.enabled | false | When true, allows Kyuubi engine to judge this SQL's classification and set `spark.sql.analyzer.classification` back into sessionConf. Through this configuration item, Spark can optimizing configuration dynamic. | 1.4.0 |
-| spark.sql.optimizer.insertZorderBeforeWriting.enabled | true | When true, we will follow target table properties to insert zorder or not. The key properties are: 1) `kyuubi.zorder.enabled`: if this property is true, we will insert zorder before writing data. 2) `kyuubi.zorder.cols`: string split by comma, we will zorder by these cols. | 1.4.0 |
-| spark.sql.optimizer.zorderGlobalSort.enabled | true | When true, we do a global sort using zorder. Note that, it can cause data skew issue if the zorder columns have less cardinality. When false, we only do local sort using zorder. | 1.4.0 |
-| spark.sql.watchdog.maxPartitions | none | Set the max partition number when spark scans a data source. Enable MaxPartitionStrategy by specifying this configuration. Add maxPartitions Strategy to avoid scan excessive partitions on partitioned table, it's optional that works with defined | 1.4.0 |
-| spark.sql.optimizer.dropIgnoreNonExistent | false | When true, do not report an error if DROP DATABASE/TABLE/VIEW/FUNCTION/PARTITION specifies a non-existent database/table/view/function/partition | 1.5.0 |
-| spark.sql.optimizer.rebalanceBeforeZorder.enabled | false | When true, we do a rebalance before zorder in case data skew. Note that, if the insertion is dynamic partition we will use the partition columns to rebalance. Note that, this config only affects with Spark 3.3.x. | 1.6.0 |
-| spark.sql.optimizer.rebalanceZorderColumns.enabled | false | When true and `spark.sql.optimizer.rebalanceBeforeZorder.enabled` is true, we do rebalance before Z-Order. If it's dynamic partition insert, the rebalance expression will include both partition columns and Z-Order columns. Note that, this config only affects with Spark 3.3.x. | 1.6.0 |
-| spark.sql.optimizer.twoPhaseRebalanceBeforeZorder.enabled | false | When true and `spark.sql.optimizer.rebalanceBeforeZorder.enabled` is true, we do two phase rebalance before Z-Order for the dynamic partition write. The first phase rebalance using dynamic partition column; The second phase rebalance using dynamic partition column Z-Order columns. Note that, this config only affects with Spark 3.3.x. | 1.6.0 |
-| spark.sql.optimizer.zorderUsingOriginalOrdering.enabled | false | When true and `spark.sql.optimizer.rebalanceBeforeZorder.enabled` is true, we do sort by the original ordering i.e. lexicographical order. Note that, this config only affects with Spark 3.3.x. | 1.6.0 |
-| spark.sql.optimizer.inferRebalanceAndSortOrders.enabled | false | When ture, infer columns for rebalance and sort orders from original query, e.g. the join keys from join. It can avoid compression ratio regression. | 1.7.0 |
-| spark.sql.optimizer.inferRebalanceAndSortOrdersMaxColumns | 3 | The max columns of inferred columns. | 1.7.0 |
-| spark.sql.optimizer.insertRepartitionBeforeWriteIfNoShuffle.enabled | false | When true, add repartition even if the original plan does not have shuffle. | 1.7.0 |
-| spark.sql.optimizer.finalStageConfigIsolationWriteOnly.enabled | true | When true, only enable final stage isolation for writing. | 1.7.0 |
+| Name | Default Value | Description | Since |
+|---------------------------------------------------------------------|----------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
+| spark.sql.optimizer.insertRepartitionBeforeWrite.enabled | true | Add repartition node at the top of query plan. An approach of merging small files. | 1.2.0 |
+| spark.sql.optimizer.insertRepartitionNum | none | The partition number if `spark.sql.optimizer.insertRepartitionBeforeWrite.enabled` is enabled. If AQE is disabled, the default value is `spark.sql.shuffle.partitions`. If AQE is enabled, the default value is none that means depend on AQE. This config is used for Spark 3.1 only. | 1.2.0 |
+| spark.sql.optimizer.dynamicPartitionInsertionRepartitionNum | 100 | The partition number of each dynamic partition if `spark.sql.optimizer.insertRepartitionBeforeWrite.enabled` is enabled. We will repartition by dynamic partition columns to reduce the small file but that can cause data skew. This config is to extend the partition of dynamic partition column to avoid skew but may generate some small files. | 1.2.0 |
+| spark.sql.optimizer.forceShuffleBeforeJoin.enabled | false | Ensure shuffle node exists before shuffled join (shj and smj) to make AQE `OptimizeSkewedJoin` works (complex scenario join, multi table join). | 1.2.0 |
+| spark.sql.optimizer.finalStageConfigIsolation.enabled | false | If true, the final stage support use different config with previous stage. The prefix of final stage config key should be `spark.sql.finalStage.`. For example, the raw spark config: `spark.sql.adaptive.advisoryPartitionSizeInBytes`, then the final stage config should be: `spark.sql.finalStage.adaptive.advisoryPartitionSizeInBytes`. | 1.2.0 |
+| spark.sql.analyzer.classification.enabled | false | When true, allows Kyuubi engine to judge this SQL's classification and set `spark.sql.analyzer.classification` back into sessionConf. Through this configuration item, Spark can optimizing configuration dynamic. | 1.4.0 |
+| spark.sql.optimizer.insertZorderBeforeWriting.enabled | true | When true, we will follow target table properties to insert zorder or not. The key properties are: 1) `kyuubi.zorder.enabled`: if this property is true, we will insert zorder before writing data. 2) `kyuubi.zorder.cols`: string split by comma, we will zorder by these cols. | 1.4.0 |
+| spark.sql.optimizer.zorderGlobalSort.enabled | true | When true, we do a global sort using zorder. Note that, it can cause data skew issue if the zorder columns have less cardinality. When false, we only do local sort using zorder. | 1.4.0 |
+| spark.sql.watchdog.maxPartitions | none | Set the max partition number when spark scans a data source. Enable maxPartition Strategy by specifying this configuration. Add maxPartitions Strategy to avoid scan excessive partitions on partitioned table, it's optional that works with defined | 1.4.0 |
+| spark.sql.watchdog.maxFileSize | none | Set the maximum size in bytes of files when spark scans a data source. Enable maxFileSize Strategy by specifying this configuration. Add maxFileSize Strategy to avoid scan excessive size of files, it's optional that works with defined | 1.8.0 |
+| spark.sql.optimizer.dropIgnoreNonExistent | false | When true, do not report an error if DROP DATABASE/TABLE/VIEW/FUNCTION/PARTITION specifies a non-existent database/table/view/function/partition | 1.5.0 |
+| spark.sql.optimizer.rebalanceBeforeZorder.enabled | false | When true, we do a rebalance before zorder in case data skew. Note that, if the insertion is dynamic partition we will use the partition columns to rebalance. Note that, this config only affects with Spark 3.3.x. | 1.6.0 |
+| spark.sql.optimizer.rebalanceZorderColumns.enabled | false | When true and `spark.sql.optimizer.rebalanceBeforeZorder.enabled` is true, we do rebalance before Z-Order. If it's dynamic partition insert, the rebalance expression will include both partition columns and Z-Order columns. Note that, this config only affects with Spark 3.3.x. | 1.6.0 |
+| spark.sql.optimizer.twoPhaseRebalanceBeforeZorder.enabled | false | When true and `spark.sql.optimizer.rebalanceBeforeZorder.enabled` is true, we do two phase rebalance before Z-Order for the dynamic partition write. The first phase rebalance using dynamic partition column; The second phase rebalance using dynamic partition column Z-Order columns. Note that, this config only affects with Spark 3.3.x. | 1.6.0 |
+| spark.sql.optimizer.zorderUsingOriginalOrdering.enabled | false | When true and `spark.sql.optimizer.rebalanceBeforeZorder.enabled` is true, we do sort by the original ordering i.e. lexicographical order. Note that, this config only affects with Spark 3.3.x. | 1.6.0 |
+| spark.sql.optimizer.inferRebalanceAndSortOrders.enabled | false | When ture, infer columns for rebalance and sort orders from original query, e.g. the join keys from join. It can avoid compression ratio regression. | 1.7.0 |
+| spark.sql.optimizer.inferRebalanceAndSortOrdersMaxColumns | 3 | The max columns of inferred columns. | 1.7.0 |
+| spark.sql.optimizer.insertRepartitionBeforeWriteIfNoShuffle.enabled | false | When true, add repartition even if the original plan does not have shuffle. | 1.7.0 |
+| spark.sql.optimizer.finalStageConfigIsolationWriteOnly.enabled | true | When true, only enable final stage isolation for writing. | 1.7.0 |
+| spark.sql.finalWriteStage.eagerlyKillExecutors.enabled | false | When true, eagerly kill redundant executors before running final write stage. | 1.8.0 |
+| spark.sql.finalWriteStage.skipKillingExecutorsForTableCache | true | When true, skip killing executors if the plan has table caches. | 1.8.0 |
+| spark.sql.finalWriteStage.retainExecutorsFactor | 1.2 | If the target executors * factor < active executors, and target executors * factor > min executors, then inject kill executors or inject custom resource profile. | 1.8.0 |
+| spark.sql.finalWriteStage.resourceIsolation.enabled | false | When true, make final write stage resource isolation using custom RDD resource profile. | 1.8.0 |
+| spark.sql.finalWriteStageExecutorCores | fallback spark.executor.cores | Specify the executor core request for final write stage. It would be passed to the RDD resource profile. | 1.8.0 |
+| spark.sql.finalWriteStageExecutorMemory | fallback spark.executor.memory | Specify the executor on heap memory request for final write stage. It would be passed to the RDD resource profile. | 1.8.0 |
+| spark.sql.finalWriteStageExecutorMemoryOverhead | fallback spark.executor.memoryOverhead | Specify the executor memory overhead request for final write stage. It would be passed to the RDD resource profile. | 1.8.0 |
+| spark.sql.finalWriteStageExecutorOffHeapMemory | NONE | Specify the executor off heap memory request for final write stage. It would be passed to the RDD resource profile. | 1.8.0 |
diff --git a/docs/extensions/server/authentication.rst b/docs/extensions/server/authentication.rst
index ab238040c..7a83b07c2 100644
--- a/docs/extensions/server/authentication.rst
+++ b/docs/extensions/server/authentication.rst
@@ -49,12 +49,12 @@ To create custom Authenticator class derived from the above interface, we need t
- Referencing the library
-.. code-block:: xml
+.. parsed-literal::
org.apache.kyuubikyuubi-common_2.12
- 1.5.2-incubating
+ \ |release|\provided
diff --git a/docs/extensions/server/events.rst b/docs/extensions/server/events.rst
index 832c1e5df..aee7d4899 100644
--- a/docs/extensions/server/events.rst
+++ b/docs/extensions/server/events.rst
@@ -51,12 +51,12 @@ To create custom EventHandlerProvider class derived from the above interface, we
- Referencing the library
-.. code-block:: xml
+.. parsed-literal::
org.apache.kyuubi
- kyuubi-event_2.12
- 1.7.0-incubating
+ kyuubi-events_2.12
+ \ |release|\provided
diff --git a/docs/imgs/kyuubi_ecosystem.drawio b/docs/imgs/kyuubi_ecosystem.drawio
index 723b306e8..7171491ef 100644
--- a/docs/imgs/kyuubi_ecosystem.drawio
+++ b/docs/imgs/kyuubi_ecosystem.drawio
@@ -1 +1 @@

\ No newline at end of file

\ No newline at end of file
diff --git a/docs/imgs/kyuubi_ecosystem.drawio.png b/docs/imgs/kyuubi_ecosystem.drawio.png
index 19de7adb5..72d221d10 100644
Binary files a/docs/imgs/kyuubi_ecosystem.drawio.png and b/docs/imgs/kyuubi_ecosystem.drawio.png differ
diff --git a/docs/index.rst b/docs/index.rst
index fbd299e7b..e86041ffc 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -179,6 +179,7 @@ What's Next
:glob:
quick_start/index
+ configuration/settings
deployment/index
Security
monitor/index
@@ -216,7 +217,13 @@ What's Next
:caption: Contributing
:maxdepth: 2
- develop_tools/index
+ contributing/code/index
+ contributing/doc/index
+
+.. toctree::
+ :caption: Community
+ :maxdepth: 2
+
community/index
.. toctree::
diff --git a/docs/make.bat b/docs/make.bat
index 1f441aefc..b8c48a2db 100644
--- a/docs/make.bat
+++ b/docs/make.bat
@@ -38,7 +38,7 @@ if errorlevel 9009 (
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
- echo.http://sphinx-doc.org/
+ echo.https://www.sphinx-doc.org/
exit /b 1
)
diff --git a/docs/monitor/logging.md b/docs/monitor/logging.md
index 8d373f5a9..9dce6e22a 100644
--- a/docs/monitor/logging.md
+++ b/docs/monitor/logging.md
@@ -114,7 +114,7 @@ For example, we can disable the console appender and enable the file appender li
-
+
@@ -265,5 +265,5 @@ You will both get the final results and the corresponding operation logs telling
- [Monitoring Kyuubi - Server Metrics](metrics.md)
- [Trouble Shooting](trouble_shooting.md)
- Spark Online Documentation
- - [Monitoring and Instrumentation](http://spark.apache.org/docs/latest/monitoring.html)
+ - [Monitoring and Instrumentation](https://spark.apache.org/docs/latest/monitoring.html)
diff --git a/docs/monitor/metrics.md b/docs/monitor/metrics.md
index 1d1fa326a..561014c37 100644
--- a/docs/monitor/metrics.md
+++ b/docs/monitor/metrics.md
@@ -44,10 +44,12 @@ These metrics include:
|--------------------------------------------------|----------------------------------------|-----------|-------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `kyuubi.exec.pool.threads.alive` | | gauge | 1.2.0 |
threads keepAlive in the backend executive thread pool
diff --git a/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala b/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala
index 1f34ae64f..cdfc6d313 100644
--- a/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala
+++ b/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineSessionPage.scala
@@ -42,7 +42,7 @@ case class EngineSessionPage(parent: EngineTab)
require(parameterId != null && parameterId.nonEmpty, "Missing id parameter")
val content = store.synchronized { // make sure all parts in this page are consistent
- val sessionStat = store.getSession(parameterId).getOrElse(null)
+ val sessionStat = store.getSession(parameterId).orNull
require(sessionStat != null, "Invalid sessionID[" + parameterId + "]")
val redactionPattern = parent.sparkUI match {
@@ -51,7 +51,7 @@ case class EngineSessionPage(parent: EngineTab)
}
val sessionPropertiesTable =
- if (sessionStat.conf != null && !sessionStat.conf.isEmpty) {
+ if (sessionStat.conf != null && sessionStat.conf.nonEmpty) {
val table = UIUtils.listingTable(
propertyHeader,
propertyRow,
@@ -78,8 +78,18 @@ case class EngineSessionPage(parent: EngineTab)
User {sessionStat.username},
IP {sessionStat.ip},
- Server {sessionStat.serverIp},
+ Server {sessionStat.serverIp}
+
++
+
Session created at {formatDate(sessionStat.startTime)},
+ {
+ if (sessionStat.endTime > 0) {
+ s"""
+ | ended at ${formatDate(sessionStat.endTime)},
+ | after ${formatDuration(sessionStat.duration)}.
+ |""".stripMargin
+ }
+ }
Total run {sessionStat.totalOperations} SQL
++
sessionPropertiesTable ++
diff --git a/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineTab.scala b/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineTab.scala
index b7cebbd97..52edcf220 100644
--- a/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineTab.scala
+++ b/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/spark/ui/EngineTab.scala
@@ -26,7 +26,7 @@ import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.engine.spark.SparkSQLEngine
import org.apache.kyuubi.engine.spark.events.EngineEventsStore
import org.apache.kyuubi.service.ServiceState
-import org.apache.kyuubi.util.ClassUtils
+import org.apache.kyuubi.util.reflect.{DynClasses, DynMethods}
/**
* Note that [[SparkUITab]] is private for Spark
@@ -62,31 +62,35 @@ case class EngineTab(
sparkUI.foreach { ui =>
try {
- // Spark shade the jetty package so here we use reflection
- val sparkServletContextHandlerClz = loadSparkServletContextHandler
- val attachHandlerMethod = Class.forName("org.apache.spark.ui.SparkUI")
- .getMethod("attachHandler", sparkServletContextHandlerClz)
- val createRedirectHandlerMethod = Class.forName("org.apache.spark.ui.JettyUtils")
- .getMethod(
- "createRedirectHandler",
+ // [KYUUBI #3627]: the official spark release uses the shaded and relocated jetty classes,
+ // but if we use sbt to build for testing, e.g. docker image, it still uses the vanilla
+ // jetty classes.
+ val sparkServletContextHandlerClz = DynClasses.builder()
+ .impl("org.sparkproject.jetty.servlet.ServletContextHandler")
+ .impl("org.eclipse.jetty.servlet.ServletContextHandler")
+ .buildChecked()
+ val attachHandlerMethod = DynMethods.builder("attachHandler")
+ .impl("org.apache.spark.ui.SparkUI", sparkServletContextHandlerClz)
+ .buildChecked(ui)
+ val createRedirectHandlerMethod = DynMethods.builder("createRedirectHandler")
+ .impl(
+ "org.apache.spark.ui.JettyUtils",
classOf[String],
classOf[String],
- classOf[(HttpServletRequest) => Unit],
+ classOf[HttpServletRequest => Unit],
classOf[String],
classOf[Set[String]])
+ .buildStaticChecked()
attachHandlerMethod
.invoke(
- ui,
createRedirectHandlerMethod
- .invoke(null, "/kyuubi/stop", "/kyuubi", handleKillRequest _, "", Set("GET", "POST")))
+ .invoke("/kyuubi/stop", "/kyuubi", handleKillRequest _, "", Set("GET", "POST")))
attachHandlerMethod
.invoke(
- ui,
createRedirectHandlerMethod
.invoke(
- null,
"/kyuubi/gracefulstop",
"/kyuubi",
handleGracefulKillRequest _,
@@ -105,18 +109,6 @@ case class EngineTab(
cause)
}
- private def loadSparkServletContextHandler: Class[_] = {
- // [KYUUBI #3627]: the official spark release uses the shaded and relocated jetty classes,
- // but if use sbt to build for testing, e.g. docker image, it still uses vanilla jetty classes.
- val shaded = "org.sparkproject.jetty.servlet.ServletContextHandler"
- val vanilla = "org.eclipse.jetty.servlet.ServletContextHandler"
- if (ClassUtils.classIsLoadable(shaded)) {
- Class.forName(shaded)
- } else {
- Class.forName(vanilla)
- }
- }
-
def handleKillRequest(request: HttpServletRequest): Unit = {
if (killEnabled && engine.isDefined && engine.get.getServiceState != ServiceState.STOPPED) {
engine.get.stop()
diff --git a/externals/kyuubi-spark-sql-engine/src/test/resources/log4j2-test.xml b/externals/kyuubi-spark-sql-engine/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/resources/log4j2-test.xml
+++ b/externals/kyuubi-spark-sql-engine/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/EtcdShareLevelSparkEngineSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/EtcdShareLevelSparkEngineSuite.scala
index 46dc3b54c..727b232e3 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/EtcdShareLevelSparkEngineSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/EtcdShareLevelSparkEngineSuite.scala
@@ -17,9 +17,7 @@
package org.apache.kyuubi.engine.spark
-import org.apache.kyuubi.config.KyuubiConf.ENGINE_CHECK_INTERVAL
-import org.apache.kyuubi.config.KyuubiConf.ENGINE_SHARE_LEVEL
-import org.apache.kyuubi.config.KyuubiConf.ENGINE_SPARK_MAX_LIFETIME
+import org.apache.kyuubi.config.KyuubiConf.{ENGINE_CHECK_INTERVAL, ENGINE_SHARE_LEVEL, ENGINE_SPARK_MAX_INITIAL_WAIT, ENGINE_SPARK_MAX_LIFETIME}
import org.apache.kyuubi.engine.ShareLevel
import org.apache.kyuubi.engine.ShareLevel.ShareLevel
@@ -30,6 +28,7 @@ trait EtcdShareLevelSparkEngineSuite
etcdConf ++ Map(
ENGINE_SHARE_LEVEL.key -> shareLevel.toString,
ENGINE_SPARK_MAX_LIFETIME.key -> "PT20s",
+ ENGINE_SPARK_MAX_INITIAL_WAIT.key -> "0",
ENGINE_CHECK_INTERVAL.key -> "PT5s")
}
}
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/IndividualSparkSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/IndividualSparkSuite.scala
index c6789d14d..8fca1d0ca 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/IndividualSparkSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/IndividualSparkSuite.scala
@@ -114,7 +114,7 @@ class SparkEngineSuites extends KyuubiFunSuite {
}
assert(SparkSQLEngine.currentEngine.isEmpty)
val errorMsg = s"The Engine main thread was interrupted, possibly due to `createSpark`" +
- s" timeout. The `kyuubi.session.engine.initialize.timeout` is ($timeout ms) " +
+ s" timeout. The `${ENGINE_INIT_TIMEOUT.key}` is ($timeout ms) " +
s" and submitted at $submitTime."
assert(logAppender.loggingEvents.exists(
_.getMessage.getFormattedMessage.equals(errorMsg)))
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SchedulerPoolSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SchedulerPoolSuite.scala
index af8c90cf2..a07f7d783 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SchedulerPoolSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SchedulerPoolSuite.scala
@@ -19,6 +19,9 @@ package org.apache.kyuubi.engine.spark
import java.util.concurrent.Executors
+import scala.concurrent.duration.SECONDS
+
+import org.apache.spark.KyuubiSparkContextHelper
import org.apache.spark.scheduler.{SparkListener, SparkListenerJobEnd, SparkListenerJobStart}
import org.scalatest.concurrent.PatienceConfiguration.Timeout
import org.scalatest.time.SpanSugar.convertIntToGrainOfTime
@@ -76,33 +79,36 @@ class SchedulerPoolSuite extends WithSparkSQLEngine with HiveJDBCTestHelper {
eventually(Timeout(3.seconds)) {
assert(job0Started)
}
- Seq(1, 0).foreach { priority =>
- threads.execute(() => {
- priority match {
- case 0 =>
- withJdbcStatement() { statement =>
- statement.execute("SET kyuubi.operation.scheduler.pool=p0")
- statement.execute("SELECT java_method('java.lang.Thread', 'sleep', 1500l)" +
- "FROM range(1, 3, 1, 2)")
- }
-
- case 1 =>
- withJdbcStatement() { statement =>
- statement.execute("SET kyuubi.operation.scheduler.pool=p1")
- statement.execute("SELECT java_method('java.lang.Thread', 'sleep', 1500l)" +
- " FROM range(1, 3, 1, 2)")
- }
- }
- })
+ threads.execute(() => {
+ // job name job1
+ withJdbcStatement() { statement =>
+ statement.execute("SET kyuubi.operation.scheduler.pool=p1")
+ statement.execute("SELECT java_method('java.lang.Thread', 'sleep', 1500l)" +
+ " FROM range(1, 3, 1, 2)")
+ }
+ })
+ // make sure job1 started before job2
+ eventually(Timeout(2.seconds)) {
+ assert(job1StartTime > 0)
}
+
+ threads.execute(() => {
+ // job name job2
+ withJdbcStatement() { statement =>
+ statement.execute("SET kyuubi.operation.scheduler.pool=p0")
+ statement.execute("SELECT java_method('java.lang.Thread', 'sleep', 1500l)" +
+ "FROM range(1, 3, 1, 2)")
+ }
+ })
threads.shutdown()
- eventually(Timeout(20.seconds)) {
- // We can not ensure that job1 is started before job2 so here using abs.
- assert(Math.abs(job1StartTime - job2StartTime) < 1000)
- // Job1 minShare is 2(total resource) so that job2 should be allocated tasks after
- // job1 finished.
- assert(job2FinishTime - job1FinishTime >= 1000)
- }
+ threads.awaitTermination(20, SECONDS)
+ // make sure the SparkListener has received the finished events for job1 and job2.
+ KyuubiSparkContextHelper.waitListenerBus(spark)
+ // job1 should be started before job2
+ assert(job1StartTime < job2StartTime)
+ // job2 minShare is 2(total resource) so that job1 should be allocated tasks after
+ // job2 finished.
+ assert(job2FinishTime < job1FinishTime)
} finally {
spark.sparkContext.removeSparkListener(listener)
}
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SparkEngineRegisterSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SparkEngineRegisterSuite.scala
new file mode 100644
index 000000000..8c636af76
--- /dev/null
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SparkEngineRegisterSuite.scala
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.engine.spark
+
+import java.util.UUID
+
+import org.apache.kyuubi.config.KyuubiReservedKeys.{KYUUBI_ENGINE_ID, KYUUBI_ENGINE_URL}
+
+trait SparkEngineRegisterSuite extends WithDiscoverySparkSQLEngine {
+
+ override def withKyuubiConf: Map[String, String] =
+ super.withKyuubiConf ++ Map("spark.ui.enabled" -> "true")
+
+ override val namespace: String = s"/kyuubi/deregister_test/${UUID.randomUUID.toString}"
+
+ test("Spark Engine Register Zookeeper with spark ui info") {
+ withDiscoveryClient(client => {
+ val info = client.getChildren(namespace).head.split(";")
+ assert(info.exists(_.startsWith(KYUUBI_ENGINE_ID)))
+ assert(info.exists(_.startsWith(KYUUBI_ENGINE_URL)))
+ })
+ }
+}
+
+class ZookeeperSparkEngineRegisterSuite extends SparkEngineRegisterSuite
+ with WithEmbeddedZookeeper {
+
+ override def withKyuubiConf: Map[String, String] =
+ super.withKyuubiConf ++ zookeeperConf
+}
+
+class EtcdSparkEngineRegisterSuite extends SparkEngineRegisterSuite
+ with WithEtcdCluster {
+ override def withKyuubiConf: Map[String, String] = super.withKyuubiConf ++ etcdConf
+}
diff --git a/extensions/spark/kyuubi-spark-connector-kudu/src/test/scala/org/apache/kyuubi/spark/connector/kudu/KuduClientSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SparkTBinaryFrontendServiceSuite.scala
similarity index 70%
rename from extensions/spark/kyuubi-spark-connector-kudu/src/test/scala/org/apache/kyuubi/spark/connector/kudu/KuduClientSuite.scala
rename to externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SparkTBinaryFrontendServiceSuite.scala
index eebb4719c..5f81e51f8 100644
--- a/extensions/spark/kyuubi-spark-connector-kudu/src/test/scala/org/apache/kyuubi/spark/connector/kudu/KuduClientSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/SparkTBinaryFrontendServiceSuite.scala
@@ -15,18 +15,15 @@
* limitations under the License.
*/
-package org.apache.kyuubi.spark.connector.kudu
+package org.apache.kyuubi.engine.spark
-import org.apache.kudu.client.KuduClient
+import org.apache.hadoop.conf.Configuration
import org.apache.kyuubi.KyuubiFunSuite
-class KuduClientSuite extends KyuubiFunSuite with KuduMixin {
-
- test("kudu client") {
- val builder = new KuduClient.KuduClientBuilder(kuduMasterUrl)
- val kuduClient = builder.build()
-
- assert(kuduClient.findLeaderMasterServer().getPort === kuduMasterPort)
+class SparkTBinaryFrontendServiceSuite extends KyuubiFunSuite {
+ test("new hive conf") {
+ val hiveConf = SparkTBinaryFrontendService.hiveConf(new Configuration())
+ assert(hiveConf.getClass().getName == SparkTBinaryFrontendService.HIVE_CONF_CLASSNAME)
}
}
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/WithSparkSQLEngine.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/WithSparkSQLEngine.scala
index 629a8374b..3b98c2efb 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/WithSparkSQLEngine.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/WithSparkSQLEngine.scala
@@ -21,7 +21,7 @@ import org.apache.spark.sql.SparkSession
import org.apache.kyuubi.{KyuubiFunSuite, Utils}
import org.apache.kyuubi.config.KyuubiConf
-import org.apache.kyuubi.engine.spark.KyuubiSparkUtil.sparkMajorMinorVersion
+import org.apache.kyuubi.engine.spark.KyuubiSparkUtil.SPARK_ENGINE_RUNTIME_VERSION
trait WithSparkSQLEngine extends KyuubiFunSuite {
protected var spark: SparkSession = _
@@ -34,14 +34,8 @@ trait WithSparkSQLEngine extends KyuubiFunSuite {
// Affected by such configuration' default value
// engine.initialize.sql='SHOW DATABASES'
- protected var initJobId: Int = {
- sparkMajorMinorVersion match {
- case (3, minor) if minor >= 2 => 1 // SPARK-35378
- case (3, _) => 0
- case _ =>
- throw new IllegalArgumentException(s"Not Support spark version $sparkMajorMinorVersion")
- }
- }
+ // SPARK-35378
+ protected lazy val initJobId: Int = if (SPARK_ENGINE_RUNTIME_VERSION >= "3.2") 1 else 0
override def beforeAll(): Unit = {
startSparkEngine()
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/ZookeeperShareLevelSparkEngineSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/ZookeeperShareLevelSparkEngineSuite.scala
index 4ef96e61a..f24abb36c 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/ZookeeperShareLevelSparkEngineSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/ZookeeperShareLevelSparkEngineSuite.scala
@@ -19,6 +19,7 @@ package org.apache.kyuubi.engine.spark
import org.apache.kyuubi.config.KyuubiConf.ENGINE_CHECK_INTERVAL
import org.apache.kyuubi.config.KyuubiConf.ENGINE_SHARE_LEVEL
+import org.apache.kyuubi.config.KyuubiConf.ENGINE_SPARK_MAX_INITIAL_WAIT
import org.apache.kyuubi.config.KyuubiConf.ENGINE_SPARK_MAX_LIFETIME
import org.apache.kyuubi.engine.ShareLevel
import org.apache.kyuubi.engine.ShareLevel.ShareLevel
@@ -30,6 +31,7 @@ trait ZookeeperShareLevelSparkEngineSuite
zookeeperConf ++ Map(
ENGINE_SHARE_LEVEL.key -> shareLevel.toString,
ENGINE_SPARK_MAX_LIFETIME.key -> "PT20s",
+ ENGINE_SPARK_MAX_INITIAL_WAIT.key -> "0",
ENGINE_CHECK_INTERVAL.key -> "PT5s")
}
}
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkArrowbasedOperationSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkArrowbasedOperationSuite.scala
index e46456914..d3d4a56d7 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkArrowbasedOperationSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkArrowbasedOperationSuite.scala
@@ -17,13 +17,36 @@
package org.apache.kyuubi.engine.spark.operation
+import java.lang.{Boolean => JBoolean}
import java.sql.Statement
+import java.util.{Locale, Set => JSet}
+import org.apache.spark.{KyuubiSparkContextHelper, TaskContext}
+import org.apache.spark.scheduler.{SparkListener, SparkListenerJobStart}
+import org.apache.spark.sql.{QueryTest, Row, SparkSession}
+import org.apache.spark.sql.catalyst.InternalRow
+import org.apache.spark.sql.catalyst.plans.logical.Project
+import org.apache.spark.sql.execution.{CollectLimitExec, LocalTableScanExec, QueryExecution, SparkPlan}
+import org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec
+import org.apache.spark.sql.execution.exchange.Exchange
+import org.apache.spark.sql.execution.joins.{BroadcastHashJoinExec, SortMergeJoinExec}
+import org.apache.spark.sql.execution.metric.SparkMetricsTestUtils
+import org.apache.spark.sql.functions.col
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.kyuubi.SparkDatasetHelper
+import org.apache.spark.sql.types.StructType
+import org.apache.spark.sql.util.QueryExecutionListener
+
+import org.apache.kyuubi.KyuubiException
import org.apache.kyuubi.config.KyuubiConf
-import org.apache.kyuubi.engine.spark.WithSparkSQLEngine
+import org.apache.kyuubi.engine.spark.{SparkSQLEngine, WithSparkSQLEngine}
+import org.apache.kyuubi.engine.spark.session.SparkSessionImpl
import org.apache.kyuubi.operation.SparkDataTypeTests
+import org.apache.kyuubi.util.reflect.{DynFields, DynMethods}
+import org.apache.kyuubi.util.reflect.ReflectUtils._
-class SparkArrowbasedOperationSuite extends WithSparkSQLEngine with SparkDataTypeTests {
+class SparkArrowbasedOperationSuite extends WithSparkSQLEngine with SparkDataTypeTests
+ with SparkMetricsTestUtils {
override protected def jdbcUrl: String = getJdbcUrl
@@ -35,6 +58,23 @@ class SparkArrowbasedOperationSuite extends WithSparkSQLEngine with SparkDataTyp
override def resultFormat: String = "arrow"
+ override def beforeEach(): Unit = {
+ super.beforeEach()
+ withJdbcStatement() { statement =>
+ checkResultSetFormat(statement, "arrow")
+ }
+ spark.catalog.listTables()
+ .collect()
+ .foreach { table =>
+ if (table.isTemporary) {
+ spark.catalog.dropTempView(table.name)
+ } else {
+ spark.sql(s"DROP TABLE IF EXISTS ${table.name}")
+ }
+ ()
+ }
+ }
+
test("detect resultSet format") {
withJdbcStatement() { statement =>
checkResultSetFormat(statement, "arrow")
@@ -43,7 +83,314 @@ class SparkArrowbasedOperationSuite extends WithSparkSQLEngine with SparkDataTyp
}
}
- def checkResultSetFormat(statement: Statement, expectFormat: String): Unit = {
+ test("Spark session timezone format") {
+ withJdbcStatement() { statement =>
+ def check(expect: String): Unit = {
+ val query =
+ """
+ |SELECT
+ | from_utc_timestamp(
+ | from_unixtime(
+ | 1670404535000 / 1000, 'yyyy-MM-dd HH:mm:ss'
+ | ),
+ | 'GMT+08:00'
+ | )
+ |""".stripMargin
+ val resultSet = statement.executeQuery(query)
+ assert(resultSet.next())
+ assert(resultSet.getString(1) == expect)
+ }
+
+ def setTimeZone(timeZone: String): Unit = {
+ val rs = statement.executeQuery(s"set spark.sql.session.timeZone=$timeZone")
+ assert(rs.next())
+ }
+
+ Seq("true", "false").foreach { timestampAsString =>
+ statement.executeQuery(
+ s"set ${KyuubiConf.ARROW_BASED_ROWSET_TIMESTAMP_AS_STRING.key}=$timestampAsString")
+ checkArrowBasedRowSetTimestampAsString(statement, timestampAsString)
+ setTimeZone("UTC")
+ check("2022-12-07 17:15:35.0")
+ setTimeZone("GMT+8")
+ check("2022-12-08 01:15:35.0")
+ }
+ }
+ }
+
+ test("assign a new execution id for arrow-based result") {
+ val listener = new SQLMetricsListener
+ withJdbcStatement() { statement =>
+ withSparkListener(listener) {
+ val result = statement.executeQuery("select 1 as c1")
+ assert(result.next())
+ assert(result.getInt("c1") == 1)
+ }
+ }
+
+ assert(listener.queryExecution.analyzed.isInstanceOf[Project])
+ }
+
+ test("arrow-based query metrics") {
+ val listener = new SQLMetricsListener
+ withJdbcStatement() { statement =>
+ withSparkListener(listener) {
+ val result = statement.executeQuery("select 1 as c1")
+ assert(result.next())
+ assert(result.getInt("c1") == 1)
+ }
+ }
+
+ val metrics = listener.queryExecution.executedPlan.collectLeaves().head.metrics
+ assert(metrics.contains("numOutputRows"))
+ assert(metrics("numOutputRows").value === 1)
+ }
+
+ test("SparkDatasetHelper.executeArrowBatchCollect should return expect row count") {
+ val returnSize = Seq(
+ 0, // spark optimizer guaranty the `limit != 0`, it's just for the sanity check
+ 7, // less than one partition
+ 10, // equal to one partition
+ 13, // between one and two partitions, run two jobs
+ 20, // equal to two partitions
+ 29, // between two and three partitions
+ 1000, // all partitions
+ 1001) // more than total row count
+
+ def runAndCheck(sparkPlan: SparkPlan, expectSize: Int): Unit = {
+ val arrowBinary = SparkDatasetHelper.executeArrowBatchCollect(sparkPlan)
+ val rows = fromBatchIterator(
+ arrowBinary.iterator,
+ sparkPlan.schema,
+ "",
+ true,
+ KyuubiSparkContextHelper.dummyTaskContext())
+ assert(rows.size == expectSize)
+ }
+
+ val excludedRules = Seq(
+ "org.apache.spark.sql.catalyst.optimizer.EliminateLimits",
+ "org.apache.spark.sql.catalyst.optimizer.OptimizeLimitZero",
+ "org.apache.spark.sql.execution.adaptive.AQEPropagateEmptyRelation").mkString(",")
+ withSQLConf(
+ SQLConf.OPTIMIZER_EXCLUDED_RULES.key -> excludedRules,
+ SQLConf.ADAPTIVE_OPTIMIZER_EXCLUDED_RULES.key -> excludedRules) {
+ // aqe
+ // outermost AdaptiveSparkPlanExec
+ spark.range(1000)
+ .repartitionByRange(100, col("id"))
+ .createOrReplaceTempView("t_1")
+ spark.sql("select * from t_1")
+ .foreachPartition { p: Iterator[Row] =>
+ assert(p.length == 10)
+ ()
+ }
+ returnSize.foreach { size =>
+ val df = spark.sql(s"select * from t_1 limit $size")
+ val headPlan = df.queryExecution.executedPlan.collectLeaves().head
+ if (SPARK_ENGINE_RUNTIME_VERSION >= "3.2") {
+ assert(headPlan.isInstanceOf[AdaptiveSparkPlanExec])
+ val finalPhysicalPlan =
+ SparkDatasetHelper.finalPhysicalPlan(headPlan.asInstanceOf[AdaptiveSparkPlanExec])
+ assert(finalPhysicalPlan.isInstanceOf[CollectLimitExec])
+ }
+ if (size > 1000) {
+ runAndCheck(df.queryExecution.executedPlan, 1000)
+ } else {
+ runAndCheck(df.queryExecution.executedPlan, size)
+ }
+ }
+
+ // outermost CollectLimitExec
+ spark.range(0, 1000, 1, numPartitions = 100)
+ .createOrReplaceTempView("t_2")
+ spark.sql("select * from t_2")
+ .foreachPartition { p: Iterator[Row] =>
+ assert(p.length == 10)
+ ()
+ }
+ returnSize.foreach { size =>
+ val df = spark.sql(s"select * from t_2 limit $size")
+ val plan = df.queryExecution.executedPlan
+ assert(plan.isInstanceOf[CollectLimitExec])
+ if (size > 1000) {
+ runAndCheck(df.queryExecution.executedPlan, 1000)
+ } else {
+ runAndCheck(df.queryExecution.executedPlan, size)
+ }
+ }
+ }
+ }
+
+ test("aqe should work properly") {
+
+ val s = spark
+ import s.implicits._
+
+ spark.sparkContext.parallelize(
+ (1 to 100).map(i => TestData(i, i.toString))).toDF()
+ .createOrReplaceTempView("testData")
+ spark.sparkContext.parallelize(
+ TestData2(1, 1) ::
+ TestData2(1, 2) ::
+ TestData2(2, 1) ::
+ TestData2(2, 2) ::
+ TestData2(3, 1) ::
+ TestData2(3, 2) :: Nil,
+ 2).toDF()
+ .createOrReplaceTempView("testData2")
+
+ withSQLConf(
+ SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
+ SQLConf.SHUFFLE_PARTITIONS.key -> "5",
+ SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "80") {
+ val (plan, adaptivePlan) = runAdaptiveAndVerifyResult(
+ """
+ |SELECT * FROM(
+ | SELECT * FROM testData join testData2 ON key = a where value = '1'
+ |) LIMIT 1
+ |""".stripMargin)
+ val smj = plan.collect { case smj: SortMergeJoinExec => smj }
+ val bhj = adaptivePlan.collect { case bhj: BroadcastHashJoinExec => bhj }
+ assert(smj.size == 1)
+ assert(bhj.size == 1)
+ }
+ }
+
+ test("result offset support") {
+ assume(SPARK_ENGINE_RUNTIME_VERSION >= "3.4")
+ var numStages = 0
+ val listener = new SparkListener {
+ override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
+ numStages = jobStart.stageInfos.length
+ }
+ }
+ withJdbcStatement() { statement =>
+ withSparkListener(listener) {
+ withPartitionedTable("t_3") {
+ statement.executeQuery("select * from t_3 limit 10 offset 10")
+ }
+ }
+ }
+ // the extra shuffle be introduced if the `offset` > 0
+ assert(numStages == 2)
+ }
+
+ test("arrow serialization should not introduce extra shuffle for outermost limit") {
+ var numStages = 0
+ val listener = new SparkListener {
+ override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
+ numStages = jobStart.stageInfos.length
+ }
+ }
+ withJdbcStatement() { statement =>
+ withSparkListener(listener) {
+ withPartitionedTable("t_3") {
+ statement.executeQuery("select * from t_3 limit 1000")
+ }
+ }
+ }
+ // Should be only one stage since there is no shuffle.
+ assert(numStages == 1)
+ }
+
+ test("CommandResultExec should not trigger job") {
+ val listener = new JobCountListener
+ val l2 = new SQLMetricsListener
+ val nodeName = spark.sql("SHOW TABLES").queryExecution.executedPlan.getClass.getName
+ if (SPARK_ENGINE_RUNTIME_VERSION < "3.2") {
+ assert(nodeName == "org.apache.spark.sql.execution.command.ExecutedCommandExec")
+ } else {
+ assert(nodeName == "org.apache.spark.sql.execution.CommandResultExec")
+ }
+ withJdbcStatement("table_1") { statement =>
+ statement.executeQuery("CREATE TABLE table_1 (id bigint) USING parquet")
+ withSparkListener(listener) {
+ withSparkListener(l2) {
+ val resultSet = statement.executeQuery("SHOW TABLES")
+ assert(resultSet.next())
+ assert(resultSet.getString("tableName") == "table_1")
+ }
+ }
+ }
+
+ if (SPARK_ENGINE_RUNTIME_VERSION < "3.2") {
+ // Note that before Spark 3.2, a LocalTableScan SparkPlan will be submitted, and the issue of
+ // preventing LocalTableScan from triggering a job submission was addressed in [KYUUBI #4710].
+ assert(l2.queryExecution.executedPlan.getClass.getName ==
+ "org.apache.spark.sql.execution.LocalTableScanExec")
+ } else {
+ assert(l2.queryExecution.executedPlan.getClass.getName ==
+ "org.apache.spark.sql.execution.CommandResultExec")
+ }
+ assert(listener.numJobs == 0)
+ }
+
+ test("LocalTableScanExec should not trigger job") {
+ val listener = new JobCountListener
+ withJdbcStatement("view_1") { statement =>
+ withSparkListener(listener) {
+ withAllSessions { s =>
+ import s.implicits._
+ Seq((1, "a")).toDF("c1", "c2").createOrReplaceTempView("view_1")
+ val plan = s.sql("select * from view_1").queryExecution.executedPlan
+ assert(plan.isInstanceOf[LocalTableScanExec])
+ }
+ val resultSet = statement.executeQuery("select * from view_1")
+ assert(resultSet.next())
+ assert(!resultSet.next())
+ }
+ }
+ assert(listener.numJobs == 0)
+ }
+
+ test("LocalTableScanExec metrics") {
+ val listener = new SQLMetricsListener
+ withJdbcStatement("view_1") { statement =>
+ withSparkListener(listener) {
+ withAllSessions { s =>
+ import s.implicits._
+ Seq((1, "a")).toDF("c1", "c2").createOrReplaceTempView("view_1")
+ }
+ val result = statement.executeQuery("select * from view_1")
+ assert(result.next())
+ assert(!result.next())
+ }
+ }
+
+ val metrics = listener.queryExecution.executedPlan.collectLeaves().head.metrics
+ assert(metrics.contains("numOutputRows"))
+ assert(metrics("numOutputRows").value === 1)
+ }
+
+ test("post LocalTableScanExec driver-side metrics") {
+ val expectedMetrics = Map(
+ 0L -> (("LocalTableScan", Map("number of output rows" -> "2"))))
+ withTables("view_1") {
+ val s = spark
+ import s.implicits._
+ Seq((1, "a"), (2, "b")).toDF("c1", "c2").createOrReplaceTempView("view_1")
+ val df = spark.sql("SELECT * FROM view_1")
+ val metrics = getSparkPlanMetrics(df)
+ assert(metrics == expectedMetrics)
+ }
+ }
+
+ test("post CommandResultExec driver-side metrics") {
+ spark.sql("show tables").show(truncate = false)
+ assume(SPARK_ENGINE_RUNTIME_VERSION >= "3.2")
+ val expectedMetrics = Map(
+ 0L -> (("CommandResult", Map("number of output rows" -> "2"))))
+ withTables("table_1", "table_2") {
+ spark.sql("CREATE TABLE table_1 (id bigint) USING parquet")
+ spark.sql("CREATE TABLE table_2 (id bigint) USING parquet")
+ val df = spark.sql("SHOW TABLES")
+ val metrics = getSparkPlanMetrics(df)
+ assert(metrics == expectedMetrics)
+ }
+ }
+
+ private def checkResultSetFormat(statement: Statement, expectFormat: String): Unit = {
val query =
s"""
|SELECT '$${hivevar:${KyuubiConf.OPERATION_RESULT_FORMAT.key}}' AS col
@@ -52,4 +399,197 @@ class SparkArrowbasedOperationSuite extends WithSparkSQLEngine with SparkDataTyp
assert(resultSet.next())
assert(resultSet.getString("col") === expectFormat)
}
+
+ private def checkArrowBasedRowSetTimestampAsString(
+ statement: Statement,
+ expect: String): Unit = {
+ val query =
+ s"""
+ |SELECT '$${hivevar:${KyuubiConf.ARROW_BASED_ROWSET_TIMESTAMP_AS_STRING.key}}' AS col
+ |""".stripMargin
+ val resultSet = statement.executeQuery(query)
+ assert(resultSet.next())
+ assert(resultSet.getString("col") === expect)
+ }
+
+ // since all the new sessions have their owner listener bus, we should register the listener
+ // in the current session.
+ private def withSparkListener[T](listener: QueryExecutionListener)(body: => T): T = {
+ withAllSessions(s => s.listenerManager.register(listener))
+ try {
+ val result = body
+ KyuubiSparkContextHelper.waitListenerBus(spark)
+ result
+ } finally {
+ withAllSessions(s => s.listenerManager.unregister(listener))
+ }
+ }
+
+ // since all the new sessions have their owner listener bus, we should register the listener
+ // in the current session.
+ private def withSparkListener[T](listener: SparkListener)(body: => T): T = {
+ withAllSessions(s => s.sparkContext.addSparkListener(listener))
+ try {
+ val result = body
+ KyuubiSparkContextHelper.waitListenerBus(spark)
+ result
+ } finally {
+ withAllSessions(s => s.sparkContext.removeSparkListener(listener))
+ }
+ }
+
+ private def withPartitionedTable[T](viewName: String)(body: => T): T = {
+ withAllSessions { spark =>
+ spark.range(0, 1000, 1, numPartitions = 100)
+ .createOrReplaceTempView(viewName)
+ }
+ try {
+ body
+ } finally {
+ withAllSessions { spark =>
+ spark.sql(s"DROP VIEW IF EXISTS $viewName")
+ }
+ }
+ }
+
+ private def withAllSessions(op: SparkSession => Unit): Unit = {
+ SparkSQLEngine.currentEngine.get
+ .backendService
+ .sessionManager
+ .allSessions()
+ .map(_.asInstanceOf[SparkSessionImpl].spark)
+ .foreach(op(_))
+ }
+
+ private def runAdaptiveAndVerifyResult(query: String): (SparkPlan, SparkPlan) = {
+ val dfAdaptive = spark.sql(query)
+ val planBefore = dfAdaptive.queryExecution.executedPlan
+ val result = dfAdaptive.collect()
+ withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "false") {
+ val df = spark.sql(query)
+ QueryTest.checkAnswer(df, df.collect().toSeq)
+ }
+ val planAfter = dfAdaptive.queryExecution.executedPlan
+ val adaptivePlan = planAfter.asInstanceOf[AdaptiveSparkPlanExec].executedPlan
+ val exchanges = adaptivePlan.collect {
+ case e: Exchange => e
+ }
+ assert(exchanges.isEmpty, "The final plan should not contain any Exchange node.")
+ (dfAdaptive.queryExecution.sparkPlan, adaptivePlan)
+ }
+
+ /**
+ * Sets all SQL configurations specified in `pairs`, calls `f`, and then restores all SQL
+ * configurations.
+ */
+ protected def withSQLConf(pairs: (String, String)*)(f: => Unit): Unit = {
+ val conf = SQLConf.get
+ val (keys, values) = pairs.unzip
+ val currentValues = keys.map { key =>
+ if (conf.contains(key)) {
+ Some(conf.getConfString(key))
+ } else {
+ None
+ }
+ }
+ (keys, values).zipped.foreach { (k, v) =>
+ if (isStaticConfigKey(k)) {
+ throw new KyuubiException(s"Cannot modify the value of a static config: $k")
+ }
+ conf.setConfString(k, v)
+ }
+ try f
+ finally {
+ keys.zip(currentValues).foreach {
+ case (key, Some(value)) => conf.setConfString(key, value)
+ case (key, None) => conf.unsetConf(key)
+ }
+ }
+ }
+
+ private def withTables[T](tableNames: String*)(f: => T): T = {
+ try {
+ f
+ } finally {
+ tableNames.foreach { name =>
+ if (name.toUpperCase(Locale.ROOT).startsWith("VIEW")) {
+ spark.sql(s"DROP VIEW IF EXISTS $name")
+ } else {
+ spark.sql(s"DROP TABLE IF EXISTS $name")
+ }
+ }
+ }
+ }
+
+ /**
+ * This method provides a reflection-based implementation of [[SQLConf.isStaticConfigKey]] to
+ * adapt Spark-3.1.x
+ *
+ * TODO: Once we drop support for Spark 3.1.x, we can directly call
+ * [[SQLConf.isStaticConfigKey()]].
+ */
+ private def isStaticConfigKey(key: String): Boolean =
+ getField[JSet[String]]((SQLConf.getClass, SQLConf), "staticConfKeys").contains(key)
+
+ // the signature of function [[ArrowConverters.fromBatchIterator]] is changed in SPARK-43528
+ // (since Spark 3.5)
+ private lazy val fromBatchIteratorMethod = DynMethods.builder("fromBatchIterator")
+ .hiddenImpl( // for Spark 3.4 or previous
+ "org.apache.spark.sql.execution.arrow.ArrowConverters$",
+ classOf[Iterator[Array[Byte]]],
+ classOf[StructType],
+ classOf[String],
+ classOf[TaskContext])
+ .hiddenImpl( // for Spark 3.5 or later
+ "org.apache.spark.sql.execution.arrow.ArrowConverters$",
+ classOf[Iterator[Array[Byte]]],
+ classOf[StructType],
+ classOf[String],
+ classOf[Boolean],
+ classOf[TaskContext])
+ .build()
+
+ def fromBatchIterator(
+ arrowBatchIter: Iterator[Array[Byte]],
+ schema: StructType,
+ timeZoneId: String,
+ errorOnDuplicatedFieldNames: JBoolean,
+ context: TaskContext): Iterator[InternalRow] = {
+ val className = "org.apache.spark.sql.execution.arrow.ArrowConverters$"
+ val instance = DynFields.builder().impl(className, "MODULE$").build[Object]().get(null)
+ if (SPARK_ENGINE_RUNTIME_VERSION >= "3.5") {
+ fromBatchIteratorMethod.invoke[Iterator[InternalRow]](
+ instance,
+ arrowBatchIter,
+ schema,
+ timeZoneId,
+ errorOnDuplicatedFieldNames,
+ context)
+ } else {
+ fromBatchIteratorMethod.invoke[Iterator[InternalRow]](
+ instance,
+ arrowBatchIter,
+ schema,
+ timeZoneId,
+ context)
+ }
+ }
+
+ class JobCountListener extends SparkListener {
+ var numJobs = 0
+ override def onJobStart(jobStart: SparkListenerJobStart): Unit = {
+ numJobs += 1
+ }
+ }
+
+ class SQLMetricsListener extends QueryExecutionListener {
+ var queryExecution: QueryExecution = _
+ override def onSuccess(funcName: String, qe: QueryExecution, durationNs: Long): Unit = {
+ queryExecution = qe
+ }
+ override def onFailure(funcName: String, qe: QueryExecution, exception: Exception): Unit = {}
+ }
}
+
+case class TestData(key: Int, value: String)
+case class TestData2(a: Int, b: Int)
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkCatalogDatabaseOperationSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkCatalogDatabaseOperationSuite.scala
index 46208bff1..5ee01bda1 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkCatalogDatabaseOperationSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkCatalogDatabaseOperationSuite.scala
@@ -22,7 +22,7 @@ import org.apache.spark.sql.util.CaseInsensitiveStringMap
import org.apache.kyuubi.config.KyuubiConf.ENGINE_OPERATION_CONVERT_CATALOG_DATABASE_ENABLED
import org.apache.kyuubi.engine.spark.WithSparkSQLEngine
-import org.apache.kyuubi.engine.spark.shim.SparkCatalogShim
+import org.apache.kyuubi.engine.spark.util.SparkCatalogUtils
import org.apache.kyuubi.operation.HiveJDBCTestHelper
class SparkCatalogDatabaseOperationSuite extends WithSparkSQLEngine with HiveJDBCTestHelper {
@@ -37,7 +37,7 @@ class SparkCatalogDatabaseOperationSuite extends WithSparkSQLEngine with HiveJDB
test("set/get current catalog") {
withJdbcStatement() { statement =>
val catalog = statement.getConnection.getCatalog
- assert(catalog == SparkCatalogShim.SESSION_CATALOG)
+ assert(catalog == SparkCatalogUtils.SESSION_CATALOG)
statement.getConnection.setCatalog("dummy")
val changedCatalog = statement.getConnection.getCatalog
assert(changedCatalog == "dummy")
@@ -61,7 +61,7 @@ class DummyCatalog extends CatalogPlugin {
_name = name
}
- private var _name: String = null
+ private var _name: String = _
override def name(): String = _name
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkOperationSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkOperationSuite.scala
index 30bbf8b77..adab0231d 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkOperationSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/operation/SparkOperationSuite.scala
@@ -32,14 +32,14 @@ import org.apache.spark.sql.catalyst.analysis.FunctionRegistry
import org.apache.spark.sql.types._
import org.apache.kyuubi.config.KyuubiConf
-import org.apache.kyuubi.engine.SemanticVersion
import org.apache.kyuubi.engine.spark.WithSparkSQLEngine
import org.apache.kyuubi.engine.spark.schema.SchemaHelper.TIMESTAMP_NTZ
-import org.apache.kyuubi.engine.spark.shim.SparkCatalogShim
+import org.apache.kyuubi.engine.spark.util.SparkCatalogUtils
+import org.apache.kyuubi.jdbc.hive.KyuubiStatement
import org.apache.kyuubi.operation.{HiveMetadataTests, SparkQueryTests}
import org.apache.kyuubi.operation.meta.ResultSetSchemaConstant._
import org.apache.kyuubi.util.KyuubiHadoopUtils
-import org.apache.kyuubi.util.SparkVersionUtil.isSparkVersionAtLeast
+import org.apache.kyuubi.util.SemanticVersion
class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with SparkQueryTests {
@@ -50,7 +50,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
withJdbcStatement() { statement =>
val meta = statement.getConnection.getMetaData
val types = meta.getTableTypes
- val expected = SparkCatalogShim.sparkTableTypes.toIterator
+ val expected = SparkCatalogUtils.sparkTableTypes.toIterator
while (types.next()) {
assert(types.getString(TABLE_TYPE) === expected.next())
}
@@ -93,12 +93,12 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
.add("c17", "struct", nullable = true, "17")
// since spark3.3.0
- if (SPARK_ENGINE_VERSION >= "3.3") {
+ if (SPARK_ENGINE_RUNTIME_VERSION >= "3.3") {
schema = schema.add("c18", "interval day", nullable = true, "18")
.add("c19", "interval year", nullable = true, "19")
}
// since spark3.4.0
- if (SPARK_ENGINE_VERSION >= "3.4") {
+ if (SPARK_ENGINE_RUNTIME_VERSION >= "3.4") {
schema = schema.add("c20", "timestamp_ntz", nullable = true, "20")
}
@@ -144,7 +144,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
var pos = 0
while (rowSet.next()) {
- assert(rowSet.getString(TABLE_CAT) === SparkCatalogShim.SESSION_CATALOG)
+ assert(rowSet.getString(TABLE_CAT) === SparkCatalogUtils.SESSION_CATALOG)
assert(rowSet.getString(TABLE_SCHEM) === defaultSchema)
assert(rowSet.getString(TABLE_NAME) === tableName)
assert(rowSet.getString(COLUMN_NAME) === schema(pos).name)
@@ -202,7 +202,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
val data = statement.getConnection.getMetaData
val rowSet = data.getColumns("", "global_temp", viewName, null)
while (rowSet.next()) {
- assert(rowSet.getString(TABLE_CAT) === SparkCatalogShim.SESSION_CATALOG)
+ assert(rowSet.getString(TABLE_CAT) === SparkCatalogUtils.SESSION_CATALOG)
assert(rowSet.getString(TABLE_SCHEM) === "global_temp")
assert(rowSet.getString(TABLE_NAME) === viewName)
assert(rowSet.getString(COLUMN_NAME) === "i")
@@ -229,7 +229,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
val data = statement.getConnection.getMetaData
val rowSet = data.getColumns("", "global_temp", viewName, "n")
while (rowSet.next()) {
- assert(rowSet.getString(TABLE_CAT) === SparkCatalogShim.SESSION_CATALOG)
+ assert(rowSet.getString(TABLE_CAT) === SparkCatalogUtils.SESSION_CATALOG)
assert(rowSet.getString(TABLE_SCHEM) === "global_temp")
assert(rowSet.getString(TABLE_NAME) === viewName)
assert(rowSet.getString(COLUMN_NAME) === "n")
@@ -307,28 +307,28 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
val tFetchResultsReq1 = new TFetchResultsReq(opHandle, TFetchOrientation.FETCH_NEXT, 1)
val tFetchResultsResp1 = client.FetchResults(tFetchResultsReq1)
assert(tFetchResultsResp1.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
- val idSeq1 = tFetchResultsResp1.getResults.getColumns.get(0).getI64Val.getValues.asScala.toSeq
+ val idSeq1 = tFetchResultsResp1.getResults.getColumns.get(0).getI64Val.getValues.asScala
assertResult(Seq(0L))(idSeq1)
// fetch next from first row
val tFetchResultsReq2 = new TFetchResultsReq(opHandle, TFetchOrientation.FETCH_NEXT, 1)
val tFetchResultsResp2 = client.FetchResults(tFetchResultsReq2)
assert(tFetchResultsResp2.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
- val idSeq2 = tFetchResultsResp2.getResults.getColumns.get(0).getI64Val.getValues.asScala.toSeq
+ val idSeq2 = tFetchResultsResp2.getResults.getColumns.get(0).getI64Val.getValues.asScala
assertResult(Seq(1L))(idSeq2)
// fetch prior from second row, expected got first row
val tFetchResultsReq3 = new TFetchResultsReq(opHandle, TFetchOrientation.FETCH_PRIOR, 1)
val tFetchResultsResp3 = client.FetchResults(tFetchResultsReq3)
assert(tFetchResultsResp3.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
- val idSeq3 = tFetchResultsResp3.getResults.getColumns.get(0).getI64Val.getValues.asScala.toSeq
+ val idSeq3 = tFetchResultsResp3.getResults.getColumns.get(0).getI64Val.getValues.asScala
assertResult(Seq(0L))(idSeq3)
// fetch first
val tFetchResultsReq4 = new TFetchResultsReq(opHandle, TFetchOrientation.FETCH_FIRST, 3)
val tFetchResultsResp4 = client.FetchResults(tFetchResultsReq4)
assert(tFetchResultsResp4.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
- val idSeq4 = tFetchResultsResp4.getResults.getColumns.get(0).getI64Val.getValues.asScala.toSeq
+ val idSeq4 = tFetchResultsResp4.getResults.getColumns.get(0).getI64Val.getValues.asScala
assertResult(Seq(0L, 1L))(idSeq4)
}
}
@@ -350,7 +350,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
val tFetchResultsResp1 = client.FetchResults(tFetchResultsReq1)
assert(tFetchResultsResp1.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
val idSeq1 = tFetchResultsResp1.getResults.getColumns.get(0)
- .getI64Val.getValues.asScala.toSeq
+ .getI64Val.getValues.asScala
assertResult(Seq(0L))(idSeq1)
// fetch next from first row
@@ -358,7 +358,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
val tFetchResultsResp2 = client.FetchResults(tFetchResultsReq2)
assert(tFetchResultsResp2.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
val idSeq2 = tFetchResultsResp2.getResults.getColumns.get(0)
- .getI64Val.getValues.asScala.toSeq
+ .getI64Val.getValues.asScala
assertResult(Seq(1L))(idSeq2)
// fetch prior from second row, expected got first row
@@ -366,7 +366,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
val tFetchResultsResp3 = client.FetchResults(tFetchResultsReq3)
assert(tFetchResultsResp3.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
val idSeq3 = tFetchResultsResp3.getResults.getColumns.get(0)
- .getI64Val.getValues.asScala.toSeq
+ .getI64Val.getValues.asScala
assertResult(Seq(0L))(idSeq3)
// fetch first
@@ -374,7 +374,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
val tFetchResultsResp4 = client.FetchResults(tFetchResultsReq4)
assert(tFetchResultsResp4.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
val idSeq4 = tFetchResultsResp4.getResults.getColumns.get(0)
- .getI64Val.getValues.asScala.toSeq
+ .getI64Val.getValues.asScala
assertResult(Seq(0L, 1L))(idSeq4)
}
}
@@ -511,7 +511,7 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
val status = tOpenSessionResp.getStatus
val errorMessage = status.getErrorMessage
assert(status.getStatusCode === TStatusCode.ERROR_STATUS)
- if (isSparkVersionAtLeast("3.4")) {
+ if (SPARK_ENGINE_RUNTIME_VERSION >= "3.4") {
assert(errorMessage.contains("[SCHEMA_NOT_FOUND]"))
assert(errorMessage.contains(s"The schema `$dbName` cannot be found."))
} else {
@@ -729,6 +729,14 @@ class SparkOperationSuite extends WithSparkSQLEngine with HiveMetadataTests with
}
}
+ test("KYUUBI #5030: Support get query id in Spark engine") {
+ withJdbcStatement() { stmt =>
+ stmt.executeQuery("SELECT 1")
+ val queryId = stmt.asInstanceOf[KyuubiStatement].getQueryId
+ assert(queryId != null && queryId.nonEmpty)
+ }
+ }
+
private def whenMetaStoreURIsSetTo(uris: String)(func: String => Unit): Unit = {
val conf = spark.sparkContext.hadoopConfiguration
val origin = conf.get("hive.metastore.uris", "")
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/schema/RowSetSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/schema/RowSetSuite.scala
index 803eea3e6..5d2ba4a0d 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/schema/RowSetSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/schema/RowSetSuite.scala
@@ -20,7 +20,7 @@ package org.apache.kyuubi.engine.spark.schema
import java.nio.ByteBuffer
import java.nio.charset.StandardCharsets
import java.sql.{Date, Timestamp}
-import java.time.{Instant, LocalDate, ZoneId}
+import java.time.{Instant, LocalDate}
import scala.collection.JavaConverters._
@@ -30,7 +30,6 @@ import org.apache.spark.sql.types._
import org.apache.spark.unsafe.types.CalendarInterval
import org.apache.kyuubi.KyuubiFunSuite
-import org.apache.kyuubi.engine.spark.schema.RowSet.toHiveString
class RowSetSuite extends KyuubiFunSuite {
@@ -97,10 +96,9 @@ class RowSetSuite extends KyuubiFunSuite {
.add("q", "timestamp")
private val rows: Seq[Row] = (0 to 10).map(genRow) ++ Seq(Row.fromSeq(Seq.fill(17)(null)))
- private val zoneId: ZoneId = ZoneId.systemDefault()
test("column based set") {
- val tRowSet = RowSet.toColumnBasedSet(rows, schema, zoneId)
+ val tRowSet = RowSet.toColumnBasedSet(rows, schema)
assert(tRowSet.getColumns.size() === schema.size)
assert(tRowSet.getRowsSize === 0)
@@ -159,22 +157,22 @@ class RowSetSuite extends KyuubiFunSuite {
val decCol = cols.next().getStringVal
decCol.getValues.asScala.zipWithIndex.foreach {
- case (b, 11) => assert(b.isEmpty)
+ case (b, 11) => assert(b === "NULL")
case (b, i) => assert(b === s"$i.$i")
}
val dateCol = cols.next().getStringVal
dateCol.getValues.asScala.zipWithIndex.foreach {
- case (b, 11) => assert(b.isEmpty)
+ case (b, 11) => assert(b === "NULL")
case (b, i) =>
- assert(b === toHiveString((Date.valueOf(s"2018-11-${i + 1}"), DateType), zoneId))
+ assert(b === RowSet.toHiveString(Date.valueOf(s"2018-11-${i + 1}") -> DateType))
}
val tsCol = cols.next().getStringVal
tsCol.getValues.asScala.zipWithIndex.foreach {
- case (b, 11) => assert(b.isEmpty)
+ case (b, 11) => assert(b === "NULL")
case (b, i) => assert(b ===
- toHiveString((Timestamp.valueOf(s"2018-11-17 13:33:33.$i"), TimestampType), zoneId))
+ RowSet.toHiveString(Timestamp.valueOf(s"2018-11-17 13:33:33.$i") -> TimestampType))
}
val binCol = cols.next().getBinaryVal
@@ -185,29 +183,27 @@ class RowSetSuite extends KyuubiFunSuite {
val arrCol = cols.next().getStringVal
arrCol.getValues.asScala.zipWithIndex.foreach {
- case (b, 11) => assert(b === "")
- case (b, i) => assert(b === toHiveString(
- (Array.fill(i)(java.lang.Double.valueOf(s"$i.$i")).toSeq, ArrayType(DoubleType)),
- zoneId))
+ case (b, 11) => assert(b === "NULL")
+ case (b, i) => assert(b === RowSet.toHiveString(
+ Array.fill(i)(java.lang.Double.valueOf(s"$i.$i")).toSeq -> ArrayType(DoubleType)))
}
val mapCol = cols.next().getStringVal
mapCol.getValues.asScala.zipWithIndex.foreach {
- case (b, 11) => assert(b === "")
- case (b, i) => assert(b === toHiveString(
- (Map(i -> java.lang.Double.valueOf(s"$i.$i")), MapType(IntegerType, DoubleType)),
- zoneId))
+ case (b, 11) => assert(b === "NULL")
+ case (b, i) => assert(b === RowSet.toHiveString(
+ Map(i -> java.lang.Double.valueOf(s"$i.$i")) -> MapType(IntegerType, DoubleType)))
}
val intervalCol = cols.next().getStringVal
intervalCol.getValues.asScala.zipWithIndex.foreach {
- case (b, 11) => assert(b === "")
+ case (b, 11) => assert(b === "NULL")
case (b, i) => assert(b === new CalendarInterval(i, i, i).toString)
}
}
test("row based set") {
- val tRowSet = RowSet.toRowBasedSet(rows, schema, zoneId)
+ val tRowSet = RowSet.toRowBasedSet(rows, schema)
assert(tRowSet.getColumnCount === 0)
assert(tRowSet.getRowsSize === rows.size)
val iter = tRowSet.getRowsIterator
@@ -237,7 +233,7 @@ class RowSetSuite extends KyuubiFunSuite {
assert(r6.get(9).getStringVal.getValue === "2018-11-06")
val r7 = iter.next().getColVals
- assert(r7.get(10).getStringVal.getValue === "2018-11-17 13:33:33.600")
+ assert(r7.get(10).getStringVal.getValue === "2018-11-17 13:33:33.6")
assert(r7.get(11).getStringVal.getValue === new String(
Array.fill[Byte](6)(6.toByte),
StandardCharsets.UTF_8))
@@ -245,7 +241,7 @@ class RowSetSuite extends KyuubiFunSuite {
val r8 = iter.next().getColVals
assert(r8.get(12).getStringVal.getValue === Array.fill(7)(7.7d).mkString("[", ",", "]"))
assert(r8.get(13).getStringVal.getValue ===
- toHiveString((Map(7 -> 7.7d), MapType(IntegerType, DoubleType)), zoneId))
+ RowSet.toHiveString(Map(7 -> 7.7d) -> MapType(IntegerType, DoubleType)))
val r9 = iter.next().getColVals
assert(r9.get(14).getStringVal.getValue === new CalendarInterval(8, 8, 8).toString)
@@ -253,7 +249,7 @@ class RowSetSuite extends KyuubiFunSuite {
test("to row set") {
TProtocolVersion.values().foreach { proto =>
- val set = RowSet.toTRowSet(rows, schema, proto, zoneId)
+ val set = RowSet.toTRowSet(rows, schema, proto)
if (proto.getValue < TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V6.getValue) {
assert(!set.isSetColumns, proto.toString)
assert(set.isSetRows, proto.toString)
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/session/SessionSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/session/SessionSuite.scala
index 5e0b6c28e..b89c560b3 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/session/SessionSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/session/SessionSuite.scala
@@ -27,7 +27,9 @@ import org.apache.kyuubi.service.ServiceState._
class SessionSuite extends WithSparkSQLEngine with HiveJDBCTestHelper {
override def withKyuubiConf: Map[String, String] = {
- Map(ENGINE_SHARE_LEVEL.key -> "CONNECTION")
+ Map(
+ ENGINE_SHARE_LEVEL.key -> "CONNECTION",
+ ENGINE_SPARK_MAX_INITIAL_WAIT.key -> "0")
}
override protected def beforeEach(): Unit = {
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/udf/KyuubiDefinedFunctionSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/udf/KyuubiDefinedFunctionSuite.scala
index dc0513ed3..7a3f8c940 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/udf/KyuubiDefinedFunctionSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/engine/spark/udf/KyuubiDefinedFunctionSuite.scala
@@ -19,26 +19,23 @@ package org.apache.kyuubi.engine.spark.udf
import java.nio.file.Paths
-import scala.collection.mutable.ArrayBuffer
+import org.apache.kyuubi.{KyuubiFunSuite, MarkdownBuilder, Utils}
+import org.apache.kyuubi.util.GoldenFileUtils._
-import org.apache.kyuubi.{KyuubiFunSuite, TestUtils, Utils}
-
-// scalastyle:off line.size.limit
/**
* End-to-end test cases for configuration doc file
- * The golden result file is "docs/sql/functions.md".
+ * The golden result file is "docs/extensions/engines/spark/functions.md".
*
* To run the entire test suite:
* {{{
- * build/mvn clean test -pl externals/kyuubi-spark-sql-engine -am -Pflink-provided,spark-provided,hive-provided -DwildcardSuites=org.apache.kyuubi.engine.spark.udf.KyuubiDefinedFunctionSuite
+ * KYUUBI_UPDATE=0 dev/gen/gen_spark_kdf_docs.sh
* }}}
*
* To re-generate golden files for entire suite, run:
* {{{
- * KYUUBI_UPDATE=1 build/mvn clean test -pl externals/kyuubi-spark-sql-engine -am -Pflink-provided,spark-provided,hive-provided -DwildcardSuites=org.apache.kyuubi.engine.spark.udf.KyuubiDefinedFunctionSuite
+ * dev/gen/gen_spark_kdf_docs.sh
* }}}
*/
-// scalastyle:on line.size.limit
class KyuubiDefinedFunctionSuite extends KyuubiFunSuite {
private val kyuubiHome: String = Utils.getCodeSourceLocation(getClass)
@@ -48,45 +45,20 @@ class KyuubiDefinedFunctionSuite extends KyuubiFunSuite {
.toAbsolutePath
test("verify or update kyuubi spark sql functions") {
- val newOutput = new ArrayBuffer[String]()
- newOutput += ""
- newOutput += ""
- newOutput += ""
- newOutput += ""
- newOutput += ""
- newOutput += "# Auxiliary SQL Functions"
- newOutput += ""
- newOutput += "Kyuubi provides several auxiliary SQL functions as supplement to Spark's " +
- "[Built-in Functions](https://spark.apache.org/docs/latest/api/sql/index.html#" +
- "built-in-functions)"
- newOutput += ""
- newOutput += "Name | Description | Return Type | Since"
- newOutput += "--- | --- | --- | ---"
- KDFRegistry
+ val builder = MarkdownBuilder(licenced = true, getClass.getName)
+
+ builder += "# Auxiliary SQL Functions" +=
+ """Kyuubi provides several auxiliary SQL functions as supplement to Spark's
+ | [Built-in Functions](https://spark.apache.org/docs/latest/api/sql/index.html#
+ |built-in-functions)""" ++=
+ """
+ | Name | Description | Return Type | Since
+ | --- | --- | --- | ---
+ |"""
KDFRegistry.registeredFunctions.foreach { func =>
- newOutput += s"${func.name} | ${func.description} | ${func.returnType} | ${func.since}"
+ builder += s"${func.name} | ${func.description} | ${func.returnType} | ${func.since}"
}
- newOutput += ""
- TestUtils.verifyOutput(
- markdown,
- newOutput,
- getClass.getCanonicalName,
- "externals/kyuubi-spark-sql-engine")
+
+ verifyOrRegenerateGoldenFile(markdown, builder.toMarkdown, "dev/gen/gen_spark_kdf_docs.sh")
}
}
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/jdbc/KyuubiHiveDriverSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/jdbc/KyuubiHiveDriverSuite.scala
index 4d3c75498..ae68440df 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/jdbc/KyuubiHiveDriverSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/kyuubi/jdbc/KyuubiHiveDriverSuite.scala
@@ -22,7 +22,7 @@ import java.util.Properties
import org.apache.kyuubi.IcebergSuiteMixin
import org.apache.kyuubi.engine.spark.WithSparkSQLEngine
-import org.apache.kyuubi.engine.spark.shim.SparkCatalogShim
+import org.apache.kyuubi.engine.spark.util.SparkCatalogUtils
import org.apache.kyuubi.jdbc.hive.{KyuubiConnection, KyuubiStatement}
import org.apache.kyuubi.tags.IcebergTest
@@ -47,15 +47,15 @@ class KyuubiHiveDriverSuite extends WithSparkSQLEngine with IcebergSuiteMixin {
val metaData = connection.getMetaData
assert(metaData.getClass.getName === "org.apache.kyuubi.jdbc.hive.KyuubiDatabaseMetaData")
val statement = connection.createStatement()
- val table1 = s"${SparkCatalogShim.SESSION_CATALOG}.default.kyuubi_hive_jdbc"
+ val table1 = s"${SparkCatalogUtils.SESSION_CATALOG}.default.kyuubi_hive_jdbc"
val table2 = s"$catalog.default.hdp_cat_tbl"
try {
statement.execute(s"CREATE TABLE $table1(key int) USING parquet")
statement.execute(s"CREATE TABLE $table2(key int) USING $format")
- val resultSet1 = metaData.getTables(SparkCatalogShim.SESSION_CATALOG, "default", "%", null)
+ val resultSet1 = metaData.getTables(SparkCatalogUtils.SESSION_CATALOG, "default", "%", null)
assert(resultSet1.next())
- assert(resultSet1.getString(1) === SparkCatalogShim.SESSION_CATALOG)
+ assert(resultSet1.getString(1) === SparkCatalogUtils.SESSION_CATALOG)
assert(resultSet1.getString(2) === "default")
assert(resultSet1.getString(3) === "kyuubi_hive_jdbc")
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/KyuubiSparkContextHelper.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/KyuubiSparkContextHelper.scala
new file mode 100644
index 000000000..1b662eadf
--- /dev/null
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/KyuubiSparkContextHelper.scala
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import org.apache.spark.sql.SparkSession
+
+/**
+ * A place to invoke non-public APIs of [[SparkContext]], for test only.
+ */
+object KyuubiSparkContextHelper {
+
+ def waitListenerBus(spark: SparkSession): Unit = {
+ spark.sparkContext.listenerBus.waitUntilEmpty()
+ }
+
+ def dummyTaskContext(): TaskContextImpl = TaskContext.empty()
+}
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/kyuubi/SQLOperationListenerSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/kyuubi/SQLOperationListenerSuite.scala
index 04277fca4..f732f7c38 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/kyuubi/SQLOperationListenerSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/kyuubi/SQLOperationListenerSuite.scala
@@ -22,13 +22,16 @@ import scala.collection.JavaConverters.asScalaBufferConverter
import org.apache.hive.service.rpc.thrift.{TExecuteStatementReq, TFetchOrientation, TFetchResultsReq, TOperationHandle}
import org.scalatest.time.SpanSugar._
+import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.config.KyuubiConf.OPERATION_SPARK_LISTENER_ENABLED
import org.apache.kyuubi.engine.spark.WithSparkSQLEngine
import org.apache.kyuubi.operation.HiveJDBCTestHelper
class SQLOperationListenerSuite extends WithSparkSQLEngine with HiveJDBCTestHelper {
- override def withKyuubiConf: Map[String, String] = Map.empty
+ override def withKyuubiConf: Map[String, String] = Map(
+ KyuubiConf.ENGINE_SPARK_SHOW_PROGRESS.key -> "true",
+ KyuubiConf.ENGINE_SPARK_SHOW_PROGRESS_UPDATE_INTERVAL.key -> "200")
override protected def jdbcUrl: String = getJdbcUrl
@@ -54,6 +57,24 @@ class SQLOperationListenerSuite extends WithSparkSQLEngine with HiveJDBCTestHelp
}
}
+ test("operation listener with progress job info") {
+ val sql = "SELECT java_method('java.lang.Thread', 'sleep', 10000l) FROM range(1, 3, 1, 2);"
+ withSessionHandle { (client, handle) =>
+ val req = new TExecuteStatementReq()
+ req.setSessionHandle(handle)
+ req.setStatement(sql)
+ val tExecuteStatementResp = client.ExecuteStatement(req)
+ val opHandle = tExecuteStatementResp.getOperationHandle
+ val fetchResultsReq = new TFetchResultsReq(opHandle, TFetchOrientation.FETCH_NEXT, 1000)
+ fetchResultsReq.setFetchType(1.toShort)
+ eventually(timeout(90.seconds), interval(500.milliseconds)) {
+ val resultsResp = client.FetchResults(fetchResultsReq)
+ val logs = resultsResp.getResults.getColumns.get(0).getStringVal.getValues.asScala
+ assert(logs.exists(_.matches(".*\\[Job .* Stages\\] \\[Stage .*\\]")))
+ }
+ }
+ }
+
test("SQLOperationListener configurable") {
val sql = "select /*+ REPARTITION(3, a) */ a from values(1) t(a);"
withSessionHandle { (client, handle) =>
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/kyuubi/SparkSQLEngineDeregisterSuite.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/kyuubi/SparkSQLEngineDeregisterSuite.scala
index 8dc93759b..4dddcd4ee 100644
--- a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/kyuubi/SparkSQLEngineDeregisterSuite.scala
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/kyuubi/SparkSQLEngineDeregisterSuite.scala
@@ -24,9 +24,8 @@ import org.apache.spark.sql.internal.SQLConf.ANSI_ENABLED
import org.scalatest.time.SpanSugar.convertIntToGrainOfTime
import org.apache.kyuubi.config.KyuubiConf._
-import org.apache.kyuubi.engine.spark.KyuubiSparkUtil.sparkMajorMinorVersion
-import org.apache.kyuubi.engine.spark.WithDiscoverySparkSQLEngine
-import org.apache.kyuubi.engine.spark.WithEmbeddedZookeeper
+import org.apache.kyuubi.engine.spark.{WithDiscoverySparkSQLEngine, WithEmbeddedZookeeper}
+import org.apache.kyuubi.engine.spark.KyuubiSparkUtil.SPARK_ENGINE_RUNTIME_VERSION
import org.apache.kyuubi.service.ServiceState
abstract class SparkSQLEngineDeregisterSuite
@@ -61,10 +60,11 @@ abstract class SparkSQLEngineDeregisterSuite
class SparkSQLEngineDeregisterExceptionSuite extends SparkSQLEngineDeregisterSuite {
override def withKyuubiConf: Map[String, String] = {
super.withKyuubiConf ++ Map(ENGINE_DEREGISTER_EXCEPTION_CLASSES.key -> {
- sparkMajorMinorVersion match {
+ if (SPARK_ENGINE_RUNTIME_VERSION >= "3.3") {
// see https://issues.apache.org/jira/browse/SPARK-35958
- case (3, minor) if minor > 2 => "org.apache.spark.SparkArithmeticException"
- case _ => classOf[ArithmeticException].getCanonicalName
+ "org.apache.spark.SparkArithmeticException"
+ } else {
+ classOf[ArithmeticException].getCanonicalName
}
})
@@ -94,10 +94,11 @@ class SparkSQLEngineDeregisterExceptionTTLSuite
zookeeperConf ++ Map(
ANSI_ENABLED.key -> "true",
ENGINE_DEREGISTER_EXCEPTION_CLASSES.key -> {
- sparkMajorMinorVersion match {
+ if (SPARK_ENGINE_RUNTIME_VERSION >= "3.3") {
// see https://issues.apache.org/jira/browse/SPARK-35958
- case (3, minor) if minor > 2 => "org.apache.spark.SparkArithmeticException"
- case _ => classOf[ArithmeticException].getCanonicalName
+ "org.apache.spark.SparkArithmeticException"
+ } else {
+ classOf[ArithmeticException].getCanonicalName
}
},
ENGINE_DEREGISTER_JOB_MAX_FAILURES.key -> maxJobFailures.toString,
diff --git a/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/sql/execution/metric/SparkMetricsTestUtils.scala b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/sql/execution/metric/SparkMetricsTestUtils.scala
new file mode 100644
index 000000000..7ab06f0ef
--- /dev/null
+++ b/externals/kyuubi-spark-sql-engine/src/test/scala/org/apache/spark/sql/execution/metric/SparkMetricsTestUtils.scala
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.metric
+
+import org.apache.spark.sql.DataFrame
+import org.apache.spark.sql.execution.SparkPlanInfo
+import org.apache.spark.sql.execution.ui.SparkPlanGraph
+import org.apache.spark.sql.kyuubi.SparkDatasetHelper
+
+import org.apache.kyuubi.engine.spark.WithSparkSQLEngine
+
+trait SparkMetricsTestUtils {
+ this: WithSparkSQLEngine =>
+
+ private lazy val statusStore = spark.sharedState.statusStore
+ private def currentExecutionIds(): Set[Long] = {
+ spark.sparkContext.listenerBus.waitUntilEmpty(10000)
+ statusStore.executionsList.map(_.executionId).toSet
+ }
+
+ protected def getSparkPlanMetrics(df: DataFrame): Map[Long, (String, Map[String, Any])] = {
+ val previousExecutionIds = currentExecutionIds()
+ SparkDatasetHelper.executeCollect(df)
+ spark.sparkContext.listenerBus.waitUntilEmpty(10000)
+ val executionIds = currentExecutionIds().diff(previousExecutionIds)
+ assert(executionIds.size === 1)
+ val executionId = executionIds.head
+ val metricValues = statusStore.executionMetrics(executionId)
+ SparkPlanGraph(SparkPlanInfo.fromSparkPlan(df.queryExecution.executedPlan)).allNodes
+ .map { node =>
+ val nodeMetrics = node.metrics.map { metric =>
+ val metricValue = metricValues(metric.accumulatorId)
+ (metric.name, metricValue)
+ }.toMap
+ (node.id, node.name -> nodeMetrics)
+ }.toMap
+ }
+}
diff --git a/externals/kyuubi-trino-engine/pom.xml b/externals/kyuubi-trino-engine/pom.xml
index 7e2f67370..7d91e4a86 100644
--- a/externals/kyuubi-trino-engine/pom.xml
+++ b/externals/kyuubi-trino-engine/pom.xml
@@ -21,11 +21,11 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../../pom.xml
- kyuubi-trino-engine_2.12
+ kyuubi-trino-engine_${scala.binary.version}jarKyuubi Project Engine Trinohttps://kyuubi.apache.org/
diff --git a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/ExecuteStatement.scala b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/ExecuteStatement.scala
index eb1b27300..3e7cce80c 100644
--- a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/ExecuteStatement.scala
+++ b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/ExecuteStatement.scala
@@ -19,7 +19,7 @@ package org.apache.kyuubi.engine.trino.operation
import java.util.concurrent.RejectedExecutionException
-import org.apache.hive.service.rpc.thrift.TRowSet
+import org.apache.hive.service.rpc.thrift.TFetchResultsResp
import org.apache.kyuubi.{KyuubiSQLException, Logging}
import org.apache.kyuubi.engine.trino.TrinoStatement
@@ -82,7 +82,9 @@ class ExecuteStatement(
}
}
- override def getNextRowSet(order: FetchOrientation, rowSetSize: Int): TRowSet = {
+ override def getNextRowSetInternal(
+ order: FetchOrientation,
+ rowSetSize: Int): TFetchResultsResp = {
validateDefaultFetchOrientation(order)
assertState(OperationState.FINISHED)
setHasResultSet(true)
@@ -97,7 +99,10 @@ class ExecuteStatement(
val taken = iter.take(rowSetSize)
val resultRowSet = RowSet.toTRowSet(taken.toList, schema, getProtocolVersion)
resultRowSet.setStartRowOffset(iter.getPosition)
- resultRowSet
+ val fetchResultsResp = new TFetchResultsResp(OK_STATUS)
+ fetchResultsResp.setResults(resultRowSet)
+ fetchResultsResp.setHasMoreRows(false)
+ fetchResultsResp
}
private def executeStatement(trinoStatement: TrinoStatement): Unit = {
diff --git a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/GetCurrentCatalog.scala b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/GetCurrentCatalog.scala
index 3d8c7fd6c..504a53a41 100644
--- a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/GetCurrentCatalog.scala
+++ b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/GetCurrentCatalog.scala
@@ -23,11 +23,16 @@ import io.trino.client.ClientStandardTypes.VARCHAR
import io.trino.client.ClientTypeSignature.VARCHAR_UNBOUNDED_LENGTH
import org.apache.kyuubi.operation.IterableFetchIterator
+import org.apache.kyuubi.operation.log.OperationLog
import org.apache.kyuubi.session.Session
class GetCurrentCatalog(session: Session)
extends TrinoOperation(session) {
+ private val operationLog: OperationLog = OperationLog.createOperationLog(session, getHandle)
+
+ override def getOperationLog: Option[OperationLog] = Option(operationLog)
+
override protected def runInternal(): Unit = {
try {
val session = trinoContext.clientSession.get
diff --git a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/GetCurrentDatabase.scala b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/GetCurrentDatabase.scala
index 3bf2987b4..3ab598ef0 100644
--- a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/GetCurrentDatabase.scala
+++ b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/GetCurrentDatabase.scala
@@ -23,11 +23,16 @@ import io.trino.client.ClientStandardTypes.VARCHAR
import io.trino.client.ClientTypeSignature.VARCHAR_UNBOUNDED_LENGTH
import org.apache.kyuubi.operation.IterableFetchIterator
+import org.apache.kyuubi.operation.log.OperationLog
import org.apache.kyuubi.session.Session
class GetCurrentDatabase(session: Session)
extends TrinoOperation(session) {
+ private val operationLog: OperationLog = OperationLog.createOperationLog(session, getHandle)
+
+ override def getOperationLog: Option[OperationLog] = Option(operationLog)
+
override protected def runInternal(): Unit = {
try {
val session = trinoContext.clientSession.get
diff --git a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/SetCurrentCatalog.scala b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/SetCurrentCatalog.scala
index 09ba4262f..16836b0a9 100644
--- a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/SetCurrentCatalog.scala
+++ b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/SetCurrentCatalog.scala
@@ -19,11 +19,16 @@ package org.apache.kyuubi.engine.trino.operation
import io.trino.client.ClientSession
+import org.apache.kyuubi.operation.log.OperationLog
import org.apache.kyuubi.session.Session
class SetCurrentCatalog(session: Session, catalog: String)
extends TrinoOperation(session) {
+ private val operationLog: OperationLog = OperationLog.createOperationLog(session, getHandle)
+
+ override def getOperationLog: Option[OperationLog] = Option(operationLog)
+
override protected def runInternal(): Unit = {
try {
val session = trinoContext.clientSession.get
diff --git a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/SetCurrentDatabase.scala b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/SetCurrentDatabase.scala
index f25cc9e0c..aa4697f5f 100644
--- a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/SetCurrentDatabase.scala
+++ b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/SetCurrentDatabase.scala
@@ -19,11 +19,16 @@ package org.apache.kyuubi.engine.trino.operation
import io.trino.client.ClientSession
+import org.apache.kyuubi.operation.log.OperationLog
import org.apache.kyuubi.session.Session
class SetCurrentDatabase(session: Session, database: String)
extends TrinoOperation(session) {
+ private val operationLog: OperationLog = OperationLog.createOperationLog(session, getHandle)
+
+ override def getOperationLog: Option[OperationLog] = Option(operationLog)
+
override protected def runInternal(): Unit = {
try {
val session = trinoContext.clientSession.get
diff --git a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/TrinoOperation.scala b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/TrinoOperation.scala
index 6e40f65f2..11eaa1bc1 100644
--- a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/TrinoOperation.scala
+++ b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/operation/TrinoOperation.scala
@@ -21,7 +21,7 @@ import java.io.IOException
import io.trino.client.Column
import io.trino.client.StatementClient
-import org.apache.hive.service.rpc.thrift.{TGetResultSetMetadataResp, TRowSet}
+import org.apache.hive.service.rpc.thrift.{TFetchResultsResp, TGetResultSetMetadataResp}
import org.apache.kyuubi.KyuubiSQLException
import org.apache.kyuubi.Utils
@@ -54,7 +54,9 @@ abstract class TrinoOperation(session: Session) extends AbstractOperation(sessio
resp
}
- override def getNextRowSet(order: FetchOrientation, rowSetSize: Int): TRowSet = {
+ override def getNextRowSetInternal(
+ order: FetchOrientation,
+ rowSetSize: Int): TFetchResultsResp = {
validateDefaultFetchOrientation(order)
assertState(OperationState.FINISHED)
setHasResultSet(true)
@@ -66,7 +68,10 @@ abstract class TrinoOperation(session: Session) extends AbstractOperation(sessio
val taken = iter.take(rowSetSize)
val resultRowSet = RowSet.toTRowSet(taken.toList, schema, getProtocolVersion)
resultRowSet.setStartRowOffset(iter.getPosition)
- resultRowSet
+ val resp = new TFetchResultsResp(OK_STATUS)
+ resp.setResults(resultRowSet)
+ resp.setHasMoreRows(false)
+ resp
}
override protected def beforeRun(): Unit = {
@@ -75,7 +80,7 @@ abstract class TrinoOperation(session: Session) extends AbstractOperation(sessio
}
override protected def afterRun(): Unit = {
- state.synchronized {
+ withLockRequired {
if (!isTerminalState(state)) {
setState(OperationState.FINISHED)
}
@@ -108,7 +113,7 @@ abstract class TrinoOperation(session: Session) extends AbstractOperation(sessio
// could be thrown.
case e: Throwable =>
if (cancel && trino.isRunning) trino.cancelLeafStage()
- state.synchronized {
+ withLockRequired {
val errMsg = Utils.stringifyException(e)
if (state == OperationState.TIMEOUT) {
val ke = KyuubiSQLException(s"Timeout operating $opType: $errMsg")
diff --git a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/session/TrinoSessionImpl.scala b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/session/TrinoSessionImpl.scala
index a19d74d58..0b3ac01a9 100644
--- a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/session/TrinoSessionImpl.scala
+++ b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/session/TrinoSessionImpl.scala
@@ -22,19 +22,23 @@ import java.time.ZoneId
import java.util.{Collections, Locale, Optional}
import java.util.concurrent.TimeUnit
+import scala.collection.JavaConverters._
+
import io.airlift.units.Duration
import io.trino.client.ClientSession
+import io.trino.client.OkHttpUtil
import okhttp3.OkHttpClient
import org.apache.hive.service.rpc.thrift.{TGetInfoType, TGetInfoValue, TProtocolVersion}
import org.apache.kyuubi.KyuubiSQLException
import org.apache.kyuubi.Utils.currentUser
import org.apache.kyuubi.config.{KyuubiConf, KyuubiReservedKeys}
+import org.apache.kyuubi.config.KyuubiReservedKeys.KYUUBI_SESSION_HANDLE_KEY
import org.apache.kyuubi.engine.trino.{TrinoConf, TrinoContext, TrinoStatement}
import org.apache.kyuubi.engine.trino.event.TrinoSessionEvent
import org.apache.kyuubi.events.EventBus
import org.apache.kyuubi.operation.{Operation, OperationHandle}
-import org.apache.kyuubi.session.{AbstractSession, SessionManager}
+import org.apache.kyuubi.session.{AbstractSession, SessionHandle, SessionManager, USE_CATALOG, USE_DATABASE}
class TrinoSessionImpl(
protocol: TProtocolVersion,
@@ -45,47 +49,53 @@ class TrinoSessionImpl(
sessionManager: SessionManager)
extends AbstractSession(protocol, user, password, ipAddress, conf, sessionManager) {
+ val sessionConf: KyuubiConf = sessionManager.getConf
+
+ override val handle: SessionHandle =
+ conf.get(KYUUBI_SESSION_HANDLE_KEY).map(SessionHandle.fromUUID).getOrElse(SessionHandle())
+
+ private val username: String = sessionConf
+ .getOption(KyuubiReservedKeys.KYUUBI_SESSION_USER_KEY).getOrElse(currentUser)
+
var trinoContext: TrinoContext = _
private var clientSession: ClientSession = _
- private var catalogName: String = null
- private var databaseName: String = null
-
+ private var catalogName: String = _
+ private var databaseName: String = _
private val sessionEvent = TrinoSessionEvent(this)
override def open(): Unit = {
- normalizedConf.foreach {
- case ("use:catalog", catalog) => catalogName = catalog
- case ("use:database", database) => databaseName = database
- case _ => // do nothing
+
+ val (useCatalogAndDatabaseConf, _) = normalizedConf.partition { case (k, _) =>
+ Array(USE_CATALOG, USE_DATABASE).contains(k)
}
- val httpClient = new OkHttpClient.Builder().build()
+ useCatalogAndDatabaseConf.foreach {
+ case (USE_CATALOG, catalog) => catalogName = catalog
+ case (USE_DATABASE, database) => databaseName = database
+ }
+ if (catalogName == null) {
+ catalogName = sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_CATALOG)
+ .getOrElse(throw KyuubiSQLException("Trino default catalog can not be null!"))
+ }
clientSession = createClientSession()
- trinoContext = TrinoContext(httpClient, clientSession)
+ trinoContext = TrinoContext(createHttpClient(), clientSession)
super.open()
EventBus.post(sessionEvent)
}
private def createClientSession(): ClientSession = {
- val sessionConf = sessionManager.getConf
val connectionUrl = sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_URL).getOrElse(
throw KyuubiSQLException("Trino server url can not be null!"))
- if (catalogName == null) {
- catalogName = sessionConf.get(
- KyuubiConf.ENGINE_TRINO_CONNECTION_CATALOG).getOrElse(
- throw KyuubiSQLException("Trino default catalog can not be null!"))
- }
-
- val user = sessionConf
- .getOption(KyuubiReservedKeys.KYUUBI_SESSION_USER_KEY).getOrElse(currentUser)
val clientRequestTimeout = sessionConf.get(TrinoConf.CLIENT_REQUEST_TIMEOUT)
+ val properties = getTrinoSessionConf(sessionConf).asJava
+
new ClientSession(
URI.create(connectionUrl),
- user,
+ username,
Optional.empty(),
"kyuubi",
Optional.empty(),
@@ -98,7 +108,7 @@ class TrinoSessionImpl(
Locale.getDefault,
Collections.emptyMap(),
Collections.emptyMap(),
- Collections.emptyMap(),
+ properties,
Collections.emptyMap(),
Collections.emptyMap(),
null,
@@ -106,6 +116,37 @@ class TrinoSessionImpl(
true)
}
+ private def createHttpClient(): OkHttpClient = {
+ val keystorePath = sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_KEYSTORE_PATH)
+ val keystorePassword = sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_KEYSTORE_PASSWORD)
+ val keystoreType = sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_KEYSTORE_TYPE)
+ val truststorePath = sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_TRUSTSTORE_PATH)
+ val truststorePassword = sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_TRUSTSTORE_PASSWORD)
+ val truststoreType = sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_TRUSTSTORE_TYPE)
+
+ val serverScheme = clientSession.getServer.getScheme
+
+ val builder = new OkHttpClient.Builder()
+
+ OkHttpUtil.setupSsl(
+ builder,
+ Optional.ofNullable(keystorePath.orNull),
+ Optional.ofNullable(keystorePassword.orNull),
+ Optional.ofNullable(keystoreType.orNull),
+ Optional.ofNullable(truststorePath.orNull),
+ Optional.ofNullable(truststorePassword.orNull),
+ Optional.ofNullable(truststoreType.orNull))
+
+ sessionConf.get(KyuubiConf.ENGINE_TRINO_CONNECTION_PASSWORD).foreach { password =>
+ require(
+ serverScheme.equalsIgnoreCase("https"),
+ "Trino engine using username/password requires HTTPS to be enabled")
+ builder.addInterceptor(OkHttpUtil.basicAuth(username, password))
+ }
+
+ builder.build()
+ }
+
override protected def runOperation(operation: Operation): OperationHandle = {
sessionEvent.totalOperations += 1
super.runOperation(operation)
@@ -133,6 +174,12 @@ class TrinoSessionImpl(
resultSet.next().head.toString
}
+ private def getTrinoSessionConf(sessionConf: KyuubiConf): Map[String, String] = {
+ val trinoSessionConf = sessionConf.getAll.filterKeys(_.startsWith("trino."))
+ .map { case (k, v) => (k.stripPrefix("trino."), v) }
+ trinoSessionConf.toMap
+ }
+
override def close(): Unit = {
sessionEvent.endTime = System.currentTimeMillis()
EventBus.post(sessionEvent)
diff --git a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/session/TrinoSessionManager.scala b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/session/TrinoSessionManager.scala
index 6d56d5c05..e18b8f758 100644
--- a/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/session/TrinoSessionManager.scala
+++ b/externals/kyuubi-trino-engine/src/main/scala/org/apache/kyuubi/engine/trino/session/TrinoSessionManager.scala
@@ -20,6 +20,7 @@ package org.apache.kyuubi.engine.trino.session
import org.apache.hive.service.rpc.thrift.TProtocolVersion
import org.apache.kyuubi.config.KyuubiConf.ENGINE_SHARE_LEVEL
+import org.apache.kyuubi.config.KyuubiReservedKeys.KYUUBI_SESSION_HANDLE_KEY
import org.apache.kyuubi.engine.ShareLevel
import org.apache.kyuubi.engine.trino.TrinoSqlEngine
import org.apache.kyuubi.engine.trino.operation.TrinoOperationManager
@@ -36,7 +37,10 @@ class TrinoSessionManager
password: String,
ipAddress: String,
conf: Map[String, String]): Session = {
- new TrinoSessionImpl(protocol, user, password, ipAddress, conf, this)
+ conf.get(KYUUBI_SESSION_HANDLE_KEY).map(SessionHandle.fromUUID).flatMap(
+ getSessionOption).getOrElse {
+ new TrinoSessionImpl(protocol, user, password, ipAddress, conf, this)
+ }
}
override def closeSession(sessionHandle: SessionHandle): Unit = {
diff --git a/externals/kyuubi-trino-engine/src/test/resources/log4j2-test.xml b/externals/kyuubi-trino-engine/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/externals/kyuubi-trino-engine/src/test/resources/log4j2-test.xml
+++ b/externals/kyuubi-trino-engine/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/externals/kyuubi-trino-engine/src/test/scala/org/apache/kyuubi/engine/trino/TrinoStatementSuite.scala b/externals/kyuubi-trino-engine/src/test/scala/org/apache/kyuubi/engine/trino/TrinoStatementSuite.scala
index fc9f1af5f..dec753ad4 100644
--- a/externals/kyuubi-trino-engine/src/test/scala/org/apache/kyuubi/engine/trino/TrinoStatementSuite.scala
+++ b/externals/kyuubi-trino-engine/src/test/scala/org/apache/kyuubi/engine/trino/TrinoStatementSuite.scala
@@ -30,15 +30,15 @@ class TrinoStatementSuite extends WithTrinoContainerServer {
assert(schema.size === 1)
assert(schema(0).getName === "_col0")
- assert(resultSet.toIterator.hasNext)
- assert(resultSet.toIterator.next() === List(1))
+ assert(resultSet.hasNext)
+ assert(resultSet.next() === List(1))
val trinoStatement2 = TrinoStatement(trinoContext, kyuubiConf, "show schemas")
val schema2 = trinoStatement2.getColumns
val resultSet2 = trinoStatement2.execute()
assert(schema2.size === 1)
- assert(resultSet2.toIterator.hasNext)
+ assert(resultSet2.hasNext)
}
}
diff --git a/externals/kyuubi-trino-engine/src/test/scala/org/apache/kyuubi/engine/trino/operation/TrinoOperationSuite.scala b/externals/kyuubi-trino-engine/src/test/scala/org/apache/kyuubi/engine/trino/operation/TrinoOperationSuite.scala
index a6f125af5..90939a3e4 100644
--- a/externals/kyuubi-trino-engine/src/test/scala/org/apache/kyuubi/engine/trino/operation/TrinoOperationSuite.scala
+++ b/externals/kyuubi-trino-engine/src/test/scala/org/apache/kyuubi/engine/trino/operation/TrinoOperationSuite.scala
@@ -590,14 +590,14 @@ class TrinoOperationSuite extends WithTrinoEngine with TrinoQueryTests {
val tFetchResultsReq1 = new TFetchResultsReq(opHandle, TFetchOrientation.FETCH_NEXT, 1)
val tFetchResultsResp1 = client.FetchResults(tFetchResultsReq1)
assert(tFetchResultsResp1.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
- val idSeq1 = tFetchResultsResp1.getResults.getColumns.get(0).getI32Val.getValues.asScala.toSeq
+ val idSeq1 = tFetchResultsResp1.getResults.getColumns.get(0).getI32Val.getValues.asScala
assertResult(Seq(0L))(idSeq1)
// fetch next from first row
val tFetchResultsReq2 = new TFetchResultsReq(opHandle, TFetchOrientation.FETCH_NEXT, 1)
val tFetchResultsResp2 = client.FetchResults(tFetchResultsReq2)
assert(tFetchResultsResp2.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
- val idSeq2 = tFetchResultsResp2.getResults.getColumns.get(0).getI32Val.getValues.asScala.toSeq
+ val idSeq2 = tFetchResultsResp2.getResults.getColumns.get(0).getI32Val.getValues.asScala
assertResult(Seq(1L))(idSeq2)
val tFetchResultsReq3 = new TFetchResultsReq(opHandle, TFetchOrientation.FETCH_PRIOR, 1)
@@ -607,7 +607,7 @@ class TrinoOperationSuite extends WithTrinoEngine with TrinoQueryTests {
} else {
assert(tFetchResultsResp3.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
val idSeq3 =
- tFetchResultsResp3.getResults.getColumns.get(0).getI32Val.getValues.asScala.toSeq
+ tFetchResultsResp3.getResults.getColumns.get(0).getI32Val.getValues.asScala
assertResult(Seq(0L))(idSeq3)
}
@@ -618,7 +618,7 @@ class TrinoOperationSuite extends WithTrinoEngine with TrinoQueryTests {
} else {
assert(tFetchResultsResp4.getStatus.getStatusCode === TStatusCode.SUCCESS_STATUS)
val idSeq4 =
- tFetchResultsResp4.getResults.getColumns.get(0).getI32Val.getValues.asScala.toSeq
+ tFetchResultsResp4.getResults.getColumns.get(0).getI32Val.getValues.asScala
assertResult(Seq(0L, 1L))(idSeq4)
}
}
@@ -771,8 +771,8 @@ class TrinoOperationSuite extends WithTrinoEngine with TrinoQueryTests {
assert(schema.size === 1)
assert(schema(0).getName === "_col0")
- assert(resultSet.toIterator.hasNext)
- version = resultSet.toIterator.next().head.toString
+ assert(resultSet.hasNext)
+ version = resultSet.next().head.toString
}
version
}
diff --git a/integration-tests/kyuubi-flink-it/pom.xml b/integration-tests/kyuubi-flink-it/pom.xml
index 7f9a84a85..15699be1d 100644
--- a/integration-tests/kyuubi-flink-it/pom.xml
+++ b/integration-tests/kyuubi-flink-it/pom.xml
@@ -21,11 +21,11 @@
org.apache.kyuubiintegration-tests
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../pom.xml
- kyuubi-flink-it_2.12
+ kyuubi-flink-it_${scala.binary.version}Kyuubi Test Flink SQL IThttps://kyuubi.apache.org/
@@ -75,10 +75,45 @@
org.apache.flink
- flink-table-runtime${flink.module.scala.suffix}
+ flink-table-runtime
+ test
+
+
+
+
+ org.apache.hadoop
+ hadoop-client-minicluster
+ test
+
+
+
+ org.bouncycastle
+ bcprov-jdk15on
+ test
+
+
+
+ org.bouncycastle
+ bcpkix-jdk15on
+ test
+
+
+
+ jakarta.activation
+ jakarta.activation-api
+ test
+
+
+
+ jakarta.xml.bind
+ jakarta.xml.bind-apitest
+
+ target/scala-${scala.binary.version}/classes
+ target/scala-${scala.binary.version}/test-classes
+
diff --git a/integration-tests/kyuubi-flink-it/src/test/resources/log4j2-test.xml b/integration-tests/kyuubi-flink-it/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/integration-tests/kyuubi-flink-it/src/test/resources/log4j2-test.xml
+++ b/integration-tests/kyuubi-flink-it/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/WithKyuubiServerAndYarnMiniCluster.scala b/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/WithKyuubiServerAndYarnMiniCluster.scala
new file mode 100644
index 000000000..de9a8ae2d
--- /dev/null
+++ b/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/WithKyuubiServerAndYarnMiniCluster.scala
@@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.it.flink
+
+import java.io.{File, FileWriter}
+import java.nio.file.Paths
+
+import org.apache.hadoop.yarn.conf.YarnConfiguration
+
+import org.apache.kyuubi.{KyuubiFunSuite, Utils, WithKyuubiServer}
+import org.apache.kyuubi.config.KyuubiConf
+import org.apache.kyuubi.config.KyuubiConf.KYUUBI_ENGINE_ENV_PREFIX
+import org.apache.kyuubi.server.{MiniDFSService, MiniYarnService}
+
+trait WithKyuubiServerAndYarnMiniCluster extends KyuubiFunSuite with WithKyuubiServer {
+
+ val kyuubiHome: String = Utils.getCodeSourceLocation(getClass).split("integration-tests").head
+
+ override protected val conf: KyuubiConf = new KyuubiConf(false)
+
+ protected var miniHdfsService: MiniDFSService = _
+
+ protected var miniYarnService: MiniYarnService = _
+
+ private val yarnConf: YarnConfiguration = {
+ val yarnConfig = new YarnConfiguration()
+
+ // configurations copied from org.apache.flink.yarn.YarnTestBase
+ yarnConfig.setInt(YarnConfiguration.RM_SCHEDULER_MINIMUM_ALLOCATION_MB, 32)
+ yarnConfig.setInt(YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_MB, 4096)
+
+ yarnConfig.setBoolean(YarnConfiguration.RM_SCHEDULER_INCLUDE_PORT_IN_NODE_NAME, true)
+ yarnConfig.setInt(YarnConfiguration.RM_AM_MAX_ATTEMPTS, 2)
+ yarnConfig.setInt(YarnConfiguration.RM_MAX_COMPLETED_APPLICATIONS, 2)
+ yarnConfig.setInt(YarnConfiguration.RM_SCHEDULER_MAXIMUM_ALLOCATION_VCORES, 4)
+ yarnConfig.setInt(YarnConfiguration.DEBUG_NM_DELETE_DELAY_SEC, 3600)
+ yarnConfig.setBoolean(YarnConfiguration.LOG_AGGREGATION_ENABLED, false)
+ // memory is overwritten in the MiniYARNCluster.
+ // so we have to change the number of cores for testing.
+ yarnConfig.setInt(YarnConfiguration.NM_VCORES, 666)
+ yarnConfig.setFloat(YarnConfiguration.NM_MAX_PER_DISK_UTILIZATION_PERCENTAGE, 99.0f)
+ yarnConfig.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_RETRY_INTERVAL_MS, 1000)
+ yarnConfig.setInt(YarnConfiguration.RESOURCEMANAGER_CONNECT_MAX_WAIT_MS, 5000)
+
+ // capacity-scheduler.xml is missing in hadoop-client-minicluster so this is a workaround
+ yarnConfig.set("yarn.scheduler.capacity.root.queues", "default,four_cores_queue")
+
+ yarnConfig.setInt("yarn.scheduler.capacity.root.default.capacity", 100)
+ yarnConfig.setFloat("yarn.scheduler.capacity.root.default.user-limit-factor", 1)
+ yarnConfig.setInt("yarn.scheduler.capacity.root.default.maximum-capacity", 100)
+ yarnConfig.set("yarn.scheduler.capacity.root.default.state", "RUNNING")
+ yarnConfig.set("yarn.scheduler.capacity.root.default.acl_submit_applications", "*")
+ yarnConfig.set("yarn.scheduler.capacity.root.default.acl_administer_queue", "*")
+
+ yarnConfig.setInt("yarn.scheduler.capacity.root.four_cores_queue.maximum-capacity", 100)
+ yarnConfig.setInt("yarn.scheduler.capacity.root.four_cores_queue.maximum-applications", 10)
+ yarnConfig.setInt("yarn.scheduler.capacity.root.four_cores_queue.maximum-allocation-vcores", 4)
+ yarnConfig.setFloat("yarn.scheduler.capacity.root.four_cores_queue.user-limit-factor", 1)
+ yarnConfig.set("yarn.scheduler.capacity.root.four_cores_queue.acl_submit_applications", "*")
+ yarnConfig.set("yarn.scheduler.capacity.root.four_cores_queue.acl_administer_queue", "*")
+
+ yarnConfig.setInt("yarn.scheduler.capacity.node-locality-delay", -1)
+ // Set bind host to localhost to avoid java.net.BindException
+ yarnConfig.set(YarnConfiguration.RM_BIND_HOST, "localhost")
+ yarnConfig.set(YarnConfiguration.NM_BIND_HOST, "localhost")
+
+ yarnConfig
+ }
+
+ override def beforeAll(): Unit = {
+ miniHdfsService = new MiniDFSService()
+ miniHdfsService.initialize(conf)
+ miniHdfsService.start()
+
+ val hdfsServiceUrl = s"hdfs://localhost:${miniHdfsService.getDFSPort}"
+ yarnConf.set("fs.defaultFS", hdfsServiceUrl)
+ yarnConf.addResource(miniHdfsService.getHadoopConf)
+
+ val cp = System.getProperty("java.class.path")
+ // exclude kyuubi flink engine jar that has SPI for EmbeddedExecutorFactory
+ // which can't be initialized on the client side
+ val hadoopJars = cp.split(":").filter(s => !s.contains("flink"))
+ val hadoopClasspath = hadoopJars.mkString(":")
+ yarnConf.set("yarn.application.classpath", hadoopClasspath)
+
+ miniYarnService = new MiniYarnService()
+ miniYarnService.setYarnConf(yarnConf)
+ miniYarnService.initialize(conf)
+ miniYarnService.start()
+
+ val hadoopConfDir = Utils.createTempDir().toFile
+ val writer = new FileWriter(new File(hadoopConfDir, "core-site.xml"))
+ yarnConf.writeXml(writer)
+ writer.close()
+
+ val flinkHome = {
+ val candidates = Paths.get(kyuubiHome, "externals", "kyuubi-download", "target")
+ .toFile.listFiles(f => f.getName.contains("flink"))
+ if (candidates == null) None else candidates.map(_.toPath).headOption
+ }
+ if (flinkHome.isEmpty) {
+ throw new IllegalStateException(s"Flink home not found in $kyuubiHome/externals")
+ }
+
+ conf.set(s"$KYUUBI_ENGINE_ENV_PREFIX.KYUUBI_HOME", kyuubiHome)
+ conf.set(s"$KYUUBI_ENGINE_ENV_PREFIX.FLINK_HOME", flinkHome.get.toString)
+ conf.set(
+ s"$KYUUBI_ENGINE_ENV_PREFIX.FLINK_CONF_DIR",
+ s"${flinkHome.get.toString}${File.separator}conf")
+ conf.set(s"$KYUUBI_ENGINE_ENV_PREFIX.HADOOP_CLASSPATH", hadoopClasspath)
+ conf.set(s"$KYUUBI_ENGINE_ENV_PREFIX.HADOOP_CONF_DIR", hadoopConfDir.getAbsolutePath)
+ conf.set(s"flink.containerized.master.env.HADOOP_CLASSPATH", hadoopClasspath)
+ conf.set(s"flink.containerized.master.env.HADOOP_CONF_DIR", hadoopConfDir.getAbsolutePath)
+ conf.set(s"flink.containerized.taskmanager.env.HADOOP_CONF_DIR", hadoopConfDir.getAbsolutePath)
+
+ super.beforeAll()
+ }
+
+ override def afterAll(): Unit = {
+ super.afterAll()
+ if (miniYarnService != null) {
+ miniYarnService.stop()
+ miniYarnService = null
+ }
+ if (miniHdfsService != null) {
+ miniHdfsService.stop()
+ miniHdfsService = null
+ }
+ }
+}
diff --git a/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/operation/FlinkOperationSuite.scala b/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/operation/FlinkOperationSuite.scala
index 893e0020a..55476bfd0 100644
--- a/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/operation/FlinkOperationSuite.scala
+++ b/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/operation/FlinkOperationSuite.scala
@@ -31,7 +31,7 @@ class FlinkOperationSuite extends WithKyuubiServerAndFlinkMiniCluster
override val conf: KyuubiConf = KyuubiConf()
.set(s"$KYUUBI_ENGINE_ENV_PREFIX.$KYUUBI_HOME", kyuubiHome)
.set(ENGINE_TYPE, "FLINK_SQL")
- .set("flink.parallelism.default", "6")
+ .set("flink.parallelism.default", "2")
override protected def jdbcUrl: String = getJdbcUrl
@@ -72,7 +72,7 @@ class FlinkOperationSuite extends WithKyuubiServerAndFlinkMiniCluster
var success = false
while (resultSet.next() && !success) {
if (resultSet.getString(1) == "parallelism.default" &&
- resultSet.getString(2) == "6") {
+ resultSet.getString(2) == "2") {
success = true
}
}
diff --git a/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/operation/FlinkOperationSuiteOnYarn.scala b/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/operation/FlinkOperationSuiteOnYarn.scala
new file mode 100644
index 000000000..ee6b9bb98
--- /dev/null
+++ b/integration-tests/kyuubi-flink-it/src/test/scala/org/apache/kyuubi/it/flink/operation/FlinkOperationSuiteOnYarn.scala
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.it.flink.operation
+
+import org.apache.hive.service.rpc.thrift.{TGetInfoReq, TGetInfoType}
+
+import org.apache.kyuubi.config.KyuubiConf
+import org.apache.kyuubi.config.KyuubiConf._
+import org.apache.kyuubi.it.flink.WithKyuubiServerAndYarnMiniCluster
+import org.apache.kyuubi.operation.HiveJDBCTestHelper
+import org.apache.kyuubi.operation.meta.ResultSetSchemaConstant.TABLE_CAT
+
+class FlinkOperationSuiteOnYarn extends WithKyuubiServerAndYarnMiniCluster
+ with HiveJDBCTestHelper {
+
+ override protected def jdbcUrl: String = {
+ // delay the access to thrift service because the thrift service
+ // may not be ready although it's registered
+ Thread.sleep(3000L)
+ getJdbcUrl
+ }
+
+ override def beforeAll(): Unit = {
+ conf
+ .set(s"$KYUUBI_ENGINE_ENV_PREFIX.$KYUUBI_HOME", kyuubiHome)
+ .set(ENGINE_TYPE, "FLINK_SQL")
+ .set("flink.execution.target", "yarn-application")
+ .set("flink.parallelism.default", "2")
+ super.beforeAll()
+ }
+
+ test("get catalogs for flink sql") {
+ withJdbcStatement() { statement =>
+ val meta = statement.getConnection.getMetaData
+ val catalogs = meta.getCatalogs
+ val expected = Set("default_catalog").toIterator
+ while (catalogs.next()) {
+ assert(catalogs.getString(TABLE_CAT) === expected.next())
+ }
+ assert(!expected.hasNext)
+ assert(!catalogs.next())
+ }
+ }
+
+ test("execute statement - create/alter/drop table") {
+ withJdbcStatement() { statement =>
+ statement.executeQuery("create table tbl_a (a string) with ('connector' = 'blackhole')")
+ assert(statement.execute("alter table tbl_a rename to tbl_b"))
+ assert(statement.execute("drop table tbl_b"))
+ }
+ }
+
+ test("execute statement - select column name with dots") {
+ withJdbcStatement() { statement =>
+ val resultSet = statement.executeQuery("select 'tmp.hello'")
+ assert(resultSet.next())
+ assert(resultSet.getString(1) === "tmp.hello")
+ }
+ }
+
+ test("set kyuubi conf into flink conf") {
+ withJdbcStatement() { statement =>
+ val resultSet = statement.executeQuery("SET")
+ // Flink does not support set key without value currently,
+ // thus read all rows to find the desired one
+ var success = false
+ while (resultSet.next() && !success) {
+ if (resultSet.getString(1) == "parallelism.default" &&
+ resultSet.getString(2) == "2") {
+ success = true
+ }
+ }
+ assert(success)
+ }
+ }
+
+ test("server info provider - server") {
+ withSessionConf(Map(KyuubiConf.SERVER_INFO_PROVIDER.key -> "SERVER"))()() {
+ withSessionHandle { (client, handle) =>
+ val req = new TGetInfoReq()
+ req.setSessionHandle(handle)
+ req.setInfoType(TGetInfoType.CLI_DBMS_NAME)
+ assert(client.GetInfo(req).getInfoValue.getStringValue === "Apache Kyuubi")
+ }
+ }
+ }
+
+ test("server info provider - engine") {
+ withSessionConf(Map(KyuubiConf.SERVER_INFO_PROVIDER.key -> "ENGINE"))()() {
+ withSessionHandle { (client, handle) =>
+ val req = new TGetInfoReq()
+ req.setSessionHandle(handle)
+ req.setInfoType(TGetInfoType.CLI_DBMS_NAME)
+ assert(client.GetInfo(req).getInfoValue.getStringValue === "Apache Flink")
+ }
+ }
+ }
+}
diff --git a/integration-tests/kyuubi-hive-it/pom.xml b/integration-tests/kyuubi-hive-it/pom.xml
index 8b9813a2b..c4e9f320c 100644
--- a/integration-tests/kyuubi-hive-it/pom.xml
+++ b/integration-tests/kyuubi-hive-it/pom.xml
@@ -21,11 +21,11 @@
org.apache.kyuubiintegration-tests
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../pom.xml
- kyuubi-hive-it_2.12
+ kyuubi-hive-it_${scala.binary.version}Kyuubi Test Hive IThttps://kyuubi.apache.org/
@@ -69,4 +69,9 @@
test
+
+
+ target/scala-${scala.binary.version}/classes
+ target/scala-${scala.binary.version}/test-classes
+
diff --git a/integration-tests/kyuubi-hive-it/src/test/resources/log4j2-test.xml b/integration-tests/kyuubi-hive-it/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/integration-tests/kyuubi-hive-it/src/test/resources/log4j2-test.xml
+++ b/integration-tests/kyuubi-hive-it/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/integration-tests/kyuubi-hive-it/src/test/scala/org/apache/kyuubi/it/hive/operation/KyuubiOperationHiveEnginePerUserSuite.scala b/integration-tests/kyuubi-hive-it/src/test/scala/org/apache/kyuubi/it/hive/operation/KyuubiOperationHiveEnginePerUserSuite.scala
index a4e6bb150..07e2bc0f2 100644
--- a/integration-tests/kyuubi-hive-it/src/test/scala/org/apache/kyuubi/it/hive/operation/KyuubiOperationHiveEnginePerUserSuite.scala
+++ b/integration-tests/kyuubi-hive-it/src/test/scala/org/apache/kyuubi/it/hive/operation/KyuubiOperationHiveEnginePerUserSuite.scala
@@ -61,4 +61,21 @@ class KyuubiOperationHiveEnginePerUserSuite extends WithKyuubiServer with HiveEn
}
}
}
+
+ test("kyuubi defined function - system_user, session_user") {
+ withJdbcStatement("hive_engine_test") { statement =>
+ val rs = statement.executeQuery("SELECT system_user(), session_user()")
+ assert(rs.next())
+ assert(rs.getString(1) === Utils.currentUser)
+ assert(rs.getString(2) === Utils.currentUser)
+ }
+ }
+
+ test("kyuubi defined function - engine_id") {
+ withJdbcStatement("hive_engine_test") { statement =>
+ val rs = statement.executeQuery("SELECT engine_id()")
+ assert(rs.next())
+ assert(rs.getString(1).nonEmpty)
+ }
+ }
}
diff --git a/integration-tests/kyuubi-jdbc-it/pom.xml b/integration-tests/kyuubi-jdbc-it/pom.xml
index 0aef12fb3..95ffd2038 100644
--- a/integration-tests/kyuubi-jdbc-it/pom.xml
+++ b/integration-tests/kyuubi-jdbc-it/pom.xml
@@ -21,11 +21,11 @@
org.apache.kyuubiintegration-tests
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../pom.xml
- kyuubi-jdbc-it_2.12
+ kyuubi-jdbc-it_${scala.binary.version}Kyuubi Test Jdbc IThttps://kyuubi.apache.org/
@@ -114,5 +114,7 @@
+ target/scala-${scala.binary.version}/classes
+ target/scala-${scala.binary.version}/test-classes
diff --git a/integration-tests/kyuubi-jdbc-it/src/test/resources/log4j2-test.xml b/integration-tests/kyuubi-jdbc-it/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/integration-tests/kyuubi-jdbc-it/src/test/resources/log4j2-test.xml
+++ b/integration-tests/kyuubi-jdbc-it/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/integration-tests/kyuubi-kubernetes-it/pom.xml b/integration-tests/kyuubi-kubernetes-it/pom.xml
index cb04e73c1..a4334e497 100644
--- a/integration-tests/kyuubi-kubernetes-it/pom.xml
+++ b/integration-tests/kyuubi-kubernetes-it/pom.xml
@@ -15,17 +15,15 @@
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->
-
-
+ 4.0.0org.apache.kyuubiintegration-tests
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../pom.xml
- 4.0.0kubernetes-integration-tests_2.12Kyuubi Test Kubernetes IT
@@ -62,12 +60,6 @@
test
-
- io.fabric8
- kubernetes-client
- test
-
-
org.apache.hadoophadoop-client-minicluster
diff --git a/integration-tests/kyuubi-kubernetes-it/src/test/resources/log4j2-test.xml b/integration-tests/kyuubi-kubernetes-it/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/integration-tests/kyuubi-kubernetes-it/src/test/resources/log4j2-test.xml
+++ b/integration-tests/kyuubi-kubernetes-it/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/MiniKube.scala b/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/MiniKube.scala
index cd373873a..f4cd557bb 100644
--- a/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/MiniKube.scala
+++ b/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/MiniKube.scala
@@ -17,7 +17,11 @@
package org.apache.kyuubi.kubernetes.test
-import io.fabric8.kubernetes.client.{Config, DefaultKubernetesClient}
+import io.fabric8.kubernetes.client.{Config, KubernetesClient, KubernetesClientBuilder}
+import io.fabric8.kubernetes.client.okhttp.OkHttpClientFactory
+import okhttp3.{Dispatcher, OkHttpClient}
+
+import org.apache.kyuubi.util.ThreadUtils
/**
* This code copied from Aapache Spark
@@ -44,7 +48,7 @@ object MiniKube {
executeMinikube(true, "ip").head
}
- def getKubernetesClient: DefaultKubernetesClient = {
+ def getKubernetesClient: KubernetesClient = {
// only the three-part version number is matched (the optional suffix like "-beta.0" is dropped)
val versionArrayOpt = "\\d+\\.\\d+\\.\\d+".r
.findFirstIn(minikubeVersionString.split(VERSION_PREFIX)(1))
@@ -65,7 +69,18 @@ object MiniKube {
"For minikube version a three-part version number is expected (the optional " +
"non-numeric suffix is intentionally dropped)")
}
+ // https://github.com/fabric8io/kubernetes-client/issues/3547
+ val dispatcher = new Dispatcher(
+ ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
+ val factoryWithCustomDispatcher = new OkHttpClientFactory() {
+ override protected def additionalConfig(builder: OkHttpClient.Builder): Unit = {
+ builder.dispatcher(dispatcher)
+ }
+ }
- new DefaultKubernetesClient(Config.autoConfigure("minikube"))
+ new KubernetesClientBuilder()
+ .withConfig(Config.autoConfigure("minikube"))
+ .withHttpClientFactory(factoryWithCustomDispatcher)
+ .build()
}
}
diff --git a/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/WithKyuubiServerOnKubernetes.scala b/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/WithKyuubiServerOnKubernetes.scala
index ed9cbce09..595fdd431 100644
--- a/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/WithKyuubiServerOnKubernetes.scala
+++ b/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/WithKyuubiServerOnKubernetes.scala
@@ -18,14 +18,14 @@
package org.apache.kyuubi.kubernetes.test
import io.fabric8.kubernetes.api.model.Pod
-import io.fabric8.kubernetes.client.DefaultKubernetesClient
+import io.fabric8.kubernetes.client.KubernetesClient
import org.apache.kyuubi.KyuubiFunSuite
trait WithKyuubiServerOnKubernetes extends KyuubiFunSuite {
protected def connectionConf: Map[String, String] = Map.empty
- lazy val miniKubernetesClient: DefaultKubernetesClient = MiniKube.getKubernetesClient
+ lazy val miniKubernetesClient: KubernetesClient = MiniKube.getKubernetesClient
lazy val kyuubiPod: Pod = miniKubernetesClient.pods().withName("kyuubi-test").get()
lazy val kyuubiServerIp: String = kyuubiPod.getStatus.getPodIP
lazy val miniKubeIp: String = MiniKube.getIp
diff --git a/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/deployment/KyuubiOnKubernetesTestsSuite.scala b/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/deployment/KyuubiOnKubernetesTestsSuite.scala
index c8894679d..95e15e6eb 100644
--- a/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/deployment/KyuubiOnKubernetesTestsSuite.scala
+++ b/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/deployment/KyuubiOnKubernetesTestsSuite.scala
@@ -54,7 +54,9 @@ class KyuubiOnKubernetesWithSparkTestsBase extends WithKyuubiServerOnKubernetes
super.connectionConf ++
Map(
"spark.master" -> s"k8s://$miniKubeApiMaster",
- "spark.kubernetes.container.image" -> "apache/spark:3.3.1",
+ // We should update spark docker image in ./github/workflows/master.yml at the same time
+ "spark.kubernetes.container.image" -> "apache/spark:3.4.1",
+ "spark.kubernetes.container.image.pullPolicy" -> "IfNotPresent",
"spark.executor.memory" -> "512M",
"spark.driver.memory" -> "1024M",
"spark.kubernetes.driver.request.cores" -> "250m",
diff --git a/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/spark/SparkOnKubernetesTestsSuite.scala b/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/spark/SparkOnKubernetesTestsSuite.scala
index 798618e4c..09532efe3 100644
--- a/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/spark/SparkOnKubernetesTestsSuite.scala
+++ b/integration-tests/kyuubi-kubernetes-it/src/test/scala/org/apache/kyuubi/kubernetes/test/spark/SparkOnKubernetesTestsSuite.scala
@@ -17,21 +17,23 @@
package org.apache.kyuubi.kubernetes.test.spark
-import scala.collection.JavaConverters._
+import java.util.UUID
+
import scala.concurrent.duration._
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.net.NetUtils
-import org.apache.kyuubi.{BatchTestHelper, KyuubiException, Logging, Utils, WithKyuubiServer, WithSimpleDFSService}
+import org.apache.kyuubi._
+import org.apache.kyuubi.client.util.BatchUtils._
import org.apache.kyuubi.config.KyuubiConf
-import org.apache.kyuubi.config.KyuubiConf.FRONTEND_THRIFT_BINARY_BIND_HOST
-import org.apache.kyuubi.engine.{ApplicationInfo, ApplicationOperation, KubernetesApplicationOperation}
+import org.apache.kyuubi.config.KyuubiConf._
+import org.apache.kyuubi.engine.{ApplicationInfo, ApplicationManagerInfo, ApplicationOperation, KubernetesApplicationOperation}
import org.apache.kyuubi.engine.ApplicationState.{FAILED, NOT_FOUND, RUNNING}
import org.apache.kyuubi.engine.spark.SparkProcessBuilder
import org.apache.kyuubi.kubernetes.test.MiniKube
import org.apache.kyuubi.operation.SparkQueryTests
-import org.apache.kyuubi.session.{KyuubiBatchSessionImpl, KyuubiSessionManager}
+import org.apache.kyuubi.session.KyuubiSessionManager
import org.apache.kyuubi.util.Validator.KUBERNETES_EXECUTOR_POD_NAME_PREFIX
import org.apache.kyuubi.zookeeper.ZookeeperConf.ZK_CLIENT_PORT_ADDRESS
@@ -41,19 +43,23 @@ abstract class SparkOnKubernetesSuiteBase
MiniKube.getKubernetesClient.getMasterUrl.toString
}
+ protected val appMgrInfo =
+ ApplicationManagerInfo(Some(s"k8s://$apiServerAddress"), Some("minikube"), None)
+
protected def sparkOnK8sConf: KyuubiConf = {
// TODO Support more Spark version
// Spark official docker image: https://hub.docker.com/r/apache/spark/tags
KyuubiConf().set("spark.master", s"k8s://$apiServerAddress")
- .set("spark.kubernetes.container.image", "apache/spark:v3.2.1")
+ .set("spark.kubernetes.container.image", "apache/spark:3.4.1")
.set("spark.kubernetes.container.image.pullPolicy", "IfNotPresent")
.set("spark.executor.instances", "1")
.set("spark.executor.memory", "512M")
.set("spark.driver.memory", "512M")
.set("spark.kubernetes.driver.request.cores", "250m")
.set("spark.kubernetes.executor.request.cores", "250m")
- .set("kyuubi.kubernetes.context", "minikube")
- .set("kyuubi.frontend.protocols", "THRIFT_BINARY,REST")
+ .set(KUBERNETES_CONTEXT.key, "minikube")
+ .set(FRONTEND_PROTOCOLS.key, "THRIFT_BINARY,REST")
+ .set(ENGINE_INIT_TIMEOUT.key, "PT10M")
}
}
@@ -122,6 +128,7 @@ class SparkClusterModeOnKubernetesSuite
override protected def jdbcUrl: String = getJdbcUrl
}
+// [KYUUBI #4467] KubernetesApplicationOperator doesn't support client mode
class KyuubiOperationKubernetesClusterClientModeSuite
extends SparkClientModeOnKubernetesSuiteBase {
private lazy val k8sOperation: KubernetesApplicationOperation = {
@@ -133,31 +140,39 @@ class KyuubiOperationKubernetesClusterClientModeSuite
private def sessionManager: KyuubiSessionManager =
server.backendService.sessionManager.asInstanceOf[KyuubiSessionManager]
- test("Spark Client Mode On Kubernetes Kyuubi KubernetesApplicationOperation Suite") {
- val batchRequest = newSparkBatchRequest(conf.getAll)
+ ignore("Spark Client Mode On Kubernetes Kyuubi KubernetesApplicationOperation Suite") {
+ val batchRequest = newSparkBatchRequest(conf.getAll ++ Map(
+ KYUUBI_BATCH_ID_KEY -> UUID.randomUUID().toString))
val sessionHandle = sessionManager.openBatchSession(
"kyuubi",
"passwd",
"localhost",
- batchRequest.getConf.asScala.toMap,
batchRequest)
eventually(timeout(3.minutes), interval(50.milliseconds)) {
- val state = k8sOperation.getApplicationInfoByTag(sessionHandle.identifier.toString)
+ val state = k8sOperation.getApplicationInfoByTag(
+ appMgrInfo,
+ sessionHandle.identifier.toString)
assert(state.id != null)
assert(state.name != null)
assert(state.state == RUNNING)
}
- val killResponse = k8sOperation.killApplicationByTag(sessionHandle.identifier.toString)
+ val killResponse = k8sOperation.killApplicationByTag(
+ appMgrInfo,
+ sessionHandle.identifier.toString)
assert(killResponse._1)
assert(killResponse._2 startsWith "Succeeded to terminate:")
- val appInfo = k8sOperation.getApplicationInfoByTag(sessionHandle.identifier.toString)
+ val appInfo = k8sOperation.getApplicationInfoByTag(
+ appMgrInfo,
+ sessionHandle.identifier.toString)
assert(appInfo == ApplicationInfo(null, null, NOT_FOUND))
- val failKillResponse = k8sOperation.killApplicationByTag(sessionHandle.identifier.toString)
+ val failKillResponse = k8sOperation.killApplicationByTag(
+ appMgrInfo,
+ sessionHandle.identifier.toString)
assert(!failKillResponse._1)
assert(failKillResponse._2 === ApplicationOperation.NOT_FOUND)
}
@@ -193,37 +208,44 @@ class KyuubiOperationKubernetesClusterClusterModeSuite
"spark.kubernetes.driver.pod.name",
driverPodNamePrefix + "-" + System.currentTimeMillis())
- val batchRequest = newSparkBatchRequest(conf.getAll)
+ val batchRequest = newSparkBatchRequest(conf.getAll ++ Map(
+ KYUUBI_BATCH_ID_KEY -> UUID.randomUUID().toString))
val sessionHandle = sessionManager.openBatchSession(
"runner",
"passwd",
"localhost",
- batchRequest.getConf.asScala.toMap,
batchRequest)
- val session = sessionManager.getSession(sessionHandle).asInstanceOf[KyuubiBatchSessionImpl]
- val batchJobSubmissionOp = session.batchJobSubmissionOp
-
- eventually(timeout(3.minutes), interval(50.milliseconds)) {
- val appInfo = batchJobSubmissionOp.getOrFetchCurrentApplicationInfo
- assert(appInfo.nonEmpty)
- assert(appInfo.exists(_.state == RUNNING))
- assert(appInfo.exists(_.name.startsWith(driverPodNamePrefix)))
+ // wait for driver pod start
+ eventually(timeout(3.minutes), interval(5.second)) {
+ // trigger k8sOperation init here
+ val appInfo = k8sOperation.getApplicationInfoByTag(
+ appMgrInfo,
+ sessionHandle.identifier.toString)
+ assert(appInfo.state == RUNNING)
+ assert(appInfo.name.startsWith(driverPodNamePrefix))
}
- val killResponse = k8sOperation.killApplicationByTag(sessionHandle.identifier.toString)
+ val killResponse = k8sOperation.killApplicationByTag(
+ appMgrInfo,
+ sessionHandle.identifier.toString)
assert(killResponse._1)
- assert(killResponse._2 startsWith "Operation of deleted appId:")
+ assert(killResponse._2 endsWith "is completed")
+ assert(killResponse._2 contains sessionHandle.identifier.toString)
eventually(timeout(3.minutes), interval(50.milliseconds)) {
- val appInfo = k8sOperation.getApplicationInfoByTag(sessionHandle.identifier.toString)
+ val appInfo = k8sOperation.getApplicationInfoByTag(
+ appMgrInfo,
+ sessionHandle.identifier.toString)
// We may kill engine start but not ready
// An EOF Error occurred when the driver was starting
assert(appInfo.state == FAILED || appInfo.state == NOT_FOUND)
}
- val failKillResponse = k8sOperation.killApplicationByTag(sessionHandle.identifier.toString)
+ val failKillResponse = k8sOperation.killApplicationByTag(
+ appMgrInfo,
+ sessionHandle.identifier.toString)
assert(!failKillResponse._1)
}
}
diff --git a/integration-tests/kyuubi-trino-it/pom.xml b/integration-tests/kyuubi-trino-it/pom.xml
index e62e58d1d..c93d43c00 100644
--- a/integration-tests/kyuubi-trino-it/pom.xml
+++ b/integration-tests/kyuubi-trino-it/pom.xml
@@ -21,11 +21,11 @@
org.apache.kyuubiintegration-tests
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../pom.xml
- kyuubi-trino-it_2.12
+ kyuubi-trino-it_${scala.binary.version}Kyuubi Test Trino IThttps://kyuubi.apache.org/
@@ -88,4 +88,9 @@
+
+
+ target/scala-${scala.binary.version}/classes
+ target/scala-${scala.binary.version}/test-classes
+
diff --git a/integration-tests/kyuubi-trino-it/src/test/resources/log4j2-test.xml b/integration-tests/kyuubi-trino-it/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/integration-tests/kyuubi-trino-it/src/test/resources/log4j2-test.xml
+++ b/integration-tests/kyuubi-trino-it/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/integration-tests/kyuubi-trino-it/src/test/scala/org/apache/kyuubi/it/trino/server/TrinoFrontendSuite.scala b/integration-tests/kyuubi-trino-it/src/test/scala/org/apache/kyuubi/it/trino/server/TrinoFrontendSuite.scala
new file mode 100644
index 000000000..7575bf8a9
--- /dev/null
+++ b/integration-tests/kyuubi-trino-it/src/test/scala/org/apache/kyuubi/it/trino/server/TrinoFrontendSuite.scala
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.it.trino.server
+
+import scala.util.control.NonFatal
+
+import org.apache.kyuubi.WithKyuubiServer
+import org.apache.kyuubi.config.KyuubiConf
+import org.apache.kyuubi.operation.SparkMetadataTests
+
+/**
+ * This test is for Trino jdbc driver with Kyuubi Server and Spark engine:
+ *
+ * -------------------------------------------------------------
+ * | JDBC |
+ * | Trino-driver ----> Kyuubi Server --> Spark Engine |
+ * | |
+ * -------------------------------------------------------------
+ */
+class TrinoFrontendSuite extends WithKyuubiServer with SparkMetadataTests {
+
+ test("execute statement - select 11 where 1=1") {
+ withJdbcStatement() { statement =>
+ val resultSet = statement.executeQuery("SELECT 11 where 1<1")
+ while (resultSet.next()) {
+ assert(resultSet.getInt(1) === 11)
+ }
+ }
+ }
+
+ test("execute preparedStatement - select 11 where 1 = 1") {
+ withJdbcPrepareStatement("select 11 where 1 = ? ") { statement =>
+ statement.setInt(1, 1)
+ val rs = statement.executeQuery()
+ while (rs.next()) {
+ assert(rs.getInt(1) == 11)
+ }
+ }
+ }
+
+ override protected val conf: KyuubiConf = {
+ KyuubiConf().set(KyuubiConf.FRONTEND_PROTOCOLS, Seq("TRINO"))
+ }
+
+ override protected def jdbcUrl: String = {
+ s"jdbc:trino://${server.frontendServices.head.connectionUrl}/;"
+ }
+
+ // trino jdbc driver requires enable SSL if specify password
+ override protected val password: String = ""
+
+ override def beforeAll(): Unit = {
+ super.beforeAll()
+ // eagerly start spark engine before running test, it's a workaround for trino jdbc driver
+ // since it does not support changing http connect timeout
+ try {
+ withJdbcStatement() { statement =>
+ statement.execute("SELECT 1")
+ }
+ } catch {
+ case NonFatal(_) =>
+ }
+ }
+}
diff --git a/integration-tests/kyuubi-zookeeper-it/pom.xml b/integration-tests/kyuubi-zookeeper-it/pom.xml
index eaeff5898..869fd40b2 100644
--- a/integration-tests/kyuubi-zookeeper-it/pom.xml
+++ b/integration-tests/kyuubi-zookeeper-it/pom.xml
@@ -21,11 +21,11 @@
org.apache.kyuubiintegration-tests
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../pom.xml
- kyuubi-zookeeper-it_2.12
+ kyuubi-zookeeper-it_${scala.binary.version}Kyuubi Test Zookeeper IThttps://kyuubi.apache.org/
diff --git a/integration-tests/kyuubi-zookeeper-it/src/test/resources/log4j2-test.xml b/integration-tests/kyuubi-zookeeper-it/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/integration-tests/kyuubi-zookeeper-it/src/test/resources/log4j2-test.xml
+++ b/integration-tests/kyuubi-zookeeper-it/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/integration-tests/pom.xml b/integration-tests/pom.xml
index 4e3431afb..35d0b4f9e 100644
--- a/integration-tests/pom.xml
+++ b/integration-tests/pom.xml
@@ -21,7 +21,7 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOTintegration-tests
diff --git a/kyuubi-assembly/pom.xml b/kyuubi-assembly/pom.xml
index 725126f84..4fa0d9a0f 100644
--- a/kyuubi-assembly/pom.xml
+++ b/kyuubi-assembly/pom.xml
@@ -22,11 +22,11 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../pom.xml
- kyuubi-assembly_2.12
+ kyuubi-assembly_${scala.binary.version}pomKyuubi Project Assemblyhttps://kyuubi.apache.org/
@@ -69,28 +69,18 @@
- org.apache.hadoop
- hadoop-client-api
+ org.apache.kyuubi
+ ${kyuubi-shaded-zookeeper.artifacts}org.apache.hadoop
- hadoop-client-runtime
-
-
-
- org.apache.curator
- curator-framework
-
-
-
- org.apache.curator
- curator-client
+ hadoop-client-api
- org.apache.curator
- curator-recipes
+ org.apache.hadoop
+ hadoop-client-runtime
diff --git a/kyuubi-common/pom.xml b/kyuubi-common/pom.xml
index 26cdc271d..0d5c491b5 100644
--- a/kyuubi-common/pom.xml
+++ b/kyuubi-common/pom.xml
@@ -21,20 +21,20 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOT../pom.xml
- kyuubi-common_2.12
+ kyuubi-common_${scala.binary.version}jarKyuubi Project Commonhttps://kyuubi.apache.org/
- com.vladsch.flexmark
- flexmark-all
- test
+ org.apache.kyuubi
+ kyuubi-util-scala_${scala.binary.version}
+ ${project.version}
@@ -88,6 +88,11 @@
runtime
+
+ org.antlr
+ ST4
+
+
org.apache.commonscommons-lang3
@@ -123,6 +128,13 @@
HikariCP
+
+ org.apache.kyuubi
+ kyuubi-util-scala_${scala.binary.version}
+ ${project.version}
+ test-jar
+
+
org.apache.hadoophadoop-minikdc
@@ -141,6 +153,12 @@
test
+
+ org.scalatestplus
+ mockito-4-11_${scala.binary.version}
+ test
+
+
com.google.guavafailureaccess
@@ -153,11 +171,23 @@
test
+
+ org.xerial
+ sqlite-jdbc
+ test
+
+
com.jakewharton.fliptablesfliptablestest
+
+
+ com.vladsch.flexmark
+ flexmark-all
+ test
+
diff --git a/kyuubi-common/src/main/resources/log4j2-defaults.xml b/kyuubi-common/src/main/resources/log4j2-defaults.xml
index 63841959a..7a1a33235 100644
--- a/kyuubi-common/src/main/resources/log4j2-defaults.xml
+++ b/kyuubi-common/src/main/resources/log4j2-defaults.xml
@@ -21,7 +21,7 @@
-
+
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/KyuubiSQLException.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/KyuubiSQLException.scala
index a9e486fb2..570ee6d38 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/KyuubiSQLException.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/KyuubiSQLException.scala
@@ -26,6 +26,7 @@ import scala.collection.JavaConverters._
import org.apache.hive.service.rpc.thrift.{TStatus, TStatusCode}
import org.apache.kyuubi.Utils.stringifyException
+import org.apache.kyuubi.util.reflect.DynConstructors
/**
* @param reason a description of the exception
@@ -139,9 +140,10 @@ object KyuubiSQLException {
}
private def newInstance(className: String, message: String, cause: Throwable): Throwable = {
try {
- Class.forName(className)
- .getConstructor(classOf[String], classOf[Throwable])
- .newInstance(message, cause).asInstanceOf[Throwable]
+ DynConstructors.builder()
+ .impl(className, classOf[String], classOf[Throwable])
+ .buildChecked[Throwable]()
+ .newInstance(message, cause)
} catch {
case _: Exception => new RuntimeException(className + ":" + message, cause)
}
@@ -154,7 +156,7 @@ object KyuubiSQLException {
(i1, i2, i3)
}
- def toCause(details: Seq[String]): Throwable = {
+ def toCause(details: Iterable[String]): Throwable = {
var ex: Throwable = null
if (details != null && details.nonEmpty) {
val head = details.head
@@ -170,7 +172,7 @@ object KyuubiSQLException {
val lineNum = line.substring(i3 + 1).toInt
new StackTraceElement(clzName, methodName, fileName, lineNum)
}
- ex = newInstance(exClz, msg, toCause(details.slice(length + 2, details.length)))
+ ex = newInstance(exClz, msg, toCause(details.slice(length + 2, details.size)))
ex.setStackTrace(stackTraceElements.toArray)
}
ex
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/Logging.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/Logging.scala
index 4944b9fcc..d6dcc8d34 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/Logging.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/Logging.scala
@@ -22,9 +22,8 @@ import org.apache.logging.log4j.core.{Logger => Log4jLogger, LoggerContext}
import org.apache.logging.log4j.core.config.DefaultConfiguration
import org.slf4j.{Logger, LoggerFactory}
import org.slf4j.bridge.SLF4JBridgeHandler
-import org.slf4j.impl.StaticLoggerBinder
-import org.apache.kyuubi.util.ClassUtils
+import org.apache.kyuubi.util.reflect.ReflectUtils
/**
* Simple version of logging adopted from Apache Spark.
@@ -54,12 +53,24 @@ trait Logging {
}
}
+ def debug(message: => Any, t: Throwable): Unit = {
+ if (logger.isDebugEnabled) {
+ logger.debug(message.toString, t)
+ }
+ }
+
def info(message: => Any): Unit = {
if (logger.isInfoEnabled) {
logger.info(message.toString)
}
}
+ def info(message: => Any, t: Throwable): Unit = {
+ if (logger.isInfoEnabled) {
+ logger.info(message.toString, t)
+ }
+ }
+
def warn(message: => Any): Unit = {
if (logger.isWarnEnabled) {
logger.warn(message.toString)
@@ -105,16 +116,17 @@ object Logging {
// This distinguishes the log4j 1.2 binding, currently
// org.slf4j.impl.Log4jLoggerFactory, from the log4j 2.0 binding, currently
// org.apache.logging.slf4j.Log4jLoggerFactory
- val binderClass = StaticLoggerBinder.getSingleton.getLoggerFactoryClassStr
- "org.slf4j.impl.Log4jLoggerFactory".equals(binderClass)
+ val binderClass = LoggerFactory.getILoggerFactory.getClass.getName
+ "org.slf4j.impl.Log4jLoggerFactory".equals(
+ binderClass) || "org.slf4j.impl.Reload4jLoggerFactory".equals(binderClass)
}
private[kyuubi] def isLog4j2: Boolean = {
// This distinguishes the log4j 1.2 binding, currently
// org.slf4j.impl.Log4jLoggerFactory, from the log4j 2.0 binding, currently
// org.apache.logging.slf4j.Log4jLoggerFactory
- val binderClass = StaticLoggerBinder.getSingleton.getLoggerFactoryClassStr
- "org.apache.logging.slf4j.Log4jLoggerFactory".equals(binderClass)
+ "org.apache.logging.slf4j.Log4jLoggerFactory"
+ .equals(LoggerFactory.getILoggerFactory.getClass.getName)
}
/**
@@ -137,7 +149,7 @@ object Logging {
isInterpreter: Boolean,
loggerName: String,
logger: => Logger): Unit = {
- if (ClassUtils.classIsLoadable("org.slf4j.bridge.SLF4JBridgeHandler")) {
+ if (ReflectUtils.isClassLoadable("org.slf4j.bridge.SLF4JBridgeHandler")) {
// Handles configuring the JUL -> SLF4J bridge
SLF4JBridgeHandler.removeHandlersForRootLogger()
SLF4JBridgeHandler.install()
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/Utils.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/Utils.scala
index 7283ea040..accfca4c9 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/Utils.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/Utils.scala
@@ -21,9 +21,12 @@ import java.io._
import java.net.{Inet4Address, InetAddress, NetworkInterface}
import java.nio.charset.StandardCharsets
import java.nio.file.{Files, Path, Paths, StandardCopyOption}
+import java.security.PrivilegedAction
import java.text.SimpleDateFormat
import java.util.{Date, Properties, TimeZone, UUID}
+import java.util.concurrent.TimeUnit
import java.util.concurrent.atomic.AtomicLong
+import java.util.concurrent.locks.Lock
import scala.collection.JavaConverters._
import scala.sys.process._
@@ -143,20 +146,6 @@ object Utils extends Logging {
f.delete()
}
- /**
- * delete file in path with logging
- * @param filePath path to file for deletion
- * @param errorMessage message as prefix logging with error exception
- */
- def deleteFile(filePath: String, errorMessage: String): Unit = {
- try {
- Files.delete(Paths.get(filePath))
- } catch {
- case e: Exception =>
- error(s"$errorMessage: $filePath ", e)
- }
- }
-
/**
* Create a temporary directory inside the given parent directory. The directory will be
* automatically deleted when the VM shuts down.
@@ -215,6 +204,14 @@ object Utils extends Logging {
def currentUser: String = UserGroupInformation.getCurrentUser.getShortUserName
+ def doAs[T](
+ proxyUser: String,
+ realUser: UserGroupInformation = UserGroupInformation.getCurrentUser)(f: () => T): T = {
+ UserGroupInformation.createProxyUser(proxyUser, realUser).doAs(new PrivilegedAction[T] {
+ override def run(): T = f()
+ })
+ }
+
private val shortVersionRegex = """^(\d+\.\d+\.\d+)(.*)?$""".r
/**
@@ -235,6 +232,11 @@ object Utils extends Logging {
*/
val isWindows: Boolean = SystemUtils.IS_OS_WINDOWS
+ /**
+ * Whether the underlying operating system is MacOS.
+ */
+ val isMac: Boolean = SystemUtils.IS_OS_MAC
+
/**
* Indicates whether Kyuubi is currently running unit tests.
*/
@@ -401,4 +403,50 @@ object Utils extends Logging {
Option(Thread.currentThread().getContextClassLoader).getOrElse(getKyuubiClassLoader)
def isOnK8s: Boolean = Files.exists(Paths.get("/var/run/secrets/kubernetes.io"))
+
+ /**
+ * Return a nice string representation of the exception. It will call "printStackTrace" to
+ * recursively generate the stack trace including the exception and its causes.
+ */
+ def prettyPrint(e: Throwable): String = {
+ if (e == null) {
+ ""
+ } else {
+ // Use e.printStackTrace here because e.getStackTrace doesn't include the cause
+ val stringWriter = new StringWriter()
+ e.printStackTrace(new PrintWriter(stringWriter))
+ stringWriter.toString
+ }
+ }
+
+ def withLockRequired[T](lock: Lock)(block: => T): T = {
+ try {
+ lock.lock()
+ block
+ } finally {
+ lock.unlock()
+ }
+ }
+
+ /**
+ * Try killing the process gracefully first, then forcibly if process does not exit in
+ * graceful period.
+ *
+ * @param process the being killed process
+ * @param gracefulPeriod the graceful killing period, in milliseconds
+ * @return the exit code if process exit normally, None if the process finally was killed
+ * forcibly
+ */
+ def terminateProcess(process: java.lang.Process, gracefulPeriod: Long): Option[Int] = {
+ process.destroy()
+ if (process.waitFor(gracefulPeriod, TimeUnit.MILLISECONDS)) {
+ Some(process.exitValue())
+ } else {
+ warn(s"Process does not exit after $gracefulPeriod ms, try to forcibly kill. " +
+ "Staging files generated by the process may be retained!")
+ process.destroyForcibly()
+ None
+ }
+ }
+
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/ConfigBuilder.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/ConfigBuilder.scala
index 62f060a05..d6de40241 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/ConfigBuilder.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/ConfigBuilder.scala
@@ -18,11 +18,14 @@
package org.apache.kyuubi.config
import java.time.Duration
+import java.util.Locale
import java.util.regex.PatternSyntaxException
import scala.util.{Failure, Success, Try}
import scala.util.matching.Regex
+import org.apache.kyuubi.util.EnumUtils._
+
private[kyuubi] case class ConfigBuilder(key: String) {
private[config] var _doc = ""
@@ -150,7 +153,7 @@ private[kyuubi] case class ConfigBuilder(key: String) {
}
}
- new TypedConfigBuilder(this, regexFromString(_, this.key), _.toString)
+ TypedConfigBuilder(this, regexFromString(_, this.key), _.toString)
}
}
@@ -166,6 +169,21 @@ private[kyuubi] case class TypedConfigBuilder[T](
def transform(fn: T => T): TypedConfigBuilder[T] = this.copy(fromStr = s => fn(fromStr(s)))
+ def transformToUpperCase: TypedConfigBuilder[T] = {
+ transformString(_.toUpperCase(Locale.ROOT))
+ }
+
+ def transformToLowerCase: TypedConfigBuilder[T] = {
+ transformString(_.toLowerCase(Locale.ROOT))
+ }
+
+ private def transformString(fn: String => String): TypedConfigBuilder[T] = {
+ require(parent._type == "string")
+ this.asInstanceOf[TypedConfigBuilder[String]]
+ .transform(fn)
+ .asInstanceOf[TypedConfigBuilder[T]]
+ }
+
/** Checks if the user-provided value for the config matches the validator. */
def checkValue(validator: T => Boolean, errMsg: String): TypedConfigBuilder[T] = {
transform { v =>
@@ -187,10 +205,35 @@ private[kyuubi] case class TypedConfigBuilder[T](
}
}
+ /** Checks if the user-provided value for the config matches the value set of the enumeration. */
+ def checkValues(enumeration: Enumeration): TypedConfigBuilder[T] = {
+ transform { v =>
+ val isValid = v match {
+ case iter: Iterable[Any] => isValidEnums(enumeration, iter)
+ case name => isValidEnum(enumeration, name)
+ }
+ if (!isValid) {
+ val actualValueStr = v match {
+ case iter: Iterable[Any] => iter.mkString(",")
+ case value => value.toString
+ }
+ throw new IllegalArgumentException(
+ s"The value of ${parent.key} should be one of ${enumeration.values.mkString(", ")}," +
+ s" but was $actualValueStr")
+ }
+ v
+ }
+ }
+
/** Turns the config entry into a sequence of values of the underlying type. */
def toSequence(sp: String = ","): TypedConfigBuilder[Seq[T]] = {
parent._type = "seq"
- TypedConfigBuilder(parent, strToSeq(_, fromStr, sp), seqToStr(_, toStr))
+ TypedConfigBuilder(parent, strToSeq(_, fromStr, sp), iterableToStr(_, toStr))
+ }
+
+ def toSet(sp: String = ",", skipBlank: Boolean = true): TypedConfigBuilder[Set[T]] = {
+ parent._type = "set"
+ TypedConfigBuilder(parent, strToSet(_, fromStr, sp, skipBlank), iterableToStr(_, toStr))
}
def createOptional: OptionalConfigEntry[T] = {
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/ConfigHelpers.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/ConfigHelpers.scala
index 225f1b537..525ea2ff4 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/ConfigHelpers.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/ConfigHelpers.scala
@@ -17,6 +17,8 @@
package org.apache.kyuubi.config
+import org.apache.commons.lang3.StringUtils
+
import org.apache.kyuubi.Utils
object ConfigHelpers {
@@ -25,7 +27,11 @@ object ConfigHelpers {
Utils.strToSeq(str, sp).map(converter)
}
- def seqToStr[T](v: Seq[T], stringConverter: T => String): String = {
- v.map(stringConverter).mkString(",")
+ def strToSet[T](str: String, converter: String => T, sp: String, skipBlank: Boolean): Set[T] = {
+ Utils.strToSeq(str, sp).filter(!skipBlank || StringUtils.isNotBlank(_)).map(converter).toSet
+ }
+
+ def iterableToStr[T](v: Iterable[T], stringConverter: T => String, sp: String = ","): String = {
+ v.map(stringConverter).mkString(sp)
}
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala
index 6ce84a70d..e52c39865 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiConf.scala
@@ -42,7 +42,7 @@ case class KyuubiConf(loadSysDefault: Boolean = true) extends Logging {
}
if (loadSysDefault) {
- val fromSysDefaults = Utils.getSystemProperties.filterKeys(_.startsWith("kyuubi."))
+ val fromSysDefaults = Utils.getSystemProperties.filterKeys(_.startsWith("kyuubi.")).toMap
loadFromMap(fromSysDefaults)
}
@@ -103,7 +103,6 @@ case class KyuubiConf(loadSysDefault: Boolean = true) extends Logging {
/** unset a parameter from the configuration */
def unset(key: String): KyuubiConf = {
- logDeprecationWarning(key)
settings.remove(key)
this
}
@@ -135,6 +134,31 @@ case class KyuubiConf(loadSysDefault: Boolean = true) extends Logging {
getAllWithPrefix(s"$KYUUBI_BATCH_CONF_PREFIX.$normalizedBatchType", "")
}
+ /** Get the kubernetes conf for specified kubernetes context and namespace. */
+ def getKubernetesConf(context: Option[String], namespace: Option[String]): KyuubiConf = {
+ val conf = this.clone
+ context.foreach { c =>
+ val contextConf =
+ getAllWithPrefix(s"$KYUUBI_KUBERNETES_CONF_PREFIX.$c", "").map { case (suffix, value) =>
+ s"$KYUUBI_KUBERNETES_CONF_PREFIX.$suffix" -> value
+ }
+ val contextNamespaceConf = namespace.map { ns =>
+ getAllWithPrefix(s"$KYUUBI_KUBERNETES_CONF_PREFIX.$c.$ns", "").map {
+ case (suffix, value) =>
+ s"$KYUUBI_KUBERNETES_CONF_PREFIX.$suffix" -> value
+ }
+ }.getOrElse(Map.empty)
+
+ (contextConf ++ contextNamespaceConf).map { case (key, value) =>
+ conf.set(key, value)
+ }
+ conf.set(KUBERNETES_CONTEXT, c)
+ namespace.foreach(ns => conf.set(KUBERNETES_NAMESPACE, ns))
+ conf
+ }
+ conf
+ }
+
/**
* Retrieve key-value pairs from [[KyuubiConf]] starting with `dropped.remainder`, and put them to
* the result map with the `dropped` of key being dropped.
@@ -189,6 +213,8 @@ case class KyuubiConf(loadSysDefault: Boolean = true) extends Logging {
s"and may be removed in the future. $comment")
}
}
+
+ def isRESTEnabled: Boolean = get(FRONTEND_PROTOCOLS).contains(FrontendProtocols.REST.toString)
}
/**
@@ -206,6 +232,7 @@ object KyuubiConf {
final val KYUUBI_HOME = "KYUUBI_HOME"
final val KYUUBI_ENGINE_ENV_PREFIX = "kyuubi.engineEnv"
final val KYUUBI_BATCH_CONF_PREFIX = "kyuubi.batchConf"
+ final val KYUUBI_KUBERNETES_CONF_PREFIX = "kyuubi.kubernetes"
final val USER_DEFAULTS_CONF_QUOTE = "___"
private[this] val kyuubiConfEntriesUpdateLock = new Object
@@ -386,12 +413,12 @@ object KyuubiConf {
"")
.version("1.4.0")
.stringConf
+ .transformToUpperCase
.toSequence()
- .transform(_.map(_.toUpperCase(Locale.ROOT)))
- .checkValue(
- _.forall(FrontendProtocols.values.map(_.toString).contains),
- s"the frontend protocol should be one or more of ${FrontendProtocols.values.mkString(",")}")
- .createWithDefault(Seq(FrontendProtocols.THRIFT_BINARY.toString))
+ .checkValues(FrontendProtocols)
+ .createWithDefault(Seq(
+ FrontendProtocols.THRIFT_BINARY.toString,
+ FrontendProtocols.REST.toString))
val FRONTEND_BIND_HOST: OptionalConfigEntry[String] = buildConf("kyuubi.frontend.bind.host")
.doc("Hostname or IP of the machine on which to run the frontend services.")
@@ -400,6 +427,16 @@ object KyuubiConf {
.stringConf
.createOptional
+ val FRONTEND_ADVERTISED_HOST: OptionalConfigEntry[String] =
+ buildConf("kyuubi.frontend.advertised.host")
+ .doc("Hostname or IP of the Kyuubi server's frontend services to publish to " +
+ "external systems such as the service discovery ensemble and metadata store. " +
+ "Use it when you want to advertise a different hostname or IP than the bind host.")
+ .version("1.8.0")
+ .serverOnly
+ .stringConf
+ .createOptional
+
val FRONTEND_THRIFT_BINARY_BIND_HOST: ConfigEntry[Option[String]] =
buildConf("kyuubi.frontend.thrift.binary.bind.host")
.doc("Hostname or IP of the machine on which to run the thrift frontend service " +
@@ -444,13 +481,13 @@ object KyuubiConf {
.stringConf
.createOptional
- val FRONTEND_THRIFT_BINARY_SSL_DISALLOWED_PROTOCOLS: ConfigEntry[Seq[String]] =
+ val FRONTEND_THRIFT_BINARY_SSL_DISALLOWED_PROTOCOLS: ConfigEntry[Set[String]] =
buildConf("kyuubi.frontend.thrift.binary.ssl.disallowed.protocols")
.doc("SSL versions to disallow for Kyuubi thrift binary frontend.")
.version("1.7.0")
.stringConf
- .toSequence()
- .createWithDefault(Seq("SSLv2", "SSLv3"))
+ .toSet()
+ .createWithDefault(Set("SSLv2", "SSLv3"))
val FRONTEND_THRIFT_BINARY_SSL_INCLUDE_CIPHER_SUITES: ConfigEntry[Seq[String]] =
buildConf("kyuubi.frontend.thrift.binary.ssl.include.ciphersuites")
@@ -726,7 +763,7 @@ object KyuubiConf {
.stringConf
.createWithDefault("X-Real-IP")
- val AUTHENTICATION_METHOD: ConfigEntry[Seq[String]] = buildConf("kyuubi.authentication")
+ val AUTHENTICATION_METHOD: ConfigEntry[Set[String]] = buildConf("kyuubi.authentication")
.doc("A comma-separated list of client authentication types." +
"
" +
"
NOSASL: raw transport.
" +
@@ -761,18 +798,17 @@ object KyuubiConf {
.version("1.0.0")
.serverOnly
.stringConf
- .toSequence()
- .transform(_.map(_.toUpperCase(Locale.ROOT)))
- .checkValue(
- _.forall(AuthTypes.values.map(_.toString).contains),
- s"the authentication type should be one or more of ${AuthTypes.values.mkString(",")}")
- .createWithDefault(Seq(AuthTypes.NONE.toString))
+ .transformToUpperCase
+ .toSet()
+ .checkValues(AuthTypes)
+ .createWithDefault(Set(AuthTypes.NONE.toString))
val AUTHENTICATION_CUSTOM_CLASS: OptionalConfigEntry[String] =
buildConf("kyuubi.authentication.custom.class")
.doc("User-defined authentication implementation of " +
"org.apache.kyuubi.service.authentication.PasswdAuthenticationProvider")
.version("1.3.0")
+ .serverOnly
.stringConf
.createOptional
@@ -788,13 +824,16 @@ object KyuubiConf {
buildConf("kyuubi.authentication.ldap.url")
.doc("SPACE character separated LDAP connection URL(s).")
.version("1.0.0")
+ .serverOnly
.stringConf
.createOptional
- val AUTHENTICATION_LDAP_BASEDN: OptionalConfigEntry[String] =
- buildConf("kyuubi.authentication.ldap.base.dn")
+ val AUTHENTICATION_LDAP_BASE_DN: OptionalConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.baseDN")
+ .withAlternative("kyuubi.authentication.ldap.base.dn")
.doc("LDAP base DN.")
- .version("1.0.0")
+ .version("1.7.0")
+ .serverOnly
.stringConf
.createOptional
@@ -802,21 +841,129 @@ object KyuubiConf {
buildConf("kyuubi.authentication.ldap.domain")
.doc("LDAP domain.")
.version("1.0.0")
+ .serverOnly
+ .stringConf
+ .createOptional
+
+ val AUTHENTICATION_LDAP_GROUP_DN_PATTERN: OptionalConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.groupDNPattern")
+ .doc("COLON-separated list of patterns to use to find DNs for group entities in " +
+ "this directory. Use %s where the actual group name is to be substituted for. " +
+ "For example: CN=%s,CN=Groups,DC=subdomain,DC=domain,DC=com.")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .createOptional
+
+ val AUTHENTICATION_LDAP_USER_DN_PATTERN: OptionalConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.userDNPattern")
+ .doc("COLON-separated list of patterns to use to find DNs for users in this directory. " +
+ "Use %s where the actual group name is to be substituted for. " +
+ "For example: CN=%s,CN=Users,DC=subdomain,DC=domain,DC=com.")
+ .version("1.7.0")
+ .serverOnly
.stringConf
.createOptional
- val AUTHENTICATION_LDAP_GUIDKEY: ConfigEntry[String] =
+ val AUTHENTICATION_LDAP_GROUP_FILTER: ConfigEntry[Set[String]] =
+ buildConf("kyuubi.authentication.ldap.groupFilter")
+ .doc("COMMA-separated list of LDAP Group names (short name not full DNs). " +
+ "For example: HiveAdmins,HadoopAdmins,Administrators")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .toSet()
+ .createWithDefault(Set.empty)
+
+ val AUTHENTICATION_LDAP_USER_FILTER: ConfigEntry[Set[String]] =
+ buildConf("kyuubi.authentication.ldap.userFilter")
+ .doc("COMMA-separated list of LDAP usernames (just short names, not full DNs). " +
+ "For example: hiveuser,impalauser,hiveadmin,hadoopadmin")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .toSet()
+ .createWithDefault(Set.empty)
+
+ val AUTHENTICATION_LDAP_GUID_KEY: ConfigEntry[String] =
buildConf("kyuubi.authentication.ldap.guidKey")
- .doc("LDAP attribute name whose values are unique in this LDAP server." +
- "For example:uid or cn.")
+ .doc("LDAP attribute name whose values are unique in this LDAP server. " +
+ "For example: uid or CN.")
.version("1.2.0")
+ .serverOnly
.stringConf
.createWithDefault("uid")
+ val AUTHENTICATION_LDAP_GROUP_MEMBERSHIP_KEY: ConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.groupMembershipKey")
+ .doc("LDAP attribute name on the group object that contains the list of distinguished " +
+ "names for the user, group, and contact objects that are members of the group. " +
+ "For example: member, uniqueMember or memberUid")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .createWithDefault("member")
+
+ val AUTHENTICATION_LDAP_USER_MEMBERSHIP_KEY: OptionalConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.userMembershipKey")
+ .doc("LDAP attribute name on the user object that contains groups of which the user is " +
+ "a direct member, except for the primary group, which is represented by the " +
+ "primaryGroupId. For example: memberOf")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .createOptional
+
+ val AUTHENTICATION_LDAP_GROUP_CLASS_KEY: ConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.groupClassKey")
+ .doc("LDAP attribute name on the group entry that is to be used in LDAP group searches. " +
+ "For example: group, groupOfNames or groupOfUniqueNames.")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .createWithDefault("groupOfNames")
+
+ val AUTHENTICATION_LDAP_CUSTOM_LDAP_QUERY: OptionalConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.customLDAPQuery")
+ .doc("A full LDAP query that LDAP Atn provider uses to execute against LDAP Server. " +
+ "If this query returns a null resultset, the LDAP Provider fails the Authentication " +
+ "request, succeeds if the user is part of the resultset." +
+ "For example: `(&(objectClass=group)(objectClass=top)(instanceType=4)(cn=Domain*))`, " +
+ "`(&(objectClass=person)(|(sAMAccountName=admin)" +
+ "(|(memberOf=CN=Domain Admins,CN=Users,DC=domain,DC=com)" +
+ "(memberOf=CN=Administrators,CN=Builtin,DC=domain,DC=com))))`")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .createOptional
+
+ val AUTHENTICATION_LDAP_BIND_USER: OptionalConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.binddn")
+ .doc("The user with which to bind to the LDAP server, and search for the full domain name " +
+ "of the user being authenticated. This should be the full domain name of the user, and " +
+ "should have search access across all users in the LDAP tree. If not specified, then " +
+ "the user being authenticated will be used as the bind user. " +
+ "For example: CN=bindUser,CN=Users,DC=subdomain,DC=domain,DC=com")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .createOptional
+
+ val AUTHENTICATION_LDAP_BIND_PASSWORD: OptionalConfigEntry[String] =
+ buildConf("kyuubi.authentication.ldap.bindpw")
+ .doc("The password for the bind user, to be used to search for the full name of the " +
+ "user being authenticated. If the username is specified, this parameter must also be " +
+ "specified.")
+ .version("1.7.0")
+ .serverOnly
+ .stringConf
+ .createOptional
+
val AUTHENTICATION_JDBC_DRIVER: OptionalConfigEntry[String] =
buildConf("kyuubi.authentication.jdbc.driver.class")
.doc("Driver class name for JDBC Authentication Provider.")
.version("1.6.0")
+ .serverOnly
.stringConf
.createOptional
@@ -824,6 +971,7 @@ object KyuubiConf {
buildConf("kyuubi.authentication.jdbc.url")
.doc("JDBC URL for JDBC Authentication Provider.")
.version("1.6.0")
+ .serverOnly
.stringConf
.createOptional
@@ -831,6 +979,7 @@ object KyuubiConf {
buildConf("kyuubi.authentication.jdbc.user")
.doc("Database user for JDBC Authentication Provider.")
.version("1.6.0")
+ .serverOnly
.stringConf
.createOptional
@@ -838,6 +987,7 @@ object KyuubiConf {
buildConf("kyuubi.authentication.jdbc.password")
.doc("Database password for JDBC Authentication Provider.")
.version("1.6.0")
+ .serverOnly
.stringConf
.createOptional
@@ -849,6 +999,7 @@ object KyuubiConf {
"The SQL statement must start with the `SELECT` clause. " +
"Available placeholders are `${user}` and `${password}`.")
.version("1.6.0")
+ .serverOnly
.stringConf
.createOptional
@@ -887,9 +1038,10 @@ object KyuubiConf {
"
auth-conf - authentication plus integrity and confidentiality protection. This is" +
" applicable only if Kyuubi is configured to use Kerberos authentication.
")
.version("1.0.0")
+ .serverOnly
.stringConf
- .checkValues(SaslQOP.values.map(_.toString))
- .transform(_.toLowerCase(Locale.ROOT))
+ .checkValues(SaslQOP)
+ .transformToLowerCase
.createWithDefault(SaslQOP.AUTH.toString)
val FRONTEND_REST_BIND_HOST: ConfigEntry[Option[String]] =
@@ -994,6 +1146,15 @@ object KyuubiConf {
.stringConf
.createOptional
+ val KUBERNETES_CONTEXT_ALLOW_LIST: ConfigEntry[Set[String]] =
+ buildConf("kyuubi.kubernetes.context.allow.list")
+ .doc("The allowed kubernetes context list, if it is empty," +
+ " there is no kubernetes context limitation.")
+ .version("1.8.0")
+ .stringConf
+ .toSet()
+ .createWithDefault(Set.empty)
+
val KUBERNETES_NAMESPACE: ConfigEntry[String] =
buildConf("kyuubi.kubernetes.namespace")
.doc("The namespace that will be used for running the kyuubi pods and find engines.")
@@ -1001,6 +1162,15 @@ object KyuubiConf {
.stringConf
.createWithDefault("default")
+ val KUBERNETES_NAMESPACE_ALLOW_LIST: ConfigEntry[Set[String]] =
+ buildConf("kyuubi.kubernetes.namespace.allow.list")
+ .doc("The allowed kubernetes namespace list, if it is empty," +
+ " there is no kubernetes namespace limitation.")
+ .version("1.8.0")
+ .stringConf
+ .toSet()
+ .createWithDefault(Set.empty)
+
val KUBERNETES_MASTER: OptionalConfigEntry[String] =
buildConf("kyuubi.kubernetes.master.address")
.doc("The internal Kubernetes master (API server) address to be used for kyuubi.")
@@ -1060,6 +1230,15 @@ object KyuubiConf {
.booleanConf
.createWithDefault(false)
+ val KUBERNETES_TERMINATED_APPLICATION_RETAIN_PERIOD: ConfigEntry[Long] =
+ buildConf("kyuubi.kubernetes.terminatedApplicationRetainPeriod")
+ .doc("The period for which the Kyuubi server retains application information after " +
+ "the application terminates.")
+ .version("1.7.1")
+ .timeConf
+ .checkValue(_ > 0, "must be positive number")
+ .createWithDefault(Duration.ofMinutes(5).toMillis)
+
// ///////////////////////////////////////////////////////////////////////////////////////////////
// SQL Engine Configuration //
// ///////////////////////////////////////////////////////////////////////////////////////////////
@@ -1117,6 +1296,16 @@ object KyuubiConf {
.timeConf
.createWithDefault(0)
+ val ENGINE_SPARK_MAX_INITIAL_WAIT: ConfigEntry[Long] =
+ buildConf("kyuubi.session.engine.spark.max.initial.wait")
+ .doc("Max wait time for the initial connection to Spark engine. The engine will" +
+ " self-terminate no new incoming connection is established within this time." +
+ " This setting only applies at the CONNECTION share level." +
+ " 0 or negative means not to self-terminate.")
+ .version("1.8.0")
+ .timeConf
+ .createWithDefault(Duration.ofSeconds(60).toMillis)
+
val ENGINE_FLINK_MAIN_RESOURCE: OptionalConfigEntry[String] =
buildConf("kyuubi.session.engine.flink.main.resource")
.doc("The package used to create Flink SQL engine remote job. If it is undefined," +
@@ -1134,6 +1323,15 @@ object KyuubiConf {
.intConf
.createWithDefault(1000000)
+ val ENGINE_FLINK_FETCH_TIMEOUT: OptionalConfigEntry[Long] =
+ buildConf("kyuubi.session.engine.flink.fetch.timeout")
+ .doc("Result fetch timeout for Flink engine. If the timeout is reached, the result " +
+ "fetch would be stopped and the current fetched would be returned. If no data are " +
+ "fetched, a TimeoutException would be thrown.")
+ .version("1.8.0")
+ .timeConf
+ .createOptional
+
val ENGINE_TRINO_MAIN_RESOURCE: OptionalConfigEntry[String] =
buildConf("kyuubi.session.engine.trino.main.resource")
.doc("The package used to create Trino engine remote job. If it is undefined," +
@@ -1156,6 +1354,55 @@ object KyuubiConf {
.stringConf
.createOptional
+ val ENGINE_TRINO_CONNECTION_PASSWORD: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.trino.connection.password")
+ .doc("The password used for connecting to trino cluster")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_TRINO_CONNECTION_KEYSTORE_PATH: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.trino.connection.keystore.path")
+ .doc("The keystore path used for connecting to trino cluster")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_TRINO_CONNECTION_KEYSTORE_PASSWORD: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.trino.connection.keystore.password")
+ .doc("The keystore password used for connecting to trino cluster")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_TRINO_CONNECTION_KEYSTORE_TYPE: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.trino.connection.keystore.type")
+ .doc("The keystore type used for connecting to trino cluster")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_TRINO_CONNECTION_TRUSTSTORE_PATH: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.trino.connection.truststore.path")
+ .doc("The truststore path used for connecting to trino cluster")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_TRINO_CONNECTION_TRUSTSTORE_PASSWORD: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.trino.connection.truststore.password")
+ .doc("The truststore password used for connecting to trino cluster")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_TRINO_CONNECTION_TRUSTSTORE_TYPE: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.trino.connection.truststore.type")
+ .doc("The truststore type used for connecting to trino cluster")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
val ENGINE_TRINO_SHOW_PROGRESS: ConfigEntry[Boolean] =
buildConf("kyuubi.session.engine.trino.showProgress")
.doc("When true, show the progress bar and final info in the Trino engine log.")
@@ -1184,6 +1431,14 @@ object KyuubiConf {
.timeConf
.createWithDefault(Duration.ofSeconds(15).toMillis)
+ val ENGINE_ALIVE_MAX_FAILURES: ConfigEntry[Int] =
+ buildConf("kyuubi.session.engine.alive.max.failures")
+ .doc("The maximum number of failures allowed for the engine.")
+ .version("1.8.0")
+ .intConf
+ .checkValue(_ > 0, "Must be positive")
+ .createWithDefault(3)
+
val ENGINE_ALIVE_PROBE_ENABLED: ConfigEntry[Boolean] =
buildConf("kyuubi.session.engine.alive.probe.enabled")
.doc("Whether to enable the engine alive probe, it true, we will create a companion thrift" +
@@ -1247,6 +1502,14 @@ object KyuubiConf {
.version("1.2.0")
.fallbackConf(SESSION_TIMEOUT)
+ val SESSION_CLOSE_ON_DISCONNECT: ConfigEntry[Boolean] =
+ buildConf("kyuubi.session.close.on.disconnect")
+ .doc("Session will be closed when client disconnects from kyuubi gateway. " +
+ "Set this to false to have session outlive its parent connection.")
+ .version("1.8.0")
+ .booleanConf
+ .createWithDefault(true)
+
val BATCH_SESSION_IDLE_TIMEOUT: ConfigEntry[Long] = buildConf("kyuubi.batch.session.idle.timeout")
.doc("Batch session idle timeout, it will be closed when it's not accessed for this duration")
.version("1.6.2")
@@ -1266,7 +1529,7 @@ object KyuubiConf {
.timeConf
.createWithDefault(Duration.ofMinutes(30L).toMillis)
- val SESSION_CONF_IGNORE_LIST: ConfigEntry[Seq[String]] =
+ val SESSION_CONF_IGNORE_LIST: ConfigEntry[Set[String]] =
buildConf("kyuubi.session.conf.ignore.list")
.doc("A comma-separated list of ignored keys. If the client connection contains any of" +
" them, the key and the corresponding value will be removed silently during engine" +
@@ -1276,10 +1539,10 @@ object KyuubiConf {
" configurations via SET syntax.")
.version("1.2.0")
.stringConf
- .toSequence()
- .createWithDefault(Nil)
+ .toSet()
+ .createWithDefault(Set.empty)
- val SESSION_CONF_RESTRICT_LIST: ConfigEntry[Seq[String]] =
+ val SESSION_CONF_RESTRICT_LIST: ConfigEntry[Set[String]] =
buildConf("kyuubi.session.conf.restrict.list")
.doc("A comma-separated list of restricted keys. If the client connection contains any of" +
" them, the connection will be rejected explicitly during engine bootstrap and connection" +
@@ -1289,8 +1552,8 @@ object KyuubiConf {
" configurations via SET syntax.")
.version("1.2.0")
.stringConf
- .toSequence()
- .createWithDefault(Nil)
+ .toSet()
+ .createWithDefault(Set.empty)
val SESSION_USER_SIGN_ENABLED: ConfigEntry[Boolean] =
buildConf("kyuubi.session.user.sign.enabled")
@@ -1320,6 +1583,15 @@ object KyuubiConf {
.booleanConf
.createWithDefault(true)
+ val SESSION_ENGINE_STARTUP_DESTROY_TIMEOUT: ConfigEntry[Long] =
+ buildConf("kyuubi.session.engine.startup.destroy.timeout")
+ .doc("Engine startup process destroy wait time, if the process does not " +
+ "stop after this time, force destroy instead. This configuration only " +
+ s"takes effect when `${SESSION_ENGINE_STARTUP_WAIT_COMPLETION.key}=false`.")
+ .version("1.8.0")
+ .timeConf
+ .createWithDefault(Duration.ofSeconds(5).toMillis)
+
val SESSION_ENGINE_LAUNCH_ASYNC: ConfigEntry[Boolean] =
buildConf("kyuubi.session.engine.launch.async")
.doc("When opening kyuubi session, whether to launch the backend engine asynchronously." +
@@ -1329,7 +1601,7 @@ object KyuubiConf {
.booleanConf
.createWithDefault(true)
- val SESSION_LOCAL_DIR_ALLOW_LIST: ConfigEntry[Seq[String]] =
+ val SESSION_LOCAL_DIR_ALLOW_LIST: ConfigEntry[Set[String]] =
buildConf("kyuubi.session.local.dir.allow.list")
.doc("The local dir list that are allowed to access by the kyuubi session application. " +
" End-users might set some parameters such as `spark.files` and it will " +
@@ -1342,8 +1614,8 @@ object KyuubiConf {
.stringConf
.checkValue(dir => dir.startsWith(File.separator), "the dir should be absolute path")
.transform(dir => dir.stripSuffix(File.separator) + File.separator)
- .toSequence()
- .createWithDefault(Nil)
+ .toSet()
+ .createWithDefault(Set.empty)
val BATCH_APPLICATION_CHECK_INTERVAL: ConfigEntry[Long] =
buildConf("kyuubi.batch.application.check.interval")
@@ -1359,7 +1631,7 @@ object KyuubiConf {
.timeConf
.createWithDefault(Duration.ofMinutes(3).toMillis)
- val BATCH_CONF_IGNORE_LIST: ConfigEntry[Seq[String]] =
+ val BATCH_CONF_IGNORE_LIST: ConfigEntry[Set[String]] =
buildConf("kyuubi.batch.conf.ignore.list")
.doc("A comma-separated list of ignored keys for batch conf. If the batch conf contains" +
" any of them, the key and the corresponding value will be removed silently during batch" +
@@ -1371,8 +1643,8 @@ object KyuubiConf {
" for the Spark batch job with key `kyuubi.batchConf.spark.spark.master`.")
.version("1.6.0")
.stringConf
- .toSequence()
- .createWithDefault(Nil)
+ .toSet()
+ .createWithDefault(Set.empty)
val BATCH_INTERNAL_REST_CLIENT_SOCKET_TIMEOUT: ConfigEntry[Long] =
buildConf("kyuubi.batch.internal.rest.client.socket.timeout")
@@ -1402,6 +1674,50 @@ object KyuubiConf {
.timeConf
.createWithDefault(Duration.ofSeconds(5).toMillis)
+ val BATCH_RESOURCE_UPLOAD_ENABLED: ConfigEntry[Boolean] =
+ buildConf("kyuubi.batch.resource.upload.enabled")
+ .internal
+ .doc("Whether to enable Kyuubi batch resource upload function.")
+ .version("1.7.1")
+ .booleanConf
+ .createWithDefault(true)
+
+ val BATCH_SUBMITTER_ENABLED: ConfigEntry[Boolean] =
+ buildConf("kyuubi.batch.submitter.enabled")
+ .internal
+ .serverOnly
+ .doc("Batch API v2 requires batch submitter to pick the INITIALIZED batch job " +
+ "from metastore and submits it to Resource Manager. " +
+ "Note: Batch API v2 is experimental and under rapid development, this configuration " +
+ "is added to allow explorers conveniently testing the developing Batch v2 API, not " +
+ "intended exposing to end users, it may be removed in anytime.")
+ .version("1.8.0")
+ .booleanConf
+ .createWithDefault(false)
+
+ val BATCH_SUBMITTER_THREADS: ConfigEntry[Int] =
+ buildConf("kyuubi.batch.submitter.threads")
+ .internal
+ .serverOnly
+ .doc("Number of threads in batch job submitter, this configuration only take effects " +
+ s"when ${BATCH_SUBMITTER_ENABLED.key} is enabled")
+ .version("1.8.0")
+ .intConf
+ .createWithDefault(16)
+
+ val BATCH_IMPL_VERSION: ConfigEntry[String] =
+ buildConf("kyuubi.batch.impl.version")
+ .internal
+ .serverOnly
+ .doc("Batch API version, candidates: 1, 2. Only take effect when " +
+ s"${BATCH_SUBMITTER_ENABLED.key} is true, otherwise always use v1 implementation. " +
+ "Note: Batch API v2 is experimental and under rapid development, this configuration " +
+ "is added to allow explorers conveniently testing the developing Batch v2 API, not " +
+ "intended exposing to end users, it may be removed in anytime.")
+ .version("1.8.0")
+ .stringConf
+ .createWithDefault("1")
+
val SERVER_EXEC_POOL_SIZE: ConfigEntry[Int] =
buildConf("kyuubi.backend.server.exec.pool.size")
.doc("Number of threads in the operation execution thread pool of Kyuubi server")
@@ -1459,16 +1775,6 @@ object KyuubiConf {
.intConf
.createWithDefault(10)
- val METADATA_REQUEST_RETRY_THREADS: ConfigEntry[Int] =
- buildConf("kyuubi.metadata.request.retry.threads")
- .doc("Number of threads in the metadata request retry manager thread pool. The metadata" +
- " store might be unavailable sometimes and the requests will fail, tolerant for this" +
- " case and unblock the main thread, we support retrying the failed requests" +
- " in an async way.")
- .version("1.6.0")
- .intConf
- .createWithDefault(10)
-
val METADATA_REQUEST_RETRY_INTERVAL: ConfigEntry[Long] =
buildConf("kyuubi.metadata.request.retry.interval")
.doc("The interval to check and trigger the metadata request retry tasks.")
@@ -1476,10 +1782,31 @@ object KyuubiConf {
.timeConf
.createWithDefault(Duration.ofSeconds(5).toMillis)
- val METADATA_REQUEST_RETRY_QUEUE_SIZE: ConfigEntry[Int] =
- buildConf("kyuubi.metadata.request.retry.queue.size")
+ val METADATA_REQUEST_ASYNC_RETRY_ENABLED: ConfigEntry[Boolean] =
+ buildConf("kyuubi.metadata.request.async.retry.enabled")
+ .doc("Whether to retry in async when metadata request failed. When true, return " +
+ "success response immediately even the metadata request failed, and schedule " +
+ "it in background until success, to tolerate long-time metadata store outages " +
+ "w/o blocking the submission request.")
+ .version("1.7.0")
+ .booleanConf
+ .createWithDefault(true)
+
+ val METADATA_REQUEST_ASYNC_RETRY_THREADS: ConfigEntry[Int] =
+ buildConf("kyuubi.metadata.request.async.retry.threads")
+ .withAlternative("kyuubi.metadata.request.retry.threads")
+ .doc("Number of threads in the metadata request async retry manager thread pool. Only " +
+ s"take affect when ${METADATA_REQUEST_ASYNC_RETRY_ENABLED.key} is `true`.")
+ .version("1.6.0")
+ .intConf
+ .createWithDefault(10)
+
+ val METADATA_REQUEST_ASYNC_RETRY_QUEUE_SIZE: ConfigEntry[Int] =
+ buildConf("kyuubi.metadata.request.async.retry.queue.size")
+ .withAlternative("kyuubi.metadata.request.retry.queue.size")
.doc("The maximum queue size for buffering metadata requests in memory when the external" +
- " metadata storage is down. Requests will be dropped if the queue exceeds.")
+ " metadata storage is down. Requests will be dropped if the queue exceeds. Only" +
+ s" take affect when ${METADATA_REQUEST_ASYNC_RETRY_ENABLED.key} is `true`.")
.version("1.6.0")
.intConf
.createWithDefault(65536)
@@ -1557,11 +1884,29 @@ object KyuubiConf {
.checkValue(_ >= 1000, "must >= 1s if set")
.createOptional
+ val OPERATION_QUERY_TIMEOUT_MONITOR_ENABLED: ConfigEntry[Boolean] =
+ buildConf("kyuubi.operation.query.timeout.monitor.enabled")
+ .doc("Whether to monitor timeout query timeout check on server side.")
+ .version("1.8.0")
+ .serverOnly
+ .internal
+ .booleanConf
+ .createWithDefault(true)
+
+ val OPERATION_RESULT_MAX_ROWS: ConfigEntry[Int] =
+ buildConf("kyuubi.operation.result.max.rows")
+ .doc("Max rows of Spark query results. Rows exceeding the limit would be ignored. " +
+ "By setting this value to 0 to disable the max rows limit.")
+ .version("1.6.0")
+ .intConf
+ .createWithDefault(0)
+
val OPERATION_INCREMENTAL_COLLECT: ConfigEntry[Boolean] =
buildConf("kyuubi.operation.incremental.collect")
.internal
.doc("When true, the executor side result will be sequentially calculated and returned to" +
- " the Spark driver side.")
+ s" the Spark driver side. Note that, ${OPERATION_RESULT_MAX_ROWS.key} will be ignored" +
+ " on incremental collect mode.")
.version("1.4.0")
.booleanConf
.createWithDefault(false)
@@ -1576,16 +1921,16 @@ object KyuubiConf {
.version("1.7.0")
.stringConf
.checkValues(Set("arrow", "thrift"))
- .transform(_.toLowerCase(Locale.ROOT))
+ .transformToLowerCase
.createWithDefault("thrift")
- val OPERATION_RESULT_MAX_ROWS: ConfigEntry[Int] =
- buildConf("kyuubi.operation.result.max.rows")
- .doc("Max rows of Spark query results. Rows exceeding the limit would be ignored. " +
- "By setting this value to 0 to disable the max rows limit.")
- .version("1.6.0")
- .intConf
- .createWithDefault(0)
+ val ARROW_BASED_ROWSET_TIMESTAMP_AS_STRING: ConfigEntry[Boolean] =
+ buildConf("kyuubi.operation.result.arrow.timestampAsString")
+ .doc("When true, arrow-based rowsets will convert columns of type timestamp to strings for" +
+ " transmission.")
+ .version("1.7.0")
+ .booleanConf
+ .createWithDefault(false)
val SERVER_OPERATION_LOG_DIR_ROOT: ConfigEntry[String] =
buildConf("kyuubi.operation.log.dir.root")
@@ -1601,8 +1946,8 @@ object KyuubiConf {
.doc(s"(deprecated) - Using kyuubi.engine.share.level instead")
.version("1.0.0")
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
- .checkValues(ShareLevel.values.map(_.toString))
+ .transformToUpperCase
+ .checkValues(ShareLevel)
.createWithDefault(ShareLevel.USER.toString)
// [ZooKeeper Data Model]
@@ -1616,7 +1961,7 @@ object KyuubiConf {
.doc("(deprecated) - Using kyuubi.engine.share.level.subdomain instead")
.version("1.2.0")
.stringConf
- .transform(_.toLowerCase(Locale.ROOT))
+ .transformToLowerCase
.checkValue(validZookeeperSubPath.matcher(_).matches(), "must be valid zookeeper sub path.")
.createOptional
@@ -1676,13 +2021,15 @@ object KyuubiConf {
" all the capacity of the Trino." +
"
HIVE_SQL: specify this engine type will launch a Hive engine which can provide" +
" all the capacity of the Hive Server2.
" +
- "
JDBC: specify this engine type will launch a JDBC engine which can provide" +
- " a MySQL protocol connector, for now we only support Doris dialect.
" +
+ "
JDBC: specify this engine type will launch a JDBC engine which can forward " +
+ " queries to the database system through the certain JDBC driver, " +
+ " for now, it supports Doris and Phoenix.
" +
+ "
CHAT: specify this engine type will launch a Chat engine.
" +
"")
.version("1.4.0")
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
- .checkValues(EngineType.values.map(_.toString))
+ .transformToUpperCase
+ .checkValues(EngineType)
.createWithDefault(EngineType.SPARK_SQL.toString)
val ENGINE_POOL_IGNORE_SUBDOMAIN: ConfigEntry[Boolean] =
@@ -1705,6 +2052,7 @@ object KyuubiConf {
.doc("This parameter is introduced as a server-side parameter " +
"controlling the upper limit of the engine pool.")
.version("1.4.0")
+ .serverOnly
.intConf
.checkValue(s => s > 0 && s < 33, "Invalid engine pool threshold, it should be in [1, 32]")
.createWithDefault(9)
@@ -1718,7 +2066,7 @@ object KyuubiConf {
.intConf
.createWithDefault(-1)
- val ENGINE_POOL_BALANCE_POLICY: ConfigEntry[String] =
+ val ENGINE_POOL_SELECT_POLICY: ConfigEntry[String] =
buildConf("kyuubi.engine.pool.selectPolicy")
.doc("The select policy of an engine from the corresponding engine pool engine for " +
"a session.
" +
@@ -1727,7 +2075,7 @@ object KyuubiConf {
"
")
.version("1.7.0")
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
+ .transformToUpperCase
.checkValues(Set("RANDOM", "POLLING"))
.createWithDefault("RANDOM")
@@ -1751,24 +2099,24 @@ object KyuubiConf {
.toSequence(";")
.createWithDefault(Nil)
- val ENGINE_DEREGISTER_EXCEPTION_CLASSES: ConfigEntry[Seq[String]] =
+ val ENGINE_DEREGISTER_EXCEPTION_CLASSES: ConfigEntry[Set[String]] =
buildConf("kyuubi.engine.deregister.exception.classes")
.doc("A comma-separated list of exception classes. If there is any exception thrown," +
" whose class matches the specified classes, the engine would deregister itself.")
.version("1.2.0")
.stringConf
- .toSequence()
- .createWithDefault(Nil)
+ .toSet()
+ .createWithDefault(Set.empty)
- val ENGINE_DEREGISTER_EXCEPTION_MESSAGES: ConfigEntry[Seq[String]] =
+ val ENGINE_DEREGISTER_EXCEPTION_MESSAGES: ConfigEntry[Set[String]] =
buildConf("kyuubi.engine.deregister.exception.messages")
.doc("A comma-separated list of exception messages. If there is any exception thrown," +
" whose message or stacktrace matches the specified message list, the engine would" +
" deregister itself.")
.version("1.2.0")
.stringConf
- .toSequence()
- .createWithDefault(Nil)
+ .toSet()
+ .createWithDefault(Set.empty)
val ENGINE_DEREGISTER_JOB_MAX_FAILURES: ConfigEntry[Int] =
buildConf("kyuubi.engine.deregister.job.max.failures")
@@ -1850,12 +2198,34 @@ object KyuubiConf {
.stringConf
.createWithDefault("file:///tmp/kyuubi/events")
+ val SERVER_EVENT_KAFKA_TOPIC: OptionalConfigEntry[String] =
+ buildConf("kyuubi.backend.server.event.kafka.topic")
+ .doc("The topic of server events go for the built-in Kafka logger")
+ .version("1.8.0")
+ .serverOnly
+ .stringConf
+ .createOptional
+
+ val SERVER_EVENT_KAFKA_CLOSE_TIMEOUT: ConfigEntry[Long] =
+ buildConf("kyuubi.backend.server.event.kafka.close.timeout")
+ .doc("Period to wait for Kafka producer of server event handlers to close.")
+ .version("1.8.0")
+ .serverOnly
+ .timeConf
+ .createWithDefault(Duration.ofMillis(5000).toMillis)
+
val SERVER_EVENT_LOGGERS: ConfigEntry[Seq[String]] =
buildConf("kyuubi.backend.server.event.loggers")
.doc("A comma-separated list of server history loggers, where session/operation etc" +
" events go.
" +
s"
JSON: the events will be written to the location of" +
s" ${SERVER_EVENT_JSON_LOG_PATH.key}
" +
+ s"
KAFKA: the events will be serialized in JSON format" +
+ s" and sent to topic of `${SERVER_EVENT_KAFKA_TOPIC.key}`." +
+ s" Note: For the configs of Kafka producer," +
+ s" please specify them with the prefix: `kyuubi.backend.server.event.kafka.`." +
+ s" For example, `kyuubi.backend.server.event.kafka.bootstrap.servers=127.0.0.1:9092`" +
+ s"
" +
s"
JDBC: to be done
" +
s"
CUSTOM: User-defined event handlers.
" +
" Note that: Kyuubi supports custom event handlers with the Java SPI." +
@@ -1866,9 +2236,11 @@ object KyuubiConf {
.version("1.4.0")
.serverOnly
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
+ .transformToUpperCase
.toSequence()
- .checkValue(_.toSet.subsetOf(Set("JSON", "JDBC", "CUSTOM")), "Unsupported event loggers")
+ .checkValue(
+ _.toSet.subsetOf(Set("JSON", "JDBC", "CUSTOM", "KAFKA")),
+ "Unsupported event loggers")
.createWithDefault(Nil)
@deprecated("using kyuubi.engine.spark.event.loggers instead", "1.6.0")
@@ -1888,7 +2260,7 @@ object KyuubiConf {
" which has a zero-arg constructor.")
.version("1.3.0")
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
+ .transformToUpperCase
.toSequence()
.checkValue(
_.toSet.subsetOf(Set("SPARK", "JSON", "JDBC", "CUSTOM")),
@@ -1950,8 +2322,23 @@ object KyuubiConf {
"subclass of `EngineSecuritySecretProvider`.")
.version("1.5.0")
.stringConf
- .createWithDefault(
- "org.apache.kyuubi.service.authentication.ZooKeeperEngineSecuritySecretProviderImpl")
+ .transform {
+ case "simple" =>
+ "org.apache.kyuubi.service.authentication.SimpleEngineSecuritySecretProviderImpl"
+ case "zookeeper" =>
+ "org.apache.kyuubi.service.authentication.ZooKeeperEngineSecuritySecretProviderImpl"
+ case other => other
+ }
+ .createWithDefault("zookeeper")
+
+ val SIMPLE_SECURITY_SECRET_PROVIDER_PROVIDER_SECRET: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.security.secret.provider.simple.secret")
+ .internal
+ .doc("The secret key used for internal security access. Only take affects when " +
+ s"${ENGINE_SECURITY_SECRET_PROVIDER.key} is 'simple'")
+ .version("1.7.0")
+ .stringConf
+ .createOptional
val ENGINE_SECURITY_CRYPTO_KEY_LENGTH: ConfigEntry[Int] =
buildConf("kyuubi.engine.security.crypto.keyLength")
@@ -1999,14 +2386,14 @@ object KyuubiConf {
val OPERATION_PLAN_ONLY_MODE: ConfigEntry[String] =
buildConf("kyuubi.operation.plan.only.mode")
.doc("Configures the statement performed mode, The value can be 'parse', 'analyze', " +
- "'optimize', 'optimize_with_stats', 'physical', 'execution', or 'none', " +
+ "'optimize', 'optimize_with_stats', 'physical', 'execution', 'lineage' or 'none', " +
"when it is 'none', indicate to the statement will be fully executed, otherwise " +
"only way without executing the query. different engines currently support different " +
"modes, the Spark engine supports all modes, and the Flink engine supports 'parse', " +
"'physical', and 'execution', other engines do not support planOnly currently.")
.version("1.4.0")
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
+ .transformToUpperCase
.checkValue(
mode =>
Set(
@@ -2016,10 +2403,11 @@ object KyuubiConf {
"OPTIMIZE_WITH_STATS",
"PHYSICAL",
"EXECUTION",
+ "LINEAGE",
"NONE").contains(mode),
"Invalid value for 'kyuubi.operation.plan.only.mode'. Valid values are" +
"'parse', 'analyze', 'optimize', 'optimize_with_stats', 'physical', 'execution' and " +
- "'none'.")
+ "'lineage', 'none'.")
.createWithDefault(NoneMode.name)
val OPERATION_PLAN_ONLY_OUT_STYLE: ConfigEntry[String] =
@@ -2029,14 +2417,11 @@ object KyuubiConf {
"of the Spark engine")
.version("1.7.0")
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
- .checkValue(
- mode => Set("PLAIN", "JSON").contains(mode),
- "Invalid value for 'kyuubi.operation.plan.only.output.style'. Valid values are " +
- "'plain', 'json'.")
+ .transformToUpperCase
+ .checkValues(Set("PLAIN", "JSON"))
.createWithDefault(PlainStyle.name)
- val OPERATION_PLAN_ONLY_EXCLUDES: ConfigEntry[Seq[String]] =
+ val OPERATION_PLAN_ONLY_EXCLUDES: ConfigEntry[Set[String]] =
buildConf("kyuubi.operation.plan.only.excludes")
.doc("Comma-separated list of query plan names, in the form of simple class names, i.e, " +
"for `SET abc=xyz`, the value will be `SetCommand`. For those auxiliary plans, such as " +
@@ -2046,14 +2431,21 @@ object KyuubiConf {
s"See also ${OPERATION_PLAN_ONLY_MODE.key}.")
.version("1.5.0")
.stringConf
- .toSequence()
- .createWithDefault(Seq(
+ .toSet()
+ .createWithDefault(Set(
"ResetCommand",
"SetCommand",
"SetNamespaceCommand",
"UseStatement",
"SetCatalogAndNamespace"))
+ val LINEAGE_PARSER_PLUGIN_PROVIDER: ConfigEntry[String] =
+ buildConf("kyuubi.lineage.parser.plugin.provider")
+ .doc("The provider for the Spark lineage parser plugin.")
+ .version("1.8.0")
+ .stringConf
+ .createWithDefault("org.apache.kyuubi.plugin.lineage.LineageParserProvider")
+
object OperationLanguages extends Enumeration with Logging {
type OperationLanguage = Value
val PYTHON, SQL, SCALA, UNKNOWN = Value
@@ -2072,22 +2464,27 @@ object KyuubiConf {
val OPERATION_LANGUAGE: ConfigEntry[String] =
buildConf("kyuubi.operation.language")
.doc("Choose a programing language for the following inputs" +
- "
SQL: (Default) Run all following statements as SQL queries.
" +
- "
SCALA: Run all following input a scala codes
")
+ "
" +
+ "
SQL: (Default) Run all following statements as SQL queries.
" +
+ "
SCALA: Run all following input as scala codes
" +
+ "
PYTHON: (Experimental) Run all following input as Python codes with Spark engine" +
+ "
" +
+ "
")
.version("1.5.0")
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
- .checkValues(OperationLanguages.values.map(_.toString))
+ .transformToUpperCase
+ .checkValues(OperationLanguages)
.createWithDefault(OperationLanguages.SQL.toString)
- val SESSION_CONF_ADVISOR: OptionalConfigEntry[String] =
+ val SESSION_CONF_ADVISOR: OptionalConfigEntry[Seq[String]] =
buildConf("kyuubi.session.conf.advisor")
- .doc("A config advisor plugin for Kyuubi Server. This plugin can provide some custom " +
+ .doc("A config advisor plugin for Kyuubi Server. This plugin can provide a list of custom " +
"configs for different users or session configs and overwrite the session configs before " +
"opening a new session. This config value should be a subclass of " +
"`org.apache.kyuubi.plugin.SessionConfAdvisor` which has a zero-arg constructor.")
.version("1.5.0")
.stringConf
+ .toSequence()
.createOptional
val GROUP_PROVIDER: ConfigEntry[String] =
@@ -2191,14 +2588,14 @@ object KyuubiConf {
val ENGINE_FLINK_MEMORY: ConfigEntry[String] =
buildConf("kyuubi.engine.flink.memory")
- .doc("The heap memory for the Flink SQL engine")
+ .doc("The heap memory for the Flink SQL engine. Only effective in yarn session mode.")
.version("1.6.0")
.stringConf
.createWithDefault("1g")
val ENGINE_FLINK_JAVA_OPTIONS: OptionalConfigEntry[String] =
buildConf("kyuubi.engine.flink.java.options")
- .doc("The extra Java options for the Flink SQL engine")
+ .doc("The extra Java options for the Flink SQL engine. Only effective in yarn session mode.")
.version("1.6.0")
.stringConf
.createOptional
@@ -2206,11 +2603,19 @@ object KyuubiConf {
val ENGINE_FLINK_EXTRA_CLASSPATH: OptionalConfigEntry[String] =
buildConf("kyuubi.engine.flink.extra.classpath")
.doc("The extra classpath for the Flink SQL engine, for configuring the location" +
- " of hadoop client jars, etc")
+ " of hadoop client jars, etc. Only effective in yarn session mode.")
.version("1.6.0")
.stringConf
.createOptional
+ val ENGINE_FLINK_APPLICATION_JARS: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.flink.application.jars")
+ .doc("A comma-separated list of the local jars to be shipped with the job to the cluster. " +
+ "For example, SQL UDF jars. Only effective in yarn application mode.")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
val SERVER_LIMIT_CONNECTIONS_PER_USER: OptionalConfigEntry[Int] =
buildConf("kyuubi.server.limit.connections.per.user")
.doc("Maximum kyuubi server connections per user." +
@@ -2238,17 +2643,28 @@ object KyuubiConf {
.intConf
.createOptional
- val SERVER_LIMIT_CONNECTIONS_USER_UNLIMITED_LIST: ConfigEntry[Seq[String]] =
+ val SERVER_LIMIT_CONNECTIONS_USER_UNLIMITED_LIST: ConfigEntry[Set[String]] =
buildConf("kyuubi.server.limit.connections.user.unlimited.list")
- .doc("The maximin connections of the user in the white list will not be limited.")
+ .doc("The maximum connections of the user in the white list will not be limited.")
.version("1.7.0")
.serverOnly
.stringConf
- .toSequence()
- .createWithDefault(Nil)
+ .toSet()
+ .createWithDefault(Set.empty)
+
+ val SERVER_LIMIT_CONNECTIONS_USER_DENY_LIST: ConfigEntry[Set[String]] =
+ buildConf("kyuubi.server.limit.connections.user.deny.list")
+ .doc("The user in the deny list will be denied to connect to kyuubi server, " +
+ "if the user has configured both user.unlimited.list and user.deny.list, " +
+ "the priority of the latter is higher.")
+ .version("1.8.0")
+ .serverOnly
+ .stringConf
+ .toSet()
+ .createWithDefault(Set.empty)
val SERVER_LIMIT_BATCH_CONNECTIONS_PER_USER: OptionalConfigEntry[Int] =
- buildConf("kyuubi.server.batch.limit.connections.per.user")
+ buildConf("kyuubi.server.limit.batch.connections.per.user")
.doc("Maximum kyuubi server batch connections per user." +
" Any user exceeding this limit will not be allowed to connect.")
.version("1.7.0")
@@ -2257,7 +2673,7 @@ object KyuubiConf {
.createOptional
val SERVER_LIMIT_BATCH_CONNECTIONS_PER_IPADDRESS: OptionalConfigEntry[Int] =
- buildConf("kyuubi.server.batch.limit.connections.per.ipaddress")
+ buildConf("kyuubi.server.limit.batch.connections.per.ipaddress")
.doc("Maximum kyuubi server batch connections per ipaddress." +
" Any user exceeding this limit will not be allowed to connect.")
.version("1.7.0")
@@ -2266,7 +2682,7 @@ object KyuubiConf {
.createOptional
val SERVER_LIMIT_BATCH_CONNECTIONS_PER_USER_IPADDRESS: OptionalConfigEntry[Int] =
- buildConf("kyuubi.server.batch.limit.connections.per.user.ipaddress")
+ buildConf("kyuubi.server.limit.batch.connections.per.user.ipaddress")
.doc("Maximum kyuubi server batch connections per user:ipaddress combination." +
" Any user-ipaddress exceeding this limit will not be allowed to connect.")
.version("1.7.0")
@@ -2274,6 +2690,15 @@ object KyuubiConf {
.intConf
.createOptional
+ val SERVER_LIMIT_CLIENT_FETCH_MAX_ROWS: OptionalConfigEntry[Int] =
+ buildConf("kyuubi.server.limit.client.fetch.max.rows")
+ .doc("Max rows limit for getting result row set operation. If the max rows specified " +
+ "by client-side is larger than the limit, request will fail directly.")
+ .version("1.8.0")
+ .serverOnly
+ .intConf
+ .createOptional
+
val SESSION_PROGRESS_ENABLE: ConfigEntry[Boolean] =
buildConf("kyuubi.operation.progress.enabled")
.doc("Whether to enable the operation progress. When true," +
@@ -2290,6 +2715,24 @@ object KyuubiConf {
.regexConf
.createOptional
+ val SERVER_PERIODIC_GC_INTERVAL: ConfigEntry[Long] =
+ buildConf("kyuubi.server.periodicGC.interval")
+ .doc("How often to trigger a garbage collection.")
+ .version("1.7.0")
+ .serverOnly
+ .timeConf
+ .createWithDefaultString("PT30M")
+
+ val SERVER_ADMINISTRATORS: ConfigEntry[Set[String]] =
+ buildConf("kyuubi.server.administrators")
+ .doc("Comma-separated list of Kyuubi service administrators. " +
+ "We use this config to grant admin permission to any service accounts.")
+ .version("1.8.0")
+ .serverOnly
+ .stringConf
+ .toSet()
+ .createWithDefault(Set.empty)
+
val OPERATION_SPARK_LISTENER_ENABLED: ConfigEntry[Boolean] =
buildConf("kyuubi.operation.spark.listener.enabled")
.doc("When set to true, Spark engine registers an SQLOperationListener before executing " +
@@ -2312,6 +2755,13 @@ object KyuubiConf {
.stringConf
.createOptional
+ val ENGINE_JDBC_CONNECTION_PROPAGATECREDENTIAL: ConfigEntry[Boolean] =
+ buildConf("kyuubi.engine.jdbc.connection.propagateCredential")
+ .doc("Whether to use the session's user and password to connect to database")
+ .version("1.8.0")
+ .booleanConf
+ .createWithDefault(false)
+
val ENGINE_JDBC_CONNECTION_USER: OptionalConfigEntry[String] =
buildConf("kyuubi.engine.jdbc.connection.user")
.doc("The user is used for connecting to server")
@@ -2348,6 +2798,24 @@ object KyuubiConf {
.stringConf
.createOptional
+ val ENGINE_JDBC_INITIALIZE_SQL: ConfigEntry[Seq[String]] =
+ buildConf("kyuubi.engine.jdbc.initialize.sql")
+ .doc("SemiColon-separated list of SQL statements to be initialized in the newly created " +
+ "engine before queries. i.e. use `SELECT 1` to eagerly active JDBCClient.")
+ .version("1.8.0")
+ .stringConf
+ .toSequence(";")
+ .createWithDefaultString("SELECT 1")
+
+ val ENGINE_JDBC_SESSION_INITIALIZE_SQL: ConfigEntry[Seq[String]] =
+ buildConf("kyuubi.engine.jdbc.session.initialize.sql")
+ .doc("SemiColon-separated list of SQL statements to be initialized in the newly created " +
+ "engine session before queries.")
+ .version("1.8.0")
+ .stringConf
+ .toSequence(";")
+ .createWithDefault(Nil)
+
val ENGINE_OPERATION_CONVERT_CATALOG_DATABASE_ENABLED: ConfigEntry[Boolean] =
buildConf("kyuubi.engine.operation.convert.catalog.database.enabled")
.doc("When set to true, The engine converts the JDBC methods of set/get Catalog " +
@@ -2356,6 +2824,53 @@ object KyuubiConf {
.booleanConf
.createWithDefault(true)
+ val ENGINE_SUBMIT_TIMEOUT: ConfigEntry[Long] =
+ buildConf("kyuubi.engine.submit.timeout")
+ .doc("Period to tolerant Driver Pod ephemerally invisible after submitting. " +
+ "In some Resource Managers, e.g. K8s, the Driver Pod is not visible immediately " +
+ "after `spark-submit` is returned.")
+ .version("1.7.1")
+ .timeConf
+ .createWithDefaultString("PT30S")
+
+ val ENGINE_KUBERNETES_SUBMIT_TIMEOUT: ConfigEntry[Long] =
+ buildConf("kyuubi.engine.kubernetes.submit.timeout")
+ .doc("The engine submit timeout for Kubernetes application.")
+ .version("1.7.2")
+ .fallbackConf(ENGINE_SUBMIT_TIMEOUT)
+
+ val ENGINE_YARN_SUBMIT_TIMEOUT: ConfigEntry[Long] =
+ buildConf("kyuubi.engine.yarn.submit.timeout")
+ .doc("The engine submit timeout for YARN application.")
+ .version("1.7.2")
+ .fallbackConf(ENGINE_SUBMIT_TIMEOUT)
+
+ object YarnUserStrategy extends Enumeration {
+ type YarnUserStrategy = Value
+ val NONE, ADMIN, OWNER = Value
+ }
+
+ val YARN_USER_STRATEGY: ConfigEntry[String] =
+ buildConf("kyuubi.yarn.user.strategy")
+ .doc("Determine which user to use to construct YARN client for application management, " +
+ "e.g. kill application. Options:
" +
+ "
NONE: use Kyuubi server user.
" +
+ "
ADMIN: use admin user configured in `kyuubi.yarn.user.admin`.
" +
+ "
OWNER: use session user, typically is application owner.
" +
+ "
")
+ .version("1.8.0")
+ .stringConf
+ .checkValues(YarnUserStrategy)
+ .createWithDefault("NONE")
+
+ val YARN_USER_ADMIN: ConfigEntry[String] =
+ buildConf("kyuubi.yarn.user.admin")
+ .doc(s"When ${YARN_USER_STRATEGY.key} is set to ADMIN, use this admin user to " +
+ "construct YARN client for application management, e.g. kill application.")
+ .version("1.8.0")
+ .stringConf
+ .createWithDefault("yarn")
+
/**
* Holds information about keys that have been deprecated.
*
@@ -2427,6 +2942,84 @@ object KyuubiConf {
Map(configs.map { cfg => cfg.key -> cfg }: _*)
}
+ val ENGINE_CHAT_MEMORY: ConfigEntry[String] =
+ buildConf("kyuubi.engine.chat.memory")
+ .doc("The heap memory for the Chat engine")
+ .version("1.8.0")
+ .stringConf
+ .createWithDefault("1g")
+
+ val ENGINE_CHAT_JAVA_OPTIONS: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.chat.java.options")
+ .doc("The extra Java options for the Chat engine")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_CHAT_PROVIDER: ConfigEntry[String] =
+ buildConf("kyuubi.engine.chat.provider")
+ .doc("The provider for the Chat engine. Candidates:
" +
+ "
ECHO: simply replies a welcome message.
" +
+ "
GPT: a.k.a ChatGPT, powered by OpenAI.
" +
+ "
")
+ .version("1.8.0")
+ .stringConf
+ .transform {
+ case "ECHO" | "echo" => "org.apache.kyuubi.engine.chat.provider.EchoProvider"
+ case "GPT" | "gpt" | "ChatGPT" => "org.apache.kyuubi.engine.chat.provider.ChatGPTProvider"
+ case other => other
+ }
+ .createWithDefault("ECHO")
+
+ val ENGINE_CHAT_GPT_API_KEY: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.chat.gpt.apiKey")
+ .doc("The key to access OpenAI open API, which could be got at " +
+ "https://platform.openai.com/account/api-keys")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_CHAT_GPT_MODEL: ConfigEntry[String] =
+ buildConf("kyuubi.engine.chat.gpt.model")
+ .doc("ID of the model used in ChatGPT. Available models refer to OpenAI's " +
+ "[Model overview](https://platform.openai.com/docs/models/overview).")
+ .version("1.8.0")
+ .stringConf
+ .createWithDefault("gpt-3.5-turbo")
+
+ val ENGINE_CHAT_EXTRA_CLASSPATH: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.chat.extra.classpath")
+ .doc("The extra classpath for the Chat engine, for configuring the location " +
+ "of the SDK and etc.")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_CHAT_GPT_HTTP_PROXY: OptionalConfigEntry[String] =
+ buildConf("kyuubi.engine.chat.gpt.http.proxy")
+ .doc("HTTP proxy url for API calling in Chat GPT engine. e.g. http://127.0.0.1:1087")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
+ val ENGINE_CHAT_GPT_HTTP_CONNECT_TIMEOUT: ConfigEntry[Long] =
+ buildConf("kyuubi.engine.chat.gpt.http.connect.timeout")
+ .doc("The timeout[ms] for establishing the connection with the Chat GPT server. " +
+ "A timeout value of zero is interpreted as an infinite timeout.")
+ .version("1.8.0")
+ .timeConf
+ .checkValue(_ >= 0, "must be 0 or positive number")
+ .createWithDefault(Duration.ofSeconds(120).toMillis)
+
+ val ENGINE_CHAT_GPT_HTTP_SOCKET_TIMEOUT: ConfigEntry[Long] =
+ buildConf("kyuubi.engine.chat.gpt.http.socket.timeout")
+ .doc("The timeout[ms] for waiting for data packets after Chat GPT server " +
+ "connection is established. A timeout value of zero is interpreted as an infinite timeout.")
+ .version("1.8.0")
+ .timeConf
+ .checkValue(_ >= 0, "must be 0 or positive number")
+ .createWithDefault(Duration.ofSeconds(120).toMillis)
+
val ENGINE_JDBC_MEMORY: ConfigEntry[String] =
buildConf("kyuubi.engine.jdbc.memory")
.doc("The heap memory for the JDBC query engine")
@@ -2483,6 +3076,15 @@ object KyuubiConf {
.stringConf
.createWithDefault("bin/python")
+ val ENGINE_SPARK_REGISTER_ATTRIBUTES: ConfigEntry[Seq[String]] =
+ buildConf("kyuubi.engine.spark.register.attributes")
+ .internal
+ .doc("The extra attributes to expose when registering for Spark engine.")
+ .version("1.8.0")
+ .stringConf
+ .toSequence()
+ .createWithDefault(Seq("spark.driver.memory", "spark.executor.memory"))
+
val ENGINE_HIVE_EVENT_LOGGERS: ConfigEntry[Seq[String]] =
buildConf("kyuubi.engine.hive.event.loggers")
.doc("A comma-separated list of engine history loggers, where engine/session/operation etc" +
@@ -2493,7 +3095,7 @@ object KyuubiConf {
"
")
.version("1.7.0")
.stringConf
- .transform(_.toUpperCase(Locale.ROOT))
+ .transformToUpperCase
.toSequence()
.checkValue(
_.toSet.subsetOf(Set("JSON", "JDBC", "CUSTOM")),
@@ -2538,4 +3140,23 @@ object KyuubiConf {
.version("1.7.0")
.timeConf
.createWithDefault(Duration.ofSeconds(60).toMillis)
+
+ val OPERATION_GET_TABLES_IGNORE_TABLE_PROPERTIES: ConfigEntry[Boolean] =
+ buildConf("kyuubi.operation.getTables.ignoreTableProperties")
+ .doc("Speed up the `GetTables` operation by returning table identities only.")
+ .version("1.8.0")
+ .booleanConf
+ .createWithDefault(false)
+
+ val SERVER_LIMIT_ENGINE_CREATION: OptionalConfigEntry[Int] =
+ buildConf("kyuubi.server.limit.engine.startup")
+ .internal
+ .doc("The maximum engine startup concurrency of kyuubi server. Highly concurrent engine" +
+ " startup processes may lead to high load on the kyuubi server machine," +
+ " this configuration is used to limit the number of engine startup processes" +
+ " running at the same time to avoid it.")
+ .version("1.8.0")
+ .serverOnly
+ .intConf
+ .createOptional
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiReservedKeys.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiReservedKeys.scala
index 6036af855..592425a4b 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiReservedKeys.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/config/KyuubiReservedKeys.scala
@@ -19,25 +19,33 @@ package org.apache.kyuubi.config
object KyuubiReservedKeys {
final val KYUUBI_CLIENT_IP_KEY = "kyuubi.client.ipAddress"
+ final val KYUUBI_CLIENT_VERSION_KEY = "kyuubi.client.version"
final val KYUUBI_SERVER_IP_KEY = "kyuubi.server.ipAddress"
final val KYUUBI_SESSION_USER_KEY = "kyuubi.session.user"
final val KYUUBI_SESSION_SIGN_PUBLICKEY = "kyuubi.session.sign.publickey"
final val KYUUBI_SESSION_USER_SIGN = "kyuubi.session.user.sign"
final val KYUUBI_SESSION_REAL_USER_KEY = "kyuubi.session.real.user"
- final val KYUUBI_SESSION_BATCH_RESOURCE_UPLOADED_KEY = "kyuubi.session.batch.resource.uploaded"
final val KYUUBI_SESSION_CONNECTION_URL_KEY = "kyuubi.session.connection.url"
+ // default priority is 10, higher priority will be scheduled first
+ // when enabled metadata store priority feature
+ final val KYUUBI_BATCH_PRIORITY = "kyuubi.batch.priority"
+ final val KYUUBI_BATCH_RESOURCE_UPLOADED_KEY = "kyuubi.batch.resource.uploaded"
final val KYUUBI_STATEMENT_ID_KEY = "kyuubi.statement.id"
final val KYUUBI_ENGINE_ID = "kyuubi.engine.id"
final val KYUUBI_ENGINE_NAME = "kyuubi.engine.name"
final val KYUUBI_ENGINE_URL = "kyuubi.engine.url"
final val KYUUBI_ENGINE_SUBMIT_TIME_KEY = "kyuubi.engine.submit.time"
final val KYUUBI_ENGINE_CREDENTIALS_KEY = "kyuubi.engine.credentials"
+ final val KYUUBI_SESSION_HANDLE_KEY = "kyuubi.session.handle"
final val KYUUBI_SESSION_ENGINE_LAUNCH_HANDLE_GUID =
"kyuubi.session.engine.launch.handle.guid"
final val KYUUBI_SESSION_ENGINE_LAUNCH_HANDLE_SECRET =
"kyuubi.session.engine.launch.handle.secret"
+ final val KYUUBI_SESSION_ENGINE_LAUNCH_SUPPORT_RESULT =
+ "kyuubi.session.engine.launch.support.result"
final val KYUUBI_OPERATION_SET_CURRENT_CATALOG = "kyuubi.operation.set.current.catalog"
final val KYUUBI_OPERATION_GET_CURRENT_CATALOG = "kyuubi.operation.get.current.catalog"
final val KYUUBI_OPERATION_SET_CURRENT_DATABASE = "kyuubi.operation.set.current.database"
final val KYUUBI_OPERATION_GET_CURRENT_DATABASE = "kyuubi.operation.get.current.database"
+ final val KYUUBI_OPERATION_HANDLE_KEY = "kyuubi.operation.handle"
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/engine/EngineType.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/engine/EngineType.scala
index 88680a8c7..3d850ba14 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/engine/EngineType.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/engine/EngineType.scala
@@ -23,5 +23,5 @@ package org.apache.kyuubi.engine
object EngineType extends Enumeration {
type EngineType = Value
- val SPARK_SQL, FLINK_SQL, TRINO, HIVE_SQL, JDBC = Value
+ val SPARK_SQL, FLINK_SQL, CHAT, TRINO, HIVE_SQL, JDBC = Value
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala
index 9cdd6a8f0..0a185b942 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/AbstractOperation.scala
@@ -18,13 +18,14 @@
package org.apache.kyuubi.operation
import java.util.concurrent.{Future, ScheduledExecutorService, TimeUnit}
+import java.util.concurrent.locks.ReentrantLock
import scala.collection.JavaConverters._
import org.apache.commons.lang3.StringUtils
-import org.apache.hive.service.rpc.thrift.{TGetResultSetMetadataResp, TProgressUpdateResp, TProtocolVersion, TRowSet, TStatus, TStatusCode}
+import org.apache.hive.service.rpc.thrift.{TFetchResultsResp, TGetResultSetMetadataResp, TProgressUpdateResp, TProtocolVersion, TStatus, TStatusCode}
-import org.apache.kyuubi.{KyuubiSQLException, Logging}
+import org.apache.kyuubi.{KyuubiSQLException, Logging, Utils}
import org.apache.kyuubi.config.KyuubiConf.OPERATION_IDLE_TIMEOUT
import org.apache.kyuubi.operation.FetchOrientation.FetchOrientation
import org.apache.kyuubi.operation.OperationState._
@@ -36,7 +37,7 @@ abstract class AbstractOperation(session: Session) extends Operation with Loggin
final protected val opType: String = getClass.getSimpleName
final protected val createTime = System.currentTimeMillis()
- final private val handle = OperationHandle()
+ protected val handle = OperationHandle()
final private val operationTimeout: Long = {
session.sessionManager.getConf.get(OPERATION_IDLE_TIMEOUT)
}
@@ -45,7 +46,11 @@ abstract class AbstractOperation(session: Session) extends Operation with Loggin
private var statementTimeoutCleaner: Option[ScheduledExecutorService] = None
- protected def cleanup(targetState: OperationState): Unit = state.synchronized {
+ private val lock: ReentrantLock = new ReentrantLock()
+
+ protected def withLockRequired[T](block: => T): T = Utils.withLockRequired(lock)(block)
+
+ protected def cleanup(targetState: OperationState): Unit = withLockRequired {
if (!isTerminalState(state)) {
setState(targetState)
Option(getBackgroundHandle).foreach(_.cancel(true))
@@ -110,7 +115,7 @@ abstract class AbstractOperation(session: Session) extends Operation with Loggin
info(s"Processing ${session.user}'s query[$statementId]: " +
s"${state.name} -> ${newState.name}, statement:\n$redactedStatement")
startTime = System.currentTimeMillis()
- case ERROR | FINISHED | CANCELED | TIMEOUT =>
+ case ERROR | FINISHED | CANCELED | TIMEOUT | CLOSED =>
completedTime = System.currentTimeMillis()
val timeCost = s", time taken: ${(completedTime - startTime) / 1000.0} seconds"
info(s"Processing ${session.user}'s query[$statementId]: " +
@@ -177,7 +182,12 @@ abstract class AbstractOperation(session: Session) extends Operation with Loggin
override def getResultSetMetadata: TGetResultSetMetadataResp
- override def getNextRowSet(order: FetchOrientation, rowSetSize: Int): TRowSet
+ def getNextRowSetInternal(order: FetchOrientation, rowSetSize: Int): TFetchResultsResp
+
+ override def getNextRowSet(order: FetchOrientation, rowSetSize: Int): TFetchResultsResp =
+ withLockRequired {
+ getNextRowSetInternal(order, rowSetSize)
+ }
/**
* convert SQL 'like' pattern to a Java regular expression.
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/FetchIterator.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/FetchIterator.scala
index fdada1174..ada155887 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/FetchIterator.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/FetchIterator.scala
@@ -20,7 +20,7 @@ package org.apache.kyuubi.operation
/**
* Borrowed from Apache Spark, see SPARK-33655
*/
-sealed trait FetchIterator[A] extends Iterator[A] {
+trait FetchIterator[A] extends Iterator[A] {
/**
* Begin a fetch block, forward from the current position.
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/Operation.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/Operation.scala
index 6f496c9b8..c20a16f61 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/Operation.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/Operation.scala
@@ -19,7 +19,7 @@ package org.apache.kyuubi.operation
import java.util.concurrent.Future
-import org.apache.hive.service.rpc.thrift.{TGetResultSetMetadataResp, TRowSet}
+import org.apache.hive.service.rpc.thrift.{TFetchResultsResp, TGetResultSetMetadataResp}
import org.apache.kyuubi.operation.FetchOrientation.FetchOrientation
import org.apache.kyuubi.operation.log.OperationLog
@@ -32,7 +32,7 @@ trait Operation {
def close(): Unit
def getResultSetMetadata: TGetResultSetMetadataResp
- def getNextRowSet(order: FetchOrientation, rowSetSize: Int): TRowSet
+ def getNextRowSet(order: FetchOrientation, rowSetSize: Int): TFetchResultsResp
def getSession: Session
def getHandle: OperationHandle
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/OperationManager.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/OperationManager.scala
index fe38263db..38dabcc1a 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/OperationManager.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/OperationManager.scala
@@ -17,6 +17,8 @@
package org.apache.kyuubi.operation
+import scala.collection.JavaConverters._
+
import org.apache.hive.service.rpc.thrift._
import org.apache.kyuubi.KyuubiSQLException
@@ -41,6 +43,8 @@ abstract class OperationManager(name: String) extends AbstractService(name) {
def getOperationCount: Int = handleToOperation.size()
+ def allOperations(): Iterable[Operation] = handleToOperation.values().asScala
+
override def initialize(conf: KyuubiConf): Unit = {
LogDivertAppender.initialize(skipOperationLog)
super.initialize(conf)
@@ -133,18 +137,22 @@ abstract class OperationManager(name: String) extends AbstractService(name) {
final def getOperationNextRowSet(
opHandle: OperationHandle,
order: FetchOrientation,
- maxRows: Int): TRowSet = {
+ maxRows: Int): TFetchResultsResp = {
getOperation(opHandle).getNextRowSet(order, maxRows)
}
def getOperationLogRowSet(
opHandle: OperationHandle,
order: FetchOrientation,
- maxRows: Int): TRowSet = {
+ maxRows: Int): TFetchResultsResp = {
val operationLog = getOperation(opHandle).getOperationLog
- operationLog.map(_.read(maxRows)).getOrElse {
+ val rowSet = operationLog.map(_.read(order, maxRows)).getOrElse {
throw KyuubiSQLException(s"$opHandle failed to generate operation log")
}
+ val resp = new TFetchResultsResp(new TStatus(TStatusCode.SUCCESS_STATUS))
+ resp.setResults(rowSet)
+ resp.setHasMoreRows(false)
+ resp
}
final def removeExpiredOperations(handles: Seq[OperationHandle]): Seq[Operation] = synchronized {
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/PlanOnlyMode.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/PlanOnlyMode.scala
index 3e170f05f..0407dab62 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/PlanOnlyMode.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/PlanOnlyMode.scala
@@ -41,6 +41,8 @@ case object PhysicalMode extends PlanOnlyMode { val name = "physical" }
case object ExecutionMode extends PlanOnlyMode { val name = "execution" }
+case object LineageMode extends PlanOnlyMode { val name = "lineage" }
+
case object NoneMode extends PlanOnlyMode { val name = "none" }
case object UnknownMode extends PlanOnlyMode {
@@ -64,6 +66,7 @@ object PlanOnlyMode {
case OptimizeWithStatsMode.name => OptimizeWithStatsMode
case PhysicalMode.name => PhysicalMode
case ExecutionMode.name => ExecutionMode
+ case LineageMode.name => LineageMode
case NoneMode.name => NoneMode
case other => UnknownMode.mode(other)
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/Log4j12DivertAppender.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/Log4j12DivertAppender.scala
index 1191e94ae..6ea853485 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/Log4j12DivertAppender.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/Log4j12DivertAppender.scala
@@ -30,7 +30,7 @@ class Log4j12DivertAppender extends WriterAppender {
final private val lo = Logger.getRootLogger
.getAllAppenders.asScala
- .find(_.isInstanceOf[ConsoleAppender])
+ .find(ap => ap.isInstanceOf[ConsoleAppender] || ap.isInstanceOf[RollingFileAppender])
.map(_.asInstanceOf[Appender].getLayout)
.getOrElse(new PatternLayout("%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n"))
@@ -39,7 +39,7 @@ class Log4j12DivertAppender extends WriterAppender {
setLayout(lo)
addFilter { _: LoggingEvent =>
- if (OperationLog.getCurrentOperationLog == null) Filter.DENY else Filter.NEUTRAL
+ if (OperationLog.getCurrentOperationLog.isDefined) Filter.NEUTRAL else Filter.DENY
}
/**
@@ -51,8 +51,7 @@ class Log4j12DivertAppender extends WriterAppender {
// That should've gone into our writer. Notify the LogContext.
val logOutput = writer.toString
writer.reset()
- val log = OperationLog.getCurrentOperationLog
- if (log != null) log.write(logOutput)
+ OperationLog.getCurrentOperationLog.foreach(_.write(logOutput))
}
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/Log4j2DivertAppender.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/Log4j2DivertAppender.scala
index 68753cf98..d8e37a019 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/Log4j2DivertAppender.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/Log4j2DivertAppender.scala
@@ -18,15 +18,18 @@
package org.apache.kyuubi.operation.log
import java.io.CharArrayWriter
+import java.util.concurrent.locks.ReadWriteLock
import scala.collection.JavaConverters._
import org.apache.logging.log4j.LogManager
import org.apache.logging.log4j.core.{Filter, LogEvent, StringLayout}
-import org.apache.logging.log4j.core.appender.{AbstractWriterAppender, ConsoleAppender, WriterManager}
+import org.apache.logging.log4j.core.appender.{AbstractWriterAppender, ConsoleAppender, RollingFileAppender, WriterManager}
import org.apache.logging.log4j.core.filter.AbstractFilter
import org.apache.logging.log4j.core.layout.PatternLayout
+import org.apache.kyuubi.util.reflect.ReflectUtils._
+
class Log4j2DivertAppender(
name: String,
layout: StringLayout,
@@ -52,22 +55,16 @@ class Log4j2DivertAppender(
addFilter(new AbstractFilter() {
override def filter(event: LogEvent): Filter.Result = {
- if (OperationLog.getCurrentOperationLog == null) {
- Filter.Result.DENY
- } else {
+ if (OperationLog.getCurrentOperationLog.isDefined) {
Filter.Result.NEUTRAL
+ } else {
+ Filter.Result.DENY
}
}
})
- def initLayout(): StringLayout = {
- LogManager.getRootLogger.asInstanceOf[org.apache.logging.log4j.core.Logger]
- .getAppenders.values().asScala
- .find(ap => ap.isInstanceOf[ConsoleAppender] && ap.getLayout.isInstanceOf[StringLayout])
- .map(_.getLayout.asInstanceOf[StringLayout])
- .getOrElse(PatternLayout.newBuilder().withPattern(
- "%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n").build())
- }
+ private val writeLock =
+ getField[ReadWriteLock]((classOf[AbstractWriterAppender[_]], this), "readWriteLock").writeLock
/**
* Overrides AbstractWriterAppender.append(), which does the real logging. No need
@@ -75,11 +72,15 @@ class Log4j2DivertAppender(
*/
override def append(event: LogEvent): Unit = {
super.append(event)
- // That should've gone into our writer. Notify the LogContext.
- val logOutput = writer.toString
- writer.reset()
- val log = OperationLog.getCurrentOperationLog
- if (log != null) log.write(logOutput)
+ writeLock.lock()
+ try {
+ // That should've gone into our writer. Notify the LogContext.
+ val logOutput = writer.toString
+ writer.reset()
+ OperationLog.getCurrentOperationLog.foreach(_.write(logOutput))
+ } finally {
+ writeLock.unlock()
+ }
}
}
@@ -87,15 +88,17 @@ object Log4j2DivertAppender {
def initLayout(): StringLayout = {
LogManager.getRootLogger.asInstanceOf[org.apache.logging.log4j.core.Logger]
.getAppenders.values().asScala
- .find(ap => ap.isInstanceOf[ConsoleAppender] && ap.getLayout.isInstanceOf[StringLayout])
+ .find(ap =>
+ (ap.isInstanceOf[ConsoleAppender] || ap.isInstanceOf[RollingFileAppender]) &&
+ ap.getLayout.isInstanceOf[StringLayout])
.map(_.getLayout.asInstanceOf[StringLayout])
.getOrElse(PatternLayout.newBuilder().withPattern(
- "%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n").build())
+ "%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n%ex").build())
}
def initialize(): Unit = {
val ap = new Log4j2DivertAppender()
- org.apache.logging.log4j.LogManager.getRootLogger()
+ org.apache.logging.log4j.LogManager.getRootLogger
.asInstanceOf[org.apache.logging.log4j.core.Logger].addAppender(ap)
ap.start()
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/LogDivertAppender.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/LogDivertAppender.scala
index 7d2989303..58bca992c 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/LogDivertAppender.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/LogDivertAppender.scala
@@ -17,7 +17,7 @@
package org.apache.kyuubi.operation.log
-import org.slf4j.impl.StaticLoggerBinder
+import org.slf4j.LoggerFactory
import org.apache.kyuubi.Logging
@@ -30,9 +30,8 @@ object LogDivertAppender extends Logging {
Log4j12DivertAppender.initialize()
} else {
warn(s"Unsupported SLF4J binding" +
- s" ${StaticLoggerBinder.getSingleton.getLoggerFactoryClassStr}")
+ s" ${LoggerFactory.getILoggerFactory.getClass.getName}")
}
}
-
}
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/OperationLog.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/OperationLog.scala
index 84c4ed55c..2e133df28 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/OperationLog.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/operation/log/OperationLog.scala
@@ -20,7 +20,7 @@ package org.apache.kyuubi.operation.log
import java.io.{BufferedReader, IOException}
import java.nio.ByteBuffer
import java.nio.charset.StandardCharsets
-import java.nio.file.{Files, Path, Paths}
+import java.nio.file.{Files, NoSuchFileException, Path, Paths}
import java.util.{ArrayList => JArrayList, List => JList}
import scala.collection.JavaConverters._
@@ -29,6 +29,7 @@ import scala.collection.mutable.ListBuffer
import org.apache.hive.service.rpc.thrift.{TColumn, TRow, TRowSet, TStringColumn}
import org.apache.kyuubi.{KyuubiSQLException, Logging}
+import org.apache.kyuubi.operation.FetchOrientation.{FETCH_FIRST, FETCH_NEXT, FetchOrientation}
import org.apache.kyuubi.operation.OperationHandle
import org.apache.kyuubi.session.Session
import org.apache.kyuubi.util.ThriftUtils
@@ -44,7 +45,7 @@ object OperationLog extends Logging {
OPERATION_LOG.set(operationLog)
}
- def getCurrentOperationLog: OperationLog = OPERATION_LOG.get()
+ def getCurrentOperationLog: Option[OperationLog] = Option(OPERATION_LOG.get)
def removeCurrentOperationLog(): Unit = OPERATION_LOG.remove()
@@ -86,7 +87,7 @@ object OperationLog extends Logging {
class OperationLog(path: Path) {
private lazy val writer = Files.newBufferedWriter(path, StandardCharsets.UTF_8)
- private lazy val reader = Files.newBufferedReader(path, StandardCharsets.UTF_8)
+ private var reader: BufferedReader = _
@volatile private var initialized: Boolean = false
@@ -95,6 +96,15 @@ class OperationLog(path: Path) {
private var lastSeekReadPos = 0
private var seekableReader: SeekableBufferedReader = _
+ def getReader(): BufferedReader = {
+ if (reader == null) {
+ try {
+ reader = Files.newBufferedReader(path, StandardCharsets.UTF_8)
+ } catch handleFileNotFound
+ }
+ reader
+ }
+
def addExtraLog(path: Path): Unit = synchronized {
try {
extraReaders += Files.newBufferedReader(path, StandardCharsets.UTF_8)
@@ -130,19 +140,23 @@ class OperationLog(path: Path) {
val logs = new JArrayList[String]
var i = 0
try {
- var line: String = reader.readLine()
- while ((i < lastRows || maxRows <= 0) && line != null) {
- logs.add(line)
+ var line: String = null
+ do {
line = reader.readLine()
- i += 1
- }
- (logs, i)
- } catch {
- case e: IOException =>
- val absPath = path.toAbsolutePath
- val opHandle = absPath.getFileName
- throw KyuubiSQLException(s"Operation[$opHandle] log file $absPath is not found", e)
- }
+ if (line != null) {
+ logs.add(line)
+ i += 1
+ }
+ } while ((i < lastRows || maxRows <= 0) && line != null)
+ } catch handleFileNotFound
+ (logs, i)
+ }
+
+ private def handleFileNotFound: PartialFunction[Throwable, Unit] = {
+ case e: IOException =>
+ val absPath = path.toAbsolutePath
+ val opHandle = absPath.getFileName
+ throw KyuubiSQLException(s"Operation[$opHandle] log file $absPath is not found", e)
}
private def toRowSet(logs: JList[String]): TRowSet = {
@@ -152,14 +166,25 @@ class OperationLog(path: Path) {
tRow
}
+ def read(maxRows: Int): TRowSet = synchronized {
+ read(FETCH_NEXT, maxRows)
+ }
+
/**
* Read to log file line by line
*
* @param maxRows maximum result number can reach
+ * @param order the fetch orientation of the result, can be FETCH_NEXT, FETCH_FIRST
*/
- def read(maxRows: Int): TRowSet = synchronized {
+ def read(order: FetchOrientation = FETCH_NEXT, maxRows: Int): TRowSet = synchronized {
if (!initialized) return ThriftUtils.newEmptyRowSet
- val (logs, lines) = readLogs(reader, maxRows, maxRows)
+ if (order != FETCH_NEXT && order != FETCH_FIRST) {
+ throw KyuubiSQLException(s"$order in operation log is not supported")
+ }
+ if (order == FETCH_FIRST) {
+ resetReader()
+ }
+ val (logs, lines) = readLogs(getReader(), maxRows, maxRows)
var lastRows = maxRows - lines
for (extraReader <- extraReaders if lastRows > 0 || maxRows <= 0) {
val (extraLogs, extraRows) = readLogs(extraReader, lastRows, maxRows)
@@ -170,6 +195,19 @@ class OperationLog(path: Path) {
toRowSet(logs)
}
+ private def resetReader(): Unit = {
+ trySafely {
+ if (reader != null) {
+ reader.close()
+ }
+ }
+ reader = null
+ closeExtraReaders()
+ extraReaders.clear()
+ extraPaths.foreach(path =>
+ extraReaders += Files.newBufferedReader(path, StandardCharsets.UTF_8))
+ }
+
def read(from: Int, size: Int): TRowSet = synchronized {
if (!initialized) return ThriftUtils.newEmptyRowSet
var pos = from
@@ -195,10 +233,14 @@ class OperationLog(path: Path) {
}
def close(): Unit = synchronized {
+ if (!initialized) return
+
closeExtraReaders()
trySafely {
- reader.close()
+ if (reader != null) {
+ reader.close()
+ }
}
trySafely {
writer.close()
@@ -212,7 +254,7 @@ class OperationLog(path: Path) {
}
trySafely {
- Files.delete(path)
+ Files.deleteIfExists(path)
}
}
@@ -220,6 +262,7 @@ class OperationLog(path: Path) {
try {
f
} catch {
+ case _: NoSuchFileException =>
case e: IOException =>
// Printing log here may cause a deadlock. The lock order of OperationLog.write
// is RootLogger -> LogDivertAppender -> OperationLog. If printing log here, the
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/AbstractBackendService.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/AbstractBackendService.scala
index e7c2d8365..443b35354 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/AbstractBackendService.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/AbstractBackendService.scala
@@ -21,7 +21,7 @@ import java.util.concurrent.{ExecutionException, TimeoutException, TimeUnit}
import scala.concurrent.CancellationException
-import org.apache.hive.service.rpc.thrift.{TGetInfoType, TGetInfoValue, TGetResultSetMetadataResp, TProtocolVersion, TRowSet}
+import org.apache.hive.service.rpc.thrift._
import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.operation.{OperationHandle, OperationStatus}
@@ -35,6 +35,7 @@ abstract class AbstractBackendService(name: String)
extends CompositeService(name) with BackendService {
private lazy val timeout = conf.get(KyuubiConf.OPERATION_STATUS_POLLING_TIMEOUT)
+ private lazy val maxRowsLimit = conf.get(KyuubiConf.SERVER_LIMIT_CLIENT_FETCH_MAX_ROWS)
override def openSession(
protocol: TProtocolVersion,
@@ -156,11 +157,14 @@ abstract class AbstractBackendService(name: String)
queryId
}
- override def getOperationStatus(operationHandle: OperationHandle): OperationStatus = {
+ override def getOperationStatus(
+ operationHandle: OperationHandle,
+ maxWait: Option[Long]): OperationStatus = {
val operation = sessionManager.operationManager.getOperation(operationHandle)
if (operation.shouldRunAsync) {
try {
- operation.getBackgroundHandle.get(timeout, TimeUnit.MILLISECONDS)
+ val waitTime = maxWait.getOrElse(timeout)
+ operation.getBackgroundHandle.get(waitTime, TimeUnit.MILLISECONDS)
} catch {
case e: TimeoutException =>
debug(s"$operationHandle: Long polling timed out, ${e.getMessage}")
@@ -197,7 +201,13 @@ abstract class AbstractBackendService(name: String)
operationHandle: OperationHandle,
orientation: FetchOrientation,
maxRows: Int,
- fetchLog: Boolean): TRowSet = {
+ fetchLog: Boolean): TFetchResultsResp = {
+ maxRowsLimit.foreach(limit =>
+ if (maxRows > limit) {
+ throw new IllegalArgumentException(s"Max rows for fetching results " +
+ s"operation should not exceed the limit: $limit")
+ })
+
sessionManager.operationManager
.getOperation(operationHandle)
.getSession
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/BackendService.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/BackendService.scala
index e18411566..85df9024c 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/BackendService.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/BackendService.scala
@@ -91,7 +91,9 @@ trait BackendService {
foreignTable: String): OperationHandle
def getQueryId(operationHandle: OperationHandle): String
- def getOperationStatus(operationHandle: OperationHandle): OperationStatus
+ def getOperationStatus(
+ operationHandle: OperationHandle,
+ maxWait: Option[Long] = None): OperationStatus
def cancelOperation(operationHandle: OperationHandle): Unit
def closeOperation(operationHandle: OperationHandle): Unit
def getResultSetMetadata(operationHandle: OperationHandle): TGetResultSetMetadataResp
@@ -99,7 +101,7 @@ trait BackendService {
operationHandle: OperationHandle,
orientation: FetchOrientation,
maxRows: Int,
- fetchLog: Boolean): TRowSet
+ fetchLog: Boolean): TFetchResultsResp
def sessionManager: SessionManager
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/ServiceUtils.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/ServiceUtils.scala
index d481aea77..955144af8 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/ServiceUtils.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/ServiceUtils.scala
@@ -17,6 +17,10 @@
package org.apache.kyuubi.service
+import java.io.{Closeable, IOException}
+
+import org.slf4j.Logger
+
object ServiceUtils {
/**
@@ -49,4 +53,24 @@ object ServiceUtils {
userName.substring(0, indexOfDomainMatch)
}
}
+
+ /**
+ * Close the Closeable objects and ignore any [[IOException]] or
+ * null pointers. Must only be used for cleanup in exception handlers.
+ *
+ * @param log the log to record problems to at debug level. Can be null.
+ * @param closeables the objects to close
+ */
+ def cleanup(log: Logger, closeables: Closeable*): Unit = {
+ closeables.filter(_ != null).foreach { c =>
+ try {
+ c.close()
+ } catch {
+ case e: IOException =>
+ if (log != null && log.isDebugEnabled) {
+ log.debug(s"Exception in closing $c", e)
+ }
+ }
+ }
+ }
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/TBinaryFrontendService.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/TBinaryFrontendService.scala
index 74cf4e2e6..2f4419374 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/TBinaryFrontendService.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/TBinaryFrontendService.scala
@@ -134,7 +134,7 @@ abstract class TBinaryFrontendService(name: String)
keyStorePassword: String,
keyStoreType: Option[String],
keyStoreAlgorithm: Option[String],
- disallowedSslProtocols: Seq[String],
+ disallowedSslProtocols: Set[String],
includeCipherSuites: Seq[String]): TServerSocket = {
val params =
if (includeCipherSuites.nonEmpty) {
@@ -163,7 +163,7 @@ abstract class TBinaryFrontendService(name: String)
}
}
sslServerSocket.setEnabledProtocols(enabledProtocols)
- info(s"SSL Server Socket enabled protocols: $enabledProtocols")
+ info(s"SSL Server Socket enabled protocols: ${enabledProtocols.mkString(",")}")
case _ =>
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/TFrontendService.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/TFrontendService.scala
index c3354cc25..1492a6af5 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/TFrontendService.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/TFrontendService.scala
@@ -31,8 +31,7 @@ import org.apache.thrift.transport.TTransport
import org.apache.kyuubi.{KyuubiSQLException, Logging, Utils}
import org.apache.kyuubi.Utils.stringifyException
-import org.apache.kyuubi.config.KyuubiConf.AUTHENTICATION_LONG_USERNAME
-import org.apache.kyuubi.config.KyuubiConf.FRONTEND_CONNECTION_URL_USE_HOSTNAME
+import org.apache.kyuubi.config.KyuubiConf.{AUTHENTICATION_LONG_USERNAME, FRONTEND_ADVERTISED_HOST, FRONTEND_CONNECTION_URL_USE_HOSTNAME, SESSION_CLOSE_ON_DISCONNECT}
import org.apache.kyuubi.config.KyuubiReservedKeys._
import org.apache.kyuubi.operation.{FetchOrientation, OperationHandle}
import org.apache.kyuubi.service.authentication.KyuubiAuthenticationFactory
@@ -113,12 +112,12 @@ abstract class TFrontendService(name: String)
override def connectionUrl: String = {
checkInitialized()
- val host = serverHost match {
- case Some(h) => h // respect user's setting ahead
- case None if conf.get(FRONTEND_CONNECTION_URL_USE_HOSTNAME) =>
+ val host = (conf.get(FRONTEND_ADVERTISED_HOST), serverHost) match {
+ case (Some(advertisedHost), _) => advertisedHost
+ case (None, Some(h)) => h
+ case (None, None) if conf.get(FRONTEND_CONNECTION_URL_USE_HOSTNAME) =>
serverAddr.getCanonicalHostName
- case None =>
- serverAddr.getHostAddress
+ case (None, None) => serverAddr.getHostAddress
}
host + ":" + actualPort
@@ -526,23 +525,20 @@ abstract class TFrontendService(name: String)
override def FetchResults(req: TFetchResultsReq): TFetchResultsResp = {
debug(req.toString)
- val resp = new TFetchResultsResp
try {
val operationHandle = OperationHandle(req.getOperationHandle)
val orientation = FetchOrientation.getFetchOrientation(req.getOrientation)
// 1 means fetching log
val fetchLog = req.getFetchType == 1
val maxRows = req.getMaxRows.toInt
- val rowSet = be.fetchResults(operationHandle, orientation, maxRows, fetchLog)
- resp.setResults(rowSet)
- resp.setHasMoreRows(false)
- resp.setStatus(OK_STATUS)
+ be.fetchResults(operationHandle, orientation, maxRows, fetchLog)
} catch {
case e: Exception =>
error("Error fetching results: ", e)
+ val resp = new TFetchResultsResp
resp.setStatus(KyuubiSQLException.toTStatus(e))
+ resp
}
- resp
}
protected def notSupportTokenErrorStatus = {
@@ -614,7 +610,14 @@ abstract class TFrontendService(name: String)
if (handle != null) {
info(s"Session [$handle] disconnected without closing properly, close it now")
try {
- be.closeSession(handle)
+ val needToClose = be.sessionManager.getSession(handle).conf
+ .getOrElse(SESSION_CLOSE_ON_DISCONNECT.key, "true").toBoolean
+ if (needToClose) {
+ be.closeSession(handle)
+ } else {
+ warn(s"Session not actually closed because configuration " +
+ s"${SESSION_CLOSE_ON_DISCONNECT.key} is set to false")
+ }
} catch {
case e: KyuubiSQLException =>
error("Failed closing session", e)
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/EngineSecuritySecretProvider.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/EngineSecuritySecretProvider.scala
index 5bd9e4092..3216a43be 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/EngineSecuritySecretProvider.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/EngineSecuritySecretProvider.scala
@@ -18,7 +18,8 @@
package org.apache.kyuubi.service.authentication
import org.apache.kyuubi.config.KyuubiConf
-import org.apache.kyuubi.config.KyuubiConf.ENGINE_SECURITY_SECRET_PROVIDER
+import org.apache.kyuubi.config.KyuubiConf._
+import org.apache.kyuubi.util.reflect.DynConstructors
trait EngineSecuritySecretProvider {
@@ -33,11 +34,27 @@ trait EngineSecuritySecretProvider {
def getSecret(): String
}
+class SimpleEngineSecuritySecretProviderImpl extends EngineSecuritySecretProvider {
+
+ private var _conf: KyuubiConf = _
+
+ override def initialize(conf: KyuubiConf): Unit = _conf = conf
+
+ override def getSecret(): String = {
+ _conf.get(SIMPLE_SECURITY_SECRET_PROVIDER_PROVIDER_SECRET).getOrElse {
+ throw new IllegalArgumentException(
+ s"${SIMPLE_SECURITY_SECRET_PROVIDER_PROVIDER_SECRET.key} must be configured " +
+ s"when ${ENGINE_SECURITY_SECRET_PROVIDER.key} is `simple`.")
+ }
+ }
+}
+
object EngineSecuritySecretProvider {
def create(conf: KyuubiConf): EngineSecuritySecretProvider = {
- val providerClass = Class.forName(conf.get(ENGINE_SECURITY_SECRET_PROVIDER))
- val provider = providerClass.getConstructor().newInstance()
- .asInstanceOf[EngineSecuritySecretProvider]
+ val provider = DynConstructors.builder()
+ .impl(conf.get(ENGINE_SECURITY_SECRET_PROVIDER))
+ .buildChecked[EngineSecuritySecretProvider]()
+ .newInstance(conf)
provider.initialize(conf)
provider
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/InternalSecurityAccessor.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/InternalSecurityAccessor.scala
index 62680e6a6..afc1dde1f 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/InternalSecurityAccessor.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/InternalSecurityAccessor.scala
@@ -20,6 +20,8 @@ package org.apache.kyuubi.service.authentication
import javax.crypto.Cipher
import javax.crypto.spec.{IvParameterSpec, SecretKeySpec}
+import org.apache.hadoop.classification.VisibleForTesting
+
import org.apache.kyuubi.{KyuubiSQLException, Logging}
import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.config.KyuubiConf._
@@ -121,4 +123,9 @@ object InternalSecurityAccessor extends Logging {
def get(): InternalSecurityAccessor = {
_engineSecurityAccessor
}
+
+ @VisibleForTesting
+ def reset(): Unit = {
+ _engineSecurityAccessor = null
+ }
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/KyuubiAuthenticationFactory.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/KyuubiAuthenticationFactory.scala
index 5f429fa4e..1b62f6030 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/KyuubiAuthenticationFactory.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/KyuubiAuthenticationFactory.scala
@@ -39,7 +39,7 @@ class KyuubiAuthenticationFactory(conf: KyuubiConf, isServer: Boolean = true) ex
private val authTypes = conf.get(AUTHENTICATION_METHOD).map(AuthTypes.withName)
private val none = authTypes.contains(NONE)
- private val noSasl = authTypes == Seq(NOSASL)
+ private val noSasl = authTypes == Set(NOSASL)
private val kerberosEnabled = authTypes.contains(KERBEROS)
private val plainAuthTypeOpt = authTypes.filterNot(_.equals(KERBEROS))
.filterNot(_.equals(NOSASL)).headOption
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/LdapAuthenticationProviderImpl.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/LdapAuthenticationProviderImpl.scala
index b5e08def5..d885da55b 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/LdapAuthenticationProviderImpl.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/LdapAuthenticationProviderImpl.scala
@@ -17,17 +17,26 @@
package org.apache.kyuubi.service.authentication
-import javax.naming.{Context, NamingException}
-import javax.naming.directory.InitialDirContext
+import javax.naming.NamingException
import javax.security.sasl.AuthenticationException
import org.apache.commons.lang3.StringUtils
+import org.apache.kyuubi.Logging
import org.apache.kyuubi.config.KyuubiConf
-import org.apache.kyuubi.config.KyuubiConf._
import org.apache.kyuubi.service.ServiceUtils
+import org.apache.kyuubi.service.authentication.LdapAuthenticationProviderImpl.FILTER_FACTORIES
+import org.apache.kyuubi.service.authentication.ldap._
+import org.apache.kyuubi.service.authentication.ldap.LdapUtils.getUserName
-class LdapAuthenticationProviderImpl(conf: KyuubiConf) extends PasswdAuthenticationProvider {
+class LdapAuthenticationProviderImpl(
+ conf: KyuubiConf,
+ searchFactory: DirSearchFactory = new LdapSearchFactory)
+ extends PasswdAuthenticationProvider with Logging {
+
+ private val filterOpt: Option[Filter] = FILTER_FACTORIES
+ .map { f => f.getInstance(conf) }
+ .collectFirst { case Some(f: Filter) => f }
/**
* The authenticate method is called by the Kyuubi Server authentication layer
@@ -41,47 +50,72 @@ class LdapAuthenticationProviderImpl(conf: KyuubiConf) extends PasswdAuthenticat
* @throws AuthenticationException When a user is found to be invalid by the implementation
*/
override def authenticate(user: String, password: String): Unit = {
+
+ val (usedBind, bindUser, bindPassword) = (
+ conf.get(KyuubiConf.AUTHENTICATION_LDAP_BIND_USER),
+ conf.get(KyuubiConf.AUTHENTICATION_LDAP_BIND_PASSWORD)) match {
+ case (Some(_bindUser), Some(_bindPw)) => (true, _bindUser, _bindPw)
+ case _ =>
+ // If no bind user or bind password was specified,
+ // we assume the user we are authenticating has the ability to search
+ // the LDAP tree, so we use it as the "binding" account.
+ // This is the way it worked before bind users were allowed in the LDAP authenticator,
+ // so we keep existing systems working.
+ (false, user, password)
+ }
+
+ var search: DirSearch = null
+ try {
+ search = createDirSearch(bindUser, bindPassword)
+ applyFilter(search, user)
+ if (usedBind) {
+ // If we used the bind user, then we need to authenticate again,
+ // this time using the full user name we got during the bind process.
+ val username = getUserName(user)
+ createDirSearch(search.findUserDn(username), password)
+ }
+ } catch {
+ case e: NamingException =>
+ throw new AuthenticationException(
+ s"Unable to find the user in the LDAP tree. ${e.getMessage}")
+ } finally {
+ ServiceUtils.cleanup(logger, search)
+ }
+ }
+
+ @throws[AuthenticationException]
+ private def createDirSearch(user: String, password: String): DirSearch = {
if (StringUtils.isBlank(user)) {
throw new AuthenticationException(s"Error validating LDAP user, user is null" +
s" or contains blank space")
}
- if (StringUtils.isBlank(password)) {
+ if (StringUtils.isBlank(password) || password.getBytes()(0) == 0) {
throw new AuthenticationException(s"Error validating LDAP user, password is null" +
s" or contains blank space")
}
- val env = new java.util.Hashtable[String, Any]()
- env.put(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.ldap.LdapCtxFactory")
- env.put(Context.SECURITY_AUTHENTICATION, "simple")
-
- conf.get(AUTHENTICATION_LDAP_URL).foreach(env.put(Context.PROVIDER_URL, _))
-
- val domain = conf.get(AUTHENTICATION_LDAP_DOMAIN)
- val u =
- if (!hasDomain(user) && domain.nonEmpty) {
- user + "@" + domain.get
- } else {
- user
+ val principals = LdapUtils.createCandidatePrincipals(conf, user)
+ val iterator = principals.iterator
+ while (iterator.hasNext) {
+ val principal = iterator.next
+ try {
+ return searchFactory.getInstance(conf, principal, password)
+ } catch {
+ case ex: AuthenticationException => if (iterator.isEmpty) throw ex
}
-
- val guidKey = conf.get(AUTHENTICATION_LDAP_GUIDKEY)
- val bindDn = conf.get(AUTHENTICATION_LDAP_BASEDN) match {
- case Some(dn) => guidKey + "=" + u + "," + dn
- case _ => u
}
+ throw new AuthenticationException(s"No candidate principals for $user was found.")
+ }
- env.put(Context.SECURITY_PRINCIPAL, bindDn)
- env.put(Context.SECURITY_CREDENTIALS, password)
-
- try {
- val ctx = new InitialDirContext(env)
- ctx.close()
- } catch {
- case e: NamingException =>
- throw new AuthenticationException(s"Error validating LDAP user: $bindDn", e)
- }
+ @throws[AuthenticationException]
+ private def applyFilter(client: DirSearch, user: String): Unit = filterOpt.foreach { filter =>
+ filter.apply(client, getUserName(user))
}
+}
- private def hasDomain(userName: String): Boolean = ServiceUtils.indexOfDomainMatch(userName) > 0
+object LdapAuthenticationProviderImpl {
+ val FILTER_FACTORIES: Array[FilterFactory] = Array[FilterFactory](
+ CustomQueryFilterFactory,
+ new ChainFilterFactory(UserSearchFilterFactory, UserFilterFactory, GroupFilterFactory))
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/PlainSASLServer.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/PlainSASLServer.scala
index 8e84c9f81..737a6d8cd 100644
--- a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/PlainSASLServer.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/PlainSASLServer.scala
@@ -23,7 +23,7 @@ import javax.security.auth.callback.{Callback, CallbackHandler, NameCallback, Pa
import javax.security.sasl.{AuthorizeCallback, SaslException, SaslServer, SaslServerFactory}
import org.apache.kyuubi.KYUUBI_VERSION
-import org.apache.kyuubi.engine.SemanticVersion
+import org.apache.kyuubi.util.SemanticVersion
class PlainSASLServer(
handler: CallbackHandler,
@@ -126,10 +126,7 @@ object PlainSASLServer {
}
}
- final private val version: Double = {
- val runtimeVersion = SemanticVersion(KYUUBI_VERSION)
- runtimeVersion.majorVersion + runtimeVersion.minorVersion.toDouble / 10
- }
+ final private val version = SemanticVersion(KYUUBI_VERSION).toDouble
class SaslPlainProvider
extends Provider("KyuubiSaslPlain", version, "Kyuubi Plain SASL provider") {
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/ChainFilterFactory.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/ChainFilterFactory.scala
new file mode 100644
index 000000000..a5badb15d
--- /dev/null
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/ChainFilterFactory.scala
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.service.authentication.ldap
+
+import javax.security.sasl.AuthenticationException
+
+import org.apache.kyuubi.config.KyuubiConf
+
+/**
+ * A factory that produces a [[Filter]] that is implemented as a chain of other filters.
+ * The chain of filters are created as a result of [[ChainFilterFactory#getInstance]] method call.
+ * The resulting object filters out all users that don't pass all chained filters.
+ * The filters will be applied in the order they are mentioned in the factory constructor.
+ */
+
+class ChainFilterFactory(chainedFactories: FilterFactory*) extends FilterFactory {
+ override def getInstance(conf: KyuubiConf): Option[Filter] = {
+ val maybeFilters = chainedFactories.map(_.getInstance(conf))
+ val filters = maybeFilters.flatten
+ if (filters.isEmpty) None else Some(new ChainFilter(filters))
+ }
+}
+
+class ChainFilter(chainedFilters: Seq[Filter]) extends Filter {
+ @throws[AuthenticationException]
+ override def apply(client: DirSearch, user: String): Unit = {
+ chainedFilters.foreach(_.apply(client, user))
+ }
+}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/CustomQueryFilterFactory.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/CustomQueryFilterFactory.scala
new file mode 100644
index 000000000..d10e6523b
--- /dev/null
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/CustomQueryFilterFactory.scala
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.service.authentication.ldap
+
+import javax.naming.NamingException
+import javax.security.sasl.AuthenticationException
+
+import org.apache.kyuubi.Logging
+import org.apache.kyuubi.config.KyuubiConf
+
+/**
+ * A factory for a [[Filter]] based on a custom query.
+ *
+ * The produced filter object filters out all users that are not found in the search result
+ * of the query provided in Kyuubi configuration.
+ *
+ * @see [[KyuubiConf.AUTHENTICATION_LDAP_CUSTOM_LDAP_QUERY]]
+ */
+object CustomQueryFilterFactory extends FilterFactory {
+ override def getInstance(conf: KyuubiConf): Option[Filter] =
+ conf.get(KyuubiConf.AUTHENTICATION_LDAP_CUSTOM_LDAP_QUERY)
+ .map { customQuery => new CustomQueryFilter(customQuery) }
+}
+class CustomQueryFilter(query: String) extends Filter with Logging {
+ @throws[AuthenticationException]
+ override def apply(client: DirSearch, user: String): Unit = {
+ var resultList: Array[String] = null
+ try {
+ resultList = client.executeCustomQuery(query)
+ } catch {
+ case e: NamingException =>
+ throw new AuthenticationException(s"LDAP Authentication failed for $user", e)
+ }
+ if (resultList != null) {
+ resultList.foreach { matchedDn =>
+ val shortUserName = LdapUtils.getShortName(matchedDn)
+ info(s"")
+ if (shortUserName.equalsIgnoreCase(user) || matchedDn.equalsIgnoreCase(user)) {
+ info("Authentication succeeded based on result set from LDAP query")
+ return
+ }
+ }
+ // try a generic user search
+ if (query.contains("%s")) {
+ val userSearchQuery = query.replace("%s", user)
+ info("Trying with generic user search in ldap:" + userSearchQuery)
+ try resultList = client.executeCustomQuery(userSearchQuery)
+ catch {
+ case e: NamingException =>
+ throw new AuthenticationException("LDAP Authentication failed for user", e)
+ }
+ if (resultList != null && resultList.length == 1) {
+ info("Authentication succeeded based on result from custom user search query")
+ return
+ }
+ }
+ }
+ info("Authentication failed based on result set from custom LDAP query")
+ throw new AuthenticationException(
+ "Authentication failed: LDAP query from property returned no data")
+ }
+}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/DirSearch.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/DirSearch.scala
new file mode 100644
index 000000000..c1c4d5060
--- /dev/null
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/DirSearch.scala
@@ -0,0 +1,73 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.service.authentication.ldap
+
+import java.io.Closeable
+import javax.naming.NamingException
+
+/**
+ * The object used for executing queries on the Directory Service.
+ */
+trait DirSearch extends Closeable {
+
+ /**
+ * Finds user's distinguished name.
+ *
+ * @param user username
+ * @return DN for the specified username
+ */
+ @throws[NamingException]
+ def findUserDn(user: String): String
+
+ /**
+ * Finds group's distinguished name.
+ *
+ * @param group group name or unique identifier
+ * @return DN for the specified group name
+ */
+ @throws[NamingException]
+ def findGroupDn(group: String): String
+
+ /**
+ * Verifies that specified user is a member of specified group.
+ *
+ * @param user user id or distinguished name
+ * @param groupDn group's DN
+ * @return true if the user is a member of the group, false - otherwise.
+ */
+ @throws[NamingException]
+ def isUserMemberOfGroup(user: String, groupDn: String): Boolean
+
+ /**
+ * Finds groups that contain the specified user.
+ *
+ * @param userDn user's distinguished name
+ * @return list of groups
+ */
+ @throws[NamingException]
+ def findGroupsForUser(userDn: String): Array[String]
+
+ /**
+ * Executes an arbitrary query.
+ *
+ * @param query any query
+ * @return list of names in the namespace
+ */
+ @throws[NamingException]
+ def executeCustomQuery(query: String): Array[String]
+}
diff --git a/kyuubi-server/src/test/scala/org/apache/kyuubi/operation/datalake/HudiOperationSuite.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/DirSearchFactory.scala
similarity index 62%
rename from kyuubi-server/src/test/scala/org/apache/kyuubi/operation/datalake/HudiOperationSuite.scala
rename to kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/DirSearchFactory.scala
index 0c507504d..2046632d8 100644
--- a/kyuubi-server/src/test/scala/org/apache/kyuubi/operation/datalake/HudiOperationSuite.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/DirSearchFactory.scala
@@ -15,20 +15,25 @@
* limitations under the License.
*/
-package org.apache.kyuubi.operation.datalake
+package org.apache.kyuubi.service.authentication.ldap
+
+import javax.security.sasl.AuthenticationException
-import org.apache.kyuubi.WithKyuubiServer
import org.apache.kyuubi.config.KyuubiConf
-import org.apache.kyuubi.operation.HudiMetadataTests
-import org.apache.kyuubi.tags.HudiTest
-@HudiTest
-class HudiOperationSuite extends WithKyuubiServer with HudiMetadataTests {
- override protected val conf: KyuubiConf = {
- val kyuubiConf = KyuubiConf().set(KyuubiConf.ENGINE_IDLE_TIMEOUT, 20000L)
- extraConfigs.foreach { case (k, v) => kyuubiConf.set(k, v) }
- kyuubiConf
- }
+/**
+ * A factory for [[DirSearch]].
+ */
+trait DirSearchFactory {
- override def jdbcUrl: String = getJdbcUrl
+ /**
+ * Returns an instance of [[DirSearch]].
+ *
+ * @param conf Kyuubi configuration
+ * @param user username
+ * @param password user password
+ * @return instance of [[DirSearch]]
+ */
+ @throws[AuthenticationException]
+ def getInstance(conf: KyuubiConf, user: String, password: String): DirSearch
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Filter.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Filter.scala
new file mode 100644
index 000000000..e57eddb0d
--- /dev/null
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/Filter.scala
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.service.authentication.ldap
+
+import javax.security.sasl.AuthenticationException
+
+/**
+ * The object that filters LDAP users.
+ *
+ * The assumption is that this user was already authenticated by a previous bind operation.
+ */
+trait Filter {
+
+ /**
+ * Applies this filter to the authenticated user.
+ *
+ * @param client LDAP client that will be used for execution of LDAP queries.
+ * @param user username
+ */
+ @throws[AuthenticationException]
+ def apply(client: DirSearch, user: String): Unit
+}
diff --git a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/trino/api/KyuubiScalaObjectMapper.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/FilterFactory.scala
similarity index 64%
rename from kyuubi-server/src/main/scala/org/apache/kyuubi/server/trino/api/KyuubiScalaObjectMapper.scala
rename to kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/FilterFactory.scala
index 915b109b7..d85104684 100644
--- a/kyuubi-server/src/main/scala/org/apache/kyuubi/server/trino/api/KyuubiScalaObjectMapper.scala
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/FilterFactory.scala
@@ -15,15 +15,20 @@
* limitations under the License.
*/
-package org.apache.kyuubi.server.trino.api
+package org.apache.kyuubi.service.authentication.ldap
-import javax.ws.rs.ext.ContextResolver
+import org.apache.kyuubi.config.KyuubiConf
-import com.fasterxml.jackson.databind.ObjectMapper
-import com.fasterxml.jackson.module.scala.DefaultScalaModule
-
-class KyuubiScalaObjectMapper extends ContextResolver[ObjectMapper] {
- private val mapper = new ObjectMapper().registerModule(DefaultScalaModule)
+/**
+ * Factory for the filter.
+ */
+trait FilterFactory {
- override def getContext(aClass: Class[_]): ObjectMapper = mapper
+ /**
+ * Returns an instance of the corresponding filter.
+ *
+ * @param conf Kyuubi configurations used to configure the filter.
+ * @return Some(filter) or None if this filter doesn't support provided set of properties
+ */
+ def getInstance(conf: KyuubiConf): Option[Filter]
}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/GroupFilterFactory.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/GroupFilterFactory.scala
new file mode 100644
index 000000000..f3048ea6f
--- /dev/null
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/GroupFilterFactory.scala
@@ -0,0 +1,108 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.service.authentication.ldap
+
+import javax.naming.NamingException
+import javax.security.sasl.AuthenticationException
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.kyuubi.Logging
+import org.apache.kyuubi.config.KyuubiConf
+
+object GroupFilterFactory extends FilterFactory {
+ override def getInstance(conf: KyuubiConf): Option[Filter] = {
+ val groupFilter = conf.get(KyuubiConf.AUTHENTICATION_LDAP_GROUP_FILTER)
+ if (groupFilter.isEmpty) {
+ None
+ } else if (conf.get(KyuubiConf.AUTHENTICATION_LDAP_USER_MEMBERSHIP_KEY).isDefined) {
+ Some(new UserMembershipKeyFilter(groupFilter))
+ } else {
+ Some(new GroupMembershipKeyFilter(groupFilter))
+ }
+ }
+}
+
+class GroupMembershipKeyFilter(groupFilter: Set[String]) extends Filter with Logging {
+
+ @throws[AuthenticationException]
+ override def apply(ldap: DirSearch, user: String): Unit = {
+ info(s"Authenticating user '$user' using ${classOf[GroupMembershipKeyFilter].getSimpleName})")
+
+ var memberOf: Array[String] = null
+ try {
+ val userDn = ldap.findUserDn(user)
+ // Workaround for magic things on Mockito:
+ // unmatched invocation returns an empty list if the method return type is JList,
+ // but null if the method return type is Array
+ memberOf = Option(ldap.findGroupsForUser(userDn)).getOrElse(Array.empty)
+ debug(s"User $userDn member of: ${memberOf.mkString(",")}")
+ } catch {
+ case e: NamingException =>
+ throw new AuthenticationException("LDAP Authentication failed for user", e)
+ }
+ memberOf.foreach { groupDn =>
+ val shortName = LdapUtils.getShortName(groupDn)
+ if (groupFilter.exists(shortName.equalsIgnoreCase)) {
+ debug(s"GroupMembershipKeyFilter passes: user '$user' is a member of '$groupDn' group")
+ info("Authentication succeeded based on group membership")
+ return
+ }
+ }
+ info("Authentication failed based on user membership")
+ throw new AuthenticationException(
+ "Authentication failed: User not a member of specified list")
+ }
+}
+
+class UserMembershipKeyFilter(groupFilter: Set[String]) extends Filter with Logging {
+ @throws[AuthenticationException]
+ override def apply(ldap: DirSearch, user: String): Unit = {
+ info(s"Authenticating user '$user' using $classOf[UserMembershipKeyFilter].getSimpleName")
+ val groupDns = new ArrayBuffer[String]
+ groupFilter.foreach { groupId =>
+ try {
+ val groupDn = ldap.findGroupDn(groupId)
+ groupDns += groupDn
+ } catch {
+ case e: NamingException =>
+ warn("Cannot find DN for group", e)
+ debug(s"Cannot find DN for group $groupId", e)
+ }
+ }
+ if (groupDns.isEmpty) {
+ debug(s"No DN(s) has been found for any of group(s): ${groupFilter.mkString(",")}")
+ throw new AuthenticationException("No DN(s) has been found for any of specified group(s)")
+ }
+ groupDns.foreach { groupDn =>
+ try {
+ if (ldap.isUserMemberOfGroup(user, groupDn)) {
+ debug(s"UserMembershipKeyFilter passes: user '$user' is a member of '$groupDn' group")
+ info("Authentication succeeded based on user membership")
+ return
+ }
+ } catch {
+ case e: NamingException =>
+ warn("Cannot match user and group", e)
+ debug(s"Cannot match user '$user' and group '$groupDn'", e)
+ }
+ }
+ throw new AuthenticationException(
+ s"Authentication failed: User '$user' is not a member of listed groups")
+ }
+}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapSearch.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapSearch.scala
new file mode 100644
index 000000000..09dca1d5c
--- /dev/null
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapSearch.scala
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.service.authentication.ldap
+
+import javax.naming.{NamingEnumeration, NamingException}
+import javax.naming.directory.{DirContext, SearchResult}
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.kyuubi.Logging
+import org.apache.kyuubi.config.KyuubiConf
+
+/**
+ * Implements search for LDAP.
+ * @param conf Kyuubi configuration
+ * @param ctx Directory service that will be used for the queries.
+ */
+class LdapSearch(conf: KyuubiConf, ctx: DirContext) extends DirSearch with Logging {
+
+ final private val baseDn = conf.get(KyuubiConf.AUTHENTICATION_LDAP_BASE_DN).orNull
+ final private val groupBases: Array[String] =
+ LdapUtils.patternsToBaseDns(
+ LdapUtils.parseDnPatterns(conf, KyuubiConf.AUTHENTICATION_LDAP_GROUP_DN_PATTERN))
+ final private val userPatterns: Array[String] =
+ LdapUtils.parseDnPatterns(conf, KyuubiConf.AUTHENTICATION_LDAP_USER_DN_PATTERN)
+ final private val userBases: Array[String] = LdapUtils.patternsToBaseDns(userPatterns)
+ final private val queries: QueryFactory = new QueryFactory(conf)
+
+ /**
+ * Closes this search object and releases any system resources associated
+ * with it. If the search object is already closed then invoking this
+ * method has no effect.
+ */
+ override def close(): Unit = {
+ try ctx.close()
+ catch {
+ case e: NamingException =>
+ warn("Exception when closing LDAP context:", e)
+ }
+ }
+
+ @throws[NamingException]
+ override def findUserDn(user: String): String = {
+ var allLdapNames: Array[String] = null
+ if (LdapUtils.isDn(user)) {
+ val userBaseDn: String = LdapUtils.extractBaseDn(user)
+ val userRdn: String = LdapUtils.extractFirstRdn(user)
+ allLdapNames = execute(Array(userBaseDn), queries.findUserDnByRdn(userRdn)).getAllLdapNames
+ } else {
+ allLdapNames = findDnByPattern(userPatterns, user)
+ if (allLdapNames.isEmpty) {
+ allLdapNames = execute(userBases, queries.findUserDnByName(user)).getAllLdapNames
+ }
+ }
+ if (allLdapNames.length == 1) allLdapNames.head
+ else {
+ info(s"Expected exactly one user result for the user: $user, " +
+ s"but got ${allLdapNames.length}. Returning null")
+ debug("Matched users: $allLdapNames")
+ null
+ }
+ }
+
+ @throws[NamingException]
+ private def findDnByPattern(patterns: Seq[String], name: String): Array[String] = {
+ for (pattern <- patterns) {
+ val baseDnFromPattern: String = LdapUtils.extractBaseDn(pattern)
+ val rdn = LdapUtils.extractFirstRdn(pattern).replaceAll("%s", name)
+ val names = execute(Array(baseDnFromPattern), queries.findDnByPattern(rdn)).getAllLdapNames
+ if (!names.isEmpty) return names
+ }
+ Array.empty
+ }
+
+ @throws[NamingException]
+ override def findGroupDn(group: String): String =
+ execute(groupBases, queries.findGroupDnById(group)).getSingleLdapName
+
+ @throws[NamingException]
+ override def isUserMemberOfGroup(user: String, groupDn: String): Boolean = {
+ val userId = LdapUtils.extractUserName(user)
+ execute(userBases, queries.isUserMemberOfGroup(userId, groupDn)).hasSingleResult
+ }
+
+ @throws[NamingException]
+ override def findGroupsForUser(userDn: String): Array[String] = {
+ val userName = LdapUtils.extractUserName(userDn)
+ execute(groupBases, queries.findGroupsForUser(userName, userDn)).getAllLdapNames
+ }
+
+ @throws[NamingException]
+ override def executeCustomQuery(query: String): Array[String] =
+ execute(Array(baseDn), queries.customQuery(query)).getAllLdapNamesAndAttributes
+
+ private def execute(baseDns: Array[String], query: Query): SearchResultHandler = {
+ val searchResults = new ArrayBuffer[NamingEnumeration[SearchResult]]
+ debug(s"Executing a query: '${query.filter}' with base DNs ${baseDns.mkString(",")}")
+ baseDns.foreach { baseDn =>
+ try {
+ val searchResult = ctx.search(baseDn, query.filter, query.controls)
+ if (searchResult != null) searchResults += searchResult
+ } catch {
+ case ex: NamingException =>
+ debug(
+ s"Exception happened for query '${query.filter}' with base DN '$baseDn'",
+ ex)
+ }
+ }
+ new SearchResultHandler(searchResults.toArray)
+ }
+}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapSearchFactory.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapSearchFactory.scala
new file mode 100644
index 000000000..e3649d359
--- /dev/null
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapSearchFactory.scala
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.service.authentication.ldap
+
+import java.util
+import javax.naming.{Context, NamingException}
+import javax.naming.directory.{DirContext, InitialDirContext}
+import javax.security.sasl.AuthenticationException
+
+import org.apache.kyuubi.Logging
+import org.apache.kyuubi.config.KyuubiConf
+
+class LdapSearchFactory extends DirSearchFactory with Logging {
+ @throws[AuthenticationException]
+ override def getInstance(conf: KyuubiConf, principal: String, password: String): DirSearch = {
+ try {
+ val ctx = createDirContext(conf, principal, password)
+ new LdapSearch(conf, ctx)
+ } catch {
+ case e: NamingException =>
+ debug(s"Could not connect to the LDAP Server: Authentication failed for $principal")
+ throw new AuthenticationException(s"Error validating LDAP user: $principal", e)
+ }
+ }
+
+ @throws[NamingException]
+ private def createDirContext(
+ conf: KyuubiConf,
+ principal: String,
+ password: String): DirContext = {
+ val ldapUrl = conf.get(KyuubiConf.AUTHENTICATION_LDAP_URL)
+ val env = new util.Hashtable[String, AnyRef]
+ ldapUrl.foreach(env.put(Context.PROVIDER_URL, _))
+ env.put(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.ldap.LdapCtxFactory")
+ env.put(Context.SECURITY_AUTHENTICATION, "simple")
+ env.put(Context.SECURITY_PRINCIPAL, principal)
+ env.put(Context.SECURITY_CREDENTIALS, password)
+ debug(s"Connecting using principal $principal to ldap server: ${ldapUrl.orNull}")
+ new InitialDirContext(env)
+ }
+}
diff --git a/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapUtils.scala b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapUtils.scala
new file mode 100644
index 000000000..e304e96f7
--- /dev/null
+++ b/kyuubi-common/src/main/scala/org/apache/kyuubi/service/authentication/ldap/LdapUtils.scala
@@ -0,0 +1,212 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.service.authentication.ldap
+
+import scala.collection.mutable.ArrayBuffer
+
+import org.apache.kyuubi.Logging
+import org.apache.kyuubi.config.{KyuubiConf, OptionalConfigEntry}
+import org.apache.kyuubi.service.ServiceUtils
+
+/**
+ * Static utility methods related to LDAP authentication module.
+ */
+object LdapUtils extends Logging {
+
+ /**
+ * Extracts a base DN from the provided distinguished name.
+ *
+ * Example:
+ *
+ * "ou=CORP,dc=mycompany,dc=com" is the base DN for "cn=user1,ou=CORP,dc=mycompany,dc=com"
+ *
+ * @param dn distinguished name
+ * @return base DN
+ */
+ def extractBaseDn(dn: String): String = {
+ val indexOfFirstDelimiter = dn.indexOf(",")
+ if (indexOfFirstDelimiter > -1) {
+ return dn.substring(indexOfFirstDelimiter + 1)
+ }
+ null
+ }
+
+ /**
+ * Extracts the first Relative Distinguished Name (RDN).
+ *
+ * Example:
+ *
+ * For DN "cn=user1,ou=CORP,dc=mycompany,dc=com" this method will return "cn=user1"
+ *
+ * @param dn distinguished name
+ * @return first RDN
+ */
+ def extractFirstRdn(dn: String): String = dn.substring(0, dn.indexOf(","))
+
+ /**
+ * Extracts username from user DN.
+ *
+ * Examples:
+ *
")}")
.version("1.3.2")
.stringConf
- .checkValues(AuthTypes.values.map(_.toString))
+ .checkValues(AuthTypes)
.createWithDefault(AuthTypes.NONE.toString)
+ val HA_ZK_AUTH_SERVER_PRINCIPAL: OptionalConfigEntry[String] =
+ buildConf("kyuubi.ha.zookeeper.auth.serverPrincipal")
+ .doc("Kerberos principal name of ZooKeeper Server. It only takes effect when " +
+ "Zookeeper client's version at least 3.5.7 or 3.6.0 or applies ZOOKEEPER-1467. " +
+ "To use Zookeeper 3.6 client, compile Kyuubi with `-Pzookeeper-3.6`.")
+ .version("1.8.0")
+ .stringConf
+ .createOptional
+
val HA_ZK_AUTH_PRINCIPAL: ConfigEntry[Option[String]] =
buildConf("kyuubi.ha.zookeeper.auth.principal")
- .doc("Name of the Kerberos principal is used for ZooKeeper authentication.")
+ .doc("Kerberos principal name that is used for ZooKeeper authentication.")
.version("1.3.2")
.fallbackConf(KyuubiConf.SERVER_PRINCIPAL)
- val HA_ZK_AUTH_KEYTAB: ConfigEntry[Option[String]] = buildConf("kyuubi.ha.zookeeper.auth.keytab")
- .doc("Location of the Kyuubi server's keytab is used for ZooKeeper authentication.")
- .version("1.3.2")
- .fallbackConf(KyuubiConf.SERVER_KEYTAB)
+ val HA_ZK_AUTH_KEYTAB: ConfigEntry[Option[String]] =
+ buildConf("kyuubi.ha.zookeeper.auth.keytab")
+ .doc("Location of the Kyuubi server's keytab that is used for ZooKeeper authentication.")
+ .version("1.3.2")
+ .fallbackConf(KyuubiConf.SERVER_KEYTAB)
- val HA_ZK_AUTH_DIGEST: OptionalConfigEntry[String] = buildConf("kyuubi.ha.zookeeper.auth.digest")
- .doc("The digest auth string is used for ZooKeeper authentication, like: username:password.")
- .version("1.3.2")
- .stringConf
- .createOptional
+ val HA_ZK_AUTH_DIGEST: OptionalConfigEntry[String] =
+ buildConf("kyuubi.ha.zookeeper.auth.digest")
+ .doc("The digest auth string is used for ZooKeeper authentication, like: username:password.")
+ .version("1.3.2")
+ .stringConf
+ .createOptional
val HA_ZK_CONN_MAX_RETRIES: ConfigEntry[Int] =
buildConf("kyuubi.ha.zookeeper.connection.max.retries")
@@ -150,7 +160,7 @@ object HighAvailabilityConf {
s" ${RetryPolicies.values.mkString("
", "
", "
")}")
.version("1.0.0")
.stringConf
- .checkValues(RetryPolicies.values.map(_.toString))
+ .checkValues(RetryPolicies)
.createWithDefault(RetryPolicies.EXPONENTIAL_BACKOFF.toString)
val HA_ZK_NODE_TIMEOUT: ConfigEntry[Long] =
@@ -210,14 +220,14 @@ object HighAvailabilityConf {
.stringConf
.createOptional
- val HA_ETCD_SSL_CLINET_CRT_PATH: OptionalConfigEntry[String] =
+ val HA_ETCD_SSL_CLIENT_CRT_PATH: OptionalConfigEntry[String] =
buildConf("kyuubi.ha.etcd.ssl.client.certificate.path")
.doc("Where the etcd SSL certificate file is stored.")
.version("1.6.0")
.stringConf
.createOptional
- val HA_ETCD_SSL_CLINET_KEY_PATH: OptionalConfigEntry[String] =
+ val HA_ETCD_SSL_CLIENT_KEY_PATH: OptionalConfigEntry[String] =
buildConf("kyuubi.ha.etcd.ssl.client.key.path")
.doc("Where the etcd SSL key file is stored.")
.version("1.6.0")
diff --git a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/DiscoveryPaths.scala b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/DiscoveryPaths.scala
index 987a88dda..fe7ebe2ab 100644
--- a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/DiscoveryPaths.scala
+++ b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/DiscoveryPaths.scala
@@ -17,7 +17,7 @@
package org.apache.kyuubi.ha.client
-import org.apache.curator.utils.ZKPaths
+import org.apache.kyuubi.shaded.curator.utils.ZKPaths
object DiscoveryPaths {
def makePath(parent: String, firstChild: String, restChildren: String*): String = {
diff --git a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/ServiceDiscovery.scala b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/ServiceDiscovery.scala
index bdb9b12fe..a1b1466d1 100644
--- a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/ServiceDiscovery.scala
+++ b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/ServiceDiscovery.scala
@@ -60,6 +60,7 @@ abstract class ServiceDiscovery(
override def start(): Unit = {
discoveryClient.registerService(conf, namespace, this)
+ info(s"Registered $name in namespace ${_namespace}.")
super.start()
}
diff --git a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/etcd/EtcdDiscoveryClient.scala b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/etcd/EtcdDiscoveryClient.scala
index ad3a0550c..d979804f4 100644
--- a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/etcd/EtcdDiscoveryClient.scala
+++ b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/etcd/EtcdDiscoveryClient.scala
@@ -74,10 +74,10 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
} else {
val caPath = conf.getOption(HA_ETCD_SSL_CA_PATH.key).getOrElse(
throw new IllegalArgumentException(s"${HA_ETCD_SSL_CA_PATH.key} is not defined"))
- val crtPath = conf.getOption(HA_ETCD_SSL_CLINET_CRT_PATH.key).getOrElse(
- throw new IllegalArgumentException(s"${HA_ETCD_SSL_CLINET_CRT_PATH.key} is not defined"))
- val keyPath = conf.getOption(HA_ETCD_SSL_CLINET_KEY_PATH.key).getOrElse(
- throw new IllegalArgumentException(s"${HA_ETCD_SSL_CLINET_KEY_PATH.key} is not defined"))
+ val crtPath = conf.getOption(HA_ETCD_SSL_CLIENT_CRT_PATH.key).getOrElse(
+ throw new IllegalArgumentException(s"${HA_ETCD_SSL_CLIENT_CRT_PATH.key} is not defined"))
+ val keyPath = conf.getOption(HA_ETCD_SSL_CLIENT_KEY_PATH.key).getOrElse(
+ throw new IllegalArgumentException(s"${HA_ETCD_SSL_CLIENT_KEY_PATH.key} is not defined"))
val context = GrpcSslContexts.forClient()
.trustManager(new File(caPath))
@@ -90,7 +90,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def createClient(): Unit = {
+ override def createClient(): Unit = {
client = buildClient()
kvClient = client.getKVClient()
lockClient = client.getLockClient()
@@ -99,13 +99,13 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
leaseTTL = conf.get(HighAvailabilityConf.HA_ETCD_LEASE_TIMEOUT) / 1000
}
- def closeClient(): Unit = {
+ override def closeClient(): Unit = {
if (client != null) {
client.close()
}
}
- def create(path: String, mode: String, createParent: Boolean = true): String = {
+ override def create(path: String, mode: String, createParent: Boolean = true): String = {
// createParent can not effect here
mode match {
case "PERSISTENT" => kvClient.put(
@@ -116,7 +116,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
path
}
- def getData(path: String): Array[Byte] = {
+ override def getData(path: String): Array[Byte] = {
val response = kvClient.get(ByteSequence.from(path.getBytes())).get()
if (response.getKvs.isEmpty) {
throw new KyuubiException(s"Key[$path] not exists in ETCD, please check it.")
@@ -125,12 +125,12 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def setData(path: String, data: Array[Byte]): Boolean = {
+ override def setData(path: String, data: Array[Byte]): Boolean = {
val response = kvClient.put(ByteSequence.from(path.getBytes), ByteSequence.from(data)).get()
response != null
}
- def getChildren(path: String): List[String] = {
+ override def getChildren(path: String): List[String] = {
val kvs = kvClient.get(
ByteSequence.from(path.getBytes()),
GetOption.newBuilder().isPrefix(true).build()).get().getKvs
@@ -142,25 +142,25 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def pathExists(path: String): Boolean = {
+ override def pathExists(path: String): Boolean = {
!pathNonExists(path)
}
- def pathNonExists(path: String): Boolean = {
+ override def pathNonExists(path: String): Boolean = {
kvClient.get(ByteSequence.from(path.getBytes())).get().getKvs.isEmpty
}
- def delete(path: String, deleteChildren: Boolean = false): Unit = {
+ override def delete(path: String, deleteChildren: Boolean = false): Unit = {
kvClient.delete(
ByteSequence.from(path.getBytes()),
DeleteOption.newBuilder().isPrefix(deleteChildren).build()).get()
}
- def monitorState(serviceDiscovery: ServiceDiscovery): Unit = {
+ override def monitorState(serviceDiscovery: ServiceDiscovery): Unit = {
// not need with etcd
}
- def tryWithLock[T](
+ override def tryWithLock[T](
lockPath: String,
timeout: Long)(f: => T): T = {
// the default unit is millis, covert to seconds.
@@ -195,7 +195,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def getServerHost(namespace: String): Option[(String, Int)] = {
+ override def getServerHost(namespace: String): Option[(String, Int)] = {
// TODO: use last one because to avoid touching some maybe-crashed engines
// We need a big improvement here.
getServiceNodesInfo(namespace, Some(1), silent = true) match {
@@ -204,7 +204,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def getEngineByRefId(
+ override def getEngineByRefId(
namespace: String,
engineRefId: String): Option[(String, Int)] = {
getServiceNodesInfo(namespace, silent = true)
@@ -212,7 +212,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
.map(data => (data.host, data.port))
}
- def getServiceNodesInfo(
+ override def getServiceNodesInfo(
namespace: String,
sizeOpt: Option[Int] = None,
silent: Boolean = false): Seq[ServiceNodeInfo] = {
@@ -241,7 +241,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def registerService(
+ override def registerService(
conf: KyuubiConf,
namespace: String,
serviceDiscovery: ServiceDiscovery,
@@ -267,7 +267,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def deregisterService(): Unit = {
+ override def deregisterService(): Unit = {
// close the EPHEMERAL_SEQUENTIAL node in etcd
if (serviceNode != null) {
if (serviceNode.lease != LEASE_NULL_VALUE) {
@@ -278,7 +278,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def postDeregisterService(namespace: String): Boolean = {
+ override def postDeregisterService(namespace: String): Boolean = {
if (namespace != null) {
delete(DiscoveryPaths.makePath(null, namespace), true)
true
@@ -287,7 +287,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def createAndGetServiceNode(
+ override def createAndGetServiceNode(
conf: KyuubiConf,
namespace: String,
instance: String,
@@ -297,7 +297,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
@VisibleForTesting
- def startSecretNode(
+ override def startSecretNode(
createMode: String,
basePath: String,
initData: String,
@@ -307,7 +307,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
ByteSequence.from(initData.getBytes())).get()
}
- def getAndIncrement(path: String, delta: Int = 1): Int = {
+ override def getAndIncrement(path: String, delta: Int = 1): Int = {
val lockPath = s"${path}_tmp_for_lock"
tryWithLock(lockPath, 60 * 1000) {
if (pathNonExists(path)) {
@@ -358,11 +358,11 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
client.getLeaseClient.keepAlive(
leaseId,
new StreamObserver[LeaseKeepAliveResponse] {
- override def onNext(v: LeaseKeepAliveResponse): Unit = Unit // do nothing
+ override def onNext(v: LeaseKeepAliveResponse): Unit = () // do nothing
- override def onError(throwable: Throwable): Unit = Unit // do nothing
+ override def onError(throwable: Throwable): Unit = () // do nothing
- override def onCompleted(): Unit = Unit // do nothing
+ override def onCompleted(): Unit = () // do nothing
})
client.getKVClient.put(
ByteSequence.from(realPath.getBytes()),
@@ -388,7 +388,7 @@ class EtcdDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
override def onError(throwable: Throwable): Unit =
throw new KyuubiException(throwable.getMessage, throwable.getCause)
- override def onCompleted(): Unit = Unit
+ override def onCompleted(): Unit = ()
}
}
diff --git a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperACLProvider.scala b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperACLProvider.scala
index 467c323b7..87ea65c17 100644
--- a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperACLProvider.scala
+++ b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperACLProvider.scala
@@ -17,13 +17,12 @@
package org.apache.kyuubi.ha.client.zookeeper
-import org.apache.curator.framework.api.ACLProvider
-import org.apache.zookeeper.ZooDefs
-import org.apache.zookeeper.data.ACL
-
import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.ha.HighAvailabilityConf
import org.apache.kyuubi.ha.client.AuthTypes
+import org.apache.kyuubi.shaded.curator.framework.api.ACLProvider
+import org.apache.kyuubi.shaded.zookeeper.ZooDefs
+import org.apache.kyuubi.shaded.zookeeper.data.ACL
class ZookeeperACLProvider(conf: KyuubiConf) extends ACLProvider {
diff --git a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperClientProvider.scala b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperClientProvider.scala
index 8dd32d6b6..d0749c8d9 100644
--- a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperClientProvider.scala
+++ b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperClientProvider.scala
@@ -18,22 +18,23 @@
package org.apache.kyuubi.ha.client.zookeeper
import java.io.{File, IOException}
+import java.nio.charset.StandardCharsets
import javax.security.auth.login.Configuration
import scala.util.Random
import com.google.common.annotations.VisibleForTesting
-import org.apache.curator.framework.{CuratorFramework, CuratorFrameworkFactory}
-import org.apache.curator.retry._
import org.apache.hadoop.security.UserGroupInformation
-import org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.JaasConfiguration
import org.apache.kyuubi.Logging
import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.ha.HighAvailabilityConf._
import org.apache.kyuubi.ha.client.{AuthTypes, RetryPolicies}
import org.apache.kyuubi.ha.client.RetryPolicies._
+import org.apache.kyuubi.shaded.curator.framework.{CuratorFramework, CuratorFrameworkFactory}
+import org.apache.kyuubi.shaded.curator.retry._
import org.apache.kyuubi.util.KyuubiHadoopUtils
+import org.apache.kyuubi.util.reflect.DynConstructors
object ZookeeperClientProvider extends Logging {
@@ -65,10 +66,8 @@ object ZookeeperClientProvider extends Logging {
.aclProvider(new ZookeeperACLProvider(conf))
.retryPolicy(retryPolicy)
- conf.get(HA_ZK_AUTH_DIGEST) match {
- case Some(anthString) =>
- builder.authorization("digest", anthString.getBytes("UTF-8"))
- case _ =>
+ conf.get(HA_ZK_AUTH_DIGEST).foreach { authString =>
+ builder.authorization("digest", authString.getBytes(StandardCharsets.UTF_8))
}
builder.build()
@@ -103,46 +102,51 @@ object ZookeeperClientProvider extends Logging {
*/
@throws[Exception]
def setUpZooKeeperAuth(conf: KyuubiConf): Unit = {
- def setupZkAuth(): Unit = {
- val keyTabFile = getKeyTabFile(conf)
- val maybePrincipal = conf.get(HA_ZK_AUTH_PRINCIPAL)
- val kerberized = maybePrincipal.isDefined && keyTabFile.isDefined
- if (UserGroupInformation.isSecurityEnabled && kerberized) {
- if (!new File(keyTabFile.get).exists()) {
- throw new IOException(s"${HA_ZK_AUTH_KEYTAB.key}: $keyTabFile does not exists")
+ def setupZkAuth(): Unit = (conf.get(HA_ZK_AUTH_PRINCIPAL), getKeyTabFile(conf)) match {
+ case (Some(principal), Some(keytab)) if UserGroupInformation.isSecurityEnabled =>
+ if (!new File(keytab).exists()) {
+ throw new IOException(s"${HA_ZK_AUTH_KEYTAB.key}: $keytab does not exists")
}
System.setProperty("zookeeper.sasl.clientconfig", "KyuubiZooKeeperClient")
- var principal = maybePrincipal.get
- principal = KyuubiHadoopUtils.getServerPrincipal(principal)
- val jaasConf = new JaasConfiguration("KyuubiZooKeeperClient", principal, keyTabFile.get)
+ conf.get(HA_ZK_AUTH_SERVER_PRINCIPAL).foreach { zkServerPrincipal =>
+ // ZOOKEEPER-1467 allows configuring SPN in client
+ System.setProperty("zookeeper.server.principal", zkServerPrincipal)
+ }
+ val zkClientPrincipal = KyuubiHadoopUtils.getServerPrincipal(principal)
+ // HDFS-16591 makes breaking change on JaasConfiguration
+ val jaasConf = DynConstructors.builder()
+ .impl( // Hadoop 3.3.5 and above
+ "org.apache.hadoop.security.authentication.util.JaasConfiguration",
+ classOf[String],
+ classOf[String],
+ classOf[String])
+ .impl( // Hadoop 3.3.4 and previous
+ // scalastyle:off
+ "org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager$JaasConfiguration",
+ // scalastyle:on
+ classOf[String],
+ classOf[String],
+ classOf[String])
+ .build[Configuration]()
+ .newInstance("KyuubiZooKeeperClient", zkClientPrincipal, keytab)
Configuration.setConfiguration(jaasConf)
- }
+ case _ =>
}
- if (conf.get(HA_ENGINE_REF_ID).isEmpty
- && AuthTypes.withName(conf.get(HA_ZK_AUTH_TYPE)) == AuthTypes.KERBEROS) {
+ if (conf.get(HA_ENGINE_REF_ID).isEmpty &&
+ AuthTypes.withName(conf.get(HA_ZK_AUTH_TYPE)) == AuthTypes.KERBEROS) {
setupZkAuth()
- } else if (conf.get(HA_ENGINE_REF_ID).nonEmpty && AuthTypes
- .withName(conf.get(HA_ZK_ENGINE_AUTH_TYPE)) == AuthTypes.KERBEROS) {
+ } else if (conf.get(HA_ENGINE_REF_ID).nonEmpty &&
+ AuthTypes.withName(conf.get(HA_ZK_ENGINE_AUTH_TYPE)) == AuthTypes.KERBEROS) {
setupZkAuth()
}
-
}
@VisibleForTesting
def getKeyTabFile(conf: KyuubiConf): Option[String] = {
- val zkAuthKeytab = conf.get(HA_ZK_AUTH_KEYTAB)
- if (zkAuthKeytab.isDefined) {
- val zkAuthKeytabPath = zkAuthKeytab.get
- val relativeFileName = new File(zkAuthKeytabPath).getName
- if (new File(relativeFileName).exists()) {
- Some(relativeFileName)
- } else {
- Some(zkAuthKeytabPath)
- }
- } else {
- None
+ conf.get(HA_ZK_AUTH_KEYTAB).map { fullPath =>
+ val filename = new File(fullPath).getName
+ if (new File(filename).exists()) filename else fullPath
}
}
-
}
diff --git a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperDiscoveryClient.scala b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperDiscoveryClient.scala
index 1315cf029..2db7d89d6 100644
--- a/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperDiscoveryClient.scala
+++ b/kyuubi-ha/src/main/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperDiscoveryClient.scala
@@ -25,39 +25,25 @@ import java.util.concurrent.atomic.AtomicBoolean
import scala.collection.JavaConverters._
import com.google.common.annotations.VisibleForTesting
-import org.apache.curator.framework.CuratorFramework
-import org.apache.curator.framework.recipes.atomic.{AtomicValue, DistributedAtomicInteger}
-import org.apache.curator.framework.recipes.locks.InterProcessSemaphoreMutex
-import org.apache.curator.framework.recipes.nodes.PersistentNode
-import org.apache.curator.framework.state.ConnectionState
-import org.apache.curator.framework.state.ConnectionState.CONNECTED
-import org.apache.curator.framework.state.ConnectionState.LOST
-import org.apache.curator.framework.state.ConnectionState.RECONNECTED
-import org.apache.curator.framework.state.ConnectionStateListener
-import org.apache.curator.retry.RetryForever
-import org.apache.curator.utils.ZKPaths
-import org.apache.zookeeper.CreateMode
-import org.apache.zookeeper.CreateMode.PERSISTENT
-import org.apache.zookeeper.KeeperException
-import org.apache.zookeeper.KeeperException.NodeExistsException
-import org.apache.zookeeper.WatchedEvent
-import org.apache.zookeeper.Watcher
-
-import org.apache.kyuubi.KYUUBI_VERSION
-import org.apache.kyuubi.KyuubiException
-import org.apache.kyuubi.KyuubiSQLException
-import org.apache.kyuubi.Logging
+
+import org.apache.kyuubi.{KYUUBI_VERSION, KyuubiException, KyuubiSQLException, Logging}
import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.config.KyuubiReservedKeys.KYUUBI_ENGINE_ID
-import org.apache.kyuubi.ha.HighAvailabilityConf.HA_ENGINE_REF_ID
-import org.apache.kyuubi.ha.HighAvailabilityConf.HA_ZK_NODE_TIMEOUT
-import org.apache.kyuubi.ha.HighAvailabilityConf.HA_ZK_PUBLISH_CONFIGS
-import org.apache.kyuubi.ha.client.DiscoveryClient
-import org.apache.kyuubi.ha.client.ServiceDiscovery
-import org.apache.kyuubi.ha.client.ServiceNodeInfo
-import org.apache.kyuubi.ha.client.zookeeper.ZookeeperClientProvider.buildZookeeperClient
-import org.apache.kyuubi.ha.client.zookeeper.ZookeeperClientProvider.getGracefulStopThreadDelay
+import org.apache.kyuubi.ha.HighAvailabilityConf.{HA_ENGINE_REF_ID, HA_ZK_NODE_TIMEOUT, HA_ZK_PUBLISH_CONFIGS}
+import org.apache.kyuubi.ha.client.{DiscoveryClient, ServiceDiscovery, ServiceNodeInfo}
+import org.apache.kyuubi.ha.client.zookeeper.ZookeeperClientProvider.{buildZookeeperClient, getGracefulStopThreadDelay}
import org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient.connectionChecker
+import org.apache.kyuubi.shaded.curator.framework.CuratorFramework
+import org.apache.kyuubi.shaded.curator.framework.recipes.atomic.{AtomicValue, DistributedAtomicInteger}
+import org.apache.kyuubi.shaded.curator.framework.recipes.locks.InterProcessSemaphoreMutex
+import org.apache.kyuubi.shaded.curator.framework.recipes.nodes.PersistentNode
+import org.apache.kyuubi.shaded.curator.framework.state.{ConnectionState, ConnectionStateListener}
+import org.apache.kyuubi.shaded.curator.framework.state.ConnectionState.{CONNECTED, LOST, RECONNECTED}
+import org.apache.kyuubi.shaded.curator.retry.RetryForever
+import org.apache.kyuubi.shaded.curator.utils.ZKPaths
+import org.apache.kyuubi.shaded.zookeeper.{CreateMode, KeeperException, WatchedEvent, Watcher}
+import org.apache.kyuubi.shaded.zookeeper.CreateMode.PERSISTENT
+import org.apache.kyuubi.shaded.zookeeper.KeeperException.NodeExistsException
import org.apache.kyuubi.util.ThreadUtils
class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
@@ -66,17 +52,17 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
@volatile private var serviceNode: PersistentNode = _
private var watcher: DeRegisterWatcher = _
- def createClient(): Unit = {
+ override def createClient(): Unit = {
zkClient.start()
}
- def closeClient(): Unit = {
+ override def closeClient(): Unit = {
if (zkClient != null) {
zkClient.close()
}
}
- def create(path: String, mode: String, createParent: Boolean = true): String = {
+ override def create(path: String, mode: String, createParent: Boolean = true): String = {
val builder =
if (createParent) zkClient.create().creatingParentsIfNeeded() else zkClient.create()
builder
@@ -84,27 +70,27 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
.forPath(path)
}
- def getData(path: String): Array[Byte] = {
+ override def getData(path: String): Array[Byte] = {
zkClient.getData.forPath(path)
}
- def setData(path: String, data: Array[Byte]): Boolean = {
+ override def setData(path: String, data: Array[Byte]): Boolean = {
zkClient.setData().forPath(path, data) != null
}
- def getChildren(path: String): List[String] = {
+ override def getChildren(path: String): List[String] = {
zkClient.getChildren.forPath(path).asScala.toList
}
- def pathExists(path: String): Boolean = {
+ override def pathExists(path: String): Boolean = {
zkClient.checkExists().forPath(path) != null
}
- def pathNonExists(path: String): Boolean = {
+ override def pathNonExists(path: String): Boolean = {
zkClient.checkExists().forPath(path) == null
}
- def delete(path: String, deleteChildren: Boolean = false): Unit = {
+ override def delete(path: String, deleteChildren: Boolean = false): Unit = {
if (deleteChildren) {
zkClient.delete().deletingChildrenIfNeeded().forPath(path)
} else {
@@ -112,7 +98,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def monitorState(serviceDiscovery: ServiceDiscovery): Unit = {
+ override def monitorState(serviceDiscovery: ServiceDiscovery): Unit = {
zkClient
.getConnectionStateListenable.addListener(new ConnectionStateListener {
private val isConnected = new AtomicBoolean(false)
@@ -141,7 +127,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
})
}
- def tryWithLock[T](lockPath: String, timeout: Long)(f: => T): T = {
+ override def tryWithLock[T](lockPath: String, timeout: Long)(f: => T): T = {
var lock: InterProcessSemaphoreMutex = null
try {
try {
@@ -189,7 +175,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def getServerHost(namespace: String): Option[(String, Int)] = {
+ override def getServerHost(namespace: String): Option[(String, Int)] = {
// TODO: use last one because to avoid touching some maybe-crashed engines
// We need a big improvement here.
getServiceNodesInfo(namespace, Some(1), silent = true) match {
@@ -198,7 +184,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def getEngineByRefId(
+ override def getEngineByRefId(
namespace: String,
engineRefId: String): Option[(String, Int)] = {
getServiceNodesInfo(namespace, silent = true)
@@ -206,7 +192,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
.map(data => (data.host, data.port))
}
- def getServiceNodesInfo(
+ override def getServiceNodesInfo(
namespace: String,
sizeOpt: Option[Int] = None,
silent: Boolean = false): Seq[ServiceNodeInfo] = {
@@ -226,7 +212,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
info(s"Get service instance:$instance$engineIdStr and version:${version.getOrElse("")} " +
s"under $namespace")
ServiceNodeInfo(namespace, p, host, port, version, engineRefId, attributes)
- }
+ }.toSeq
} catch {
case _: Exception if silent => Nil
case e: Exception =>
@@ -235,7 +221,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def registerService(
+ override def registerService(
conf: KyuubiConf,
namespace: String,
serviceDiscovery: ServiceDiscovery,
@@ -254,7 +240,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
watchNode()
}
- def deregisterService(): Unit = {
+ override def deregisterService(): Unit = {
// close the EPHEMERAL_SEQUENTIAL node in zk
if (serviceNode != null) {
try {
@@ -268,7 +254,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def postDeregisterService(namespace: String): Boolean = {
+ override def postDeregisterService(namespace: String): Boolean = {
if (namespace != null) {
try {
delete(namespace, true)
@@ -283,7 +269,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
}
- def createAndGetServiceNode(
+ override def createAndGetServiceNode(
conf: KyuubiConf,
namespace: String,
instance: String,
@@ -293,7 +279,7 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
}
@VisibleForTesting
- def startSecretNode(
+ override def startSecretNode(
createMode: String,
basePath: String,
initData: String,
@@ -305,9 +291,13 @@ class ZookeeperDiscoveryClient(conf: KyuubiConf) extends DiscoveryClient {
basePath,
initData.getBytes(StandardCharsets.UTF_8))
secretNode.start()
+ val znodeTimeout = conf.get(HA_ZK_NODE_TIMEOUT)
+ if (!secretNode.waitForInitialCreate(znodeTimeout, TimeUnit.MILLISECONDS)) {
+ throw new KyuubiException(s"Max znode creation wait time $znodeTimeout s exhausted")
+ }
}
- def getAndIncrement(path: String, delta: Int = 1): Int = {
+ override def getAndIncrement(path: String, delta: Int = 1): Int = {
val dai = new DistributedAtomicInteger(zkClient, path, new RetryForever(1000))
var atomicVal: AtomicValue[Integer] = null
do {
diff --git a/kyuubi-ha/src/test/resources/log4j2-test.xml b/kyuubi-ha/src/test/resources/log4j2-test.xml
index bfc40dd6d..3110216c1 100644
--- a/kyuubi-ha/src/test/resources/log4j2-test.xml
+++ b/kyuubi-ha/src/test/resources/log4j2-test.xml
@@ -21,14 +21,14 @@
-
+
-
+
diff --git a/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/DiscoveryClientTests.scala b/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/DiscoveryClientTests.scala
index 87db340b5..9caf38646 100644
--- a/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/DiscoveryClientTests.scala
+++ b/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/DiscoveryClientTests.scala
@@ -135,17 +135,17 @@ trait DiscoveryClientTests extends KyuubiFunSuite {
new Thread(() => {
withDiscoveryClient(conf) { discoveryClient =>
- discoveryClient.tryWithLock(lockPath, 3000) {
+ discoveryClient.tryWithLock(lockPath, 10000) {
lockLatch.countDown()
- Thread.sleep(5000)
+ Thread.sleep(15000)
}
}
}).start()
withDiscoveryClient(conf) { discoveryClient =>
- assert(lockLatch.await(5000, TimeUnit.MILLISECONDS))
+ assert(lockLatch.await(20000, TimeUnit.MILLISECONDS))
val e = intercept[KyuubiSQLException] {
- discoveryClient.tryWithLock(lockPath, 2000) {}
+ discoveryClient.tryWithLock(lockPath, 5000) {}
}
assert(e.getMessage contains s"Timeout to lock on path [$lockPath]")
}
@@ -162,7 +162,7 @@ trait DiscoveryClientTests extends KyuubiFunSuite {
test("setData method test") {
withDiscoveryClient(conf) { discoveryClient =>
- val data = "abc";
+ val data = "abc"
val path = "/setData_test"
discoveryClient.create(path, "PERSISTENT")
discoveryClient.setData(path, data.getBytes)
diff --git a/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/etcd/EtcdDiscoveryClientSuite.scala b/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/etcd/EtcdDiscoveryClientSuite.scala
index 5b8855c1e..de48a3495 100644
--- a/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/etcd/EtcdDiscoveryClientSuite.scala
+++ b/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/etcd/EtcdDiscoveryClientSuite.scala
@@ -22,6 +22,9 @@ import java.nio.charset.StandardCharsets
import scala.collection.JavaConverters._
import io.etcd.jetcd.launcher.{Etcd, EtcdCluster}
+import org.scalactic.source.Position
+import org.scalatest.Tag
+import org.testcontainers.DockerClientFactory
import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.ha.HighAvailabilityConf.{HA_ADDRESSES, HA_CLIENT_CLASS}
@@ -41,25 +44,38 @@ class EtcdDiscoveryClientSuite extends DiscoveryClientTests {
var conf: KyuubiConf = KyuubiConf()
.set(HA_CLIENT_CLASS, classOf[EtcdDiscoveryClient].getName)
+ private val hasDockerEnv = DockerClientFactory.instance().isDockerAvailable
+
override def beforeAll(): Unit = {
- etcdCluster = new Etcd.Builder()
- .withNodes(2)
- .build()
- etcdCluster.start()
- conf = new KyuubiConf()
- .set(HA_CLIENT_CLASS, classOf[EtcdDiscoveryClient].getName)
- .set(HA_ADDRESSES, getConnectString)
+ if (hasDockerEnv) {
+ etcdCluster = new Etcd.Builder()
+ .withNodes(2)
+ .build()
+ etcdCluster.start()
+ conf = new KyuubiConf()
+ .set(HA_CLIENT_CLASS, classOf[EtcdDiscoveryClient].getName)
+ .set(HA_ADDRESSES, getConnectString)
+ }
super.beforeAll()
}
override def afterAll(): Unit = {
super.afterAll()
- if (etcdCluster != null) {
+ if (hasDockerEnv && etcdCluster != null) {
etcdCluster.close()
etcdCluster = null
}
}
+ override protected def test(
+ testName: String,
+ testTags: Tag*)(testFun: => Any)(implicit pos: Position): Unit = {
+ if (hasDockerEnv) {
+ super.test(testName, testTags: _*)(testFun)
+ }
+ // skip test
+ }
+
test("etcd test: set, get and delete") {
withDiscoveryClient(conf) { discoveryClient =>
val path = "/kyuubi"
diff --git a/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperDiscoveryClientSuite.scala b/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperDiscoveryClientSuite.scala
index bbd8b94ac..dd78e1fb8 100644
--- a/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperDiscoveryClientSuite.scala
+++ b/kyuubi-ha/src/test/scala/org/apache/kyuubi/ha/client/zookeeper/ZookeeperDiscoveryClientSuite.scala
@@ -25,11 +25,7 @@ import javax.security.auth.login.Configuration
import scala.collection.JavaConverters._
-import org.apache.curator.framework.CuratorFrameworkFactory
-import org.apache.curator.retry.ExponentialBackoffRetry
import org.apache.hadoop.util.StringUtils
-import org.apache.zookeeper.ZooDefs
-import org.apache.zookeeper.data.ACL
import org.scalatest.time.SpanSugar._
import org.apache.kyuubi.{KerberizedTestHelper, KYUUBI_VERSION}
@@ -37,7 +33,13 @@ import org.apache.kyuubi.config.KyuubiConf
import org.apache.kyuubi.ha.HighAvailabilityConf._
import org.apache.kyuubi.ha.client._
import org.apache.kyuubi.ha.client.DiscoveryClientProvider.withDiscoveryClient
+import org.apache.kyuubi.ha.client.zookeeper.ZookeeperClientProvider._
import org.apache.kyuubi.service._
+import org.apache.kyuubi.shaded.curator.framework.CuratorFrameworkFactory
+import org.apache.kyuubi.shaded.curator.retry.ExponentialBackoffRetry
+import org.apache.kyuubi.shaded.zookeeper.ZooDefs
+import org.apache.kyuubi.shaded.zookeeper.data.ACL
+import org.apache.kyuubi.util.reflect.ReflectUtils._
import org.apache.kyuubi.zookeeper.EmbeddedZookeeper
import org.apache.kyuubi.zookeeper.ZookeeperConf.ZK_CLIENT_PORT
@@ -117,7 +119,7 @@ abstract class ZookeeperDiscoveryClientSuite extends DiscoveryClientTests
conf.set(HA_ZK_AUTH_PRINCIPAL.key, principal)
conf.set(HA_ZK_AUTH_TYPE.key, AuthTypes.KERBEROS.toString)
- ZookeeperClientProvider.setUpZooKeeperAuth(conf)
+ setUpZooKeeperAuth(conf)
val configuration = Configuration.getConfiguration
val entries = configuration.getAppConfigurationEntry("KyuubiZooKeeperClient")
@@ -129,9 +131,9 @@ abstract class ZookeeperDiscoveryClientSuite extends DiscoveryClientTests
assert(options("useKeyTab").toString.toBoolean)
conf.set(HA_ZK_AUTH_KEYTAB.key, s"${keytab.getName}")
- val e = intercept[IOException](ZookeeperClientProvider.setUpZooKeeperAuth(conf))
- assert(e.getMessage ===
- s"${HA_ZK_AUTH_KEYTAB.key}: ${ZookeeperClientProvider.getKeyTabFile(conf)} does not exists")
+ val e = intercept[IOException](setUpZooKeeperAuth(conf))
+ assert(
+ e.getMessage === s"${HA_ZK_AUTH_KEYTAB.key}: ${getKeyTabFile(conf).get} does not exists")
}
}
@@ -155,12 +157,11 @@ abstract class ZookeeperDiscoveryClientSuite extends DiscoveryClientTests
assert(service.getServiceState === ServiceState.STARTED)
stopZk()
- val isServerLostM = discovery.getClass.getSuperclass.getDeclaredField("isServerLost")
- isServerLostM.setAccessible(true)
- val isServerLost = isServerLostM.get(discovery)
+ val isServerLost =
+ getField[AtomicBoolean]((discovery.getClass.getSuperclass, discovery), "isServerLost")
eventually(timeout(10.seconds), interval(100.millis)) {
- assert(isServerLost.asInstanceOf[AtomicBoolean].get())
+ assert(isServerLost.get())
assert(discovery.getServiceState === ServiceState.STOPPED)
assert(service.getServiceState === ServiceState.STOPPED)
}
diff --git a/kyuubi-hive-beeline/README.md b/kyuubi-hive-beeline/README.md
index ec4f86fd7..161acb99b 100644
--- a/kyuubi-hive-beeline/README.md
+++ b/kyuubi-hive-beeline/README.md
@@ -3,3 +3,4 @@
Aiming to make a better supported beeline for Kyuubi
- Support to show launch engine log when getting KyuubiConnection(Done, available since v1.4.0-incubating)
+
diff --git a/kyuubi-hive-beeline/pom.xml b/kyuubi-hive-beeline/pom.xml
index 76753b38d..1068a81ce 100644
--- a/kyuubi-hive-beeline/pom.xml
+++ b/kyuubi-hive-beeline/pom.xml
@@ -21,7 +21,7 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOTkyuubi-hive-beeline
@@ -40,6 +40,12 @@
${project.version}
+
+ org.apache.kyuubi
+ kyuubi-util
+ ${project.version}
+
+
org.apache.hivehive-beeline
@@ -115,6 +121,12 @@
commons-io
+
+ org.mockito
+ mockito-core
+ test
+
+
commons-langcommons-lang
@@ -149,6 +161,11 @@
log4j-slf4j-impl
+
+ org.slf4j
+ jul-to-slf4j
+
+
org.apache.logging.log4jlog4j-api
@@ -211,6 +228,14 @@
true
+
+
+ org.apache.maven.plugins
+ maven-surefire-plugin
+
+ ${skipTests}
+
+ target/classestarget/test-classes
diff --git a/kyuubi-hive-beeline/src/main/java/org/apache/hive/beeline/KyuubiBeeLine.java b/kyuubi-hive-beeline/src/main/java/org/apache/hive/beeline/KyuubiBeeLine.java
index 7ca767148..224cbb3ce 100644
--- a/kyuubi-hive-beeline/src/main/java/org/apache/hive/beeline/KyuubiBeeLine.java
+++ b/kyuubi-hive-beeline/src/main/java/org/apache/hive/beeline/KyuubiBeeLine.java
@@ -19,22 +19,45 @@
import java.io.IOException;
import java.io.InputStream;
-import java.lang.reflect.Field;
-import java.lang.reflect.Method;
import java.sql.Driver;
-import java.util.Arrays;
-import java.util.Collections;
-import java.util.Iterator;
-import java.util.List;
+import java.util.*;
import org.apache.commons.cli.CommandLine;
import org.apache.commons.cli.Options;
import org.apache.commons.cli.ParseException;
+import org.apache.hive.common.util.HiveStringUtils;
+import org.apache.kyuubi.util.reflect.DynConstructors;
+import org.apache.kyuubi.util.reflect.DynFields;
+import org.apache.kyuubi.util.reflect.DynMethods;
public class KyuubiBeeLine extends BeeLine {
+
+ static {
+ try {
+ // We use reflection here to handle the case where users remove the
+ // slf4j-to-jul bridge order to route their logs to JUL.
+ Class> bridgeClass = Class.forName("org.slf4j.bridge.SLF4JBridgeHandler");
+ bridgeClass.getMethod("removeHandlersForRootLogger").invoke(null);
+ boolean installed = (boolean) bridgeClass.getMethod("isInstalled").invoke(null);
+ if (!installed) {
+ bridgeClass.getMethod("install").invoke(null);
+ }
+ } catch (ReflectiveOperationException cnf) {
+ // can't log anything yet so just fail silently
+ }
+ }
+
public static final String KYUUBI_BEELINE_DEFAULT_JDBC_DRIVER =
"org.apache.kyuubi.jdbc.KyuubiHiveDriver";
protected KyuubiCommands commands = new KyuubiCommands(this);
- private Driver defaultDriver = null;
+ private Driver defaultDriver;
+
+ // copied from org.apache.hive.beeline.BeeLine
+ private static final int ERRNO_OK = 0;
+ private static final int ERRNO_ARGS = 1;
+ private static final int ERRNO_OTHER = 2;
+
+ private static final String PYTHON_MODE_PREFIX = "--python-mode";
+ private boolean pythonMode = false;
public KyuubiBeeLine() {
this(true);
@@ -44,25 +67,37 @@ public KyuubiBeeLine() {
public KyuubiBeeLine(boolean isBeeLine) {
super(isBeeLine);
try {
- Field commandsField = BeeLine.class.getDeclaredField("commands");
- commandsField.setAccessible(true);
- commandsField.set(this, commands);
+ DynFields.builder().hiddenImpl(BeeLine.class, "commands").buildChecked(this).set(commands);
} catch (Throwable t) {
throw new ExceptionInInitializerError("Failed to inject kyuubi commands");
}
try {
defaultDriver =
- (Driver)
- Class.forName(
- KYUUBI_BEELINE_DEFAULT_JDBC_DRIVER,
- true,
- Thread.currentThread().getContextClassLoader())
- .newInstance();
+ DynConstructors.builder()
+ .impl(KYUUBI_BEELINE_DEFAULT_JDBC_DRIVER)
+ .buildChecked()
+ .newInstance();
} catch (Throwable t) {
throw new ExceptionInInitializerError(KYUUBI_BEELINE_DEFAULT_JDBC_DRIVER + "-missing");
}
}
+ @Override
+ void usage() {
+ super.usage();
+ output("Usage: java \" + KyuubiBeeLine.class.getCanonicalName()");
+ output(" --python-mode Execute python code/script.");
+ }
+
+ public boolean isPythonMode() {
+ return pythonMode;
+ }
+
+ // Visible for testing
+ public void setPythonMode(boolean pythonMode) {
+ this.pythonMode = pythonMode;
+ }
+
/** Starts the program. */
public static void main(String[] args) throws IOException {
mainWithInputRedirection(args, null);
@@ -115,25 +150,37 @@ int initArgs(String[] args) {
BeelineParser beelineParser;
boolean connSuccessful;
boolean exit;
- Field exitField;
+ DynFields.BoundField exitField;
try {
- Field optionsField = BeeLine.class.getDeclaredField("options");
- optionsField.setAccessible(true);
- Options options = (Options) optionsField.get(this);
+ Options options =
+ DynFields.builder()
+ .hiddenImpl(BeeLine.class, "options")
+ .buildStaticChecked()
+ .get();
- beelineParser = new BeelineParser();
+ beelineParser =
+ new BeelineParser() {
+ @SuppressWarnings("rawtypes")
+ @Override
+ protected void processOption(String arg, ListIterator iter) throws ParseException {
+ if (PYTHON_MODE_PREFIX.equals(arg)) {
+ pythonMode = true;
+ } else {
+ super.processOption(arg, iter);
+ }
+ }
+ };
cl = beelineParser.parse(options, args);
- Method connectUsingArgsMethod =
- BeeLine.class.getDeclaredMethod(
- "connectUsingArgs", BeelineParser.class, CommandLine.class);
- connectUsingArgsMethod.setAccessible(true);
- connSuccessful = (boolean) connectUsingArgsMethod.invoke(this, beelineParser, cl);
+ connSuccessful =
+ DynMethods.builder("connectUsingArgs")
+ .hiddenImpl(BeeLine.class, BeelineParser.class, CommandLine.class)
+ .buildChecked(this)
+ .invoke(beelineParser, cl);
- exitField = BeeLine.class.getDeclaredField("exit");
- exitField.setAccessible(true);
- exit = (boolean) exitField.get(this);
+ exitField = DynFields.builder().hiddenImpl(BeeLine.class, "exit").buildChecked(this);
+ exit = exitField.get();
} catch (ParseException e1) {
output(e1.getMessage());
@@ -149,10 +196,11 @@ int initArgs(String[] args) {
// no-op if the file is not present
if (!connSuccessful && !exit) {
try {
- Method defaultBeelineConnectMethod =
- BeeLine.class.getDeclaredMethod("defaultBeelineConnect", CommandLine.class);
- defaultBeelineConnectMethod.setAccessible(true);
- connSuccessful = (boolean) defaultBeelineConnectMethod.invoke(this, cl);
+ connSuccessful =
+ DynMethods.builder("defaultBeelineConnect")
+ .hiddenImpl(BeeLine.class, CommandLine.class)
+ .buildChecked(this)
+ .invoke(cl);
} catch (Exception t) {
error(t.getMessage());
@@ -160,6 +208,11 @@ int initArgs(String[] args) {
}
}
+ // see HIVE-19048 : InitScript errors are ignored
+ if (exit) {
+ return 1;
+ }
+
int code = 0;
if (cl.getOptionValues('e') != null) {
commands = Arrays.asList(cl.getOptionValues('e'));
@@ -175,8 +228,7 @@ int initArgs(String[] args) {
return 1;
}
if (!commands.isEmpty()) {
- for (Iterator i = commands.iterator(); i.hasNext(); ) {
- String command = i.next().toString();
+ for (String command : commands) {
debug(loc("executing-command", command));
if (!dispatch(command)) {
code++;
@@ -184,7 +236,7 @@ int initArgs(String[] args) {
}
try {
exit = true;
- exitField.set(this, exit);
+ exitField.set(exit);
} catch (Exception e) {
error(e.getMessage());
return 1;
@@ -192,4 +244,59 @@ int initArgs(String[] args) {
}
return code;
}
+
+ // see HIVE-19048 : Initscript errors are ignored
+ @Override
+ int runInit() {
+ String[] initFiles = getOpts().getInitFiles();
+
+ // executionResult will be ERRNO_OK only if all initFiles execute successfully
+ int executionResult = ERRNO_OK;
+ boolean exitOnError = !getOpts().getForce();
+ DynFields.BoundField exitField = null;
+
+ if (initFiles != null && initFiles.length != 0) {
+ for (String initFile : initFiles) {
+ info("Running init script " + initFile);
+ try {
+ int currentResult;
+ try {
+ currentResult =
+ DynMethods.builder("executeFile")
+ .hiddenImpl(BeeLine.class, String.class)
+ .buildChecked(this)
+ .invoke(initFile);
+ exitField = DynFields.builder().hiddenImpl(BeeLine.class, "exit").buildChecked(this);
+ } catch (Exception t) {
+ error(t.getMessage());
+ currentResult = ERRNO_OTHER;
+ }
+
+ if (currentResult != ERRNO_OK) {
+ executionResult = currentResult;
+
+ if (exitOnError) {
+ return executionResult;
+ }
+ }
+ } finally {
+ // exit beeline if there is initScript failure and --force is not set
+ boolean exit = exitOnError && executionResult != ERRNO_OK;
+ try {
+ exitField.set(exit);
+ } catch (Exception t) {
+ error(t.getMessage());
+ return ERRNO_OTHER;
+ }
+ }
+ }
+ }
+ return executionResult;
+ }
+
+ // see HIVE-15820: comment at the head of beeline -e
+ @Override
+ boolean dispatch(String line) {
+ return super.dispatch(isPythonMode() ? line : HiveStringUtils.removeComments(line));
+ }
}
diff --git a/kyuubi-hive-beeline/src/main/java/org/apache/hive/beeline/KyuubiCommands.java b/kyuubi-hive-beeline/src/main/java/org/apache/hive/beeline/KyuubiCommands.java
index aaa32739a..fcfee49ed 100644
--- a/kyuubi-hive-beeline/src/main/java/org/apache/hive/beeline/KyuubiCommands.java
+++ b/kyuubi-hive-beeline/src/main/java/org/apache/hive/beeline/KyuubiCommands.java
@@ -19,10 +19,13 @@
import static org.apache.kyuubi.jdbc.hive.JdbcConnectionParams.*;
+import com.google.common.annotations.VisibleForTesting;
import java.io.*;
+import java.nio.file.Files;
import java.sql.*;
import java.util.*;
import org.apache.hive.beeline.logs.KyuubiBeelineInPlaceUpdateStream;
+import org.apache.hive.common.util.HiveStringUtils;
import org.apache.kyuubi.jdbc.hive.KyuubiStatement;
import org.apache.kyuubi.jdbc.hive.Utils;
import org.apache.kyuubi.jdbc.hive.logs.InPlaceUpdateStream;
@@ -43,9 +46,14 @@ public boolean sql(String line) {
return execute(line, false, false);
}
+ /** For python mode, keep it as it is. */
+ private String trimForNonPythonMode(String line) {
+ return beeLine.isPythonMode() ? line : line.trim();
+ }
+
/** Extract and clean up the first command in the input. */
private String getFirstCmd(String cmd, int length) {
- return cmd.substring(length).trim();
+ return trimForNonPythonMode(cmd.substring(length));
}
private String[] tokenizeCmd(String cmd) {
@@ -79,10 +87,9 @@ private boolean sourceFile(String cmd) {
}
private boolean sourceFileInternal(File sourceFile) throws IOException {
- BufferedReader reader = null;
- try {
- reader = new BufferedReader(new FileReader(sourceFile));
- String lines = null, extra;
+ try (BufferedReader reader = Files.newBufferedReader(sourceFile.toPath())) {
+ String lines = null;
+ String extra;
while ((extra = reader.readLine()) != null) {
if (beeLine.isComment(extra)) {
continue;
@@ -93,16 +100,13 @@ private boolean sourceFileInternal(File sourceFile) throws IOException {
lines += "\n" + extra;
}
}
- String[] cmds = lines.split(";");
+ String[] cmds = lines.split(beeLine.getOpts().getDelimiter());
for (String c : cmds) {
+ c = trimForNonPythonMode(c);
if (!executeInternal(c, false)) {
return false;
}
}
- } finally {
- if (reader != null) {
- reader.close();
- }
}
return true;
}
@@ -258,9 +262,10 @@ private boolean execute(String line, boolean call, boolean entireLineAsCommand)
beeLine.handleException(e);
}
+ line = trimForNonPythonMode(line);
List cmdList = getCmdList(line, entireLineAsCommand);
for (int i = 0; i < cmdList.size(); i++) {
- String sql = cmdList.get(i);
+ String sql = trimForNonPythonMode(cmdList.get(i));
if (sql.length() != 0) {
if (!executeInternal(sql, call)) {
return false;
@@ -276,7 +281,8 @@ private boolean execute(String line, boolean call, boolean entireLineAsCommand)
* quotations. It iterates through each character in the line and checks to see if it is a ;, ',
* or "
*/
- private List getCmdList(String line, boolean entireLineAsCommand) {
+ @VisibleForTesting
+ public List getCmdList(String line, boolean entireLineAsCommand) {
List cmdList = new ArrayList();
if (entireLineAsCommand) {
cmdList.add(line);
@@ -352,7 +358,7 @@ private List getCmdList(String line, boolean entireLineAsCommand) {
*/
private void addCmdPart(List cmdList, StringBuilder command, String cmdpart) {
if (cmdpart.endsWith("\\")) {
- command.append(cmdpart.substring(0, cmdpart.length() - 1)).append(";");
+ command.append(cmdpart, 0, cmdpart.length() - 1).append(";");
return;
} else {
command.append(cmdpart);
@@ -417,6 +423,7 @@ private String getProperty(Properties props, String[] keys) {
return null;
}
+ @Override
public boolean connect(Properties props) throws IOException {
String url =
getProperty(
@@ -462,7 +469,7 @@ public boolean connect(Properties props) throws IOException {
beeLine.info("Connecting to " + url);
if (Utils.parsePropertyFromUrl(url, AUTH_PRINCIPAL) == null
- || Utils.parsePropertyFromUrl(url, AUTH_KYUUBI_SERVER_PRINCIPAL) == null) {
+ && Utils.parsePropertyFromUrl(url, AUTH_KYUUBI_SERVER_PRINCIPAL) == null) {
String urlForPrompt = url.substring(0, url.contains(";") ? url.indexOf(';') : url.length());
if (username == null) {
username = beeLine.getConsoleReader().readLine("Enter username for " + urlForPrompt + ": ");
@@ -484,7 +491,19 @@ public boolean connect(Properties props) throws IOException {
if (!beeLine.isBeeLine()) {
beeLine.updateOptsForCli();
}
- beeLine.runInit();
+
+ // see HIVE-19048 : Initscript errors are ignored
+ int initScriptExecutionResult = beeLine.runInit();
+
+ // if execution of the init script(s) return anything other than ERRNO_OK from beeline
+ // exit beeline with error unless --force is set
+ if (initScriptExecutionResult != 0 && !beeLine.getOpts().getForce()) {
+ return beeLine.error("init script execution failed.");
+ }
+
+ if (beeLine.getOpts().getInitFiles() != null) {
+ beeLine.initializeConsoleReader(null);
+ }
beeLine.setCompletions();
beeLine.getOpts().setLastConnectedUrl(url);
@@ -499,12 +518,14 @@ public boolean connect(Properties props) throws IOException {
@Override
public String handleMultiLineCmd(String line) throws IOException {
- int[] startQuote = {-1};
Character mask =
(System.getProperty("jline.terminal", "").equals("jline.UnsupportedTerminal"))
? null
: jline.console.ConsoleReader.NULL_MASK;
+ if (!beeLine.isPythonMode()) {
+ line = HiveStringUtils.removeComments(line);
+ }
while (isMultiLine(line) && beeLine.getOpts().isAllowMultiLineCommand()) {
StringBuilder prompt = new StringBuilder(beeLine.getPrompt());
if (!beeLine.getOpts().isSilent()) {
@@ -530,6 +551,9 @@ public String handleMultiLineCmd(String line) throws IOException {
if (extra == null) { // it happens when using -f and the line of cmds does not end with ;
break;
}
+ if (!beeLine.isPythonMode()) {
+ extra = HiveStringUtils.removeComments(extra);
+ }
if (!extra.isEmpty()) {
line += "\n" + extra;
}
@@ -541,12 +565,13 @@ public String handleMultiLineCmd(String line) throws IOException {
// console. Used in handleMultiLineCmd method assumes line would never be null when this method is
// called
private boolean isMultiLine(String line) {
+ line = trimForNonPythonMode(line);
if (line.endsWith(beeLine.getOpts().getDelimiter()) || beeLine.isComment(line)) {
return false;
}
// handles the case like line = show tables; --test comment
List cmds = getCmdList(line, false);
- return cmds.isEmpty() || !cmds.get(cmds.size() - 1).startsWith("--");
+ return cmds.isEmpty() || !trimForNonPythonMode(cmds.get(cmds.size() - 1)).startsWith("--");
}
static class KyuubiLogRunnable implements Runnable {
diff --git a/kyuubi-hive-beeline/src/test/java/org/apache/hive/beeline/KyuubiBeeLineTest.java b/kyuubi-hive-beeline/src/test/java/org/apache/hive/beeline/KyuubiBeeLineTest.java
index b144c95c6..9c7aec35a 100644
--- a/kyuubi-hive-beeline/src/test/java/org/apache/hive/beeline/KyuubiBeeLineTest.java
+++ b/kyuubi-hive-beeline/src/test/java/org/apache/hive/beeline/KyuubiBeeLineTest.java
@@ -19,7 +19,12 @@
package org.apache.hive.beeline;
import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.io.PrintStream;
+import org.apache.kyuubi.util.reflect.DynFields;
import org.junit.Test;
public class KyuubiBeeLineTest {
@@ -29,4 +34,104 @@ public void testKyuubiBeelineWithoutArgs() {
int result = kyuubiBeeLine.initArgs(new String[0]);
assertEquals(0, result);
}
+
+ @Test
+ public void testKyuubiBeelineExitCodeWithoutConnection() {
+ KyuubiBeeLine kyuubiBeeLine = new KyuubiBeeLine();
+ String scriptFile = getClass().getClassLoader().getResource("test.sql").getFile();
+
+ String[] args1 = {"-u", "badUrl", "-e", "show tables"};
+ int result1 = kyuubiBeeLine.initArgs(args1);
+ assertEquals(1, result1);
+
+ String[] args2 = {"-u", "badUrl", "-f", scriptFile};
+ int result2 = kyuubiBeeLine.initArgs(args2);
+ assertEquals(1, result2);
+
+ String[] args3 = {"-u", "badUrl", "-i", scriptFile};
+ int result3 = kyuubiBeeLine.initArgs(args3);
+ assertEquals(1, result3);
+ }
+
+ @Test
+ public void testKyuubiBeeLineCmdUsage() {
+ BufferPrintStream printStream = new BufferPrintStream();
+
+ KyuubiBeeLine kyuubiBeeLine = new KyuubiBeeLine();
+ DynFields.builder()
+ .hiddenImpl(BeeLine.class, "outputStream")
+ .build(kyuubiBeeLine)
+ .set(printStream);
+ String[] args1 = {"-h"};
+ kyuubiBeeLine.initArgs(args1);
+ String output = printStream.getOutput();
+ assert output.contains("--python-mode Execute python code/script.");
+ }
+
+ @Test
+ public void testKyuubiBeeLinePythonMode() {
+ KyuubiBeeLine kyuubiBeeLine = new KyuubiBeeLine();
+ String[] args1 = {"-u", "badUrl", "--python-mode"};
+ kyuubiBeeLine.initArgs(args1);
+ assertTrue(kyuubiBeeLine.isPythonMode());
+ kyuubiBeeLine.setPythonMode(false);
+
+ String[] args2 = {"--python-mode", "-f", "test.sql"};
+ kyuubiBeeLine.initArgs(args2);
+ assertTrue(kyuubiBeeLine.isPythonMode());
+ assert kyuubiBeeLine.getOpts().getScriptFile().equals("test.sql");
+ kyuubiBeeLine.setPythonMode(false);
+
+ String[] args3 = {"-u", "badUrl"};
+ kyuubiBeeLine.initArgs(args3);
+ assertTrue(!kyuubiBeeLine.isPythonMode());
+ kyuubiBeeLine.setPythonMode(false);
+ }
+
+ @Test
+ public void testKyuubiBeelineComment() {
+ KyuubiBeeLine kyuubiBeeLine = new KyuubiBeeLine();
+ int result = kyuubiBeeLine.initArgsFromCliVars(new String[] {"-e", "--comment show database;"});
+ assertEquals(0, result);
+ result = kyuubiBeeLine.initArgsFromCliVars(new String[] {"-e", "--comment\n show database;"});
+ assertEquals(1, result);
+ result =
+ kyuubiBeeLine.initArgsFromCliVars(
+ new String[] {"-e", "--comment line 1 \n --comment line 2 \n show database;"});
+ assertEquals(1, result);
+ }
+
+ static class BufferPrintStream extends PrintStream {
+ public StringBuilder stringBuilder = new StringBuilder();
+
+ static OutputStream noOpOutputStream =
+ new OutputStream() {
+ @Override
+ public void write(int b) throws IOException {
+ // do nothing
+ }
+ };
+
+ public BufferPrintStream() {
+ super(noOpOutputStream);
+ }
+
+ public BufferPrintStream(OutputStream outputStream) {
+ super(noOpOutputStream);
+ }
+
+ @Override
+ public void println(String x) {
+ stringBuilder.append(x).append("\n");
+ }
+
+ @Override
+ public void print(String x) {
+ stringBuilder.append(x);
+ }
+
+ public String getOutput() {
+ return stringBuilder.toString();
+ }
+ }
}
diff --git a/kyuubi-hive-beeline/src/test/java/org/apache/hive/beeline/KyuubiCommandsTest.java b/kyuubi-hive-beeline/src/test/java/org/apache/hive/beeline/KyuubiCommandsTest.java
new file mode 100644
index 000000000..653d1b08f
--- /dev/null
+++ b/kyuubi-hive-beeline/src/test/java/org/apache/hive/beeline/KyuubiCommandsTest.java
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hive.beeline;
+
+import static org.junit.Assert.assertEquals;
+
+import java.io.IOException;
+import java.util.List;
+import jline.console.ConsoleReader;
+import org.junit.Test;
+import org.mockito.Mockito;
+
+public class KyuubiCommandsTest {
+ @Test
+ public void testParsePythonSnippets() throws IOException {
+ ConsoleReader reader = Mockito.mock(ConsoleReader.class);
+ String pythonSnippets = "for i in [1, 2, 3]:\n" + " print(i)\n";
+ Mockito.when(reader.readLine()).thenReturn(pythonSnippets);
+
+ KyuubiBeeLine beeline = new KyuubiBeeLine();
+ beeline.setPythonMode(true);
+ beeline.setConsoleReader(reader);
+ KyuubiCommands commands = new KyuubiCommands(beeline);
+ String line = commands.handleMultiLineCmd(pythonSnippets);
+
+ List cmdList = commands.getCmdList(line, false);
+ assertEquals(cmdList.size(), 1);
+ assertEquals(cmdList.get(0), pythonSnippets);
+ }
+
+ @Test
+ public void testHandleMultiLineCmd() throws IOException {
+ ConsoleReader reader = Mockito.mock(ConsoleReader.class);
+ String snippets = "select 1;--comments1\nselect 2;--comments2";
+ Mockito.when(reader.readLine()).thenReturn(snippets);
+
+ KyuubiBeeLine beeline = new KyuubiBeeLine();
+ beeline.setConsoleReader(reader);
+ beeline.setPythonMode(false);
+ KyuubiCommands commands = new KyuubiCommands(beeline);
+ String line = commands.handleMultiLineCmd(snippets);
+ List cmdList = commands.getCmdList(line, false);
+ assertEquals(cmdList.size(), 2);
+ assertEquals(cmdList.get(0), "select 1");
+ assertEquals(cmdList.get(1), "\nselect 2");
+
+ // see HIVE-15820: comment at the head of beeline -e
+ snippets = "--comments1\nselect 2;--comments2";
+ Mockito.when(reader.readLine()).thenReturn(snippets);
+ line = commands.handleMultiLineCmd(snippets);
+ cmdList = commands.getCmdList(line, false);
+ assertEquals(cmdList.size(), 1);
+ assertEquals(cmdList.get(0), "select 2");
+ }
+}
diff --git a/kyuubi-hive-beeline/src/test/resources/test.sql b/kyuubi-hive-beeline/src/test/resources/test.sql
new file mode 100644
index 000000000..c7c3ee2f9
--- /dev/null
+++ b/kyuubi-hive-beeline/src/test/resources/test.sql
@@ -0,0 +1,17 @@
+-- Licensed to the Apache Software Foundation (ASF) under one or more
+-- contributor license agreements. See the NOTICE file distributed with
+-- this work for additional information regarding copyright ownership.
+-- The ASF licenses this file to You under the Apache License, Version 2.0
+-- (the "License"); you may not use this file except in compliance with
+-- the License. You may obtain a copy of the License at
+--
+-- http://www.apache.org/licenses/LICENSE-2.0
+--
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+--
+
+show tables;
diff --git a/kyuubi-hive-jdbc-shaded/pom.xml b/kyuubi-hive-jdbc-shaded/pom.xml
index 0bfe88922..174f199be 100644
--- a/kyuubi-hive-jdbc-shaded/pom.xml
+++ b/kyuubi-hive-jdbc-shaded/pom.xml
@@ -21,7 +21,7 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOTkyuubi-hive-jdbc-shaded
@@ -108,10 +108,6 @@
org.apache.commons${kyuubi.shade.packageName}.org.apache.commons
-
- org.apache.curator
- ${kyuubi.shade.packageName}.org.apache.curator
- org.apache.hive${kyuubi.shade.packageName}.org.apache.hive
@@ -120,18 +116,10 @@
org.apache.http${kyuubi.shade.packageName}.org.apache.http
-
- org.apache.jute
- ${kyuubi.shade.packageName}.org.apache.jute
- org.apache.thrift${kyuubi.shade.packageName}.org.apache.thrift
-
- org.apache.zookeeper
- ${kyuubi.shade.packageName}.org.apache.zookeeper
-
diff --git a/kyuubi-hive-jdbc/README.md b/kyuubi-hive-jdbc/README.md
index 3210e76ac..10a0522dc 100644
--- a/kyuubi-hive-jdbc/README.md
+++ b/kyuubi-hive-jdbc/README.md
@@ -1,9 +1,9 @@
# Kyuubi Hive JDBC Module
-
Aiming to make a better supported client for Kyuubi and Spark
- Add catalog to getTables meta function for DataLakes (DONE, broken in v1.3.0-incubating, fixed in v1.3.1-incubating)
- Deploy to maven central (DONE, available since v1.3.0-incubating)
- Create shaded jar (DONE, available since v1.4.0-incubating)
- Remove Hive dependencies (DONE, available since v1.6.0-incubating)
+
diff --git a/kyuubi-hive-jdbc/pom.xml b/kyuubi-hive-jdbc/pom.xml
index 4d9648e75..aa5e7c161 100644
--- a/kyuubi-hive-jdbc/pom.xml
+++ b/kyuubi-hive-jdbc/pom.xml
@@ -21,7 +21,7 @@
org.apache.kyuubikyuubi-parent
- 1.7.0-SNAPSHOT
+ 1.9.0-SNAPSHOTkyuubi-hive-jdbc
@@ -35,6 +35,11 @@
+
+ org.apache.kyuubi
+ kyuubi-util
+ ${project.version}
+ org.apache.arrow
@@ -102,24 +107,14 @@
provided
-
- org.apache.curator
- curator-framework
-
-
-
- org.apache.curator
- curator-client
-
-
org.apache.httpcomponentshttpclient
- org.apache.zookeeper
- zookeeper
+ org.apache.kyuubi
+ ${kyuubi-shaded-zookeeper.artifacts}
@@ -171,6 +166,14 @@
+
+
+
+ true
+ src/main/resources
+
+
+
org.apache.maven.plugins
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/KyuubiHiveDriver.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/KyuubiHiveDriver.java
index 3b874ba2e..66b797087 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/KyuubiHiveDriver.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/KyuubiHiveDriver.java
@@ -24,6 +24,7 @@
import java.util.jar.Attributes;
import java.util.jar.Manifest;
import java.util.logging.Logger;
+import org.apache.commons.lang3.StringUtils;
import org.apache.kyuubi.jdbc.hive.JdbcConnectionParams;
import org.apache.kyuubi.jdbc.hive.KyuubiConnection;
import org.apache.kyuubi.jdbc.hive.KyuubiSQLException;
@@ -137,7 +138,7 @@ private Properties parseURLForPropertyInfo(String url, Properties defaults) thro
host = "";
}
String port = Integer.toString(params.getPort());
- if (host.equals("")) {
+ if (StringUtils.isEmpty(host)) {
port = "";
} else if (port.equals("0") || port.equals("-1")) {
port = DEFAULT_PORT;
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/JdbcColumnAttributes.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/JdbcColumnAttributes.java
index 06fb39899..b0257cfff 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/JdbcColumnAttributes.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/JdbcColumnAttributes.java
@@ -20,7 +20,7 @@
public class JdbcColumnAttributes {
public int precision = 0;
public int scale = 0;
- public String timeZone = "";
+ public String timeZone = null;
public JdbcColumnAttributes() {}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/JdbcConnectionParams.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/JdbcConnectionParams.java
index 71949b9df..bcc94e083 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/JdbcConnectionParams.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/JdbcConnectionParams.java
@@ -33,6 +33,7 @@ public class JdbcConnectionParams {
// Client param names:
+ static final String CLIENT_PROTOCOL_VERSION = "clientProtocolVersion";
// Retry setting
static final String RETRIES = "retries";
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiArrowBasedResultSet.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiArrowBasedResultSet.java
index c3e75c0ea..ef5008503 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiArrowBasedResultSet.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiArrowBasedResultSet.java
@@ -50,6 +50,7 @@ public abstract class KyuubiArrowBasedResultSet implements SQLResultSet {
protected Schema arrowSchema;
protected VectorSchemaRoot root;
protected ArrowColumnarBatchRow row;
+ protected boolean timestampAsString = true;
protected BufferAllocator allocator;
@@ -312,11 +313,18 @@ private Object getColumnValue(int columnIndex) throws SQLException {
if (wasNull) {
return null;
} else {
- return row.get(columnIndex - 1, columnType);
+ JdbcColumnAttributes attributes = columnAttributes.get(columnIndex - 1);
+ return row.get(
+ columnIndex - 1,
+ columnType,
+ attributes == null ? null : attributes.timeZone,
+ timestampAsString);
}
} catch (Exception e) {
- e.printStackTrace();
- throw new KyuubiSQLException("Unrecognized column type:", e);
+ throw new KyuubiSQLException(
+ String.format(
+ "Error getting row of type %s at column index %d", columnType, columnIndex - 1),
+ e);
}
}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiArrowQueryResultSet.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiArrowQueryResultSet.java
index 1f2af29dc..54491b2d6 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiArrowQueryResultSet.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiArrowQueryResultSet.java
@@ -58,9 +58,6 @@ public class KyuubiArrowQueryResultSet extends KyuubiArrowBasedResultSet {
private boolean isScrollable = false;
private boolean fetchFirst = false;
- // TODO:(fchen) make this configurable
- protected boolean convertComplexTypeToString = true;
-
private final TProtocolVersion protocol;
public static class Builder {
@@ -87,6 +84,8 @@ public static class Builder {
private boolean isScrollable = false;
private ReentrantLock transportLock = null;
+ private boolean timestampAsString = true;
+
public Builder(Statement statement) throws SQLException {
this.statement = statement;
this.connection = statement.getConnection();
@@ -153,6 +152,11 @@ public Builder setScrollable(boolean setScrollable) {
return this;
}
+ public Builder setTimestampAsString(boolean timestampAsString) {
+ this.timestampAsString = timestampAsString;
+ return this;
+ }
+
public Builder setTransportLock(ReentrantLock transportLock) {
this.transportLock = transportLock;
return this;
@@ -189,10 +193,10 @@ protected KyuubiArrowQueryResultSet(Builder builder) throws SQLException {
this.maxRows = builder.maxRows;
}
this.isScrollable = builder.isScrollable;
+ this.timestampAsString = builder.timestampAsString;
this.protocol = builder.getProtocolVersion();
arrowSchema =
- ArrowUtils.toArrowSchema(
- columnNames, convertComplexTypeToStringType(columnTypes), columnAttributes);
+ ArrowUtils.toArrowSchema(columnNames, convertToStringType(columnTypes), columnAttributes);
if (allocator == null) {
initArrowSchemaAndAllocator();
}
@@ -246,9 +250,6 @@ private void retrieveSchema() throws SQLException {
metadataResp = client.GetResultSetMetadata(metadataReq);
Utils.verifySuccess(metadataResp.getStatus());
- StringBuilder namesSb = new StringBuilder();
- StringBuilder typesSb = new StringBuilder();
-
TTableSchema schema = metadataResp.getSchema();
if (schema == null || !schema.isSetColumns()) {
// TODO: should probably throw an exception here.
@@ -258,10 +259,6 @@ private void retrieveSchema() throws SQLException {
List columns = schema.getColumns();
for (int pos = 0; pos < schema.getColumnsSize(); pos++) {
- if (pos != 0) {
- namesSb.append(",");
- typesSb.append(",");
- }
String columnName = columns.get(pos).getColumnName();
columnNames.add(columnName);
normalizedColumnNames.add(columnName.toLowerCase());
@@ -271,8 +268,7 @@ private void retrieveSchema() throws SQLException {
columnAttributes.add(getColumnAttributes(primitiveTypeEntry));
}
arrowSchema =
- ArrowUtils.toArrowSchema(
- columnNames, convertComplexTypeToStringType(columnTypes), columnAttributes);
+ ArrowUtils.toArrowSchema(columnNames, convertToStringType(columnTypes), columnAttributes);
} catch (SQLException eS) {
throw eS; // rethrow the SQLException as is
} catch (Exception ex) {
@@ -480,22 +476,25 @@ public boolean isClosed() {
return isClosed;
}
- private List convertComplexTypeToStringType(List colTypes) {
- if (convertComplexTypeToString) {
- return colTypes.stream()
- .map(
- type -> {
- if (type == TTypeId.ARRAY_TYPE
- || type == TTypeId.MAP_TYPE
- || type == TTypeId.STRUCT_TYPE) {
- return TTypeId.STRING_TYPE;
- } else {
- return type;
- }
- })
- .collect(Collectors.toList());
- } else {
- return colTypes;
- }
+ /**
+ * 1. the complex types (map/array/struct) are always converted to string type to transport 2. if
+ * the user set `timestampAsString = true`, then the timestamp type will be converted to string
+ * type too.
+ */
+ private List convertToStringType(List colTypes) {
+ return colTypes.stream()
+ .map(
+ type -> {
+ if ((type == TTypeId.ARRAY_TYPE
+ || type == TTypeId.MAP_TYPE
+ || type == TTypeId.STRUCT_TYPE) // complex type (map/array/struct)
+ // timestamp type
+ || (type == TTypeId.TIMESTAMP_TYPE && timestampAsString)) {
+ return TTypeId.STRING_TYPE;
+ } else {
+ return type;
+ }
+ })
+ .collect(Collectors.toList());
}
}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiConnection.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiConnection.java
index 9931dcec2..d3fbbeb6d 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiConnection.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiConnection.java
@@ -30,10 +30,7 @@
import java.net.UnknownHostException;
import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
-import java.security.AccessControlContext;
-import java.security.AccessController;
-import java.security.KeyStore;
-import java.security.SecureRandom;
+import java.security.*;
import java.sql.*;
import java.util.*;
import java.util.Map.Entry;
@@ -43,6 +40,7 @@
import javax.net.ssl.TrustManagerFactory;
import javax.security.auth.Subject;
import javax.security.sasl.Sasl;
+import org.apache.commons.lang3.ClassUtils;
import org.apache.commons.lang3.StringUtils;
import org.apache.hive.service.rpc.thrift.*;
import org.apache.http.HttpRequestInterceptor;
@@ -106,9 +104,11 @@ public class KyuubiConnection implements SQLConnection, KyuubiLoggable {
private Thread engineLogThread;
private boolean engineLogInflight = true;
private volatile boolean launchEngineOpCompleted = false;
+ private boolean launchEngineOpSupportResult = false;
private String engineId = "";
private String engineName = "";
private String engineUrl = "";
+ private String engineRefId = "";
private boolean isBeeLineMode;
@@ -733,11 +733,24 @@ private void openSession() throws SQLException {
if (sessVars.containsKey(HS2_PROXY_USER)) {
openConf.put(HS2_PROXY_USER, sessVars.get(HS2_PROXY_USER));
}
+ String clientProtocolStr =
+ sessVars.getOrDefault(
+ CLIENT_PROTOCOL_VERSION, openReq.getClient_protocol().getValue() + "");
+ TProtocolVersion clientProtocol =
+ TProtocolVersion.findByValue(Integer.parseInt(clientProtocolStr));
+ if (clientProtocol == null) {
+ throw new IllegalArgumentException(
+ String.format(
+ "Unsupported Hive2 protocol version %s specified by session conf key %s",
+ clientProtocolStr, CLIENT_PROTOCOL_VERSION));
+ }
+ openReq.setClient_protocol(clientProtocol);
try {
openConf.put("kyuubi.client.ipAddress", InetAddress.getLocalHost().getHostAddress());
} catch (UnknownHostException e) {
LOG.debug("Error getting Kyuubi session local client ip address", e);
}
+ openConf.put(Utils.KYUUBI_CLIENT_VERSION_KEY, Utils.getVersion());
openReq.setConfiguration(openConf);
// Store the user name in the open request in case no non-sasl authentication
@@ -770,6 +783,10 @@ private void openSession() throws SQLException {
String launchEngineOpHandleSecret =
openRespConf.get("kyuubi.session.engine.launch.handle.secret");
+ launchEngineOpSupportResult =
+ Boolean.parseBoolean(
+ openRespConf.getOrDefault("kyuubi.session.engine.launch.support.result", "false"));
+
if (launchEngineOpHandleGuid != null && launchEngineOpHandleSecret != null) {
try {
byte[] guidBytes = Base64.getMimeDecoder().decode(launchEngineOpHandleGuid);
@@ -812,11 +829,16 @@ private boolean isSaslAuthMode() {
return !AUTH_SIMPLE.equalsIgnoreCase(sessConfMap.get(AUTH_TYPE));
}
- private boolean isFromSubjectAuthMode() {
- return isSaslAuthMode()
- && hasSessionValue(AUTH_PRINCIPAL)
- && AUTH_KERBEROS_AUTH_TYPE_FROM_SUBJECT.equalsIgnoreCase(
- sessConfMap.get(AUTH_KERBEROS_AUTH_TYPE));
+ private boolean isHadoopUserGroupInformationDoAs() {
+ try {
+ @SuppressWarnings("unchecked")
+ Class extends Principal> HadoopUserClz =
+ (Class extends Principal>) ClassUtils.getClass("org.apache.hadoop.security.User");
+ Subject subject = Subject.getSubject(AccessController.getContext());
+ return subject != null && !subject.getPrincipals(HadoopUserClz).isEmpty();
+ } catch (ClassNotFoundException e) {
+ return false;
+ }
}
private boolean isKeytabAuthMode() {
@@ -826,6 +848,16 @@ && hasSessionValue(AUTH_KYUUBI_CLIENT_PRINCIPAL)
&& hasSessionValue(AUTH_KYUUBI_CLIENT_KEYTAB);
}
+ private boolean isFromSubjectAuthMode() {
+ return isSaslAuthMode()
+ && hasSessionValue(AUTH_PRINCIPAL)
+ && !hasSessionValue(AUTH_KYUUBI_CLIENT_PRINCIPAL)
+ && !hasSessionValue(AUTH_KYUUBI_CLIENT_KEYTAB)
+ && (AUTH_KERBEROS_AUTH_TYPE_FROM_SUBJECT.equalsIgnoreCase(
+ sessConfMap.get(AUTH_KERBEROS_AUTH_TYPE))
+ || isHadoopUserGroupInformationDoAs());
+ }
+
private boolean isTgtCacheAuthMode() {
return isSaslAuthMode()
&& hasSessionValue(AUTH_PRINCIPAL)
@@ -842,15 +874,15 @@ private boolean isKerberosAuthMode() {
}
private Subject createSubject() {
- if (isFromSubjectAuthMode()) {
+ if (isKeytabAuthMode()) {
+ String principal = sessConfMap.get(AUTH_KYUUBI_CLIENT_PRINCIPAL);
+ String keytab = sessConfMap.get(AUTH_KYUUBI_CLIENT_KEYTAB);
+ return KerberosAuthenticationManager.getKeytabAuthentication(principal, keytab).getSubject();
+ } else if (isFromSubjectAuthMode()) {
AccessControlContext context = AccessController.getContext();
return Subject.getSubject(context);
} else if (isTgtCacheAuthMode()) {
return KerberosAuthenticationManager.getTgtCacheAuthentication().getSubject();
- } else if (isKeytabAuthMode()) {
- String principal = sessConfMap.get(AUTH_KYUUBI_CLIENT_PRINCIPAL);
- String keytab = sessConfMap.get(AUTH_KYUUBI_CLIENT_KEYTAB);
- return KerberosAuthenticationManager.getKeytabAuthentication(principal, keytab).getSubject();
} else {
// This should never happen
throw new IllegalArgumentException("Unsupported auth mode");
@@ -1338,7 +1370,7 @@ public void waitLaunchEngineToComplete() throws SQLException {
}
private void fetchLaunchEngineResult() {
- if (launchEngineOpHandle == null) return;
+ if (launchEngineOpHandle == null || !launchEngineOpSupportResult) return;
TFetchResultsReq tFetchResultsReq =
new TFetchResultsReq(
@@ -1356,6 +1388,8 @@ private void fetchLaunchEngineResult() {
engineName = value;
} else if ("url".equals(key)) {
engineUrl = value;
+ } else if ("refId".equals(key)) {
+ engineRefId = value;
}
}
} catch (Exception e) {
@@ -1374,4 +1408,8 @@ public String getEngineName() {
public String getEngineUrl() {
return engineUrl;
}
+
+ public String getEngineRefId() {
+ return engineRefId;
+ }
}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiDatabaseMetaData.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiDatabaseMetaData.java
index f5e29f8e7..c6ab3a277 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiDatabaseMetaData.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiDatabaseMetaData.java
@@ -531,7 +531,7 @@ public ResultSet getProcedureColumns(
@Override
public String getProcedureTerm() throws SQLException {
- return new String("UDF");
+ return "UDF";
}
@Override
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiPreparedStatement.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiPreparedStatement.java
index 43c2a030b..1e53f9401 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiPreparedStatement.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiPreparedStatement.java
@@ -26,9 +26,7 @@
import java.sql.Timestamp;
import java.sql.Types;
import java.text.MessageFormat;
-import java.util.ArrayList;
import java.util.HashMap;
-import java.util.List;
import java.util.Scanner;
import org.apache.hive.service.rpc.thrift.TCLIService;
import org.apache.hive.service.rpc.thrift.TSessionHandle;
@@ -81,57 +79,7 @@ public int executeUpdate() throws SQLException {
/** update the SQL string with parameters set by setXXX methods of {@link PreparedStatement} */
private String updateSql(final String sql, HashMap parameters)
throws SQLException {
- List parts = splitSqlStatement(sql);
-
- StringBuilder newSql = new StringBuilder(parts.get(0));
- for (int i = 1; i < parts.size(); i++) {
- if (!parameters.containsKey(i)) {
- throw new KyuubiSQLException("Parameter #" + i + " is unset");
- }
- newSql.append(parameters.get(i));
- newSql.append(parts.get(i));
- }
- return newSql.toString();
- }
-
- /**
- * Splits the parametered sql statement at parameter boundaries.
- *
- *
taking into account ' and \ escaping.
- *
- *
output for: 'select 1 from ? where a = ?' ['select 1 from ',' where a = ','']
- */
- private List splitSqlStatement(String sql) {
- List parts = new ArrayList<>();
- int apCount = 0;
- int off = 0;
- boolean skip = false;
-
- for (int i = 0; i < sql.length(); i++) {
- char c = sql.charAt(i);
- if (skip) {
- skip = false;
- continue;
- }
- switch (c) {
- case '\'':
- apCount++;
- break;
- case '\\':
- skip = true;
- break;
- case '?':
- if ((apCount & 1) == 0) {
- parts.add(sql.substring(off, i));
- off = i + 1;
- }
- break;
- default:
- break;
- }
- }
- parts.add(sql.substring(off, sql.length()));
- return parts;
+ return Utils.updateSql(sql, parameters);
}
@Override
@@ -220,7 +168,7 @@ public void setObject(int parameterIndex, Object x) throws SQLException {
// Can't infer a type.
throw new KyuubiSQLException(
MessageFormat.format(
- "Can't infer the SQL type to use for an instance of {0}. Use setObject() with an explicit Types value to specify the type to use.",
+ "Cannot infer the SQL type to use for an instance of {0}. Use setObject() with an explicit Types value to specify the type to use.",
x.getClass().getName()));
}
}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiQueryResultSet.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiQueryResultSet.java
index f06ada5d4..242ec7720 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiQueryResultSet.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiQueryResultSet.java
@@ -26,6 +26,7 @@
import org.apache.kyuubi.jdbc.hive.cli.RowSet;
import org.apache.kyuubi.jdbc.hive.cli.RowSetFactory;
import org.apache.kyuubi.jdbc.hive.common.HiveDecimal;
+import org.apache.thrift.TException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
@@ -47,6 +48,7 @@ public class KyuubiQueryResultSet extends KyuubiBaseResultSet {
private boolean emptyResultSet = false;
private boolean isScrollable = false;
private boolean fetchFirst = false;
+ private boolean hasMoreToFetch = false;
private final TProtocolVersion protocol;
@@ -223,9 +225,6 @@ private void retrieveSchema() throws SQLException {
metadataResp = client.GetResultSetMetadata(metadataReq);
Utils.verifySuccess(metadataResp.getStatus());
- StringBuilder namesSb = new StringBuilder();
- StringBuilder typesSb = new StringBuilder();
-
TTableSchema schema = metadataResp.getSchema();
if (schema == null || !schema.isSetColumns()) {
// TODO: should probably throw an exception here.
@@ -235,10 +234,6 @@ private void retrieveSchema() throws SQLException {
List columns = schema.getColumns();
for (int pos = 0; pos < schema.getColumnsSize(); pos++) {
- if (pos != 0) {
- namesSb.append(",");
- typesSb.append(",");
- }
String columnName = columns.get(pos).getColumnName();
columnNames.add(columnName);
normalizedColumnNames.add(columnName.toLowerCase());
@@ -324,25 +319,20 @@ public boolean next() throws SQLException {
try {
TFetchOrientation orientation = TFetchOrientation.FETCH_NEXT;
if (fetchFirst) {
- // If we are asked to start from begining, clear the current fetched resultset
+ // If we are asked to start from beginning, clear the current fetched resultset
orientation = TFetchOrientation.FETCH_FIRST;
fetchedRows = null;
fetchedRowsItr = null;
fetchFirst = false;
}
if (fetchedRows == null || !fetchedRowsItr.hasNext()) {
- TFetchResultsReq fetchReq = new TFetchResultsReq(stmtHandle, orientation, fetchSize);
- TFetchResultsResp fetchResp;
- fetchResp = client.FetchResults(fetchReq);
- Utils.verifySuccessWithInfo(fetchResp.getStatus());
-
- TRowSet results = fetchResp.getResults();
- fetchedRows = RowSetFactory.create(results, protocol);
- fetchedRowsItr = fetchedRows.iterator();
+ fetchResult(orientation);
}
if (fetchedRowsItr.hasNext()) {
row = fetchedRowsItr.next();
+ } else if (hasMoreToFetch) {
+ fetchResult(orientation);
} else {
return false;
}
@@ -357,6 +347,18 @@ public boolean next() throws SQLException {
return true;
}
+ private void fetchResult(TFetchOrientation orientation) throws SQLException, TException {
+ TFetchResultsReq fetchReq = new TFetchResultsReq(stmtHandle, orientation, fetchSize);
+ TFetchResultsResp fetchResp;
+ fetchResp = client.FetchResults(fetchReq);
+ Utils.verifySuccessWithInfo(fetchResp.getStatus());
+ hasMoreToFetch = fetchResp.isSetHasMoreRows() && fetchResp.isHasMoreRows();
+
+ TRowSet results = fetchResp.getResults();
+ fetchedRows = RowSetFactory.create(results, protocol);
+ fetchedRowsItr = fetchedRows.iterator();
+ }
+
@Override
public ResultSetMetaData getMetaData() throws SQLException {
if (isClosed) {
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiSQLException.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiSQLException.java
index 1ac0adf04..7d26f8078 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiSQLException.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiSQLException.java
@@ -21,6 +21,7 @@
import java.util.ArrayList;
import java.util.List;
import org.apache.hive.service.rpc.thrift.TStatus;
+import org.apache.kyuubi.util.reflect.DynConstructors;
public class KyuubiSQLException extends SQLException {
@@ -186,7 +187,10 @@ private static Throwable toStackTrace(
private static Throwable newInstance(String className, String message) {
try {
- return (Throwable) Class.forName(className).getConstructor(String.class).newInstance(message);
+ return DynConstructors.builder()
+ .impl(className, String.class)
+ .buildChecked()
+ .newInstance(message);
} catch (Exception e) {
return new RuntimeException(className + ":" + message);
}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiStatement.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiStatement.java
index ab7c06a55..cbe32eca6 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiStatement.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/KyuubiStatement.java
@@ -37,6 +37,7 @@ public class KyuubiStatement implements SQLStatement, KyuubiLoggable {
public static final Logger LOG = LoggerFactory.getLogger(KyuubiStatement.class.getName());
public static final int DEFAULT_FETCH_SIZE = 1000;
public static final String DEFAULT_RESULT_FORMAT = "thrift";
+ public static final String DEFAULT_ARROW_TIMESTAMP_AS_STRING = "false";
private final KyuubiConnection connection;
private TCLIService.Iface client;
private TOperationHandle stmtHandle = null;
@@ -45,7 +46,8 @@ public class KyuubiStatement implements SQLStatement, KyuubiLoggable {
private int fetchSize = DEFAULT_FETCH_SIZE;
private boolean isScrollableResultset = false;
private boolean isOperationComplete = false;
- private Map properties = new HashMap<>();
+
+ private Map properties = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
/**
* We need to keep a reference to the result set to support the following:
* statement.execute(String sql);
@@ -210,9 +212,14 @@ private boolean executeWithConfOverlay(String sql, Map confOverl
String resultFormat =
properties.getOrDefault("__kyuubi_operation_result_format__", DEFAULT_RESULT_FORMAT);
- LOG.info("kyuubi.operation.result.format: " + resultFormat);
+ LOG.debug("kyuubi.operation.result.format: {}", resultFormat);
switch (resultFormat) {
case "arrow":
+ boolean timestampAsString =
+ Boolean.parseBoolean(
+ properties.getOrDefault(
+ "__kyuubi_operation_result_arrow_timestampAsString__",
+ DEFAULT_ARROW_TIMESTAMP_AS_STRING));
resultSet =
new KyuubiArrowQueryResultSet.Builder(this)
.setClient(client)
@@ -222,6 +229,7 @@ private boolean executeWithConfOverlay(String sql, Map confOverl
.setFetchSize(fetchSize)
.setScrollable(isScrollableResultset)
.setSchema(columnNames, columnTypes, columnAttributes)
+ .setTimestampAsString(timestampAsString)
.build();
break;
default:
@@ -267,9 +275,14 @@ public boolean executeAsync(String sql) throws SQLException {
String resultFormat =
properties.getOrDefault("__kyuubi_operation_result_format__", DEFAULT_RESULT_FORMAT);
- LOG.info("kyuubi.operation.result.format: " + resultFormat);
+ LOG.debug("kyuubi.operation.result.format: {}", resultFormat);
switch (resultFormat) {
case "arrow":
+ boolean timestampAsString =
+ Boolean.parseBoolean(
+ properties.getOrDefault(
+ "__kyuubi_operation_result_arrow_timestampAsString__",
+ DEFAULT_ARROW_TIMESTAMP_AS_STRING));
resultSet =
new KyuubiArrowQueryResultSet.Builder(this)
.setClient(client)
@@ -279,6 +292,7 @@ public boolean executeAsync(String sql) throws SQLException {
.setFetchSize(fetchSize)
.setScrollable(isScrollableResultset)
.setSchema(columnNames, columnTypes, columnAttributes)
+ .setTimestampAsString(timestampAsString)
.build();
break;
default:
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/Utils.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/Utils.java
index c5b197f13..135c38d8e 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/Utils.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/Utils.java
@@ -22,10 +22,12 @@
import java.net.InetAddress;
import java.net.URI;
import java.net.UnknownHostException;
+import java.sql.PreparedStatement;
import java.sql.SQLException;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
+import org.apache.commons.lang3.StringUtils;
import org.apache.hive.service.rpc.thrift.TStatus;
import org.apache.hive.service.rpc.thrift.TStatusCode;
import org.slf4j.Logger;
@@ -88,6 +90,62 @@ static void verifySuccess(TStatus status, boolean withInfo) throws SQLException
throw new KyuubiSQLException(status);
}
+ /**
+ * Splits the parametered sql statement at parameter boundaries.
+ *
+ *
taking into account ' and \ escaping.
+ *
+ *
output for: 'select 1 from ? where a = ?' ['select 1 from ',' where a = ','']
+ */
+ static List splitSqlStatement(String sql) {
+ List parts = new ArrayList<>();
+ int apCount = 0;
+ int off = 0;
+ boolean skip = false;
+
+ for (int i = 0; i < sql.length(); i++) {
+ char c = sql.charAt(i);
+ if (skip) {
+ skip = false;
+ continue;
+ }
+ switch (c) {
+ case '\'':
+ apCount++;
+ break;
+ case '\\':
+ skip = true;
+ break;
+ case '?':
+ if ((apCount & 1) == 0) {
+ parts.add(sql.substring(off, i));
+ off = i + 1;
+ }
+ break;
+ default:
+ break;
+ }
+ }
+ parts.add(sql.substring(off));
+ return parts;
+ }
+
+ /** update the SQL string with parameters set by setXXX methods of {@link PreparedStatement} */
+ public static String updateSql(final String sql, HashMap parameters)
+ throws SQLException {
+ List parts = splitSqlStatement(sql);
+
+ StringBuilder newSql = new StringBuilder(parts.get(0));
+ for (int i = 1; i < parts.size(); i++) {
+ if (!parameters.containsKey(i)) {
+ throw new KyuubiSQLException("Parameter #" + i + " is unset");
+ }
+ newSql.append(parameters.get(i));
+ newSql.append(parts.get(i));
+ }
+ return newSql.toString();
+ }
+
public static JdbcConnectionParams parseURL(String uri)
throws JdbcUriParseException, SQLException, ZooKeeperHiveClientException {
return parseURL(uri, new Properties());
@@ -193,12 +251,20 @@ public static JdbcConnectionParams extractURLComponents(String uri, Properties i
}
}
+ Pattern confPattern = Pattern.compile("([^;]*)([^;]*);?");
+
// parse hive conf settings
String confStr = jdbcURI.getQuery();
if (confStr != null) {
- Matcher confMatcher = pattern.matcher(confStr);
+ Matcher confMatcher = confPattern.matcher(confStr);
while (confMatcher.find()) {
- connParams.getHiveConfs().put(confMatcher.group(1), confMatcher.group(2));
+ String connParam = confMatcher.group(1);
+ if (StringUtils.isNotBlank(connParam) && connParam.contains("=")) {
+ int symbolIndex = connParam.indexOf('=');
+ connParams
+ .getHiveConfs()
+ .put(connParam.substring(0, symbolIndex), connParam.substring(symbolIndex + 1));
+ }
}
}
@@ -226,6 +292,13 @@ public static JdbcConnectionParams extractURLComponents(String uri, Properties i
}
}
}
+ if (!connParams.getSessionVars().containsKey(CLIENT_PROTOCOL_VERSION)) {
+ if (info.containsKey(CLIENT_PROTOCOL_VERSION)) {
+ connParams
+ .getSessionVars()
+ .put(CLIENT_PROTOCOL_VERSION, info.getProperty(CLIENT_PROTOCOL_VERSION));
+ }
+ }
// Extract user/password from JDBC connection properties if its not supplied
// in the connection URL
if (!connParams.getSessionVars().containsKey(AUTH_USER)) {
@@ -477,4 +550,24 @@ public static String getCanonicalHostName(String hostName) {
public static boolean isKyuubiOperationHint(String hint) {
return KYUUBI_OPERATION_HINT_PATTERN.matcher(hint).matches();
}
+
+ public static final String KYUUBI_CLIENT_VERSION_KEY = "kyuubi.client.version";
+ private static String KYUUBI_CLIENT_VERSION;
+
+ public static synchronized String getVersion() {
+ if (KYUUBI_CLIENT_VERSION == null) {
+ try {
+ Properties prop = new Properties();
+ prop.load(
+ Utils.class
+ .getClassLoader()
+ .getResourceAsStream("org/apache/kyuubi/version.properties"));
+ KYUUBI_CLIENT_VERSION = prop.getProperty(KYUUBI_CLIENT_VERSION_KEY, "unknown");
+ } catch (Exception e) {
+ LOG.error("Error getting kyuubi client version", e);
+ KYUUBI_CLIENT_VERSION = "unknown";
+ }
+ }
+ return KYUUBI_CLIENT_VERSION;
+ }
}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/ZooKeeperHiveClientHelper.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/ZooKeeperHiveClientHelper.java
index 349fc8dfb..948fd3334 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/ZooKeeperHiveClientHelper.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/ZooKeeperHiveClientHelper.java
@@ -17,27 +17,30 @@
package org.apache.kyuubi.jdbc.hive;
+import com.google.common.annotations.VisibleForTesting;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
-import java.util.Random;
+import java.util.concurrent.ThreadLocalRandom;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
-import org.apache.curator.framework.CuratorFramework;
-import org.apache.curator.framework.CuratorFrameworkFactory;
-import org.apache.curator.retry.ExponentialBackoffRetry;
+import org.apache.kyuubi.shaded.curator.framework.CuratorFramework;
+import org.apache.kyuubi.shaded.curator.framework.CuratorFrameworkFactory;
+import org.apache.kyuubi.shaded.curator.retry.ExponentialBackoffRetry;
class ZooKeeperHiveClientHelper {
// Pattern for key1=value1;key2=value2
private static final Pattern kvPattern = Pattern.compile("([^=;]*)=([^;]*);?");
- private static String getZooKeeperNamespace(JdbcConnectionParams connParams) {
+ @VisibleForTesting
+ protected static String getZooKeeperNamespace(JdbcConnectionParams connParams) {
String zooKeeperNamespace =
connParams.getSessionVars().get(JdbcConnectionParams.ZOOKEEPER_NAMESPACE);
if ((zooKeeperNamespace == null) || (zooKeeperNamespace.isEmpty())) {
zooKeeperNamespace = JdbcConnectionParams.ZOOKEEPER_DEFAULT_NAMESPACE;
}
+ zooKeeperNamespace = zooKeeperNamespace.replaceAll("^/+", "").replaceAll("/+$", "");
return zooKeeperNamespace;
}
@@ -108,7 +111,7 @@ static void configureConnParams(JdbcConnectionParams connParams)
try (CuratorFramework zooKeeperClient = getZkClient(connParams)) {
List serverHosts = getServerHosts(connParams, zooKeeperClient);
// Now pick a server node randomly
- String serverNode = serverHosts.get(new Random().nextInt(serverHosts.size()));
+ String serverNode = serverHosts.get(ThreadLocalRandom.current().nextInt(serverHosts.size()));
updateParamsWithZKServerNode(connParams, zooKeeperClient, serverNode);
} catch (Exception e) {
throw new ZooKeeperHiveClientException(
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/arrow/ArrowColumnarBatchRow.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/arrow/ArrowColumnarBatchRow.java
index 20ed55a1d..373867069 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/arrow/ArrowColumnarBatchRow.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/arrow/ArrowColumnarBatchRow.java
@@ -19,6 +19,8 @@
import java.math.BigDecimal;
import java.sql.Timestamp;
+import java.time.LocalDateTime;
+import org.apache.arrow.vector.util.DateUtility;
import org.apache.hive.service.rpc.thrift.TTypeId;
import org.apache.kyuubi.jdbc.hive.common.DateUtils;
import org.apache.kyuubi.jdbc.hive.common.HiveIntervalDayTime;
@@ -104,7 +106,11 @@ public Object getMap(int ordinal) {
throw new UnsupportedOperationException();
}
- public Object get(int ordinal, TTypeId dataType) {
+ public Object get(int ordinal, TTypeId dataType, String timeZone, boolean timestampAsString) {
+ long seconds;
+ long milliseconds;
+ long microseconds;
+ int nanos;
switch (dataType) {
case BOOLEAN_TYPE:
return getBoolean(ordinal);
@@ -127,13 +133,19 @@ public Object get(int ordinal, TTypeId dataType) {
case STRING_TYPE:
return getString(ordinal);
case TIMESTAMP_TYPE:
- return new Timestamp(getLong(ordinal) / 1000);
+ if (timestampAsString) {
+ return Timestamp.valueOf(getString(ordinal));
+ } else {
+ LocalDateTime localDateTime =
+ DateUtility.getLocalDateTimeFromEpochMicro(getLong(ordinal), timeZone);
+ return Timestamp.valueOf(localDateTime);
+ }
case DATE_TYPE:
return DateUtils.internalToDate(getInt(ordinal));
case INTERVAL_DAY_TIME_TYPE:
- long microseconds = getLong(ordinal);
- long seconds = microseconds / 1000000;
- int nanos = (int) (microseconds % 1000000) * 1000;
+ microseconds = getLong(ordinal);
+ seconds = microseconds / 1_000_000;
+ nanos = (int) (microseconds % 1_000_000) * 1_000;
return new HiveIntervalDayTime(seconds, nanos);
case INTERVAL_YEAR_MONTH_TYPE:
return new HiveIntervalYearMonth(getInt(ordinal));
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/auth/HttpKerberosRequestInterceptor.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/auth/HttpKerberosRequestInterceptor.java
index 278cef0b4..02d168c3f 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/auth/HttpKerberosRequestInterceptor.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/auth/HttpKerberosRequestInterceptor.java
@@ -65,7 +65,7 @@ protected void addHttpAuthHeader(HttpRequest httpRequest, HttpContext httpContex
httpRequest.addHeader(
HttpAuthUtils.AUTHORIZATION, HttpAuthUtils.NEGOTIATE + " " + kerberosAuthHeader);
} catch (Exception e) {
- throw new HttpException(e.getMessage(), e);
+ throw new HttpException(e.getMessage() == null ? "" : e.getMessage(), e);
} finally {
kerberosLock.unlock();
}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/auth/HttpRequestInterceptorBase.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/auth/HttpRequestInterceptorBase.java
index 9ce5a330b..42641c219 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/auth/HttpRequestInterceptorBase.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/auth/HttpRequestInterceptorBase.java
@@ -110,7 +110,7 @@ public void process(HttpRequest httpRequest, HttpContext httpContext)
httpRequest.addHeader("Cookie", cookieHeaderKeyValues.toString());
}
} catch (Exception e) {
- throw new HttpException(e.getMessage(), e);
+ throw new HttpException(e.getMessage() == null ? "" : e.getMessage(), e);
}
}
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/cli/ColumnBuffer.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/cli/ColumnBuffer.java
index e703cb1f0..bd5124f95 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/cli/ColumnBuffer.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/cli/ColumnBuffer.java
@@ -228,8 +228,9 @@ public Object get(int index) {
return stringVars.get(index);
case BINARY_TYPE:
return binaryVars.get(index).array();
+ default:
+ return null;
}
- return null;
}
@Override
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/Date.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/Date.java
index 1b49c268a..720c7517f 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/Date.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/Date.java
@@ -65,6 +65,7 @@ public String toString() {
return localDate.format(PRINT_FORMATTER);
}
+ @Override
public int hashCode() {
return localDate.hashCode();
}
@@ -164,6 +165,7 @@ public int getDayOfWeek() {
}
/** Return a copy of this object. */
+ @Override
public Object clone() {
// LocalDateTime is immutable.
return new Date(this.localDate);
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/FastHiveDecimalImpl.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/FastHiveDecimalImpl.java
index d3dba0f7b..65f17e734 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/FastHiveDecimalImpl.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/FastHiveDecimalImpl.java
@@ -5182,7 +5182,6 @@ public static boolean fastRoundIntegerDown(
fastResult.fastIntegerDigitCount = 0;
fastResult.fastScale = 0;
} else {
- fastResult.fastSignum = 0;
fastResult.fastSignum = fastSignum;
fastResult.fastIntegerDigitCount = fastRawPrecision(fastResult);
fastResult.fastScale = 0;
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/Timestamp.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/Timestamp.java
index cdb6b10ce..7e02835b7 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/Timestamp.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/Timestamp.java
@@ -95,6 +95,7 @@ public String toString() {
return localDateTime.format(PRINT_FORMATTER);
}
+ @Override
public int hashCode() {
return localDateTime.hashCode();
}
@@ -207,6 +208,7 @@ public int getDayOfWeek() {
}
/** Return a copy of this object. */
+ @Override
public Object clone() {
// LocalDateTime is immutable.
return new Timestamp(this.localDateTime);
diff --git a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/TimestampTZUtil.java b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/TimestampTZUtil.java
index a938e1688..be16926cb 100644
--- a/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/TimestampTZUtil.java
+++ b/kyuubi-hive-jdbc/src/main/java/org/apache/kyuubi/jdbc/hive/common/TimestampTZUtil.java
@@ -98,7 +98,7 @@ private static String handleSingleDigitHourOffset(String s) {
Matcher matcher = SINGLE_DIGIT_PATTERN.matcher(s);
if (matcher.find()) {
int index = matcher.start() + 1;
- s = s.substring(0, index) + "0" + s.substring(index, s.length());
+ s = s.substring(0, index) + "0" + s.substring(index);
}
return s;
}
diff --git a/kyuubi-hive-jdbc/src/main/resources/org/apache/kyuubi/version.properties b/kyuubi-hive-jdbc/src/main/resources/org/apache/kyuubi/version.properties
new file mode 100644
index 000000000..82ae50cfb
--- /dev/null
+++ b/kyuubi-hive-jdbc/src/main/resources/org/apache/kyuubi/version.properties
@@ -0,0 +1,18 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+kyuubi.client.version = ${project.version}
diff --git a/kyuubi-hive-jdbc/src/test/java/org/apache/kyuubi/jdbc/hive/TestJdbcDriver.java b/kyuubi-hive-jdbc/src/test/java/org/apache/kyuubi/jdbc/hive/TestJdbcDriver.java
index 228ad00ee..efdf73092 100644
--- a/kyuubi-hive-jdbc/src/test/java/org/apache/kyuubi/jdbc/hive/TestJdbcDriver.java
+++ b/kyuubi-hive-jdbc/src/test/java/org/apache/kyuubi/jdbc/hive/TestJdbcDriver.java
@@ -24,6 +24,7 @@
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
+import java.nio.file.Files;
import java.util.Arrays;
import java.util.Collection;
import org.junit.AfterClass;
@@ -67,14 +68,14 @@ public static Collection