Skip to content

Commit a63b8db

Browse files
ueshinhaoyangeng-db
authored andcommitted
[SPARK-52554][PS] Avoid multiple roundtrips for config check in Spark Connect
### What changes were proposed in this pull request? Avoids multiple roundtrips for config check in Spark Connect. ### Why are the changes needed? Some APIs for pandas API on Spark now need to check the server configs, which could cause a performance issue in Spark Connect. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually, and the existing tests should pass. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#51252 from ueshin/issues/SPARK-52554/is_ansi_mode_enabled. Authored-by: Takuya Ueshin <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
1 parent 37c208c commit a63b8db

File tree

1 file changed

+23
-4
lines changed

1 file changed

+23
-4
lines changed

python/pyspark/pandas/utils.py

Lines changed: 23 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020

2121
import functools
2222
from contextlib import contextmanager
23+
import json
2324
import os
2425
from typing import (
2526
Any,
@@ -1071,10 +1072,28 @@ def xor(df1: PySparkDataFrame, df2: PySparkDataFrame) -> PySparkDataFrame:
10711072

10721073

10731074
def is_ansi_mode_enabled(spark: SparkSession) -> bool:
1074-
return (
1075-
ps.get_option("compute.ansi_mode_support", spark_session=spark)
1076-
and spark.conf.get("spark.sql.ansi.enabled") == "true"
1077-
)
1075+
if is_remote():
1076+
from pyspark.sql.connect.session import SparkSession as ConnectSession
1077+
from pyspark.pandas.config import _key_format, _options_dict
1078+
1079+
client = cast(ConnectSession, spark).client
1080+
(ansi_mode_support, ansi_enabled) = client.get_config_with_defaults(
1081+
(
1082+
_key_format("compute.ansi_mode_support"),
1083+
json.dumps(_options_dict["compute.ansi_mode_support"].default),
1084+
),
1085+
("spark.sql.ansi.enabled", None),
1086+
)
1087+
if ansi_enabled is None:
1088+
ansi_enabled = spark.conf.get("spark.sql.ansi.enabled")
1089+
# Explicitly set the default value to reduce the roundtrip for the next time.
1090+
spark.conf.set("spark.sql.ansi.enabled", ansi_enabled)
1091+
return json.loads(ansi_mode_support) and ansi_enabled.lower() == "true"
1092+
else:
1093+
return (
1094+
ps.get_option("compute.ansi_mode_support", spark_session=spark)
1095+
and spark.conf.get("spark.sql.ansi.enabled").lower() == "true"
1096+
)
10781097

10791098

10801099
def _test() -> None:

0 commit comments

Comments
 (0)