forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 1
[CI] Do not upload dockerbuild #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
EnricoMi
wants to merge
485
commits into
master
Choose a base branch
from
ci-do-not-upload-dockerbuild
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8f41120
to
c286064
Compare
### What changes were proposed in this pull request? Like what we‘ve improved in apache#50674. This PR introduces TypedConfigBuilder for Java enums and leverages it for existing configurations that use enums as parameters. Before this PR, we need to change them from Enumeration to string, string to Enumeration, back and forth... We also need to do upper-case transformation, .checkValues validation one by one. After this PR, those steps are centralized. ### Why are the changes needed? Better support for java-enum-like configurations ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? new tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#50691 from yaooqinn/SPARK-51896. Authored-by: Kent Yao <[email protected]> Signed-off-by: Kent Yao <[email protected]>
### What changes were proposed in this pull request? This PR aims to update `setup-minikube` to the latest version v0.0.19. ### Why are the changes needed? Currently, we use `v0.0.18` (2024-06-18). We had better use the latest one. - https://github.com/medyagh/setup-minikube/releases/tag/v0.0.19 (2025-01-23) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50702 from dongjoon-hyun/SPARK-51908. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
Fix python lint Closes apache#50705 from zhengruifeng/fix_lint_x. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
### What changes were proposed in this pull request? Add 4 missing functions to API references ### Why are the changes needed? for docs ### Does this PR introduce _any_ user-facing change? doc-only change ### How was this patch tested? CI ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#50709 from zhengruifeng/doc_missing_fcs. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
…ug in connect-only mode ### What changes were proposed in this pull request? Enable SparkConnectDataFrameDebug in connect-only mode ### Why are the changes needed? to improve test coverage ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? CI ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#50710 from zhengruifeng/connect-only-df-debug. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
…roper exception instead of an internal one ### What changes were proposed in this pull request? Following query throw `Cannot cast NullType to Arraytype`: ``` SELECT get(null, 0); ``` instead of throwing a more user friendly one. I propose that we fix that. ### Why are the changes needed? To correct behavior of `get` function. ### Does this PR introduce _any_ user-facing change? Query that were failing with internal error are now throwing a more user friendly one. ### How was this patch tested? Added tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50590 from mihailoale-db/getnull. Authored-by: mihailoale-db <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
…alyzer ### What changes were proposed in this pull request? Properly throw datatype mismatch in single-pass Analyzer. Currently we don't have a way to pass a resolved operator to `failOnTypeCheckResult`, so we pass `None` - this simply omits the `issueFixedIfAnsiOff` functionality. ### Why are the changes needed? This improves error message reporting in single-pass Analyzer. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50697 from vladimirg-db/vladimir-golubev_data/throw-datatype-mismatch-in-single-pass-analyzer. Authored-by: Vladimir Golubev <[email protected]> Signed-off-by: Max Gekk <[email protected]>
…nd ignoring non-batch files when listing OperatorMetadata files ### What changes were proposed in this pull request? Currently, we don't want to purge StateSchemaV3 files, so we need to remove the relevant call from MicrobatchExecution. Additionally, we want to ignore any files in the metadata or state schema directory that don't have a Long (which would cause a parse exception) ### Why are the changes needed? The changes are needed because we cannot purge schema files because these are necessary until full rewrite is implemented. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit tests ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50700 from ericm-db/remove-async-purge. Authored-by: Eric Marnadi <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
…angelogReaderFactory for v1 ### What changes were proposed in this pull request? Catch the UTFDataFormatException thrown for v1 in the StateStoreChangelogReaderFactory and assign the version to 1. ### Why are the changes needed? We should not throw this error. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50721 from liviazhu-db/liviazhu-db/master. Authored-by: Livia Zhu <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
…sFinderSuite` ### What changes were proposed in this pull request? This pr fixes an incorrect `assume` behavior in the `ClassFinderSuite` test suite, which was introduced in SPARK-51623. The issue stems from the fact that the `expectedClassFiles` list contained file paths without their parent directories. Consequently, the assertion added in SPARK-51623 https://github.com/apache/spark/blob/b634978936499f58f8cb2e8ea16339feb02ffb52/sql/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/ClassFinderSuite.scala#L40 would always evaluate to `false`, causing the test case to be permanently marked as `CANCELED`. We can observe relevant test cases in the GA testing phase, for example: - https://github.com/apache/spark/actions/runs/14675551942/job/41191081107  Therefore, we should modify the check to `assume` whether the pre-defined class files exist within the source directory (`classResourcePath`). ### Why are the changes needed? Fix the `erroneous` assume in the `ClassFinderSuite`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Acitons - Locally test ``` build/sbt clean "connect-client-jvm/testOnly org.apache.spark.sql.connect.client.ClassFinderSuite" ``` **Before** ``` [info] ClassFinderSuite: [info] - REPLClassDirMonitor functionality test !!! CANCELED !!! (202 milliseconds) [info] p.toFile().exists() was false (ClassFinderSuite.scala:40) [info] org.scalatest.exceptions.TestCanceledException: [info] at org.scalatest.Assertions.newTestCanceledException(Assertions.scala:475) [info] at org.scalatest.Assertions.newTestCanceledException$(Assertions.scala:474) [info] at org.scalatest.Assertions$.newTestCanceledException(Assertions.scala:1231) [info] at org.scalatest.Assertions$AssertionsHelper.macroAssume(Assertions.scala:1310) [info] at org.apache.spark.sql.connect.client.ClassFinderSuite.$anonfun$new$3(ClassFinderSuite.scala:40) [info] at scala.collection.immutable.List.foreach(List.scala:334) [info] at org.apache.spark.sql.connect.client.ClassFinderSuite.checkClasses$1(ClassFinderSuite.scala:40) [info] at org.apache.spark.sql.connect.client.ClassFinderSuite.$anonfun$new$1(ClassFinderSuite.scala:48) [info] at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) [info] at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85) [info] at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83) [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) [info] at org.scalatest.Transformer.apply(Transformer.scala:22) [info] at org.scalatest.Transformer.apply(Transformer.scala:20) [info] at org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226) [info] at org.scalatest.TestSuite.withFixture(TestSuite.scala:196) [info] at org.scalatest.TestSuite.withFixture$(TestSuite.scala:195) [info] at org.scalatest.funsuite.AnyFunSuite.withFixture(AnyFunSuite.scala:1564) [info] at org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236) [info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218) [info] at org.scalatest.funsuite.AnyFunSuite.runTest(AnyFunSuite.scala:1564) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269) [info] at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413) [info] at scala.collection.immutable.List.foreach(List.scala:334) [info] at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) [info] at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396) [info] at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:269) [info] at org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:268) [info] at org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1564) [info] at org.scalatest.Suite.run(Suite.scala:1114) [info] at org.scalatest.Suite.run$(Suite.scala:1096) [info] at org.scalatest.funsuite.AnyFunSuite.org$scalatest$funsuite$AnyFunSuiteLike$$super$run(AnyFunSuite.scala:1564) [info] at org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$run$1(AnyFunSuiteLike.scala:273) [info] at org.scalatest.SuperEngine.runImpl(Engine.scala:535) [info] at org.scalatest.funsuite.AnyFunSuiteLike.run(AnyFunSuiteLike.scala:273) [info] at org.scalatest.funsuite.AnyFunSuiteLike.run$(AnyFunSuiteLike.scala:272) [info] at org.scalatest.funsuite.AnyFunSuite.run(AnyFunSuite.scala:1564) [info] at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:321) [info] at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:517) [info] at sbt.ForkMain$Run.lambda$runTest$1(ForkMain.java:414) [info] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [info] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [info] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [info] at java.base/java.lang.Thread.run(Thread.java:840) [info] Run completed in 628 milliseconds. [info] Total number of tests run: 0 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 0, failed 0, canceled 1, ignored 0, pending 0 [info] No tests were executed. ``` **After** ``` [info] ClassFinderSuite: [info] - REPLClassDirMonitor functionality test (169 milliseconds) [info] Run completed in 530 milliseconds. [info] Total number of tests run: 1 [info] Suites: completed 1, aborted 0 [info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 [info] All tests passed. ``` Manually delete `Hello.class`, `smallClassFile.class` and `smallClassFileDup.class`, then proceed with the test, the test will be skipped. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50725 from LuciferYang/SPARK-51925. Authored-by: yangjie01 <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…ble error correctly ### What changes were proposed in this pull request? As of today, when people use jdbc v2 and try to query a nonexisting table, they will get `FAILED_JDBC.LOAD_TABLE` error. This is a bit confusing as the real error is table not exist. This PR improves the error message by using an additional table existence check and throw no such table error if the table does not exists. ### Why are the changes needed? better error messaging ### Does this PR introduce _any_ user-facing change? yes people will see clearer errors if the JDBC table does not exist ### How was this patch tested? updated existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#50706 from cloud-fan/jdbc. Lead-authored-by: Wenchen Fan <[email protected]> Co-authored-by: Kent Yao <[email protected]> Signed-off-by: Kent Yao <[email protected]>
…dPrefixes ### What changes were proposed in this pull request? This PR adds `com.mysql.cj` to `spark.sql.hive.metastore.sharedPrefixes` ### Why are the changes needed? Following upstream changes https://github.com/mysql/mysql-connector-j ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? existing tests ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#50711 from yaooqinn/SPARK-51914. Authored-by: Kent Yao <[email protected]> Signed-off-by: Kent Yao <[email protected]>
### What changes were proposed in this pull request? This pr aims to upgrade Apache `commons-collections4` from 4.4 to 4.5.0 ### Why are the changes needed? The full release notes as follows: - https://commons.apache.org/proper/commons-collections/changes.html#a4.5.0 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50723 from LuciferYang/SPARK-51923. Authored-by: yangjie01 <[email protected]> Signed-off-by: Kent Yao <[email protected]>
### What changes were proposed in this pull request? This PR proposes to sync the missing python function types which are out of sync between Scala and Python. ### Why are the changes needed? These types are supposed to be sync between Scala and Python. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing UTs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50728 from HeartSaVioR/SPARK-51814-follow-up-sync-function-type. Authored-by: Jungtaek Lim <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…eption ### What changes were proposed in this pull request? apache#50693 enabled `SparkConnectErrorTests` in connect-only mode `toJSON` and `rdd` throw `PySparkNotImplementedError` in connect model, but `PySparkAttributeError` in connect-only model ### Why are the changes needed? to fix https://github.com/apache/spark/actions/runs/14649632571/job/41112060443 ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? will be tested in daily builder ### Was this patch authored or co-authored using generative AI tooling? no Closes apache#50708 from zhengruifeng/follow_up_connect_error. Authored-by: Ruifeng Zheng <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
… messageParameters of CAST_INVALID_INPUT and CAST_OVERFLOW ### What changes were proposed in this pull request? In Spark Connect, we guarantee that older clients are compatible with newer versions of the Spark Connect service. A previous change - e28c33b - broke this compatibility by removing the "ansiConfig" field in the message parameters for two error codes - "CAST_OVERFLOW" and "CAST_INVALID_INPUT". The Spark Connect client includes GrpcExceptionConverter.scala\[1] to convert error codes from the server to produce SQL compliant error codes on the client. The SQL compliant error codes and corresponding error messages are included in the error-conditions.json file. Older clients do not include the change (e28c33b) to this file and still include the `ansiConfig` parameter. Later versions of the Spark Connect service don't return this parameter resulting in an internal error\[2] that the correct error condition could not be formulated. This change reverts the changes on the server to continue producing the "ansiConfig" field so older clients can still correctly reformulate the error class. \[1]: https://github.com/apache/spark/blob/2ba156096e83adf7b0b2f5c38453d6fd37d95ded/sql/connect/common/src/main/scala/org/apache/spark/sql/connect/client/GrpcExceptionConverter.scala#L184 \[2]: https://github.com/apache/spark/blob/2ba156096e83adf7b0b2f5c38453d6fd37d95ded/common/utils/src/main/scala/org/apache/spark/ErrorClassesJSONReader.scala#L58 ### Why are the changes needed? Explained above. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50604 from nija-at/cast-invalid-input. Authored-by: Niranjan Jayakar <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
…e in PySpark ### What changes were proposed in this pull request? This PR proposes to support Spark Connect on transformWithState in PySpark. The code is mostly reused between Pandas version and Row version. We rely on PythonEvanType to determine the user facing type of API, hence no proto change. ### Why are the changes needed? The new API needs to be supported with Spark Connect. ### Does this PR introduce _any_ user-facing change? Yes, we will expose a new API to be available in Spark Connect. ### How was this patch tested? New test suites. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50704 from HeartSaVioR/WIP-transform-with-state-python-in-spark-connect. Authored-by: Jungtaek Lim <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
…o 5.12.2 ### What changes were proposed in this pull request? This pr aims to upgrade `jupiter-interface` from 0.13.3 to 0.14.0 and Junit5 to the latest version(Platform 1.12.2 + Jupiter 5.12.2). ### Why are the changes needed? The full release notes of `jupiter-interface` as follows: - https://github.com/sbt/sbt-jupiter-interface/releases/tag/v0.14.0 and the full release notes between Junit 5.11.4 to 5.12.2 as follows: - https://junit.org/junit5/docs/5.12.2/release-notes/#release-notes-5.12.2 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50724 from LuciferYang/SPARK-51924. Authored-by: yangjie01 <[email protected]> Signed-off-by: yangjie01 <[email protected]>
…WithState in PySpark" This reverts commit 81ede34.
### What changes were proposed in this pull request? This PR aims to upgrade AWS SDK v2 to 2.29.52. ### Why are the changes needed? Like [Apache Iceberg v1.8.1](https://iceberg.apache.org/releases/#181-release) and Apache Hadoop 3.4.2 (HADOOP-19485), Apache Spark 4.1.0 had better use the latest one. - apache/hadoop#7479 - apache/iceberg#12339 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50731 from dongjoon-hyun/SPARK-51929. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Kent Yao <[email protected]>
…Size ### What changes were proposed in this pull request? The current implementation of the `prepare` in `OffsetWindowFunctionFrameBase`: ``` override def prepare(rows: ExternalAppendOnlyUnsafeRowArray): Unit = { if (offset > rows.length) { fillDefaultValue(EmptyRow) } else { ... } ``` The current implementation of the `write` in `FrameLessOffsetWindowFunctionFrame`: ``` override def write(index: Int, current: InternalRow): Unit = { if (offset > rows.length) { // Already use default values in prepare. } else { ... } ``` These implementations caused the `LEAD` and `LAG` functions to have `NullPointerException` when the default value is not Literal and the range of the default value exceeds the window group size. This pr introduced a boolean val `onlyLiteralNulls` and modified `prepare` and `write`. The `onlyLiteralNulls` indicated whether the default values are Literal values. In `prepare`, first check `onlyLiteralNulls`. If the default value is Literal, call `fillDefaultValue(EmptyRow)`. In `write`, if `onlyLiteralNulls ` is false, the default value must be non-literal, call `fillDefaultValue(current)`. ### Why are the changes needed? Fix `LEAD` and `LAG` cause NullPointerException in the window function (SPARK-51757) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Add test method in test("lead/lag with column reference as default when offset exceeds window group size") in org.apache.spark.sql.DataFrameWindowFramesSuite ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50552 from xin-aurora/windowFuncFix. Lead-authored-by: xin-aurora <[email protected]> Co-authored-by: Xin Zhang <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
### What changes were proposed in this pull request? This pr aims to upgarde Apache `common-text` from 1.13.0 to 1.13.1. ### Why are the changes needed? The full release notes as follows: - https://github.com/apache/commons-text/blob/rel/commons-text-1.13.1/RELEASE-NOTES.txt ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50732 from LuciferYang/SPARK-51928. Lead-authored-by: yangjie01 <[email protected]> Co-authored-by: YangJie <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request? Adds a new state store config `unloadOnCommit` that unloads the state store instance from the executor at task completion. This frees up resources on the executor and prevents potentially unbounded resource usage from continually adding more state store instances to a single executor. A task completion listener will execute a synchronous maintenance followed by a close on the state store. Since we do the maintenance synchronously, we never need to start the background maintenance thread. ### Why are the changes needed? Stateful streams can have trouble scaling to large volumes of data without also increasing the total resources allocated to the application. By unloading state stores on task completion, stateful streams are able to complete with fewer resources, at the cost of slightly higher latency per batch in certain scenarios. ### Does this PR introduce _any_ user-facing change? Yes, adds a new config for changing the behavior of stateful streams. ### How was this patch tested? New UT is added to show the config takes effect. I'm not sure what all corner cases may need to be tested with this. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50612 from Kimahriman/state-store-unload-on-commit. Authored-by: Adam Binford <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
…e in PySpark ### What changes were proposed in this pull request? This PR proposes to support Spark Connect on transformWithState in PySpark. The code is mostly reused between Pandas version and Row version. We rely on PythonEvanType to determine the user facing type of API, hence no proto change. ### Why are the changes needed? The new API needs to be supported with Spark Connect. ### Does this PR introduce _any_ user-facing change? Yes, we will expose a new API to be available in Spark Connect. ### How was this patch tested? New test suites. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50704 from HeartSaVioR/WIP-transform-with-state-python-in-spark-connect. Authored-by: Jungtaek Lim <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
…references ### What changes were proposed in this pull request? Fix ML cache object python client references. When a model is copied from client, it results in multiple client model objects refer to the same server cached model. In this case, we need a reference count, only when reference count decreases to zero, we can release the server cached model. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50707 from WeichenXu123/ml-ref-id-fix. Lead-authored-by: Weichen Xu <[email protected]> Co-authored-by: WeichenXu <[email protected]> Signed-off-by: Ruifeng Zheng <[email protected]>
…for PythonEvalType.toString ### What changes were proposed in this pull request? This PR adds missing type handling for PythonEvalType.toString. ### Why are the changes needed? Just completeness's sake. This isn't based on actual observed failure. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing UTs. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#50736 from HeartSaVioR/SPARK-51814-followup-2. Authored-by: Jungtaek Lim <[email protected]> Signed-off-by: Jungtaek Lim <[email protected]>
…istTables() ### What changes were proposed in this pull request? - Revert apache#50515 - Implement error handling rules based on error conditions for Spark errors in spark.catalog.listTables(). ### Why are the changes needed? There are risks associated with working with partial data, especially when unaware that some tables are broken. Throwing an exception instead provides a clear indication that something is wrong. Instead we can use error handling rules to determine the proper behavior on a case-by-case basis. SparkThrowable should be sufficient to capture the cases we want to handle. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests (e.g. `build/sbt "sql/testOnly *CatalogSuite"`, `build/sbt "hive/testOnly *HiveDDLSuite"`) ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50696 from heyihong/SPARK-51899. Authored-by: Yihong He <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
### What changes were proposed in this pull request? This pr aims to upgrade `datasketches-java` from 6.1.1 to 6.2.0. ### Why are the changes needed? Based on the release notes, this version fixes a bug that was discovered in the Theta compression algorithm. - https://github.com/apache/datasketches-java/releases/tag/6.2.0 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GitHub Actions ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50733 from LuciferYang/SPARK-51930. Lead-authored-by: yangjie01 <[email protected]> Co-authored-by: YangJie <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
bde9fa9
to
7c2793e
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Why are the changes needed?
Does this PR introduce any user-facing change?
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?