-
Notifications
You must be signed in to change notification settings - Fork 234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix JSON Matrix tests on Databricks 14.3 #11533
Comments
I have filed #11711 for the change in behaviour of |
Generally what I have been doing for JSON matrix, and I think is the proper course is to split up data for tests that are failing so we can continue to have coverage for the parts that work and mark the ones that fail with a clear error message so we know what is happening there. In the case of #11711, I don't know what the priority is going to end up being so splitting up the tests is probably the best way to get around the issue. I also don't think that #11711 is really all that much of a blocker. There are very few cases where a top level null is going to be treated differently from a struct with two nulls in them. |
Fixes NVIDIA#11533. This commit addresses the test failures reported in NVIDIA#11533, for the following tests: - `json_matrix_test.py::test_from_json_long_structs()` - `json_matrix_test.py::test_scan_json_long_structs()` These failures are a result of NVIDIA#11711. When the JSON parser attempts to read integral struct members from a JSON file, if the parsing leads to an overflow, then the `STRUCT` column value is deemed null on Databricks 14.3 (i.e. *without* `spark-rapids` active). This behaviour differs from that exhibited by Apache Spark versions exceeding 3.4.1. This commit breaks out the problematic JSON test rows into a separate file, whose read is tested in an `xfail` for Databricks 14.3. The remaining rows are tested on all versions. The true fix for NVIDIA#11711 will be addressed later. Signed-off-by: MithunR <[email protected]>
Build the plugin against the Databricks 14.3 cluster using #11467. Once built successfully run the JSON matrix tests by
TESTS=json_matrix_test.py jenkins/databricks/test.sh
The following tests fail
The text was updated successfully, but these errors were encountered: