-
Notifications
You must be signed in to change notification settings - Fork 3.9k
WIP: Dummy PR to check maint-22.0.0 status #47750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…rted image (#47730) ### Rationale for this change Old image fails due to debian update ### What changes are included in this PR? Use newer image ### Are these changes tested? Will submit crossbow run ### Are there any user-facing changes? No * GitHub Issue: #47705 Authored-by: Nic Crane <[email protected]> Signed-off-by: Nic Crane <[email protected]>
### Rationale for this change #45964 changed paths of pre-built Apache Arrow C++ binaries for R. But we forgot to update the nightly upload job. ### What changes are included in this PR? Update paths in the nightly upload job. ### Are these changes tested? No... ### Are there any user-facing changes? Yes. * GitHub Issue: #47704 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Nic Crane <[email protected]>
…47743) ### Rationale for this change Valgrind would report memory leaks induced by protobuf initialization on library load, for example: ``` ==14628== 414 bytes in 16 blocks are possibly lost in loss record 22 of 26 ==14628== at 0x4914EFF: operator new(unsigned long) (vg_replace_malloc.c:487) ==14628== by 0x8D0B6CA: void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .isra.0] (in /opt/conda/envs/arrow/lib/libprotobuf.so.25.3.0) ==14628== by 0x8D33E62: google::protobuf::DescriptorPool::Tables::Tables() (in /opt/conda/envs/arrow/lib/libprotobuf.so.25.3.0) ==14628== by 0x8D340E2: google::protobuf::DescriptorPool::DescriptorPool(google::protobuf::DescriptorDatabase*, google::protobuf::DescriptorPool::ErrorCollector*) (in /opt/conda/envs/arrow/lib/libprotobuf.so.25.3.0) ==14628== by 0x8D341A2: google::protobuf::DescriptorPool::internal_generated_pool() (in /opt/conda/envs/arrow/lib/libprotobuf.so.25.3.0) ==14628== by 0x8D34277: google::protobuf::DescriptorPool::InternalAddGeneratedFile(void const*, int) (in /opt/conda/envs/arrow/lib/libprotobuf.so.25.3.0) ==14628== by 0x8D9C56F: google::protobuf::internal::AddDescriptorsRunner::AddDescriptorsRunner(google::protobuf::internal::DescriptorTable const*) (in /opt/conda/envs/arrow/lib/libprotobuf.so.25.3.0) ==14628== by 0x40D147D: call_init.part.0 (dl-init.c:70) ==14628== by 0x40D1567: call_init (dl-init.c:33) ==14628== by 0x40D1567: _dl_init (dl-init.c:117) ==14628== by 0x40EB2C9: ??? (in /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) ``` This was triggered by the `libprotobuf` upgrade on conda-forge from 3.21.12 to 4.25.3. ### What changes are included in this PR? Add a Valgrind suppression for these leak reports, as there is probably not much we can do about them. ### Are these changes tested? Yes, by existing CI test. ### Are there any user-facing changes? No. * GitHub Issue: #47742 Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
…Parquet data (#47741) ### Rationale for this change Fix issues found by OSS-Fuzz when invalid Parquet data is fed to the Parquet reader: * https://issues.oss-fuzz.com/issues/447262173 * https://issues.oss-fuzz.com/issues/447480433 * https://issues.oss-fuzz.com/issues/447490896 * https://issues.oss-fuzz.com/issues/447693724 * https://issues.oss-fuzz.com/issues/447693728 * https://issues.oss-fuzz.com/issues/449498800 ### Are these changes tested? Yes, using the updated fuzz regression files from apache/arrow-testing#115 ### Are there any user-facing changes? No. **This PR contains a "Critical Fix".** (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.) * GitHub Issue: #47740 Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
This comment was marked as off-topic.
This comment was marked as off-topic.
### Rationale for this change Mimalloc default generates LSE atomic instructions only work on armv8.1. This causes illegal instruction on armv8.0 platforms like Raspberry4. This PR sets mimalloc build flag -DMI_NO_OPT_ARCH=ON to disable LSE instruction. Please note even with flag set, compiler and libc will replace the atmoic call with an ifunc that matches hardware best at runtime. That means LSE is used only if the running platform supports it. ### What changes are included in this PR? Force mimalloc build flag -DMI_NO_OPT_ARCH=ON. ### Are these changes tested? Manually tested. ### Are there any user-facing changes? No. **This PR contains a "Critical Fix".** Fixes crashes on Armv8.0 platform. * GitHub Issue: #47229 Lead-authored-by: Yibo Cai <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
### Rationale for this change According to microsoft/mimalloc#1073 , mimalloc v3 is preferred over v2 for production usage. There are reports of higher than expected memory consumption with mimalloc 2.2.x, notably when reading Parquet data (example: GH-47266). ### What changes are included in this PR? Bump to mimalloc 3.1.5, which is the latest mimalloc 3.1.x release as of this writing. ### Are these changes tested? Yes, by existing tests and CI. ### Are there any user-facing changes? Hopefully not, besides a potential reduction in memory usage due to improvements in mimalloc v3. * GitHub Issue: #47588 Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
### Rationale for this change There are link errors with build options for JNI on macOS. ### What changes are included in this PR? `ARROW_BUNDLED_STATIC_LIBS` has CMake target names defined in Apache Arrow not `find_package()`-ed target names. So we should use `aws-c-common` not `AWS::aws-c-common`. Recent aws-c-common or something use the Network framework. So add `Network` to `Arrow::arrow_bundled_dependencies` dependencies. Don't use `compute/kernels/temporal_internal.cc` in `libarrow.dylib` and `libarrow_compute.dylib` to avoid duplicated symbols error. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * GitHub Issue: #47748 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
### Rationale for this change This is for preventing to break Apache Arrow Java JNI use case on Linux. ### What changes are included in this PR? * Add a CI job that uses build options for JNI use case * Install more packages in manylinux image that is also used by JNI build ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #47632 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change `archery docker push` doesn't support custom Docker registry such as ghcr.io. ### What changes are included in this PR? Parse Docker image tag and specify Docker registry name to `docker push` if it's specified in the tag. Docker image tag format: `[HOST[:PORT]/]NAMESPACE/REPOSITORY[:TAG]` See also: https://docs.docker.com/reference/cli/docker/image/tag/#description ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #47795 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
…47616) ### Rationale for this change Python 3.14 is currently in a prerelease status and is expected to have a final release in October this year (https://peps.python.org/pep-0745/). We should ensure we are fully ready to support Python 3.14 for the PyArrow 22 release. ### What changes are included in this PR? This PR updates wheels for Python 3.14. ### Are these changes tested? Tested in the CI and with extended builds. ### Are there any user-facing changes? No, but users will be able to use PyArrow with Python 3.14. * GitHub Issue: #47438 --- Todo: - Update the image revision name in `.env` - Add 3.14 conda build ([arrow/dev/tasks/tasks.yml](https://github.com/apache/arrow/blob/d803afcc43f5d132506318fd9e162d33b2c3d4cd/dev/tasks/tasks.yml#L809)) when conda-forge/pyarrow-feedstock#156 is merged Follow-ups: - #47437 Authored-by: AlenkaF <[email protected]> Signed-off-by: AlenkaF <[email protected]>
…47804) Found by OSS-Fuzz, should fix https://issues.oss-fuzz.com/issues/451150486. Ensure RLE run is within bounds before reading it. Yes, by fuzz regression test in ASAN/UBSAN build. No. **This PR contains a "Critical Fix".** (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.) * GitHub Issue: #47803 Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
### Rationale for this change Summarise changes for release ### What changes are included in this PR? Update NEWS file ### Are these changes tested? No ### Are there any user-facing changes? No * GitHub Issue: #47738 Authored-by: Nic Crane <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
|
@github-actions crossbow submit --group verify-rc-source |
|
@github-actions crossbow submit --group packaging |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
…l patch from conda (#47810) ### Rationale for this change Our verify-rc-source Windows job is failing due to patch not being available for Windows. ### What changes are included in this PR? Move patch requirement from `conda_env_cpp.txt` to `conda_env_unix.txt` ### Are these changes tested? Yes via CI and archery. ### Are there any user-facing changes? No * GitHub Issue: #47809 Authored-by: Raúl Cumplido <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
|
@github-actions crossbow submit --group verify-rc-source |
|
@github-actions crossbow submit --group packaging |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
… release branch push (#47826) ### Rationale for this change We require the Linux package jobs to be triggered on RC tag creation. For example for 22.0.0, we currently push the tag `apache-arrow-22.0.0-rc0` and the release branch `release-22.0.0-rc0`. Those events are triggering builds over the same commit and the tag event gets cancelled due to a "high priority task" triggering the same jobs. This causes jobs to fail on the branch because the ARROW_VERSION is not generated. If we manually re-trigger the jobs on the tag they are successful. ### What changes are included in this PR? Remove the `release-*` branches from triggering the event to allow only the tag to run the jobs so they don't get cancelled. ### Are these changes tested? No ### Are there any user-facing changes? No * GitHub Issue: #47819 Authored-by: Raúl Cumplido <[email protected]> Signed-off-by: Raúl Cumplido <[email protected]>
…ign with the variant spec (#47835) ### Rationale for this change According to the [Variant specification](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md), the specification_version field must be set to 1 to indicate Variant encoding version 1. Currently, this field defaults to 0, which violates the specification. Parquet readers that strictly enforce specification version validation will fail to read files containing Variant types. <img width="624" height="185" alt="image" src="https://github.com/user-attachments/assets/b0f1deb9-0301-4b94-a472-17fd9cc0df5d" /> ### What changes are included in this PR? The change includes defaulting the specification version to 1. ### Are these changes tested? The change is covered by unit test. ### Are there any user-facing changes? The Parquet files produced the variant logical type annotation `VARIANT(1)`. ``` Schema: message schema { optional group V (VARIANT(1)) = 1 { required binary metadata; required binary value; } } ``` * GitHub Issue: #47838 Lead-authored-by: Aihua <[email protected]> Co-authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
|
@github-actions crossbow submit --group verify-rc-source |
|
@github-actions crossbow submit --group packaging |
|
Revision: e04fafb Submitted crossbow builds: ursacomputing/crossbow @ actions-a5e0a3ad69 |
|
Revision: e04fafb Submitted crossbow builds: ursacomputing/crossbow @ actions-e54bda83c9 |
Caution
Do not merge this PR.
This PR is being used to test the status of the 22.0.0 release branch on CI and should not be merged.