-
Notifications
You must be signed in to change notification settings - Fork 22
add arrow to contrib, move to sha256 #157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughAdds Apache Arrow as a selectable contrib build target with a new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
CMakeLists.txt(5 hunks)libraries.cmake/arrow.cmake(1 hunks)macros.cmake(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build (windows-latest)
- GitHub Check: build (ubuntu-22.04-arm)
- GitHub Check: build (ubuntu-latest)
🔇 Additional comments (4)
macros.cmake (1)
114-120: LGTM!The SHA256 migration in
download_contrib_archiveis correctly implemented. The empty file SHA256 hash (e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855) is correct, and variable naming is consistent.CMakeLists.txt (2)
624-631: LGTM!The Arrow build section correctly follows the established pattern used by other libraries (COINOR, EIGEN, HDF5, etc.) with
OPENMS_CLEAN_LIB,OPENMS_CONTRIB_BUILD_ARROW, andOPENMS_COPY_LIBS.
217-257: LGTM!The SHA256 hash migration for all existing libraries is correctly implemented. The hash values are the expected 64-character hexadecimal strings consistent with SHA256 format.
libraries.cmake/arrow.cmake (1)
18-31:ARROW_EXTRA_CMAKE_FLAGis intentionally undefined and follows the established pattern.This variable is used at line 26 but left undefined in CMakeLists.txt. This is an intentional design pattern used consistently across multiple libraries in the codebase (zlib.cmake, xercesc.cmake), allowing users to pass optional CMake flags without modifying the source files. When undefined, it safely expands to an empty string with no functional impact.
| # CFLAGS for libsvm compiler (see libsvm Makefile) | ||
| set(ARROW_CFLAGS "-Wall -O3 -fPIC") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy/paste error in comment.
The comment references "libsvm compiler" but this is the Arrow build macro.
- # CFLAGS for libsvm compiler (see libsvm Makefile)
+ # CFLAGS for Arrow compiler
set(ARROW_CFLAGS "-Wall -O3 -fPIC")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # CFLAGS for libsvm compiler (see libsvm Makefile) | |
| set(ARROW_CFLAGS "-Wall -O3 -fPIC") | |
| # CFLAGS for Arrow compiler | |
| set(ARROW_CFLAGS "-Wall -O3 -fPIC") |
🤖 Prompt for AI Agents
In libraries.cmake/arrow.cmake around lines 89 to 90, the inline comment
incorrectly references "libsvm compiler" due to a copy/paste error; update the
comment to correctly describe that these CFLAGS apply to the Arrow build (e.g.,
"CFLAGS for Arrow build") so it accurately documents the set(ARROW_CFLAGS ...)
line and keep the existing flag assignment unchanged.
| set(_target_SHA256 ${ARCHIVE_${libname}_SHA256}) | ||
|
|
||
| message(STATUS "Validating archive for ${libname} .. ") | ||
|
|
||
| file(SHA1 ${_target_file} _downloaded_sha1 ) | ||
| if(NOT "${_downloaded_sha1}" STREQUAL "${_target_sha1}") | ||
| file(SHA256 ${_target_file} _downloaded_SHA256 ) | ||
| if(NOT "${_downloaded_SHA256}" STREQUAL "${_target_SHA256}") | ||
| file(REMOVE ${_target_file}) | ||
| if("${_downloaded_sha1}" STREQUAL "da39a3ee5e6b4b0d3255bfef95601890afd80709") | ||
| if("${_downloaded_SHA256}" STREQUAL "da39a3ee5e6b4b0d3255bfef95601890afd80709") | ||
| message(STATUS "Validating archive for ${libname} .. Found empty archive.") | ||
| message(STATUS "Check your internet connection.") | ||
| else() | ||
| message(STATUS "Validating archive for ${libname} .. sha1 mismatch (expected: ${_target_sha1} got: ${_downloaded_sha1})") | ||
| message(STATUS "Validating archive for ${libname} .. SHA256 mismatch (expected: ${_target_SHA256} got: ${_downloaded_SHA256})") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent empty-archive hash: SHA1 used instead of SHA256.
Line 63 still uses the SHA1 hash of an empty file (da39a3ee5e6b4b0d3255bfef95601890afd80709), but the variable _downloaded_SHA256 now contains a SHA256 hash. This comparison will never match, so the "Found empty archive" message will never be shown in validate_archive.
Compare with line 117 which correctly uses the SHA256 hash of an empty file.
Apply this diff to fix the inconsistency:
if(NOT "${_downloaded_SHA256}" STREQUAL "${_target_SHA256}")
file(REMOVE ${_target_file})
- if("${_downloaded_SHA256}" STREQUAL "da39a3ee5e6b4b0d3255bfef95601890afd80709")
+ if("${_downloaded_SHA256}" STREQUAL "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855")
message(STATUS "Validating archive for ${libname} .. Found empty archive.")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| set(_target_SHA256 ${ARCHIVE_${libname}_SHA256}) | |
| message(STATUS "Validating archive for ${libname} .. ") | |
| file(SHA1 ${_target_file} _downloaded_sha1 ) | |
| if(NOT "${_downloaded_sha1}" STREQUAL "${_target_sha1}") | |
| file(SHA256 ${_target_file} _downloaded_SHA256 ) | |
| if(NOT "${_downloaded_SHA256}" STREQUAL "${_target_SHA256}") | |
| file(REMOVE ${_target_file}) | |
| if("${_downloaded_sha1}" STREQUAL "da39a3ee5e6b4b0d3255bfef95601890afd80709") | |
| if("${_downloaded_SHA256}" STREQUAL "da39a3ee5e6b4b0d3255bfef95601890afd80709") | |
| message(STATUS "Validating archive for ${libname} .. Found empty archive.") | |
| message(STATUS "Check your internet connection.") | |
| else() | |
| message(STATUS "Validating archive for ${libname} .. sha1 mismatch (expected: ${_target_sha1} got: ${_downloaded_sha1})") | |
| message(STATUS "Validating archive for ${libname} .. SHA256 mismatch (expected: ${_target_SHA256} got: ${_downloaded_SHA256})") | |
| set(_target_SHA256 ${ARCHIVE_${libname}_SHA256}) | |
| message(STATUS "Validating archive for ${libname} .. ") | |
| file(SHA256 ${_target_file} _downloaded_SHA256 ) | |
| if(NOT "${_downloaded_SHA256}" STREQUAL "${_target_SHA256}") | |
| file(REMOVE ${_target_file}) | |
| if("${_downloaded_SHA256}" STREQUAL "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855") | |
| message(STATUS "Validating archive for ${libname} .. Found empty archive.") | |
| message(STATUS "Check your internet connection.") | |
| else() | |
| message(STATUS "Validating archive for ${libname} .. SHA256 mismatch (expected: ${_target_SHA256} got: ${_downloaded_SHA256})") |
🤖 Prompt for AI Agents
In macros.cmake around lines 56 to 67, the empty-file check compares
_downloaded_SHA256 to the SHA1 empty-file digest
(da39a3ee5e6b4b0d3255bfef95601890afd80709) so it will never match; replace that
value with the SHA256 digest of an empty file
(e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855) so the
condition can detect and report an empty archive correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
CMakeLists.txt(5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build (windows-latest)
- GitHub Check: build (ubuntu-22.04-arm)
- GitHub Check: build (ubuntu-latest)
🔇 Additional comments (5)
CMakeLists.txt (5)
94-95: ADD Arrow to build type lists.Arrow is correctly added to both
BUILD_TYPEandVALID_BUILD_TYPES, making it selectable for compilation.
271-271: Verify INCLUDE_DIR_ARROW target location.
INCLUDE_DIR_ARROWis set to${CONTRIB_BIN_INCLUDE_DIR}/arrow. Confirm that the Arrow build macro (OPENMS_CONTRIB_BUILD_ARROWin libraries.cmake/arrow.cmake) installs headers to this location during the build process.
485-485: Arrow CMake macros correctly included.The Arrow build macro file is properly included alongside other library macros.
624-631: Arrow build section follows established pattern.The Arrow build logic is consistent with other optional libraries (HDF5, KISSFFT, OPENMP): clean, build, and copy libraries. The conditional check (
if (BUILD_TYPE STREQUAL "ALL" OR "ARROW" IN_LIST BUILD_TYPE)) properly integrates Arrow as a selectable build target.
199-199: Verify Arrow archive extracts to the expected directory structure.
ARROW_DIRreferencesarrow-cpp-22.0.0/cpp, and the archive isarrow-cpp-22.0.0.tar.xz. While the version numbers now match (fixing the prior mismatch), you must confirm that extracting this.tar.xzproduces acpp/subdirectory at the expected depth. If the archive root differs fromarrow-cpp-22.0.0/, or if the C++ sources are not in acpp/subdirectory, the build will fail.You can verify the archive structure by extracting it locally or by checking the Apache Arrow GitHub repository for version 22.0.0 and confirming the directory layout matches.
#!/bin/bash # Verify Arrow archive structure (if you have access to the archive locally) # Expected: arrow-cpp-22.0.0/cpp directory should exist after extraction archive_file="arrow-cpp-22.0.0.tar.xz" if [ -f "$archive_file" ]; then tar -tzf "$archive_file" | grep -E '^arrow-cpp-22.0.0/cpp/' | head -5 echo "---" tar -tzf "$archive_file" | head -10 fiAlternatively, check the upstream repository on GitHub to confirm the C++ source location within the archive.
Also applies to: 259-261
| -D CMAKE_INSTALL_PREFIX=${PROJECT_BINARY_DIR} | ||
| -D ARROW_CSV=ON | ||
| -D ARROW_PARQUET=ON | ||
| ${ARROW_EXTRA_CMAKE_FLAG} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One could try to make it recognize the compression libs that we are building in our contrib like we do for boost.
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.