GH-41816: [C++] Add Minimal Meson Build of libarrow#45441
Conversation
|
@kou this is a simplified attempt at adding Meson to address your comment #41816 (comment) This is currently slower than the CMake configuration by a good deal (pending investigation) and the configuration file is not 100% complete, but this should give us an idea of what a Meson configuration may look like. To use, from the cpp directory developers can: meson setup builddir -Dtests=true -Dcompute=true
meson compile -C builddir
meson test -C builddirFor ASAN/UBSAN, users could simply: meson setup builddir -Dtests=true -Dcompute=true -Db_sanitize=address,undefinedOr if the project is already setup run: meson configure -C builddir -Db_sanitize=address,undefinedCoverage can be enabled with: meson configure -C builddir -Db_coverage=trueand tests can be run under valgrind with: meson test -C builddir --wrap='valgrind --track-origins=yes --leak-check=full' --print-errorlog |
|
Thanks. Can we start from more simplified version? For example, we don't need We want to add a nightly CI for this to detect regression. We want to update version information automatically in release process. For example: arrow/dev/release/utils-prepare.sh Lines 39 to 44 in 0556905 (I can do it in this branch later.) Anyway, I didn't know that |
|
Sure I can strip down further. So do think just something that builds libarrow is the right starting point? |
|
Yes. Only minimal |
c492403 to
61f4bbb
Compare
WillAyd
left a comment
There was a problem hiding this comment.
Thanks @kou . Hope this is more in line with your expectation.
As far as nightlies go, can you point me to an existing nightly CI setup in the repo? I was able to grep for some R nightlies, but not sure if there is existing infrastructure for C++ nightly jobs where this would be better placed
cpp/src/arrow/meson.build
Outdated
There was a problem hiding this comment.
Wasn't sure if this was intentional or not in the existing code base
There was a problem hiding this comment.
Ah, we should disable features that depend on IPC in cpp/src/arrow/compute/expression.cc like GH-45171.
Could you open an issue for this?
cpp/src/arrow/meson.build
Outdated
There was a problem hiding this comment.
Meson is pretty strict about wildcards for the reasons outlined in their FAQ. Idiomatically, Meson would want you to put the files you want to install in a separate directory and call install_subdir, but that would go beyond the scope of this initial PR I think
cpp/src/arrow/meson.build
Outdated
There was a problem hiding this comment.
I did not research this in too much detail yet; figured I'd check if it was a big deal before investing time into a resolution
There was a problem hiding this comment.
We don't need it.
arrow.pc.in has the license header because it's in this repository. Files that don't exist in this repository don't need it.
kou
left a comment
There was a problem hiding this comment.
Can I push some commits to this branch for nightly CI and auto version update?
cpp/meson.build
Outdated
There was a problem hiding this comment.
| license: 'Apache 2.0', | |
| license: 'Apache-2.0', |
cpp/meson.build
Outdated
There was a problem hiding this comment.
Can we ignore git log error? This will fail when we use source archive.
| git_id = run_command('git', 'log', '-n1', '--format=%H', check: true).stdout().strip() | |
| git_id = run_command('git', 'log', '-n1', '--format=%H').stdout().strip() |
cpp/meson.build
Outdated
There was a problem hiding this comment.
| git_description = run_command('git', 'describe', '--tags', check: true).stdout().strip() | |
| git_description = run_command('git', 'describe', '--tags').stdout().strip() |
cpp/src/arrow/meson.build
Outdated
There was a problem hiding this comment.
We don't need to emulate objlib in Meson. It's just for faster build.
We can just use library().
There was a problem hiding this comment.
I have a pretty poor understanding of how object libraries actually work, but I don't think we can just convert these to using the library call unless we want the linker to allow undefined symbols in these libraries as well (?)
Trying to do what I think you suggest generates a huge amount of errors like:
[7/96] Linking target src/arrow/libarrow_io.so
FAILED: src/arrow/libarrow_io.so
c++ -o src/arrow/libarrow_io.so src/arrow/libarrow_io.so.p/io_buffered.cc.o src/arrow/libarrow_io.so.p/io_caching.cc.o src/arrow/libarrow_io.so.p/io_compressed.cc.o src/arrow/libarrow_io.so.p/io_file.cc.o src/arrow/libarrow_io.so.p/io_hdfs.cc.o src/arrow/libarrow_io.so.p/io_hdfs_internal.cc.o src/arrow/libarrow_io.so.p/io_interfaces.cc.o src/arrow/libarrow_io.so.p/io_memory.cc.o src/arrow/libarrow_io.so.p/io_slow.cc.o src/arrow/libarrow_io.so.p/io_stdio.cc.o src/arrow/libarrow_io.so.p/io_transform.cc.o -Wl,--as-needed -Wl,--no-undefined -Wl,-O1 -shared -fPIC -Wl,--start-group -Wl,-soname,libarrow_io.so -Wl,--end-group
/usr/bin/ld: src/arrow/libarrow_io.so.p/io_buffered.cc.o: in function `arrow::io::BufferedInputStream::SetBufferSize(long)':
buffered.cc:(.text+0x13a9): undefined reference to `arrow::Buffer::CheckCPU() const'
/usr/bin/ld: buffered.cc:(.text+0x13b1): undefined reference to `arrow::Buffer::CheckMutable() const'
/usr/bin/ld: buffered.cc:(.text+0x1407): undefined reference to `arrow::util::detail::StringStreamWrapper::StringStreamWrapper()'
/usr/bin/ld: buffered.cc:(.text+0x149e): undefined reference to `arrow::util::detail::StringStreamWrapper::str[abi:cxx11]()'
/usr/bin/ld: buffered.cc:(.text+0x14a6): undefined reference to `arrow::util::detail::StringStreamWrapper::~StringStreamWrapper()'
/usr/bin/ld: buffered.cc:(.text+0x14b6): undefined reference to `arrow::Status::Status(arrow::StatusCode, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)'
There was a problem hiding this comment.
Ah, we can remove objlib_sources and static_library() for them entirely.
We can just add sources in objlib_sources to arrow_srcs and use it in one library() (that already exists in this PR).
There was a problem hiding this comment.
(We don't need objlib feature entirely with Meson.)
There was a problem hiding this comment.
Makes sense. I've still kept the top level dict for parity with CMake, and since some of the sources vary depending on compilation options (like arrow compute). Let me know if this is more in line with what you are thinking
cpp/src/arrow/meson.build
Outdated
There was a problem hiding this comment.
Ah, we should disable features that depend on IPC in cpp/src/arrow/compute/expression.cc like GH-45171.
Could you open an issue for this?
cpp/src/arrow/meson.build
Outdated
There was a problem hiding this comment.
We don't need it.
arrow.pc.in has the license header because it's in this repository. Files that don't exist in this repository don't need it.
cpp/src/arrow/util/meson.build
Outdated
There was a problem hiding this comment.
| install_dir: 'arrow', | |
| install_dir: 'arrow/util', |
cpp/src/arrow/util/meson.build
Outdated
There was a problem hiding this comment.
It seems that we don't need to use foreach here:
conf_data.set('ARROW_COMPUTE', false)
conf_data.set('ARROW_CSV', false)
...
cpp/src/arrow/util/meson.build
Outdated
There was a problem hiding this comment.
We don't need to install _internal.h:
| install: true, | |
| install_dir: 'arrow', |
|
Sure no problem |
b7a25a4 to
78d5fdb
Compare
|
@github-actions crossbow submit test-conda-cpp-meson |
|
Revision: 564df5f Submitted crossbow builds: ursacomputing/crossbow @ actions-527e5649ae
|
kou
left a comment
There was a problem hiding this comment.
I've rebased on main, added support for updating version in release process and added nightly CI job for Meson.
Could you update the PR description?
cpp/src/arrow/meson.build
Outdated
| # Meson does not natively support object libraries | ||
| # https://github.com/mesonbuild/meson/issues/13843 |
There was a problem hiding this comment.
Can we remove this comment?
We don't need this comment now, right?
cpp/src/arrow/meson.build
Outdated
| 'device.h', | ||
| 'extension_type.h', | ||
| 'memory_pool.h', | ||
| 'memory_pool_test.h', |
There was a problem hiding this comment.
| 'memory_pool_test.h', |
cpp/src/arrow/meson.build
Outdated
| 'extension_type.h', | ||
| 'memory_pool.h', | ||
| 'memory_pool_test.h', | ||
| 'pch.h', |
cpp/src/arrow/meson.build
Outdated
| 'stl_allocator.h', | ||
| 'stl.h', | ||
| 'stl_iterator.h', |
| meson test \ | ||
| --print-errorlogs \ | ||
| "$@" |
There was a problem hiding this comment.
We aren't building any tests now, are we? On the current CI:
+ meson test --print-errorlogs
No tests defined.
There was a problem hiding this comment.
For now we aren't actually building any tests - just creating an MRE for Meson with libarrow
I've addressed your feedback. Not sure I am doing what you are asking with respect to the PR description but I changed it to be more exact. Let me know if there is something else should be doing. |
We use the PR description for a commit message when we merge this. So we want to update the PR description to reflect the latest PR changes. BTW, it seems that you didn't change the PR description yet. (You might forget to save your PR description change.) |
|
@github-actions crossbow submit test-conda-cpp-meson |
|
Revision: 428c2f3 Submitted crossbow builds: ursacomputing/crossbow @ actions-3c4e1b441c
|
|
Do you mean the commit message? I can squash up and change that if so. On Github I've already change the PR title |
|
I mean the merge commit's commit message not commit messages of commits in this PRs. You don't need to squash commits in this PR. Because we use "Squash and merge" feature provided by GitHub. We use the following commit message: For example, ec3d283 's commit message uses #44375 's title and description. It doesn't use commit messages of commits in #44375 .
Could you also update the PR description? |
|
Ah got it - I've updated the OP |
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit c7a9100. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 6 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Rationale for this change
The Meson build system may be more user friendly to some developers, and may make it easier to perform tasks like valgrind, coverage, or ASAN/UBSAN coverage. There is also a prior art for using meson in the nanoarrow and arrow-adbc projects.
What changes are included in this PR?
This PR implements a Meson configuration that can build a minimal libarrow.
Are these changes tested?
A nightly CI job is also added to detect regressions.
Are there any user-facing changes?
No