Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cmake] Using Protobuf v27 make python pybind11 test failed #4307

Open
Mizux opened this issue Jul 10, 2024 · 3 comments
Open

[cmake] Using Protobuf v27 make python pybind11 test failed #4307

Mizux opened this issue Jul 10, 2024 · 3 comments
Assignees
Labels
Bug Build: CMake CMake based build issue dependencies Pull requests that update a dependency file Lang: Python Python wrapper issue
Milestone

Comments

@Mizux
Copy link
Collaborator

Mizux commented Jul 10, 2024

Tests impacted

here the list of impacted tests...

To test:

cmake -S. -Bbuild -DBUILD_PYTHON=ON -DBUILD_CXX_SAMPLES=OFF -DBUILD_CXX_EXAMPLES=OFF
cmake --build build -j 8
(cd build && ctest -R "python_")
...
96% tests passed, 13 tests failed out of 332

Total Test time (real) = 349.67 sec

The following tests FAILED:
	11 - python_pdlp_pdlp_test (Subprocess aborted)
	14 - python_routing_model_test (Subprocess aborted)
	19 - python_scheduling_rcpsp_test (Subprocess aborted)
	21 - python_math_opt_solver_test (Subprocess aborted)
	170 - python_math_opt_advanced_linear_programming (Subprocess aborted)
	171 - python_math_opt_basic_example (Subprocess aborted)
	172 - python_math_opt_cutting_stock (Subprocess aborted)
	173 - python_math_opt_integer_programming (Subprocess aborted)
	174 - python_math_opt_linear_programming (Subprocess aborted)
	175 - python_math_opt_linear_regression (Subprocess aborted)
	176 - python_math_opt_time_indexed_scheduling (Subprocess aborted)
	178 - python_pdlp_simple_pdlp_program (Subprocess aborted)
	439 - python_python_rcpsp_sat (Subprocess aborted)
Errors while running CTest
Output from these tests are in: /usr/local/google/home/corentinl/work/main/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

note: using Protobuf 58baeb4c3b664f8918d24cef5151083d9da9767c everything pass...
Regression seem to have been introduced by http://cl/602725967

Investigation

Testing one broken test

cmake -S. -Bbuild -DBUILD_PYTHON=ON -DBUILD_CXX_SAMPLES=OFF -DBUILD_CXX_EXAMPLES=OFF
cmake --build build -j 8
source build/python/venv/bin/activate
python ortools/routing/python/model_test.py

Trace:

Running tests under Python 3.12.2: /usr/local/google/home/corentinl/work/main/build/python/venv/bin/python
[ RUN      ] ModelTest.testConstantDimensionTSP
[       OK ] ModelTest.testConstantDimensionTSP
[ RUN      ] ModelTest.testCtor
<ortools.routing.python.model.RoutingModel object at 0x7faea8f9d4b0>
[       OK ] ModelTest.testCtor
[ RUN      ] ModelTest.testDimensionTSP
[       OK ] ModelTest.testDimensionTSP
[ RUN      ] ModelTest.testDimensionWithVehicleCapacitiesTSP
[       OK ] ModelTest.testDimensionWithVehicleCapacitiesTSP
[ RUN      ] ModelTest.testDimensionWithVehicleTransitsTSP
[       OK ] ModelTest.testDimensionWithVehicleTransitsTSP
[ RUN      ] ModelTest.testDimensionWithVehicleTransitsVRP
[       OK ] ModelTest.testDimensionWithVehicleTransitsVRP
[ RUN      ] ModelTest.testDisjunctionPenaltyTSP
[       OK ] ModelTest.testDisjunctionPenaltyTSP
[ RUN      ] ModelTest.testDisjunctionTSP
[       OK ] ModelTest.testDisjunctionTSP
[ RUN      ] ModelTest.testMatrixDimensionTSP
[       OK ] ModelTest.testMatrixDimensionTSP
[ RUN      ] ModelTest.testMatrixDimensionVRP
[       OK ] ModelTest.testMatrixDimensionVRP
[ RUN      ] ModelTest.testRoutingLocalSearchFiltering
First solution statistics:
                  | Time (s)
PATH_CHEAPEST_ARC | 3.3e-05
Local search filter statistics (PATH_CHEAPEST_ARC):
                      |     Calls |   Rejects | Time (s) | Rejects/s
VehicleVariableFilter |        12 |         0 | 8.3e-06  |       0
 VariableDomainFilter |        12 |         0 | 7.8e-06  |       0
                Total |        24 |         0 | 1.6e-05  |       0

[       OK ] ModelTest.testRoutingLocalSearchFiltering
[ RUN      ] ModelTest.testRoutingModelParameters
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
F0000 00:00:1720594007.067703 3589844 generated_message_reflection.cc:3620] Check failed: file != nullptr 
*** Check failure stack trace: ***
Fatal Python error: Aborted

Current thread 0x00007faeaa0d0040 (most recent call first):
  File "/usr/local/google/home/corentinl/work/main/ortools/routing/python/model_test.py", line 650 in testRoutingModelParameters
  File "/usr/lib/python3.12/unittest/case.py", line 589 in _callTestMethod
  File "/usr/lib/python3.12/unittest/case.py", line 634 in run
  File "/usr/lib/python3.12/unittest/case.py", line 690 in __call__
  File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
  File "/usr/lib/python3.12/unittest/suite.py", line 84 in __call__
  File "/usr/lib/python3.12/unittest/suite.py", line 122 in run
  File "/usr/lib/python3.12/unittest/suite.py", line 84 in __call__
  File "/usr/lib/python3.12/unittest/runner.py", line 240 in run
  File "/usr/local/google/home/corentinl/work/main/build/python/venv/lib/python3.12/site-packages/absl/testing/_pretty_print_reporter.py", line 82 in run
  File "/usr/lib/python3.12/unittest/main.py", line 281 in runTests
  File "/usr/lib/python3.12/unittest/main.py", line 105 in __init__
  File "/usr/local/google/home/corentinl/work/main/build/python/venv/lib/python3.12/site-packages/absl/testing/absltest.py", line 2653 in _run_and_get_tests_result
  File "/usr/local/google/home/corentinl/work/main/build/python/venv/lib/python3.12/site-packages/absl/testing/absltest.py", line 2689 in run_tests
  File "/usr/local/google/home/corentinl/work/main/build/python/venv/lib/python3.12/site-packages/absl/testing/absltest.py", line 2234 in main_function
  File "/usr/local/google/home/corentinl/work/main/build/python/venv/lib/python3.12/site-packages/absl/app.py", line 254 in _run_main
  File "/usr/local/google/home/corentinl/work/main/build/python/venv/lib/python3.12/site-packages/absl/app.py", line 308 in run
  File "/usr/local/google/home/corentinl/work/main/build/python/venv/lib/python3.12/site-packages/absl/testing/absltest.py", line 2236 in _run_in_app
  File "/usr/local/google/home/corentinl/work/main/build/python/venv/lib/python3.12/site-packages/absl/testing/absltest.py", line 2131 in main
  File "/usr/local/google/home/corentinl/work/main/ortools/routing/python/model_test.py", line 675 in <module>

Extension modules: google._upb._message (total: 1)
zsh: IOT instruction  python ortools/routing/python/model_test.py

parameters = model.default_routing_model_parameters()
parameters.solver_parameters.CopyFrom(
constraint_solver.Solver.default_solver_parameters()
)

Protobuf Version Python Test
v26.1 Pass
v27.0 Failed
v27.1 Failed
v27.2 Failed

continue to investigate...

ref: https://github.com/protocolbuffers/protobuf/blob/63def39e881afa496502d9c410f4ea948e59490d/src/google/protobuf/generated_message_reflection.cc#L3616-L3620

@Mizux Mizux added Bug Lang: Python Python wrapper issue Build: CMake CMake based build issue dependencies Pull requests that update a dependency file labels Jul 10, 2024
@Mizux Mizux added this to the v10.0 milestone Jul 10, 2024
@Mizux Mizux self-assigned this Jul 10, 2024
@Mizux
Copy link
Collaborator Author

Mizux commented Jul 10, 2024

Custom Protobuf

cloning Protobuf locally and playing with SHA1/patch

git clone [email protected]:protocolbuffers/protobuf.git
cd protobuf
# reset to the commit to test
git reset --hard d21425d334f965650c3c66362b0d827797f059db

# Change -dev version suffix using v27.0 commit otherwise at runtime protobuf check version suffix and refuse to start
# we use tig to select the commit to cherry pick
tig v27.0
# inside tig press `C` to cherrypick, `Y` to confirm, then quit with `q`
# fix patch using mergetool
git mergetool
# remove .orig
git clean -n
git clean -f

# Patch CMake stuff
# note: could be protobuf-v27.patch
git apply --3way ~/work/main/patches/protobuf-v26.patch

modify or-tools

diff --git a/cmake/dependencies/CMakeLists.txt b/cmake/dependencies/CMakeLists.txt
index a5cfdeee13..fbd75fe544 100644
--- a/cmake/dependencies/CMakeLists.txt
+++ b/cmake/dependencies/CMakeLists.txt
@@ -106,13 +106,21 @@ if(BUILD_Protobuf)
   #set(protobuf_BUILD_LIBUPB ON)
   FetchContent_Declare(
     Protobuf
-    GIT_REPOSITORY "https://github.com/protocolbuffers/protobuf.git"
+    SOURCE_DIR "${CMAKE_SOURCE_DIR}/pb"

-    GIT_SHALLOW TRUE
+    #GIT_SHALLOW TRUE
     GIT_SUBMODULES ""

-    PATCH_COMMAND git apply --ignore-whitespace "${CMAKE_CURRENT_LIST_DIR}/../../patches/protobuf-v27.patch")
+    #PATCH_COMMAND git apply --ignore-whitespace "${CMAKE_CURRENT_LIST_DIR}/../../patches/protobuf-v27.patch"
+    #PATCH_COMMAND git apply --ignore-whitespace "${CMAKE_CURRENT_LIST_DIR}/../../patches/protobuf-v26.patch"
+  )
   FetchContent_MakeAvailable(Protobuf)
   list(POP_BACK CMAKE_MESSAGE_INDENT)
   message(CHECK_PASS "fetched")

note: I've use a symbolic link in the root dir of my or-tools local repo.
ln -s <path_to_hacked_protobuf> pb

In or-tools:

# clean protobuf build
rm -rf build/_deps/protobuf-build build/_deps/protobuf-src build/_deps/protobuf-subbuild 
# rebuild
cmake --build build  -j 8

# test
source build/python/venv/bin/activate
python ortools/routing/python/model_test.py

Dev Note

Rename the 'includingDefaultValueWithoutPresenceFields' and 'always_print_without_presence_fields' to 'alwaysPrintFieldsWithNoPresence':

You also, may need to revert this commit e0a4dcf
or apply this diff

diff --git a/ortools/util/file_util.cc b/ortools/util/file_util.cc
index db638ca8f9..fcb661fcf1 100644
--- a/ortools/util/file_util.cc
+++ b/ortools/util/file_util.cc
@@ -165,7 +165,7 @@ absl::Status WriteProtoToFile(absl::string_view filename,
     case ProtoWriteFormat::kJson: {
       google::protobuf::util::JsonPrintOptions options;
       options.add_whitespace = true;
-      options.always_print_fields_with_no_presence = true;
+      options.always_print_primitive_fields = true;
       options.preserve_proto_field_names = true;
       if (!google::protobuf::util::MessageToJsonString(proto, &output_string,
                                                        options)

if you are before this commit:

Fun Fact: While release v26.1 contains this patch, it is NOT available in the v26-dev tag...
https://github.com/protocolbuffers/protobuf/blob/v26.1/src/google/protobuf/json/json.h#L46
https://github.com/protocolbuffers/protobuf/blob/v26-dev/src/google/protobuf/json/json.h#L46

See only commit between a and b[

To only list commit [a, b[ you can use:

tig a...b

Results

Using a bisect approach to find the first borken commit(s)...

Protobuf Version CMake Patch Test Date
v27.2 protobuf-v27.patch FAIL 2024
v27.1 protobuf-v27.patch FAIL 2024
v27.0 protobuf-v27.patch FAIL 2024/05/22
... ... ... ...
<v27-dev> protobuf-v27.patch FAIL 2024/04/17
... ... ... ...
01312f9c34a779f2a4e3327822b33d85f7a64002 protobuf-v26.patch FAIL 2024/02/06
... ... ... ...
4687ef335f29736fa0ff83d6421ea2bbdbdf1aa5 protobuf-v26.patch FAIL 2024/02/01
... ... ... ...
21ab7459ee43c0d5079e9f7708d453f68a5fd0e5 protobuf-v26.patch FAIL 2024/01/31
... ... ... ...
6f1d88107f268b8ebdad6690d116e74c403e366e protobuf-v26.patch FAIL 2024/01/30
... ... ... ...
4df6e042eca08159ee395853d8c250469a35db72 protobuf-v26.patch FAIL 2024/01/30
... ... ... ...
f4511fda5a0ad44e84848077da83f2ae9b21830e protobuf-v26.patch FAIL 2024/01/30
... ... ... ...
49c83ab799977592744f50df93957ddec1e12e70 protobuf-v26.patch FAIL 2024/01/30
abc9bae7c19d9fff4b0be0ffd3a01f947d3b487a protobuf-v26.patch NA(0) 2024/01/30
58baeb4c3b664f8918d24cef5151083d9da9767c protobuf-v26.patch PASS 2024/01/30
... ... ... ...
d21425d334f965650c3c66362b0d827797f059db protobuf-v26.patch PASS 2024/01/30
... ... ... ...
9146ce6ddba3b74ad16a0f440f9cf9d73ba282c7 protobuf-v26.patch PASS 2024/01/25
... ... ... ...
<v26-dev> protobuf-v26.patch NA(1)
  1. protocolbuffers/protobuf@49c83ab contains auto generated code after the commit protocolbuffers/protobuf@abc9bae so I guess this one is not atomic and can't be tested...
  2. compile fail didn't manage to generate a binary, but v26-dev is not in the v26.1 which contains commit after this tag

@Mizux
Copy link
Collaborator Author

Mizux commented Jul 31, 2024

TLDR: using Protobuf (cmake based build) as a "Static" library will lead to an ODR violation, until now this ODR was without impact/side effect but the Protobuf change this behaviour and make it crash the runtime.

@magneticflux-
Copy link

Also seems to break fzn-cp-sat:

F0801 20:51:14.314146 2793569 generated_message_reflection.cc:3620] Check failed: file != nullptr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Build: CMake CMake based build issue dependencies Pull requests that update a dependency file Lang: Python Python wrapper issue
Projects
Status: In progress
Development

No branches or pull requests

2 participants