Skip to content

Conversation

willat343
Copy link

@willat343 willat343 commented May 12, 2025

  • bumped GTSAM 4.2a9 (a 4.2.0 prerelease) to 4.2.0
  • added finding of GTSAM_UNSTABLE for preinstalled GTSAM option, as this was causing downstream linker errors
  • changed the src directory of the downloaded gtsam files to be within the src tree, while the tmp and build files are in the normal build directory
  • removed INSTALL_COMMAND as this is not necessary, and by using make install instead of cmake --build . --target install prevents the jobserver from propagating

@willat343 willat343 self-assigned this May 12, 2025
@willat343 willat343 requested review from nubertj, aravindev and tutunarsl and removed request for nubertj May 12, 2025 12:04
@nubertj
Copy link
Member

nubertj commented May 12, 2025

We need 882, 879 and 874 which have been introduced in 4.2a8 (see here https://github.com/borglab/gtsam/releases/tag/4.2a8).
I checked for a few minutes that these changes are now also in 4.2, and it seems they are, but would be great if you could double check. (they are not explicitly mentioned in the release, but when doing a diff I can't find any difference)

BINARY_DIR "${CMAKE_BINARY_DIR}/gtsam/build"
INSTALL_DIR "${CATKIN_DEVEL_PREFIX}"
CMAKE_CACHE_ARGS "-DCMAKE_POSITION_INDEPENDENT_CODE:BOOL=true"
CMAKE_ARGS
-DCMAKE_BUILD_TYPE=Release
-DGTSAM_POSE3_EXPMAP=ON
-DGTSAM_ROT3_EXPMAP=ON
-DGTSAM_USE_QUATERNIONS=ON
-DGTSAM_USE_SYSTEM_EIGEN=ON
-DGTSAM_BUILD_WITH_MARCH_NATIVE=OFF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with the march native some time ago, and it was always a bit of a pain, but I would expect to get quite some boost in performance, in particular as this option builds it locally from source anyway (which is exactly what march native is made for)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remind me the subtleties of march native? I thought it optimized performance according to system hardware?

In the ANYmal deployment, this may or may not be an issue, depending on how things are set up. In https://github.com/leggedrobotics/anymal_rsl/pull/189 I have set up GTSAM to be installed from source in the Dockerfile.npc, in which case GTSAM is built in docker on the robot, and here -march=native is ok. If instead we have a remote build and sync strategy (the default before the PR) then -march=native would optimise for the wrong system architecture potentially. Definitely if EV agrees with having GTSAM installed by the dockerfile and assuming the images that get pushed to dockerhub are built on the robot, then adding this option makes sense there. I can see it also making sense to have this as the default in gtsam_catkin once we know what is happening with that PR.

Another option would be specifying the architecture. Running gcc -Q --help=target | grep march -A 1 reveals the options and the native choice for your system. On my PC it is x86-64 but i think ANYmal might be something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, do we have some data to see how much performance improvement march_native brings in?
I would suggest that we create a debian generation script that helps us create and upload the debian into our deb repository. This can be included in the NPC installation scripts and would be a much nicer solution than embedding into npc docker image, which can increase the image size

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@willat343 we can have a quick discussion when you are free if you want some help to navigate this problem

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well if you have a debian I guess you don't want to enable march=native due to compatibility issues.
If it is compiled from source in the workspace, you probably want.

Speedup can be significant, up to 1.5-2x.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the difficulty with march native is that every library along the line has to be installed with this, as otherwise Eigen alignment is not correct, and you get some random segfaults from time to time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if it is one big cmake project and built in an isolated way it is usually easy to enable, as all the build flags are forwarded. However, if you build libraries independently in a catkin-style, it gets tricky, as you link against a library that was e.g. not built with march native enabled, creating issues.

)

# 2 - Check whether GTSAM is installed -------------------------------------------------------
# 2 - Check whether GTSAM is installed -------------------------------------------------------
elseif (GTSAM_FOUND)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just tested it, still having an issue with this option

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason it does not find limetis-gtsam.so

libtbbmalloc.so.2 => /lib/x86_64-linux-gnu/libtbbmalloc.so.2 (0x000078145950c000) libmetis-gtsam.so => not found libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x0000781459504000)

Although the library is present in /usr/local/lib.
libgtsam.so and libtsam_untsable.so point to the correct location

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok this is interesting...

If you take a look at this file https://github.com/leggedrobotics/rsl_seam/blob/main/docker/Dockerfile.ros1 I install from source to ${HOME}/.local and explicitly call in my catkin workspace: catkin config --append-args --cmake-args -DGTSAM_DIR=${HOME}/.local/lib/cmake/GTSAM -DGTSAM_UNSTABLE_DIR=${HOME}/.local/lib/cmake/GTSAM_UNSTABLE

Could you try build the rsl_seam dockerfile and compare with your system? (instructions are in the README.md, fairly straightforward)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am referring to option 2, not 1 (the one where GTSAM is installed and can be found through find(...).

@willat343
Copy link
Author

We need 882, 879 and 874 which have been introduced in 4.2a8 (see here https://github.com/borglab/gtsam/releases/tag/4.2a8). I checked for a few minutes that these changes are now also in 4.2, and it seems they are, but would be great if you could double check. (they are not explicitly mentioned in the release, but when doing a diff I can't find any difference)

4.2 was the official release after the pre-releases up to 4.2a9. They would have to have reverted the change right? Also I'm building the whole of holistic fusion with 4.2 so does this prove that 4.2 is ok, or is it a runtime thing?

@nubertj
Copy link
Member

nubertj commented May 21, 2025

We need 882, 879 and 874 which have been introduced in 4.2a8 (see here https://github.com/borglab/gtsam/releases/tag/4.2a8). I checked for a few minutes that these changes are now also in 4.2, and it seems they are, but would be great if you could double check. (they are not explicitly mentioned in the release, but when doing a diff I can't find any difference)

4.2 was the official release after the pre-releases up to 4.2a9. They would have to have reverted the change right? Also I'm building the whole of holistic fusion with 4.2 so does this prove that 4.2 is ok, or is it a runtime thing?

They fixed some internal bugs. You won't notice from just the fact that it compiles.
I guess the question is whether they took all changes from 4.2a9 to 4.2, or whether some only appear with 4.3, but I think it seems to be fine.

BINARY_DIR "${CMAKE_BINARY_DIR}/gtsam/build"
INSTALL_DIR "${CATKIN_DEVEL_PREFIX}"
CMAKE_CACHE_ARGS "-DCMAKE_POSITION_INDEPENDENT_CODE:BOOL=true"
CMAKE_ARGS
-DCMAKE_BUILD_TYPE=Release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_BUILD_TYPE=Release
-DGTSAM_WITH_TBB=OFF

This is super helpful in terms of performance. Reduces peak cpu load by a factor of 3 for me

@nubertj
Copy link
Member

nubertj commented Jun 11, 2025

I tried today compiling it and it seems to work, but it compiles suuuuuper slowly (only with one core).
image
After 10 minutes it only compiled 12% (when I canceled).

In contrast, the old gtsam_catkin version (main) compiled with multiple cores and only took 1min 28s to compile the entire library..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants