Skip to content

Commit dc547f3

Browse files
authored
Merge pull request #528 from ValeevGroup/evaleev/feature/standalone-umpire-cxx-allocator
factored out Umpire's CMake harness and allocator adaptor to `ValeevGroup/umpire-cxx-allocator`
2 parents ee4c908 + 136ae09 commit dc547f3

File tree

18 files changed

+210
-700
lines changed

18 files changed

+210
-700
lines changed

CMakeLists.txt

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -320,10 +320,9 @@ if (TA_TTG)
320320
endif(TA_TTG)
321321
detect_MADNESS_configuration()
322322
include(external/eigen.cmake)
323-
# the FetchContent-based version will not work due to BLT target name conflicts
324-
# include(${PROJECT_SOURCE_DIR}/cmake/modules/FindOrFetchUmpire.cmake)
325-
# use the ExternalProject-based version
326-
include(external/umpire.cmake)
323+
324+
include(${PROJECT_SOURCE_DIR}/cmake/modules/FindOrFetchUmpireCXXAllocator.cmake)
325+
add_dependencies(External-tiledarray vrg-build-external-projects)
327326

328327
###### discover linear algebra
329328

INSTALL.md

Lines changed: 35 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,32 @@
22

33
## Synopsis
44

5-
```.cpp
6-
$ git clone https://github.com/ValeevGroup/TiledArray.git tiledarray
7-
$ cd tiledarray
8-
$ cmake -B build \
9-
-D CMAKE_INSTALL_PREFIX=/path/to/tiledarray/install \
10-
-D CMAKE_TOOLCHAIN_FILE=cmake/vg/toolchains/<toolchain-file-for-your-platform>.cmake \
11-
.
12-
$ cmake --build build
13-
(recommended, but optional): $ cmake --build build --target check
14-
$ cmake --build build --target install
5+
Building and installing:
6+
```c++
7+
$ git clone https://github.com/ValeevGroup/tiledarray.git
8+
$ cmake -S tiledarray -B tiledarray/build \
9+
-D CMAKE_INSTALL_PREFIX=/path/to/tiledarray/install
10+
(recommended, but optional): $ cmake --build tiledarray/build --target check
11+
$ cmake --build tiledarray/build --target install
12+
```
13+
After this TA can be consumed from another project's CMake harness:
14+
```cmake
15+
find_package(tiledarray CONFIG REQUIRED)
16+
target_link_libraries(your_executable_or_library_target PUBLIC tiledarray)
17+
```
18+
19+
Or simply build TiledArray from source within another project's CMake harness:
20+
```cmake
21+
find_package(tiledarray CONFIG)
22+
if (NOT TARGET tiledarray)
23+
cmake_minimum_required(VERSION 3.14.0) # for FetchContent_MakeAvailable
24+
include(FetchContent)
25+
FetchContent_Declare(tiledarray
26+
GIT_REPOSITORY https://github.com/ValeevGroup/tiledarray
27+
)
28+
FetchContent_MakeAvailable(tiledarray)
29+
endif()
30+
target_link_libraries(your_executable_or_library_target PUBLIC tiledarray)
1531
```
1632

1733
## Introduction
@@ -40,22 +56,21 @@ Both methods are supported. However, for most users we _strongly_ recommend to b
4056
- Boost.Test: header-only or (optionally) as a compiled library, *only used for unit testing*
4157
- Boost.Range: header-only, *only used for unit testing*
4258
- [Range-V3](https://github.com/ericniebler/range-v3.git) -- a Ranges library that served as the basis for Ranges component of C++20 and later.
43-
- [BTAS](http://github.com/ValeevGroup/BTAS), tag 62d57d9b1e0c733b4b547bc9cfdd07047159dbca . If usable BTAS installation is not found, TiledArray will download and compile
59+
- [BTAS](http://github.com/ValeevGroup/BTAS) -- a generic dense local tensor framework. If usable BTAS installation is not found, TiledArray will download and compile
4460
BTAS from source. *This is the recommended way to compile BTAS for all users*.
45-
- [MADNESS](https://github.com/m-a-d-n-e-s-s/madness), tag 8abd78b8a304a88b951449d8cb127f5a91f27721 .
46-
Only the MADworld runtime and BLAS/LAPACK C API component of MADNESS is used by TiledArray.
61+
- [MADNESS](https://github.com/m-a-d-n-e-s-s/madness) -- a multiresolution numerical calculus framework,
62+
TiledArray only uses its distributed task-based programming model ("MADworld")
4763
If usable MADNESS installation is not found, TiledArray will download and compile
4864
MADNESS from source. *This is the recommended way to compile MADNESS for all users*.
49-
A detailed list of MADNESS prerequisites can be found at [MADNESS' INSTALL file](https://github.com/m-a-d-n-e-s-s/madness/blob/master/INSTALL_CMake);
50-
it also also contains detailed
51-
MADNESS compilation instructions.
65+
A detailed list of MADNESS dependencies can be found at [MADNESS' INSTALL file](https://github.com/m-a-d-n-e-s-s/madness/blob/master/INSTALL_CMake);
66+
it also also contains detailed MADNESS compilation instructions.
67+
- [Umpire C++ allocator](github.com/ValeevGroup/umpire-cxx-allocator) -- a C++ allocator for [LLNL/Umpire](https://github.com/LLNL/Umpire), a portable memory manager. *It is recommended to let TiledArray build the Umpire C++ allocator and Umpire itself from source.*
5268

5369
Compiling MADNESS requires the following prerequisites:
5470
- An implementation of Message Passing Interface version 2 or 3, with support
5571
for `MPI_THREAD_MULTIPLE`.
56-
- (optional)
57-
Intel Thread Building Blocks (TBB), available in a [commercial](software.intel.com/tbb) or
58-
an [open-source](https://www.threadingbuildingblocks.org/) form
72+
- (recommended)
73+
[PaRSEC](https://github.com/ICLDisco/parsec) -- a distributed programming model used for local task scheduling in MADNESS.
5974

6075
Compiling BTAS requires the following prerequisites:
6176
- [blaspp](https://bitbucket.org/icl/blaspp.git) -- C++ API for BLAS
@@ -68,10 +83,9 @@ Optional prerequisites:
6883
- [CUDA compiler and runtime](https://developer.nvidia.com/cuda-zone) -- for execution on NVIDIA's CUDA-enabled accelerators. CUDA 12 or later is required.
6984
- [HIP/ROCm compiler and runtime](https://developer.nvidia.com/cuda-zone) -- for execution on AMD's ROCm-enabled accelerators. Note that TiledArray does not use ROCm directly but its C++ Heterogeneous-Compute Interface for Portability, `HIP`; although HIP can also be used to program CUDA-enabled devices, in TiledArray it is used only to program ROCm devices, hence ROCm and HIP will be used interchangeably.
7085
- [LibreTT](github.com/victor-anisimov/LibreTT) -- free tensor transpose library for CUDA, ROCm, and SYCL platforms that is based on the [original cuTT library](github.com/ap-hynninen/cutt) extended to provide thread-safety improvements (via github.com/ValeevGroup/cutt) and extended to non-CUDA platforms by [@victor-anisimov](github.com/victor-anisimov) (tag 6eed30d4dd2a5aa58840fe895dcffd80be7fbece).
71-
- [Umpire](github.com/LLNL/Umpire) -- portable memory manager for heterogeneous platforms (tag 8c85866107f78a58403e20a2ae8e1f24c9852287).
7286
- [Doxygen](http://www.doxygen.nl/) -- for building documentation (version 1.8.12 or later).
7387
- [ScaLAPACK](http://www.netlib.org/scalapack/) -- a distributed-memory linear algebra package. If detected, the following C++ components will also be sought and downloaded, if missing:
74-
- [scalapackpp](https://github.com/wavefunction91/scalapackpp.git) -- a modern C++ wrapper for ScaLAPACK (tag 6397f52cf11c0dfd82a79698ee198a2fce515d81); pulls and builds the following additional prerequisite
88+
- [scalapackpp](https://github.com/wavefunction91/scalapackpp.git) -- a modern C++ wrapper for ScaLAPACK; pulls and builds the following additional prerequisite
7589
- [blacspp](https://github.com/wavefunction91/blacspp.git) -- a modern C++ wrapper for BLACS
7690
- Python3 interpreter -- to test (optionally-built) Python bindings
7791
- [TTG](https://github.com/TESSEorg/ttg.git) -- C++ implementation of the Template Task Graph programming model for fine-grained flow-graph composition of distributed memory programs (tag 3fe4a06dbf4b05091269488aab38223da1f8cb8e).

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ $ cmake --build build
9999
$ cmake --build build --target install
100100
```
101101
Here `<toolchain-file-for-your-platform>` is the appropriate toolchain file from [the Valeev Group CMake kit](https://github.com/ValeevGroup/kit-cmake/tree/master/toolchains); an alternative is
102-
to provide your own toolchain file. On some standard platforms (e.g. MacOS) the toolchain file can be skipped.
102+
to provide your own toolchain file. On most standard platforms (e.g. Ubuntu, MacOS) the toolchain file can be skipped.
103103

104104
The detailed instructions can be found in [INSTALL.md](https://github.com/ValeevGroup/tiledarray/blob/master/INSTALL.md).
105105

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# try find_package
2+
if (NOT TARGET umpire-cxx-allocator)
3+
include (FindPackageRegimport)
4+
find_package_regimport(umpire-cxx-allocator QUIET CONFIG)
5+
if (TARGET umpire-cxx-allocator)
6+
message(STATUS "Found umpire-cxx-allocator CONFIG at ${umpire-cxx-allocator_CONFIG}")
7+
endif (TARGET umpire-cxx-allocator)
8+
endif (NOT TARGET umpire-cxx-allocator)
9+
10+
# if not found, build via FetchContent
11+
if (NOT TARGET umpire-cxx-allocator)
12+
13+
if (TA_CUDA)
14+
set(UMPIRE_ENABLE_CUDA ON CACHE BOOL "Enable CUDA support in Umpire")
15+
endif()
16+
if (TA_HIP)
17+
set(UMPIRE_ENABLE_HIP ON CACHE BOOL "Enable HIP support in Umpire")
18+
endif()
19+
20+
include(FetchContent)
21+
FetchContent_Declare(
22+
umpire-cxx-allocator
23+
GIT_REPOSITORY https://github.com/ValeevGroup/umpire-cxx-allocator.git
24+
GIT_TAG ${TA_TRACKED_UMPIRE-CXX-ALLOCATOR_TAG}
25+
)
26+
FetchContent_MakeAvailable(umpire-cxx-allocator)
27+
FetchContent_GetProperties(umpire-cxx-allocator
28+
SOURCE_DIR UMPIRE-CXX-ALLOCATOR_SOURCE_DIR
29+
BINARY_DIR UMPIRE-CXX-ALLOCATOR_BINARY_DIR
30+
)
31+
32+
# set umpire-cxx-allocator_CONFIG to the install location so that we know where to find it
33+
set(umpire-cxx-allocator_CONFIG ${CMAKE_INSTALL_PREFIX}/${UMPIRE-CXX-ALLOCATOR_CMAKE_DIR}/umpire-cxx-allocator-config.cmake)
34+
35+
endif(NOT TARGET umpire-cxx-allocator)
36+
37+
# postcond check
38+
if (NOT TARGET umpire-cxx-allocator)
39+
message(FATAL_ERROR "FindOrFetchUmpireCXXAllocator could not make umpire-cxx-allocator target available")
40+
endif(NOT TARGET umpire-cxx-allocator)

cmake/tiledarray-config.cmake.in

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,6 @@ if(NOT TARGET MADworld)
3838
include( CMakeFindDependencyMacro )
3939
find_dependency(MADNESS 0.10.1 CONFIG REQUIRED COMPONENTS world PATHS "${MADNESS_CONFIG_DIR}" NO_DEFAULT_PATH)
4040
endif()
41-
if(NOT TARGET tiledarray)
42-
include("${CMAKE_CURRENT_LIST_DIR}/tiledarray-targets.cmake")
43-
endif()
4441

4542
# if TA is a CUDA-dependent library it needs CUDA to link properly ... unfortunately CMake is not able to do this correctly
4643
# see https://gitlab.kitware.com/cmake/cmake/issues/18614
@@ -66,15 +63,24 @@ if(TILEDARRAY_HAS_CUDA)
6663
INTERFACE_LINK_LIBRARIES "${_ta_interface_libs}")
6764
endif()
6865

69-
set(TILEDARRAY_HAS_SCALAPACK "@ENABLE_SCALAPACK@" )
70-
if(TILEDARRAY_HAS_SCALAPACK)
66+
set(TA_SCALAPACK "@TA_SCALAPACK@" )
67+
if(TA_SCALAPACK)
7168
include( CMakeFindDependencyMacro )
7269
get_filename_component(blacspp_DIR "@blacspp_CONFIG@" DIRECTORY)
7370
find_dependency( blacspp CONFIG REQUIRED HINTS "${blacspp_DIR}" )
7471
get_filename_component(scalapackpp_DIR "@scalapackpp_CONFIG@" DIRECTORY)
7572
find_dependency( scalapackpp CONFIG REQUIRED HINTS "${scalapackpp_DIR}" )
7673
endif()
7774

75+
if (NOT TARGET umpire-cxx-allocator)
76+
get_filename_component(umpire-cxx-allocator_DIR "@umpire-cxx-allocator_CONFIG@" DIRECTORY)
77+
find_dependency(umpire-cxx-allocator 1.0.0 QUIET CONFIG REQUIRED HINTS "${umpire-cxx-allocator_DIR}")
78+
endif()
79+
80+
if(NOT TARGET tiledarray)
81+
include("${CMAKE_CURRENT_LIST_DIR}/tiledarray-targets.cmake")
82+
endif()
83+
7884
# Set the tiledarray compiled library target
7985
set(TILEDARRAY_LIBRARIES tiledarray)
8086

examples/device/device_task.cpp

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,8 @@
99
#include <tiledarray.h>
1010

1111
using value_type = double;
12-
using tensor_type = TA::btasUMTensorVarray<value_type>;
13-
using tile_type = TA::Tile<tensor_type>;
12+
using tensor_type = TiledArray::btasUMTensorVarray<value_type>;
13+
using tile_type = TiledArray::Tile<tensor_type>;
1414

1515
/// verify the elements in tile is equal to value
1616
void verify(const tile_type& tile, value_type value, std::size_t index) {
@@ -34,18 +34,19 @@ tile_type scale(const tile_type& arg, value_type a,
3434
using Storage = typename tile_type::tensor_type::storage_type;
3535
Storage result_storage;
3636
auto result_range = arg.range();
37-
make_device_storage(result_storage, arg.size(), stream);
37+
TiledArray::make_device_storage(result_storage, arg.size(), stream);
3838

3939
typename tile_type::tensor_type result(std::move(result_range),
4040
std::move(result_storage));
4141

4242
/// copy the original Tensor
4343
auto& queue = TiledArray::BLASQueuePool::queue(stream);
4444

45-
blas::copy(result.size(), arg.data(), 1, device_data(result.storage()), 1,
46-
queue);
45+
blas::copy(result.size(), arg.data(), 1,
46+
TiledArray::device_data(result.storage()), 1, queue);
4747

48-
blas::scal(result.size(), a, device_data(result.storage()), 1, queue);
48+
blas::scal(result.size(), a, TiledArray::device_data(result.storage()), 1,
49+
queue);
4950

5051
// std::stringstream stream_str;
5152
// stream_str << stream;

external/cuda.cmake

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,6 @@ sanitize_cuda_implicit_directories()
4444
message(STATUS "CMAKE Implicit Include Directories: ${CMAKE_CUDA_IMPLICIT_INCLUDE_DIRECTORIES}")
4545
message(STATUS "CMAKE Implicit Link Directories: ${CMAKE_CUDA_IMPLICIT_LINK_DIRECTORIES}")
4646

47-
##
48-
## Umpire
49-
##
50-
include(external/umpire.cmake)
51-
5247
##
5348
## LibreTT
5449
##

external/hip.cmake

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,6 @@ foreach (library hipblas;rocthrust)
2020
endif()
2121
endforeach()
2222

23-
##
24-
## Umpire
25-
##
26-
include(external/umpire.cmake)
27-
2823
##
2924
## LibreTT
3025
##

0 commit comments

Comments
 (0)