Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -320,10 +320,9 @@ if (TA_TTG)
endif(TA_TTG)
detect_MADNESS_configuration()
include(external/eigen.cmake)
# the FetchContent-based version will not work due to BLT target name conflicts
# include(${PROJECT_SOURCE_DIR}/cmake/modules/FindOrFetchUmpire.cmake)
# use the ExternalProject-based version
include(external/umpire.cmake)

include(${PROJECT_SOURCE_DIR}/cmake/modules/FindOrFetchUmpireCXXAllocator.cmake)
add_dependencies(External-tiledarray vrg-build-external-projects)

###### discover linear algebra

Expand Down
56 changes: 35 additions & 21 deletions INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,32 @@

## Synopsis

```.cpp
$ git clone https://github.com/ValeevGroup/TiledArray.git tiledarray
$ cd tiledarray
$ cmake -B build \
-D CMAKE_INSTALL_PREFIX=/path/to/tiledarray/install \
-D CMAKE_TOOLCHAIN_FILE=cmake/vg/toolchains/<toolchain-file-for-your-platform>.cmake \
.
$ cmake --build build
(recommended, but optional): $ cmake --build build --target check
$ cmake --build build --target install
Building and installing:
```c++
$ git clone https://github.com/ValeevGroup/tiledarray.git
$ cmake -S tiledarray -B tiledarray/build \
-D CMAKE_INSTALL_PREFIX=/path/to/tiledarray/install
(recommended, but optional): $ cmake --build tiledarray/build --target check
$ cmake --build tiledarray/build --target install
```
After this TA can be consumed from another project's CMake harness:
```cmake
find_package(tiledarray CONFIG REQUIRED)
target_link_libraries(your_executable_or_library_target PUBLIC tiledarray)
```

Or simply build TiledArray from source within another project's CMake harness:
```cmake
find_package(tiledarray CONFIG)
if (NOT TARGET tiledarray)
cmake_minimum_required(VERSION 3.14.0) # for FetchContent_MakeAvailable
include(FetchContent)
FetchContent_Declare(tiledarray
GIT_REPOSITORY https://github.com/ValeevGroup/tiledarray
)
FetchContent_MakeAvailable(tiledarray)
endif()
target_link_libraries(your_executable_or_library_target PUBLIC tiledarray)
```

## Introduction
Expand Down Expand Up @@ -40,22 +56,21 @@ Both methods are supported. However, for most users we _strongly_ recommend to b
- Boost.Test: header-only or (optionally) as a compiled library, *only used for unit testing*
- Boost.Range: header-only, *only used for unit testing*
- [Range-V3](https://github.com/ericniebler/range-v3.git) -- a Ranges library that served as the basis for Ranges component of C++20 and later.
- [BTAS](http://github.com/ValeevGroup/BTAS), tag 62d57d9b1e0c733b4b547bc9cfdd07047159dbca . If usable BTAS installation is not found, TiledArray will download and compile
- [BTAS](http://github.com/ValeevGroup/BTAS) -- a generic dense local tensor framework. If usable BTAS installation is not found, TiledArray will download and compile
BTAS from source. *This is the recommended way to compile BTAS for all users*.
- [MADNESS](https://github.com/m-a-d-n-e-s-s/madness), tag 8abd78b8a304a88b951449d8cb127f5a91f27721 .
Only the MADworld runtime and BLAS/LAPACK C API component of MADNESS is used by TiledArray.
- [MADNESS](https://github.com/m-a-d-n-e-s-s/madness) -- a multiresolution numerical calculus framework,
TiledArray only uses its distributed task-based programming model ("MADworld")
If usable MADNESS installation is not found, TiledArray will download and compile
MADNESS from source. *This is the recommended way to compile MADNESS for all users*.
A detailed list of MADNESS prerequisites can be found at [MADNESS' INSTALL file](https://github.com/m-a-d-n-e-s-s/madness/blob/master/INSTALL_CMake);
it also also contains detailed
MADNESS compilation instructions.
A detailed list of MADNESS dependencies can be found at [MADNESS' INSTALL file](https://github.com/m-a-d-n-e-s-s/madness/blob/master/INSTALL_CMake);
it also also contains detailed MADNESS compilation instructions.
- [Umpire C++ allocator](github.com/ValeevGroup/umpire-cxx-allocator) -- a C++ allocator for [LLNL/Umpire](https://github.com/LLNL/Umpire), a portable memory manager. *It is recommended to let TiledArray build the Umpire C++ allocator and Umpire itself from source.*

Compiling MADNESS requires the following prerequisites:
- An implementation of Message Passing Interface version 2 or 3, with support
for `MPI_THREAD_MULTIPLE`.
- (optional)
Intel Thread Building Blocks (TBB), available in a [commercial](software.intel.com/tbb) or
an [open-source](https://www.threadingbuildingblocks.org/) form
- (recommended)
[PaRSEC](https://github.com/ICLDisco/parsec) -- a distributed programming model used for local task scheduling in MADNESS.

Compiling BTAS requires the following prerequisites:
- [blaspp](https://bitbucket.org/icl/blaspp.git) -- C++ API for BLAS
Expand All @@ -68,10 +83,9 @@ Optional prerequisites:
- [CUDA compiler and runtime](https://developer.nvidia.com/cuda-zone) -- for execution on NVIDIA's CUDA-enabled accelerators. CUDA 12 or later is required.
- [HIP/ROCm compiler and runtime](https://developer.nvidia.com/cuda-zone) -- for execution on AMD's ROCm-enabled accelerators. Note that TiledArray does not use ROCm directly but its C++ Heterogeneous-Compute Interface for Portability, `HIP`; although HIP can also be used to program CUDA-enabled devices, in TiledArray it is used only to program ROCm devices, hence ROCm and HIP will be used interchangeably.
- [LibreTT](github.com/victor-anisimov/LibreTT) -- free tensor transpose library for CUDA, ROCm, and SYCL platforms that is based on the [original cuTT library](github.com/ap-hynninen/cutt) extended to provide thread-safety improvements (via github.com/ValeevGroup/cutt) and extended to non-CUDA platforms by [@victor-anisimov](github.com/victor-anisimov) (tag 6eed30d4dd2a5aa58840fe895dcffd80be7fbece).
- [Umpire](github.com/LLNL/Umpire) -- portable memory manager for heterogeneous platforms (tag 8c85866107f78a58403e20a2ae8e1f24c9852287).
- [Doxygen](http://www.doxygen.nl/) -- for building documentation (version 1.8.12 or later).
- [ScaLAPACK](http://www.netlib.org/scalapack/) -- a distributed-memory linear algebra package. If detected, the following C++ components will also be sought and downloaded, if missing:
- [scalapackpp](https://github.com/wavefunction91/scalapackpp.git) -- a modern C++ wrapper for ScaLAPACK (tag 6397f52cf11c0dfd82a79698ee198a2fce515d81); pulls and builds the following additional prerequisite
- [scalapackpp](https://github.com/wavefunction91/scalapackpp.git) -- a modern C++ wrapper for ScaLAPACK; pulls and builds the following additional prerequisite
- [blacspp](https://github.com/wavefunction91/blacspp.git) -- a modern C++ wrapper for BLACS
- Python3 interpreter -- to test (optionally-built) Python bindings
- [TTG](https://github.com/TESSEorg/ttg.git) -- C++ implementation of the Template Task Graph programming model for fine-grained flow-graph composition of distributed memory programs (tag 3fe4a06dbf4b05091269488aab38223da1f8cb8e).
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ $ cmake --build build
$ cmake --build build --target install
```
Here `<toolchain-file-for-your-platform>` is the appropriate toolchain file from [the Valeev Group CMake kit](https://github.com/ValeevGroup/kit-cmake/tree/master/toolchains); an alternative is
to provide your own toolchain file. On some standard platforms (e.g. MacOS) the toolchain file can be skipped.
to provide your own toolchain file. On most standard platforms (e.g. Ubuntu, MacOS) the toolchain file can be skipped.

The detailed instructions can be found in [INSTALL.md](https://github.com/ValeevGroup/tiledarray/blob/master/INSTALL.md).

Expand Down
40 changes: 40 additions & 0 deletions cmake/modules/FindOrFetchUmpireCXXAllocator.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# try find_package
if (NOT TARGET umpire-cxx-allocator)
include (FindPackageRegimport)
find_package_regimport(umpire-cxx-allocator QUIET CONFIG)
if (TARGET umpire-cxx-allocator)
message(STATUS "Found umpire-cxx-allocator CONFIG at ${umpire-cxx-allocator_CONFIG}")
endif (TARGET umpire-cxx-allocator)
endif (NOT TARGET umpire-cxx-allocator)

# if not found, build via FetchContent
if (NOT TARGET umpire-cxx-allocator)

if (TA_CUDA)
set(UMPIRE_ENABLE_CUDA ON CACHE BOOL "Enable CUDA support in Umpire")
endif()
if (TA_HIP)
set(UMPIRE_ENABLE_HIP ON CACHE BOOL "Enable HIP support in Umpire")
endif()

include(FetchContent)
FetchContent_Declare(
umpire-cxx-allocator
GIT_REPOSITORY https://github.com/ValeevGroup/umpire-cxx-allocator.git
GIT_TAG ${TA_TRACKED_UMPIRE-CXX-ALLOCATOR_TAG}
)
FetchContent_MakeAvailable(umpire-cxx-allocator)
FetchContent_GetProperties(umpire-cxx-allocator
SOURCE_DIR UMPIRE-CXX-ALLOCATOR_SOURCE_DIR
BINARY_DIR UMPIRE-CXX-ALLOCATOR_BINARY_DIR
)

# set umpire-cxx-allocator_CONFIG to the install location so that we know where to find it
set(umpire-cxx-allocator_CONFIG ${CMAKE_INSTALL_PREFIX}/${UMPIRE-CXX-ALLOCATOR_CMAKE_DIR}/umpire-cxx-allocator-config.cmake)

endif(NOT TARGET umpire-cxx-allocator)

# postcond check
if (NOT TARGET umpire-cxx-allocator)
message(FATAL_ERROR "FindOrFetchUmpireCXXAllocator could not make umpire-cxx-allocator target available")
endif(NOT TARGET umpire-cxx-allocator)
16 changes: 11 additions & 5 deletions cmake/tiledarray-config.cmake.in
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,6 @@ if(NOT TARGET MADworld)
include( CMakeFindDependencyMacro )
find_dependency(MADNESS 0.10.1 CONFIG REQUIRED COMPONENTS world PATHS "${MADNESS_CONFIG_DIR}" NO_DEFAULT_PATH)
endif()
if(NOT TARGET tiledarray)
include("${CMAKE_CURRENT_LIST_DIR}/tiledarray-targets.cmake")
endif()

# if TA is a CUDA-dependent library it needs CUDA to link properly ... unfortunately CMake is not able to do this correctly
# see https://gitlab.kitware.com/cmake/cmake/issues/18614
Expand All @@ -66,15 +63,24 @@ if(TILEDARRAY_HAS_CUDA)
INTERFACE_LINK_LIBRARIES "${_ta_interface_libs}")
endif()

set(TILEDARRAY_HAS_SCALAPACK "@ENABLE_SCALAPACK@" )
if(TILEDARRAY_HAS_SCALAPACK)
set(TA_SCALAPACK "@TA_SCALAPACK@" )
if(TA_SCALAPACK)
include( CMakeFindDependencyMacro )
get_filename_component(blacspp_DIR "@blacspp_CONFIG@" DIRECTORY)
find_dependency( blacspp CONFIG REQUIRED HINTS "${blacspp_DIR}" )
get_filename_component(scalapackpp_DIR "@scalapackpp_CONFIG@" DIRECTORY)
find_dependency( scalapackpp CONFIG REQUIRED HINTS "${scalapackpp_DIR}" )
endif()

if (NOT TARGET umpire-cxx-allocator)
get_filename_component(umpire-cxx-allocator_DIR "@umpire-cxx-allocator_CONFIG@" DIRECTORY)
find_dependency(umpire-cxx-allocator 1.0.0 QUIET CONFIG REQUIRED HINTS "${umpire-cxx-allocator_DIR}")
endif()

if(NOT TARGET tiledarray)
include("${CMAKE_CURRENT_LIST_DIR}/tiledarray-targets.cmake")
endif()

# Set the tiledarray compiled library target
set(TILEDARRAY_LIBRARIES tiledarray)

Expand Down
13 changes: 7 additions & 6 deletions examples/device/device_task.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@
#include <tiledarray.h>

using value_type = double;
using tensor_type = TA::btasUMTensorVarray<value_type>;
using tile_type = TA::Tile<tensor_type>;
using tensor_type = TiledArray::btasUMTensorVarray<value_type>;
using tile_type = TiledArray::Tile<tensor_type>;

/// verify the elements in tile is equal to value
void verify(const tile_type& tile, value_type value, std::size_t index) {
Expand All @@ -34,18 +34,19 @@ tile_type scale(const tile_type& arg, value_type a,
using Storage = typename tile_type::tensor_type::storage_type;
Storage result_storage;
auto result_range = arg.range();
make_device_storage(result_storage, arg.size(), stream);
TiledArray::make_device_storage(result_storage, arg.size(), stream);

typename tile_type::tensor_type result(std::move(result_range),
std::move(result_storage));

/// copy the original Tensor
auto& queue = TiledArray::BLASQueuePool::queue(stream);

blas::copy(result.size(), arg.data(), 1, device_data(result.storage()), 1,
queue);
blas::copy(result.size(), arg.data(), 1,
TiledArray::device_data(result.storage()), 1, queue);

blas::scal(result.size(), a, device_data(result.storage()), 1, queue);
blas::scal(result.size(), a, TiledArray::device_data(result.storage()), 1,
queue);

// std::stringstream stream_str;
// stream_str << stream;
Expand Down
5 changes: 0 additions & 5 deletions external/cuda.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,6 @@ sanitize_cuda_implicit_directories()
message(STATUS "CMAKE Implicit Include Directories: ${CMAKE_CUDA_IMPLICIT_INCLUDE_DIRECTORIES}")
message(STATUS "CMAKE Implicit Link Directories: ${CMAKE_CUDA_IMPLICIT_LINK_DIRECTORIES}")

##
## Umpire
##
include(external/umpire.cmake)

##
## LibreTT
##
Expand Down
5 changes: 0 additions & 5 deletions external/hip.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,6 @@ foreach (library hipblas;rocthrust)
endif()
endforeach()

##
## Umpire
##
include(external/umpire.cmake)

##
## LibreTT
##
Expand Down
Loading
Loading