Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/multi rhs #1318

Merged
merged 77 commits into from
Oct 25, 2022
Merged
Show file tree
Hide file tree
Changes from 71 commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
1da7761
Add initial multi-RHS interface for Dirac and DiracMatrix classes
maddyscientist Aug 12, 2022
9c83b1a
tunecache is not auto-dumped if the time since last tuning exceeds 12…
maddyscientist Aug 12, 2022
ab149d5
Some initial cleanup of the eigen solver
maddyscientist Aug 12, 2022
f95ec63
Minor cleanup of multi_reduce.cu
maddyscientist Aug 14, 2022
854f184
Add support for non-sum reductions in blas reduce functors. Add two …
maddyscientist Aug 20, 2022
64c4cd3
Update is_enabled precision helper to use function arg instead of tem…
maddyscientist Aug 20, 2022
587f57e
Add support to make_set to handle std::vector<T*> to std::vector<refe…
maddyscientist Aug 21, 2022
fd979fb
Some dilution test cleanup
maddyscientist Aug 21, 2022
bbbbd7c
Update VectorIO::load/save functions to internally use vector of ref_…
maddyscientist Aug 21, 2022
ce6380a
math_helper should be included in blas_helper to ensure it is included
maddyscientist Aug 22, 2022
32a389d
Fix compilation warning
maddyscientist Aug 22, 2022
75a8034
Move some QUDA feature enablement to quda_define.h.in instead of requ…
maddyscientist Aug 22, 2022
6615c1d
Added new test file io_test: this does unit testing on i/o functionli…
maddyscientist Aug 22, 2022
0602fe7
deflation.cpp should use VectorIO class for load/saving vectors
maddyscientist Aug 22, 2022
4350fc0
Simplify eigensolver.cpp
maddyscientist Aug 22, 2022
450b359
Replace the blas ax_ functor with axy_ which allows for reading from …
maddyscientist Aug 23, 2022
66ecbe7
Improve error verbosity in multi-BLAS _U/_L functions
maddyscientist Aug 24, 2022
f1da382
Add ColroSpinorField_cref alias for const fields
maddyscientist Aug 24, 2022
5bf5c3f
Update multi-blas interface function to accept vectors of ColorSpinor…
maddyscientist Aug 24, 2022
1a0d4d4
Mark ColorSpinorField/LatticeField copy and move constructors to be n…
maddyscientist Aug 24, 2022
250cb83
Fix issue with 5bf5c3f66f48ac9bc36edae19de915b386548586, where for fu…
maddyscientist Aug 25, 2022
4c88d29
Add custom resize functions for std::vector<ColorSpinorField> instanc…
maddyscientist Aug 25, 2022
e6d40a7
Use custom resize function in blas_test and dslash_test: this signifi…
maddyscientist Aug 25, 2022
96a2cb1
Rewrite of eigensolvers to utilize vector of fields instead of vector…
maddyscientist Aug 27, 2022
d0e9855
Merge branch 'hotfix/arpack' of github.com:lattice/quda into feature/…
maddyscientist Aug 28, 2022
6a06844
Add SVD computation to ARPACK path
cpviolator Aug 29, 2022
29b49ac
Merge branch 'feature/multi-rhs' of https://github.com/lattice/quda i…
cpviolator Aug 29, 2022
06885a4
Fix resize bug
maddyscientist Aug 29, 2022
53b0edb
Add function to allow for ARPACK sorting, use 'A' param to return Rit…
cpviolator Aug 29, 2022
0ce2afd
Merge branch 'feature/multi-rhs' of https://github.com/lattice/quda i…
cpviolator Aug 29, 2022
dc740a3
Promote block axpy functions to be templated on coefficient type: thi…
maddyscientist Aug 30, 2022
68929f8
The member functions rotateVecs and blockRotate are now generic, and …
maddyscientist Aug 30, 2022
6a5aacd
Minor clean up of ARPACK eigen/singular value computation.
cpviolator Aug 30, 2022
b20749c
Minor clean up of ARPACK eigen/singular value computation.
cpviolator Aug 30, 2022
95bfc5c
Add init member to LatticeFieldParam, which is used to check if the p…
maddyscientist Aug 31, 2022
1b02e05
Generalizer pair constructor for reference_wrapper to a variadic cons…
maddyscientist Aug 31, 2022
260d4ec
Remove all use of std::vector<ColorSpinorField*> from regular solvers
maddyscientist Aug 31, 2022
46039b5
Merge branch 'feature/multi-rhs' of github.com:lattice/quda into feat…
maddyscientist Aug 31, 2022
ed01e57
Delete accidental inclusion
maddyscientist Aug 31, 2022
ab28f80
Minor fix for max_deviation functor
maddyscientist Aug 31, 2022
efa1bc8
Added FieldTmp class for caching temporaries fields. Removed all exp…
maddyscientist Sep 1, 2022
429ba11
Add missing header from last commit
maddyscientist Sep 1, 2022
41d4b6b
Remove ColorSpinorField_ref type: replaced all usage using vector_ref
maddyscientist Sep 1, 2022
7c0066a
Update multi-RHS Dirac interface to use vector_ref
maddyscientist Sep 1, 2022
7ceb35b
Cleanup for blas functions: added missing doxygen, all blas function …
maddyscientist Sep 2, 2022
f18c843
Remove now unnecessary const_cast usage when calling blas functions -…
maddyscientist Sep 2, 2022
83ed9b5
Initial multi-RHS awareness added to block Lanczos
maddyscientist Sep 2, 2022
cd85a7a
Add deflation to bicgstab
cpviolator Sep 2, 2022
228473d
Merge branch 'feature/multi-rhs' of https://github.com/lattice/quda i…
cpviolator Sep 2, 2022
b78a87c
Remove 'mat' argument from eigesolver member functions, use 'mat' ref…
cpviolator Sep 2, 2022
f3576c8
Make blas::zero function multi-RHS aware. Fix bug in EigenSolver::ro…
maddyscientist Sep 2, 2022
5cb7fc6
Fix PCG when no preconditioner is used
maddyscientist Sep 3, 2022
2a7fc1d
Merge branch 'develop' of github.com:lattice/quda into feature/multi-rhs
maddyscientist Sep 7, 2022
deb42f0
Fix issue with field cache: needs to be nuked if the comm partitionin…
maddyscientist Sep 7, 2022
add330a
Remove last remnants of Dirac::tmp1 / tmp2
maddyscientist Sep 8, 2022
b40818a
Fix for VectorIO interface, and add WAR for multigrid's use of Vector…
maddyscientist Sep 8, 2022
1359142
Add missing string header inclusion
maddyscientist Sep 8, 2022
a9b6a0e
Fix Jenkins error with BiCGStab
maddyscientist Sep 8, 2022
2e87cb7
Add PCG to the invert_test list
maddyscientist Sep 8, 2022
77018ba
Upate multi-RHS interface functions to use const vector_ref lvalue in…
maddyscientist Sep 13, 2022
f7c2b15
Created cvector_ref alias for const vector_ref
maddyscientist Sep 16, 2022
ca7035f
Fix some long-standing issues in PCG, and apply some cleanup
maddyscientist Sep 16, 2022
0790fd0
Tweak default parameter max_res_increase to ensure convergence for al…
maddyscientist Sep 16, 2022
27c5b07
Fix bug when running io_test with multiple processes
maddyscientist Sep 20, 2022
284f951
Fix bug with dslash_ctest: test_split_grid is not yet initialized whe…
maddyscientist Sep 20, 2022
b37ed33
Address Mathias' review comments
maddyscientist Sep 21, 2022
dbf647e
Add Schwarz solvers to the invert_test gtest list. Change default CA…
maddyscientist Sep 21, 2022
c2ea87f
fix a few typos
mathiaswagner Sep 21, 2022
5ca4bc2
Fix bug in invert_test, introduced when adding Schwarz preconditioner…
maddyscientist Sep 27, 2022
6df7059
Delete dead code
maddyscientist Sep 27, 2022
327d369
Merge branch 'feature/multi-rhs' of github.com:lattice/quda into feat…
maddyscientist Sep 27, 2022
c0c9457
clean up headers, fix typo
mathiaswagner Sep 28, 2022
cdc47f9
Remove commented code
maddyscientist Sep 28, 2022
d46a748
Merge branch 'feature/multi-rhs' of github.com:lattice/quda into feat…
maddyscientist Sep 28, 2022
92a770c
Fix compiler warnings when QUDA_INVERFACE_NVTX=ON
maddyscientist Oct 5, 2022
b607f03
Remove some legacy std::vector<std::reference_wrapper<ColorSpinorField>>
maddyscientist Oct 5, 2022
a595b02
Merge branch 'develop' of github.com:lattice/quda into feature/multi-rhs
maddyscientist Oct 5, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions include/blas_helper.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include <convert.h>
#include <float_vector.h>
#include <array.h>
#include <math_helper.cuh>

//#define QUAD_SUM
#ifdef QUAD_SUM
Expand Down Expand Up @@ -428,12 +429,12 @@ namespace quda

template <template <typename...> class Functor,
template <template <typename...> class, typename store_t, typename y_store_t, int, typename> class Blas,
bool mixed, typename T, typename x_store_t, typename V, typename... Args>
constexpr std::enable_if_t<mixed, void> instantiate(const T &a, const T &b, const T &c, V &x_, V &y_,
bool mixed, typename T, typename x_store_t, typename Vx, typename Vy, typename... Args>
constexpr std::enable_if_t<mixed, void> instantiate(const T &a, const T &b, const T &c, Vx &x_, Vy &y_,
Args &&... args)
{
unwrap_t<V> &x(x_);
unwrap_t<V> &y(y_);
unwrap_t<Vx> &x(x_);
unwrap_t<Vy> &y(y_);

if (y.Precision() < x.Precision()) errorQuda("Y precision %d not supported", y.Precision());

Expand Down
Loading