Skip to content
maddyscientist edited this page Dec 15, 2021 · 2 revisions

Development is in the sycl branch https://github.com/lattice/quda/tree/feature/sycl

Changes from develop can be seen in the PR https://github.com/lattice/quda/pull/1168

Outstanding changes/issues

  • BlockReduce calls simplified to make it easier to implement in SYCL
  • reducer_t types added for reductions (reducer.h, transform_reduce.cuh)
  • multi blas Args may not fit in max_kernel_arg_size (using max_constant_size instead)
  • quda_target.h needs to be included from quda_internal.h
  • block size in dslash_coarse kernel must evenly divide threads
  • FAST_COMPILE_REDUCE version of block_orthogonalize.cu and restrictor.cu can't go larger than max_block_size
Clone this wiki locally