Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/reproducible #1446

Open
wants to merge 40 commits into
base: develop
Choose a base branch
from
Open

Feature/reproducible #1446

wants to merge 40 commits into from

Conversation

maddyscientist
Copy link
Member

(this isn't ready to be merged, I'm creating this PR as a placeholder.)

…oduce new CMake set types: real_t (QUDA_SCALAR_TYPE) - the host side scalar precision, complex_t the complex version of this (replaces Complex), device_reduce_t (QUDA_REDUCTION_TYPE). Eventually we will be able to set these to non-double types, but we're there yet....
…educe_t are different types, e.g., double vs doubledouble
… a different type (needed when copying from deviation_t<doubledouble> to deviation_t<double> for example
…ble (need to split into 64-bit words) and small generic cleanup
…so updates the coalesced writing to sysmem to work with large reduce_t types, such that sizeof(device_reduce_t) / sizeof(atomic_type<device_reduce_t>) > warp_size, which previously was a restriction: we now use a warp-stride loop to do a coalesced write to sysmem
…MP at present and just a simple gather method for now
…to real_t done after the multi-process reduction
…r direct comparisons, use max error not error sum when multiple norms are used to check correctness, print out the deviation when verbosity >= QUDA_VERBOSE
…itations representing this being WIP (bin bounds LUT repeatadly recomputed on the host, bin bounds LUT presently in explicit constant, CG reduction not supported, warp reductions rather register heavy, etc.)
@maddyscientist maddyscientist requested review from a team as code owners March 18, 2024 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant