in tests/unit/warp/memory/tile/global_to_register.cu, float case

when i modify the validate atol to 0.0001, the validate has mismatch.


why load/store op reduce the accuracy? I think load/store will keep the dst/src bit same.
in otherwise, I change the initialize func in tests/unit/testing_commons/testing_utils.cuh to init every element for 0/1/2.../input_size -1

run the case again, the case will mismatch again

i run the case in H100, can you repeat the phenomenon?
in tests/unit/warp/memory/tile/global_to_register.cu, float case
when i modify the validate atol to 0.0001, the validate has mismatch.
why load/store op reduce the accuracy? I think load/store will keep the dst/src bit same.
in otherwise, I change the initialize func in tests/unit/testing_commons/testing_utils.cuh to init every element for 0/1/2.../input_size -1
run the case again, the case will mismatch again
i run the case in H100, can you repeat the phenomenon?