Open
Description
While trying to cook up a test case that does this:
$ taco 'y(j) = A(i,j,k) * x(k)' -s='parallelize(j, CPUThread, Atomics)'
I noticed that the C++ API does not have the same default loop order as the command line tool. The command line tool seems to use (i,j,k)
order by default, the C++ API seems to use (j,i,k)
by default.
When using the (j,i,k)
order, Taco generates code that accesses an iterator variable before it has been defined.
$ taco 'y(j) = A(i,j,k) * x(k)' -f=A:sss -s='reorder(j,i,k)'
[...]
for (int32_t jA = A2_pos[iA]; jA < A2_pos[(iA + 1)]; jA++) {
int32_t j = A2_crd[jA];
double tiy_val = 0.0;
for (int32_t iA = A1_pos[0]; iA < A1_pos[1]; iA++) {
Note, the variable jA
is initialized using iA
, but iA
is defined farther down.