Skip to content

Conversation

Zinoex
Copy link
Contributor

@Zinoex Zinoex commented Sep 29, 2025

I have a usecase where I effectively need to iterate columns of a sparse matrix and access its nonzero indices and values, which fails without the functions added in this PR. And preferably, I would want to avoid to manually keep track of the internals during the iteration.

Below I attach a MWE to that fails to compile without the added functions. The concrete usecase is more complicated than the MWE with other additional logic and computation, and multiple threads per column, but it is not important to show the error.

using CUDA, SparseArrays

function mwe()
    A = cu(sprand(1000, 1000, 0.01))
    @cuda threads=256 kernel(A)
end

function kernel(A)
    i = threadIdx().x
    while i <= size(A, 2)
        col = @view A[:, i]
        inds = rowvals(col)
        vals = nonzeros(col)
        num_nonzeros = nnz(col)

        v = col[inds[1]]

        i += blockDim().x
    end
end

Copy link
Contributor

github-actions bot commented Sep 30, 2025

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic master) to apply these changes.

Click here to view the suggested changes.
diff --git a/lib/cusparse/device.jl b/lib/cusparse/device.jl
index 90a7b5e6f..aa6778d5d 100644
--- a/lib/cusparse/device.jl
+++ b/lib/cusparse/device.jl
@@ -23,7 +23,7 @@ Base.length(g::CuSparseDeviceVector) = g.len
 Base.size(g::CuSparseDeviceVector) = (g.len,)
 SparseArrays.nnz(g::CuSparseDeviceVector) = g.nnz
 
-struct CuSparseDeviceMatrixCSC{Tv,Ti,A} <: SparseArrays.AbstractSparseMatrixCSC{Tv,Ti}
+struct CuSparseDeviceMatrixCSC{Tv, Ti, A} <: SparseArrays.AbstractSparseMatrixCSC{Tv, Ti}
     colPtr::CuDeviceVector{Ti, A}
     rowVal::CuDeviceVector{Ti, A}
     nzVal::CuDeviceVector{Tv, A}
diff --git a/test/libraries/cusparse/device.jl b/test/libraries/cusparse/device.jl
index 650a065e4..5b15c7bc8 100644
--- a/test/libraries/cusparse/device.jl
+++ b/test/libraries/cusparse/device.jl
@@ -39,7 +39,7 @@ end
             end
 
             out = CuVector{Ti}(undef, size(A, 2))
-            @cuda threads=size(A, 2) nnz_per_column_kernel(out, A)
+            @cuda threads = size(A, 2) nnz_per_column_kernel(out, A)
             out
         end
 
@@ -56,7 +56,7 @@ end
             function sum_per_column_kernel(out, A)
                 j = blockIdx().x
                 col = @view A[:, j]
-                
+
                 v = zero(Tv)
                 i = threadIdx().x
                 while i <= SparseArrays.nnz(col)
@@ -72,11 +72,11 @@ end
             end
 
             out = CuVector{Tv}(undef, size(A, 2))
-            @cuda threads=32 blocks=size(A, 2) sum_per_column_kernel(out, A)
+            @cuda threads = 32 blocks = size(A, 2) sum_per_column_kernel(out, A)
             out
         end
 
-        sum_per_column(A::SparseMatrixCSC) = vec(sum(A; dims=1))
+        sum_per_column(A::SparseMatrixCSC) = vec(sum(A; dims = 1))
 
         A = sprand(10, 10, 0.5)
         cuA = CuSparseMatrixCSC(A)
@@ -94,7 +94,7 @@ end
             end
 
             out = CuVector{Ti}(undef, size(A, 2))
-            @cuda threads=size(A, 2) last_nz_per_column_kernel(out, A)
+            @cuda threads = size(A, 2) last_nz_per_column_kernel(out, A)
             out
         end
 
@@ -105,4 +105,4 @@ end
 
         @test last_nz_per_column(A) == Vector(last_nz_per_column(cuA))
     end
-end
\ No newline at end of file
+end

@maleadt
Copy link
Member

maleadt commented Oct 8, 2025

Conceptually LGTM, and this is something SparseArrays.jl supports (and specializes), so seems useful to have. @kshyatt may have other comments though.

In any case, this would need a test. Can you add some covering this functionality?

@maleadt maleadt added cuda array Stuff about CuArray. needs tests Tests are requested. enhancement New feature or request labels Oct 8, 2025
@Zinoex
Copy link
Contributor Author

Zinoex commented Oct 8, 2025

@maleadt I'm not sure where you want the tests placed or what style of tests, but I have added 3 tests for the added functionality. Please let me know if you want me to change something.

@maleadt maleadt force-pushed the fm/cusparse_device_columnview branch from 118e6fc to ccc14f9 Compare October 9, 2025 08:34
@maleadt maleadt merged commit f7deec6 into JuliaGPU:master Oct 9, 2025
2 of 3 checks passed
@Zinoex Zinoex deleted the fm/cusparse_device_columnview branch October 9, 2025 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda array Stuff about CuArray. enhancement New feature or request needs tests Tests are requested.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants