If someone want to contribute to spconv 2.x, feel free to start new discussion in github, or just email to me.
- TF32 support
- Make
ConvAlgo.Native
runable in KRSC layout and only use this layout in future - PyTorch Int8 Support
- Move most of function in spconv.pytorch.ops to C++
- Ampere multi-stage gemm support
- Optimize CUDA Kernels for small-channel-size layers.
- nvrtc support for gemm/conv kernels
- C++ only spconv
- TensorRT support
- Test spconv 2.x in torch-points3d and other frameworks
- Documents in github Page
- Better tests
- TF32 support
we only need to add tf32 tensor cores to cumm. not hard.
- Make
ConvAlgo.Native
runable in KRSC layout
Add stride arg to gemm kernels, use offset + stride to force gemm kernel use KRSC layout as a "KC" matrix.
- PyTorch Int8 Support
...
- Move most of function in spconv.pytorch.ops to C++
Pure engieering work.
- Ampere multi-stage gemm support
Not easy, we need to use new pattern to write gemm kernels.
- Optimize CUDA Kernels for small-channel-size layers
modify cumm and make it support small kernels. not hard, but need time.
- nvrtc support for gemm/conv kernels
need to rewrite kernel params in cumm. not easy.
- C++ only spconv
actually code generation is easy, we can finish this easily after move ops to c++.
- TensorRT support
The TensorRT support is the last feature in this plan. it needs lots of engieering work and prerequisites, may cost much time.