Skip to content

Latest commit

 

History

History
83 lines (49 loc) · 2.24 KB

SPCONV_DEVELOP_PLAN.md

File metadata and controls

83 lines (49 loc) · 2.24 KB

Spconv 2.x Develop Plan

If someone want to contribute to spconv 2.x, feel free to start new discussion in github, or just email to me.

v2.2 Core Features

  • TF32 support
  • Make ConvAlgo.Native runable in KRSC layout and only use this layout in future
  • PyTorch Int8 Support

v2.3 Core Features

  • Move most of function in spconv.pytorch.ops to C++
  • Ampere multi-stage gemm support
  • Optimize CUDA Kernels for small-channel-size layers.

v2.4 Core Features

  • nvrtc support for gemm/conv kernels
  • C++ only spconv
  • TensorRT support

Misc Features need contribution

  • Test spconv 2.x in torch-points3d and other frameworks
  • Documents in github Page
  • Better tests

Details

  1. TF32 support

we only need to add tf32 tensor cores to cumm. not hard.

  1. Make ConvAlgo.Native runable in KRSC layout

Add stride arg to gemm kernels, use offset + stride to force gemm kernel use KRSC layout as a "KC" matrix.

  1. PyTorch Int8 Support

...

  1. Move most of function in spconv.pytorch.ops to C++

Pure engieering work.

  1. Ampere multi-stage gemm support

Not easy, we need to use new pattern to write gemm kernels.

  1. Optimize CUDA Kernels for small-channel-size layers

modify cumm and make it support small kernels. not hard, but need time.

  1. nvrtc support for gemm/conv kernels

need to rewrite kernel params in cumm. not easy.

  1. C++ only spconv

actually code generation is easy, we can finish this easily after move ops to c++.

  1. TensorRT support

The TensorRT support is the last feature in this plan. it needs lots of engieering work and prerequisites, may cost much time.