Skip to content

Commit 4e7b2bb

Browse files
authored
Make NumpyOps CPU kernels generic (#627)
* Make NumpyOps CPU kernels generic This PR makes most CPU kernels generic, so that they can take both float32 and float64 arrays (and hopefully in the future float16). I experimented with kernels in Cython + fused types and kernels as C++ with templates, I found the C++ template route more promising: - More compact/ergonomic implementations with fewer compile-time conditionals. - Opens up the possibility to easily use SIMD intrinsics in the future. To allow genericity in the NumpyOps method arguments, we use: - Fused types when we require a specific dimensionality; - np.ndarray otherwise. Some of the kernels are not made generic: - cpu_scatter_add: needs tests to verify that the op still works correctly. - cpu_position_encode: the position_encode op doesn't take float array(s). - lstm kernels: I need to look more deeply into them. * Include C++ headers in sdist * NumpyOps: Use workaround for cython/cython#4697 * Namespace-qualify memcpy * ReLU kernel: never output -0.0 * Add fixes suggested by @svlandeg
1 parent 45249c2 commit 4e7b2bb

File tree

6 files changed

+651
-430
lines changed

6 files changed

+651
-430
lines changed

MANIFEST.in

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
recursive-include thinc *.cu *.pyx *.pxd
1+
recursive-include thinc *.cu *.pyx *.pxd *.hh
22
include LICENSE
33
include README.md
44
prune tmp/

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ def setup_package():
113113
version=about["__version__"],
114114
ext_modules=ext_modules,
115115
cmdclass={"build_ext": build_ext_subclass},
116-
package_data={"": ["*.pyx", "*.pxd", "*.pxi", "*.cu"]},
116+
package_data={"": ["*.pyx", "*.pxd", "*.pxi", "*.cu", "*.hh"]},
117117
)
118118

119119

0 commit comments

Comments
 (0)