You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bert series ONNX models are very large (x GB) thus not easy to share the real file. We can improve this process by overwriting the weights (initializers)
It can be fixed data (e.g. all 0.1 or other value specified), thus the model can be compressed.
After sharing, we can recover with numpy style random numbers.
This can only be used as a sharing method, the generated model are not useful when evaluate accuracy.
For better usage:
Annotation will be added when writing fixed data, thus when re-random we can detect automatically.
The tensors can be specified with names or size.
Only works for FP32/FP16.
0 removed.
The text was updated successfully, but these errors were encountered:
Bert series ONNX models are very large (x GB) thus not easy to share the real file. We can improve this process by overwriting the weights (initializers)
0.1
or other value specified), thus the model can be compressed.This can only be used as a sharing method, the generated model are not useful when evaluate accuracy.
For better usage:
0
removed.The text was updated successfully, but these errors were encountered: