-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components #8813
base: main
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 13208168523Details
💛 - Coveralls |
Current failing tests are due to #8811. PR is still WIP but the tests pass locally. |
Okay, after taking a look at how we could possibly do At this point, I will undraft this PR and take any suggestions/reviews for what I should fix or modify. |
Related Issues
Proposed Changes:
Sentence Transformer based components now expose a backend parameter that allows the user to specify a different backend besides the default of torch. Supported backends are onnx and openvino. Documentation for these backends can be found here.
How did you test it?
Integration tests were added for each of the supported backends. I am making this PR early on to have the CI assist with some tests as I can not run them locally.
Notes for the reviewer
onnx
andopenvino
backends at the same time due to a limitation withoptimum-intel[openvino]==1.21.0
which does not supporttransformers>=4.47
. The nextoptimum-intel
update should add support fortransformers>=4.47
. For now,openvino
tests are skipped.onnx-gpu
supportSentenceTransformersDiversityRanker
dtype
quantizationChecklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
and added!
in case the PR includes breaking changes.