feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components #8813

lbux · 2025-02-05T00:58:35Z

Related Issues

fixes Sentence Transformers components do not support ONNX or OpenVINO formats #8802

Proposed Changes:

Sentence Transformer based components now expose a backend parameter that allows the user to specify a different backend besides the default of torch. Supported backends are onnx and openvino. Documentation for these backends can be found here.

How did you test it?

Integration tests were added for each of the supported backends. I am making this PR early on to have the CI assist with some tests as I can not run them locally.

Notes for the reviewer

There is a major dependency conflict in the test environment. We can not install both the onnx and openvino backends at the same time due to a limitation with optimum-intel[openvino]==1.21.0 which does not support transformers>=4.47. The next optimum-intel update should add support for transformers>=4.47. For now, openvino tests are skipped.
I am looking into implementing explicit onnx-gpu support
I am looking into also implementing the changes into the SentenceTransformersDiversityRanker
I am going to add tests to support torch dtype quantization

Checklist

I have read the contributors guidelines and the code of conduct
I have updated the related issue with new insights and changes
I added unit tests and updated the docstrings
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
I documented my code
I ran pre-commit hooks and fixed any issue

…nsformers

coveralls · 2025-02-05T01:03:57Z

Pull Request Test Coverage Report for Build 13208168523

Details

0 of 0 changed or added relevant lines in 0 files are covered.
12 unchanged lines in 3 files lost coverage.
Overall coverage decreased (-0.01%) to 92.288%

Files with Coverage Reduction	New Missed Lines	%
components/embedders/sentence_transformers_document_embedder.py	2	96.83%
components/embedders/sentence_transformers_text_embedder.py	2	96.36%
components/rankers/sentence_transformers_diversity.py	8	94.84%

Totals
Change from base Build 13203068476:	-0.01%
Covered Lines:	9155
Relevant Lines:	9920

💛 - Coveralls

lbux · 2025-02-05T01:14:20Z

Current failing tests are due to #8811. PR is still WIP but the tests pass locally.

anakin87 · 2025-02-05T08:19:44Z

@lbux to fix failing tests, try doing something similar to #8809. It should work....

…mers

lbux · 2025-02-07T20:58:33Z

Okay, after taking a look at how we could possibly do onnxruntime-gpu support, I don't think it makes sense for us to explicitly "support" or add tests for it. While it should technically work, there are some requirements that make testing for it a bit complicated. Currently, when using onnx, everything will be run on the CPU as it accelerates CPU inference. Additionally, this allows for using quantized models. If a user installs the onnxruntime-gpu package instead of the onnxruntime, they can pass in model_kwargs={"provider": "CUDAExecutionProvider"} or model_kwargs={"provider": "TensorrtExecutionProvider" depending on whether their objective is to offload some of the operations to the GPU while using an unquantized model or to offload and accelerate some of the operations while using static quantized models, respectively. More information on the limitations can be found here.

At this point, I will undraft this PR and take any suggestions/reviews for what I should fix or modify.

lbux added 5 commits February 3, 2025 23:29

initial rough draft

c6c8330

expose backend instead of extracting from model_kwargs

ab0d259

Merge remote-tracking branch 'origin/main' into quantize_sentence_tra…

2481de3

…nsformers

explictly set backend model path

8eb7d5f

add reno

1029a93

lbux requested review from a team as code owners February 5, 2025 00:58

lbux requested review from dfokina and anakin87 and removed request for a team February 5, 2025 00:58

github-actions bot added topic:tests topic:build/distribution labels Feb 5, 2025

lbux marked this pull request as draft February 5, 2025 01:14

lbux added 2 commits February 5, 2025 13:33

Merge remote-tracking branch 'origin' into quantize_sentence_transfor…

a0c9531

…mers

expose backend for ST diversity backend

a90a249

github-actions bot added the type:documentation Improvements on the docs label Feb 5, 2025

add dtype tests and expose kwargs to ST ranker for backend parameters

ce829a7

lbux changed the title ~~feat: expose backend parameter in Sentence Transformer components for onnx and openvino models~~ feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components Feb 6, 2025

lbux and others added 3 commits February 5, 2025 19:24

skip dtype tests as torch isnt compiled with cuda

c3b690e

add new openvino dependency release, unskip tests

3f5d58d

Merge branch 'main' into quantize_sentence_transformers

37239af

lbux marked this pull request as ready for review February 7, 2025 20:58

Merge branch 'main' into quantize_sentence_transformers

9bcae09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components #8813

feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components #8813

lbux commented Feb 5, 2025 •

edited

Loading

coveralls commented Feb 5, 2025 •

edited

Loading

lbux commented Feb 5, 2025

anakin87 commented Feb 5, 2025

lbux commented Feb 7, 2025

feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components #8813

Are you sure you want to change the base?

feat: Add ONNX & OpenVINO backend support, and torch dtype kwargs in Sentence Transformers Components #8813

Conversation

lbux commented Feb 5, 2025 • edited Loading

Related Issues

Proposed Changes:

How did you test it?

Notes for the reviewer

Checklist

coveralls commented Feb 5, 2025 • edited Loading

Pull Request Test Coverage Report for Build 13208168523

Details

💛 - Coveralls

lbux commented Feb 5, 2025

anakin87 commented Feb 5, 2025

lbux commented Feb 7, 2025

lbux commented Feb 5, 2025 •

edited

Loading

coveralls commented Feb 5, 2025 •

edited

Loading