Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinity embeddings #1125

Closed
wants to merge 25 commits into from
Closed

Infinity embeddings #1125

wants to merge 25 commits into from

Conversation

canesin
Copy link

@canesin canesin commented Mar 19, 2025

Add Support for Infinity Embeddings, JSON response and pre-defined IDs on the API.

Overview

This PR adds support for local embedding generation using the Infinity Embedding library with a focus on the high-quality Snowflake Arctic Embed models.

Changes

1. New Embedding Backend

  • Added a new embedding backend infinity that uses the infinity-embed package
  • Allows local embedding generation using Hugging Face models without API calls
  • Default model is the top-performing "Snowflake/snowflake-arctic-embed-l-v2.0"

2. Model Caching

  • Implemented caching to load models only once and reuse them for subsequent embeddings
  • Prevents inefficient reloading of models on every embedding request

3. Asymmetric Embedding Support

  • Separate query embedding functions in server configuration when using Snowflake models
  • This follows Snowflake's specifications for optimal retrieval results

4. Error Handling

  • Implemented retry logic for embedding generation with specific exception types
  • Handles connection, timeout, and value errors appropriately

5. Example Implementation

  • Added a demo script lightrag_infinity_demo.py showing how to use Infinity embeddings
  • Demonstrates both document and query embedding with the asymmetric retrieval pattern

6. Configuration

  • Updated environment example and argument parsing to support the new embedding binding
  • No additional host configuration needed as embeddings are generated locally

Dependencies

  • Added infinity-emb[all] to requirements.txt

Testing

The implementation has been tested using the Snowflake Arctic Embed model for both standalone usage and integration with the LightRAG API server.

KennyFukuya and others added 9 commits March 7, 2025 20:09
* fix: postgres kg fix

* fix: exporting id

* fix: passing id from vector db

* git pull add json_response parameter

* add logger for debugging json_response

* add json_response as api param

* add json_reponse

* add json_reponse

* fix original csv output

---------

Co-authored-by: Kenny Fukuya <[email protected]>
Co-authored-by: João <[email protected]>
@canesin canesin closed this Mar 19, 2025
@canesin canesin reopened this Mar 19, 2025
@canesin
Copy link
Author

canesin commented Mar 19, 2025

I am opening the PR here to contribute back and also stop a bit the divergence of our fork main branch. This includes also the given ID changes and JSON response changes.
Will add some testing and example for both before removing the draft status.

@canesin canesin marked this pull request as ready for review March 20, 2025 05:45
@LarFii LarFii requested a review from danielaskdd March 20, 2025 08:47
@canesin canesin closed this Mar 21, 2025
@canesin canesin reopened this Mar 21, 2025
@canesin canesin closed this Mar 21, 2025
@canesin canesin deleted the infinity-embeddings branch March 21, 2025 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants