You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found a potential issue with the embedding model configuration of RAG corpus creation.
I would appreciate if you could review and address this matter.
Environment details
OS type and version: macOS Sequoia 15.0.1
Python version: 3.13.1
pip version: 24.3.1
google-cloud-aiplatform version: 1.79.0
Steps to reproduce
Create a RAG corpus with text-multilingual-embedding-002 model configuration but without vecto specification
Observe that the embedding model defaults to text-embedding-005 instead of using the specified model
Code example
fromvertexaiimportragfromvertexai.generative_modelsimportGenerativeModel, Toolimportvertexai# Initialize Vertex AIvertexai.init(project="YOUR_PROJECT_ID", location="us-central1")
# Configure embedding model to use text-multilingual-embedding-002rag_embedding_model_config=rag.RagEmbeddingModelConfig(
vertex_prediction_endpoint=rag.VertexPredictionEndpoint(
publisher_model="projects/YOUR_PROJECT_ID/locations/us-central1/publishers/google/models/text-multilingual-embedding-002"
)
)
# Create RagCorpusrag_corpus=rag.create_corpus(
display_name="my_corpus",
backend_config=rag.RagVectorDbConfig(
rag_embedding_model_config=rag_embedding_model_config,
)
)
# The created corpus will use text-embedding-005 instead of text-multilingual-embedding-002
Root Cause
The issue is in _gapic_utils.py. The set_embedding_model_config() call is incorrectly indented within the if backend_config.vector_db is not None: block:
Due to this indentation, the embedding model configuration is only applied when a vector database is specified. The set_embedding_model_config() call should be at the same level as the vector database check to ensure the embedding model is set regardless of vector database configuration.
I found a potential issue with the embedding model configuration of RAG corpus creation.
I would appreciate if you could review and address this matter.
Environment details
3.13.1
24.3.1
google-cloud-aiplatform
version:1.79.0
Steps to reproduce
Code example
Root Cause
The issue is in
_gapic_utils.py
. Theset_embedding_model_config()
call is incorrectly indented within theif backend_config.vector_db is not None:
block:Due to this indentation, the embedding model configuration is only applied when a vector database is specified. The
set_embedding_model_config()
call should be at the same level as the vector database check to ensure the embedding model is set regardless of vector database configuration.Proposed fix:
The text was updated successfully, but these errors were encountered: