retrievalqa #29474
Replies: 1 comment 7 replies
-
Hey @Anand-puthiyapurayil! 👋 I'm here to help you with any bugs, questions, or contributions you have. Feel free to ask me anything while you're waiting for a human maintainer. I found a similar issue related to token indices sequence length being longer than the specified maximum sequence length for the model. The error you're encountering suggests that the input sequence length exceeds the model's context window, which is causing indexing errors. To resolve this, you can adjust the from langchain_community.llms import LlamaCpp
# Initialize the Llama model with a custom context window
llm = LlamaCpp(model_path="/path/to/llama/model", n_ctx=1024)
# Ensure your input sequences do not exceed the n_ctx value
input_sequence = "Your input text here"
if llm.get_num_tokens(input_sequence) > llm.n_ctx:
# Truncate or split the input_sequence to fit within the n_ctx limit
input_sequence = input_sequence[:llm.n_ctx]
# Use the model for retrieval
result = llm.invoke(input_sequence) Make sure that the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Beta Was this translation helpful? Give feedback.
-
import os
import time
import re
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
from utils.utils import load_config, initialize_embeddings, initialize_llm, load_faiss_store
from logger.logger import get_logger
Initialize logger
logger = get_logger(name)
Define Map-Reduce Prompts
map_prompt = PromptTemplate(
input_variables=["context", "question"],
template="""
You have the following chunk of data (could be a product or service):
{context}
User question: {question}
""",
)
reduce_prompt = PromptTemplate(
input_variables=["summaries", "question"],
template="""
We have partial answers from multiple chunks:
{summaries}
Combine them into a single, cohesive answer to: "{question}"
Requirements:
""",
)
def semantic_search_tool(query: str) -> str:
"""
Enhanced product search that utilizes the LLM's answer and extracts product IDs.
"""
max_retries = 3
for attempt in range(max_retries):
try:
response = qa_chain.invoke(query)
llm_answer = response.get("result", "")
source_docs = response.get("source_documents", [])
def main():
try:
# Configuration setup
this_dir = os.path.dirname(os.path.abspath(file))
config_path = os.path.join(this_dir, "..", "config.yaml")
config = load_config(config_path)
)
if name == "main":
main()
def initialize_llm(
save_dir: str,
max_new_tokens: int = 500,
temperature: float = 0.1,
repetition_penalty: float = 1.2,
top_p: float = 0.95,
do_sample: bool = True
):
"""
Initializes a retrieval-optimized LLM for product ID extraction.
"""
try:
logger.info(f"Initializing retrieval LLM from: {save_dir}")
i have two functions here , once is retrieval function , where i retrieve with retrievalqa , and another function is initializing the llama , , although my model has 131k context window $ python -m mlscripts.product_retrieval
2025-01-29 18:47:33,016 - utils - INFO - Configuration loaded from D:\Anand\Jstore_Ai\usecase1\mlscripts..\config.yaml
2025-01-29 18:47:33,016 - utils - INFO - Initializing retrieval LLM from: output/BGI-llama
2025-01-29 18:47:33,016 - utils - INFO - Loading tokenizer...
2025-01-29 18:47:33,391 - utils - INFO - Tokenizer loaded with pad_token set
2025-01-29 18:47:33,391 - utils - INFO - Loading model...
2025-01-29 18:47:34,694 - utils - INFO - Model context length: 131072
2025-01-29 18:47:34,694 - utils - INFO - Configuring text-generation pipeline
Device set to use cuda:0
2025-01-29 18:47:34,695 - utils - INFO - Creating HuggingFacePipeline
Actual pipeline max input: 131072
2025-01-29 18:47:35,996 - utils - INFO - Embeddings initialized using model at: output/sbert_finetuned on device: cuda
2025-01-29 18:47:35,997 - utils - INFO - Loading FAISS vector store from: output/product_vector_store
2025-01-29 18:47:36,282 - utils - INFO - FAISS vector store loaded successfully.
Enter your product search query: show me some products under 2000
Token indices sequence length is longer than the specified maximum sequence length for this model (6944 > 1024). Running this sequence through the model will result in indexing errors
2025-01-29 18:48:41,656 - main - ERROR - Attempt 1 failed: A single document was longer than the context length, we cannot handle this.
Traceback (most recent call last):
File "D:\Anand\Jstore_Ai\usecase1\mlscripts\product_retrieval.py", line 48, in semantic_search_tool
Token indices sequence length is longer than the specified maximum sequence length for this model (6944 > 1024). Running this sequence through the model will result in indexing errors im getting this error , any help?
Beta Was this translation helpful? Give feedback.
All reactions