How does privateGPT determine per-query system context? #1714

mikesafh · 2024-03-12T14:20:53Z

mikesafh
Mar 12, 2024

Hello,

I have a privateGPT (v0.2, with several LLMs but currently using abacusai/Smaug-72B-v0.1 as tokenizer, local mode, default local config:

local:
prompt_style: "llama2"
llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.2-GGUF
llm_hf_model_file: mistral-7b-instruct-v0.2.Q4_K_M.gguf
embedding_hf_model_name: BAAI/bge-small-en-v1.5

) install on my work M2 Macbook Pro, to evaluate whether this type of technology would be useful for work. Proof of concept is based on having it answer questions based on local data ingestion, anything from a more user friendly interface to a traditional knowledge repository / FAQ to writing customized content based on existing examples, and so on.

What I'm having a problem with now is figuring out how to get privateGPT to be aware of all locally ingested files. I'm going with CSV imports (either prompt:response or empty:response format) as that seems to be the simplest way, although if this moves forward it'll be ingesting a whole lot of data types that I know we'll have to be careful with given the limitations of text retrieval from e.g. presentations and PDFs.

Specifically, it seems like there's a function that decides which chunks of text from the ingested documents to use as its system context, restricting answers to that. I can't figure out how it's determining which chunks of text to use and more importantly, how to make that more relevant to the query at hand. I've tried making max_new_tokens and context_window much larger, to no avail. Even did some quick searches in the python code.

But as it stands, even if I ask a question that's almost literally out of the ingested documents, the appropriate chunk of text was not chosen for system context so it says the contexts does not provide information on that.

What am I doing wrong? Do I need to update the local config to use appropriate llm_hf_repo_id, llm_hf_model_file, embedding_hf_model_name for Smaug whatever they are? I don't think my issue lies there though.

Thank you for any assistance you can provide.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does privateGPT determine per-query system context? #1714

{{title}}

Replies: 0 comments

Select a reply

How does privateGPT determine per-query system context? #1714

mikesafh Mar 12, 2024

Replies: 0 comments

mikesafh
Mar 12, 2024