Skip to content

Conversation

@MaxAdmk
Copy link

@MaxAdmk MaxAdmk commented Jan 21, 2026

commit 1: refactor to Groq + llama-3.3-70b
commit 2: replace web search with local filesystem search

…rch flow

- Migrate all LLM nodes (query generation, reflection, final answer) from Gemini to Groq using llama-3.3-70b-versatile
- Keep Gemini exclusively for Google Search grounding to preserve citation metadata
- Add graceful handling for Google Search API quota exhaustion (429 / RESOURCE_EXHAUSTED)
  - Return safe fallback state instead of crashing the graph
- Guard against missing or partial API responses
  - Handle empty candidates and absent grounding metadata
  - Fall back to plain-text extraction when citations cannot be generated
- Improve robustness and documentation
  - Defensive access to graph state keys
  - Fix docstring inconsistencies and remove redundant comments
The search component operates on a user-provided directory and retrieves
relevant documentation snippets directly from markdown files.

Architecture decision:
I chose an extended-context, file-based search approach instead of a
broad snippet search or vector database. Technical documentation often
contains long code examples that are easily truncated by short RAG-style
snippets.

To preserve full examples while respecting Groq token limits, the search
retrieves a small number of highly relevant results (top_k=2) with a
larger context window (~100 lines). Relevance is determined using a
deterministic keyword and phrase scoring mechanism without external
dependencies.

This design keeps the agent lightweight, reproducible, and suitable for
offline evaluation on local documentation archives.
@google-cla
Copy link

google-cla bot commented Jan 21, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@gemini-code-assist
Copy link

Summary of Changes

Hello @MaxAdmk, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the agent's underlying language model infrastructure and its primary information retrieval method. It transitions the agent from relying on Google Gemini models for various tasks to utilizing Groq's Llama 3.3 70B, aiming for potentially different performance characteristics. More importantly, it replaces the previous web search capability with a new local filesystem search, allowing the agent to ground its responses in a provided directory of markdown documentation rather than external internet sources. This change shifts the agent's operational paradigm from general web research to domain-specific knowledge retrieval from local files.

Highlights

  • LLM Model Migration: The agent's core functionalities, including query generation, reflection, and answer generation, have been migrated from various Google Gemini models to Groq's llama-3.3-70b-versatile model.
  • Local Filesystem Search: The external web search mechanism has been replaced with a new local markdown file search capability, allowing the agent to retrieve information from a specified local directory.
  • CLI Argument for Local Docs: A new --dir command-line argument has been introduced to specify the path to the local directory containing markdown files for the agent to reference.
  • Stricter Answer Generation: The answer generation logic has been updated to strictly adhere to the provided local documentation snippets, requiring specific citation markers and providing a fallback message if insufficient information is found.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the agent to use Groq and Llama-3 models, and replaces web search with a local filesystem search over markdown files. The changes are substantial, introducing a custom local search implementation and updating the agent's graph and prompts accordingly.

My review highlights a critical issue with citation marker generation that could lead to incorrect source attribution. I've also noted several high and medium severity issues related to maintainability, including hardcoded values that override configurable arguments, brittle error handling, and broad exception catching. Addressing these points will improve the correctness and robustness of the new implementation.

sources = []
result_chunks = []
for idx, (rel_path, snippet) in enumerate(snippets):
marker = f"[S{idx}]"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The citation markers (e.g., [S0], [S1]) are generated using enumerate's index. Since multiple web_research nodes can run in parallel, this will create conflicting markers (e.g., multiple [S0] sources from different files), leading to incorrect source attribution in the final answer. The markers must be unique across all parallel searches. You can use the unique id from the state to calculate a globally unique index for each snippet.

Suggested change
marker = f"[S{idx}]"
marker_id = state["id"] * 2 + idx
marker = f"[S{marker_id}]"

if state.get("initial_search_query_count") is None:
state["initial_search_query_count"] = configurable.number_of_initial_queries
# Limit query count to reduce token usage
state["initial_search_query_count"] = 2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The number of initial search queries is hardcoded to 2. This makes the --initial-queries command-line argument defined in cli_research.py ineffective. This behavior should be reconciled: either the CLI argument should be removed, or the logic should be updated to respect the argument if it's provided.

Comment on lines +341 to +342
except Exception:
return 0, ""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using a bare except Exception: is too broad and can hide unexpected errors or catch system-level exceptions, making debugging difficult. It's better to catch more specific exceptions that you expect to occur, such as IOError or UnicodeDecodeError when reading files.

Suggested change
except Exception:
return 0, ""
except (IOError, UnicodeDecodeError):
return 0, ""

Comment on lines +32 to 37
web_search_model: str = Field(
default="gemini-2.0-flash",
metadata={
"description": "The name of the language model to use for the agent's answer."
"description": "Gemini model name to use for Google Search grounding."
},
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The web_search_model configuration field is defined but appears to be unused. The new local search implementation in the web_research node does not use an LLM. This field should be removed to avoid confusion and keep the configuration clean.

Comment on lines +231 to +235
valid_chunks = [
r for r in all_results
if not ("No local markdown results found" in r or "no docs_dir provided" in r or "Web search skipped" in r)
and r.strip()
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to filter out invalid search results relies on matching hardcoded error strings. This is brittle and will break if the error messages in the web_research node are changed. Consider returning a structured object from web_research that includes a status, or at least use constants for the error messages to make this more robust.

return []

# Small set of common English stopwords for filtering
STOPWORDS = {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The STOPWORDS set is defined inside the _search_markdown_directory function, which means it is recreated on every call. For better performance and code organization, this should be defined as a module-level constant outside the function.

text_lower = text.lower()

# Score file path matches (boost)
path_score = sum(path_lower.count(t) * 3 for t in terms) + sum(path_lower.count(p) * 5 for p in phrases)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _search_markdown_directory function uses several "magic numbers" for scoring weights (e.g., 3, 5) and snippet slicing (e.g., 20, 100, 120). These should be defined as named constants at the top of the function or module. This improves readability and makes it easier to adjust these parameters in the future. For example: PATH_TERM_WEIGHT = 3, SNIPPET_LINES_BEFORE = 20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant