-
Notifications
You must be signed in to change notification settings - Fork 3k
Refactor agent and replace web search with local search #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…rch flow - Migrate all LLM nodes (query generation, reflection, final answer) from Gemini to Groq using llama-3.3-70b-versatile - Keep Gemini exclusively for Google Search grounding to preserve citation metadata - Add graceful handling for Google Search API quota exhaustion (429 / RESOURCE_EXHAUSTED) - Return safe fallback state instead of crashing the graph - Guard against missing or partial API responses - Handle empty candidates and absent grounding metadata - Fall back to plain-text extraction when citations cannot be generated - Improve robustness and documentation - Defensive access to graph state keys - Fix docstring inconsistencies and remove redundant comments
The search component operates on a user-provided directory and retrieves relevant documentation snippets directly from markdown files. Architecture decision: I chose an extended-context, file-based search approach instead of a broad snippet search or vector database. Technical documentation often contains long code examples that are easily truncated by short RAG-style snippets. To preserve full examples while respecting Groq token limits, the search retrieves a small number of highly relevant results (top_k=2) with a larger context window (~100 lines). Relevance is determined using a deterministic keyword and phrase scoring mechanism without external dependencies. This design keeps the agent lightweight, reproducible, and suitable for offline evaluation on local documentation archives.
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello @MaxAdmk, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the agent's underlying language model infrastructure and its primary information retrieval method. It transitions the agent from relying on Google Gemini models for various tasks to utilizing Groq's Llama 3.3 70B, aiming for potentially different performance characteristics. More importantly, it replaces the previous web search capability with a new local filesystem search, allowing the agent to ground its responses in a provided directory of markdown documentation rather than external internet sources. This change shifts the agent's operational paradigm from general web research to domain-specific knowledge retrieval from local files. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the agent to use Groq and Llama-3 models, and replaces web search with a local filesystem search over markdown files. The changes are substantial, introducing a custom local search implementation and updating the agent's graph and prompts accordingly.
My review highlights a critical issue with citation marker generation that could lead to incorrect source attribution. I've also noted several high and medium severity issues related to maintainability, including hardcoded values that override configurable arguments, brittle error handling, and broad exception catching. Addressing these points will improve the correctness and robustness of the new implementation.
| sources = [] | ||
| result_chunks = [] | ||
| for idx, (rel_path, snippet) in enumerate(snippets): | ||
| marker = f"[S{idx}]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The citation markers (e.g., [S0], [S1]) are generated using enumerate's index. Since multiple web_research nodes can run in parallel, this will create conflicting markers (e.g., multiple [S0] sources from different files), leading to incorrect source attribution in the final answer. The markers must be unique across all parallel searches. You can use the unique id from the state to calculate a globally unique index for each snippet.
| marker = f"[S{idx}]" | |
| marker_id = state["id"] * 2 + idx | |
| marker = f"[S{marker_id}]" |
| if state.get("initial_search_query_count") is None: | ||
| state["initial_search_query_count"] = configurable.number_of_initial_queries | ||
| # Limit query count to reduce token usage | ||
| state["initial_search_query_count"] = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of initial search queries is hardcoded to 2. This makes the --initial-queries command-line argument defined in cli_research.py ineffective. This behavior should be reconciled: either the CLI argument should be removed, or the logic should be updated to respect the argument if it's provided.
| except Exception: | ||
| return 0, "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a bare except Exception: is too broad and can hide unexpected errors or catch system-level exceptions, making debugging difficult. It's better to catch more specific exceptions that you expect to occur, such as IOError or UnicodeDecodeError when reading files.
| except Exception: | |
| return 0, "" | |
| except (IOError, UnicodeDecodeError): | |
| return 0, "" |
| web_search_model: str = Field( | ||
| default="gemini-2.0-flash", | ||
| metadata={ | ||
| "description": "The name of the language model to use for the agent's answer." | ||
| "description": "Gemini model name to use for Google Search grounding." | ||
| }, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| valid_chunks = [ | ||
| r for r in all_results | ||
| if not ("No local markdown results found" in r or "no docs_dir provided" in r or "Web search skipped" in r) | ||
| and r.strip() | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to filter out invalid search results relies on matching hardcoded error strings. This is brittle and will break if the error messages in the web_research node are changed. Consider returning a structured object from web_research that includes a status, or at least use constants for the error messages to make this more robust.
| return [] | ||
|
|
||
| # Small set of common English stopwords for filtering | ||
| STOPWORDS = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| text_lower = text.lower() | ||
|
|
||
| # Score file path matches (boost) | ||
| path_score = sum(path_lower.count(t) * 3 for t in terms) + sum(path_lower.count(p) * 5 for p in phrases) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _search_markdown_directory function uses several "magic numbers" for scoring weights (e.g., 3, 5) and snippet slicing (e.g., 20, 100, 120). These should be defined as named constants at the top of the function or module. This improves readability and makes it easier to adjust these parameters in the future. For example: PATH_TERM_WEIGHT = 3, SNIPPET_LINES_BEFORE = 20.
commit 1: refactor to Groq + llama-3.3-70b
commit 2: replace web search with local filesystem search