Skip to content

feat: make Tavily web search query context-aware when PDF is uploaded#258

Open
nkmohit wants to merge 3 commits intoTHU-MAIC:mainfrom
nkmohit:feature/context-aware-web-search-query
Open

feat: make Tavily web search query context-aware when PDF is uploaded#258
nkmohit wants to merge 3 commits intoTHU-MAIC:mainfrom
nkmohit:feature/context-aware-web-search-query

Conversation

@nkmohit
Copy link
Contributor

@nkmohit nkmohit commented Mar 24, 2026

Summary

This PR makes Tavily web search queries context-aware when a PDF is attached, and aligns both active generation paths with the same rewrite behavior.

Previously, the preview flow sent the raw requirement directly to /api/web-search, so vague prompts like “Tell me about this paper” were passed to Tavily unchanged even when parsed PDF text was available. This PR adds a shared server-side query rewrite step that can use uploaded PDF text and also improves overly long raw queries before they reach Tavily.

Related Issues

Fixes #246

Changes

  • Added a shared server-side search query builder that:
    • rewrites queries when PDF text is present
    • rewrites queries when the raw requirement exceeds 400 characters
    • uses the existing prompt system and JSON parsing flow
    • falls back to the normalized raw requirement if rewrite is unavailable or unusable
  • Added a new prompt template pair for web-search-query-rewrite
  • Updated /api/web-search to:
    • accept optional pdfText
    • resolve the model from request headers, matching other preview-generation routes
    • rewrite the search query before calling Tavily
  • Updated generation-preview to send parsed pdfText to /api/web-search
  • Updated generateClassroom(...) to reuse the same shared query-rewrite helper instead of keeping separate local rewrite logic
  • Kept Tavily request/response behavior unchanged apart from the improved query input
  • Kept web-search rewrite as best-effort: if rewrite model resolution or output fails, web search falls back to the raw query instead of failing the entire flow

Type of Change

  • New feature (non-breaking change that adds functionality)

Verification

Steps to reproduce / test

  1. Start the app locally with pnpm dev
  2. Go through the homepage -> generation-preview flow with:
    • web search enabled
    • a PDF attached
    • a vague requirement such as “Tell me about this paper”
  3. Confirm /api/web-search is called with pdfText and that the request succeeds instead of sending only the raw vague requirement
  4. Optionally call /api/generate-classroom directly with pdfContent.text and enableWebSearch: true to exercise the server classroom-generation path as well

What you personally verified

  • Verified that the active preview flow now passes pdfText into /api/web-search
  • Verified that /api/web-search uses the same header-based model resolution pattern as the other preview generation APIs
  • Verified that preview web search logs show:
    • hasPdfContext: true
    • rewriteAttempted: true
  • Verified that the preview flow continues successfully through outlines / scene-content / scene-actions after the web-search change
  • Verified that the shared helper is reused by both /api/web-search and generateClassroom(...)
  • Verified that web-search rewrite failure does not hard-fail preview; it falls back to the raw requirement
  • Did not fully resolve the separate pre-existing inconsistency in generateClassroom(...) base model selection, where that job path still uses server-default model resolution and may fail if the server default provider is not configured

Evidence

  • CI passes (pnpm check && pnpm lint && npx tsc --noEmit)
  • Manually tested locally
  • Screenshots / recordings attached (if UI changes)

PDF attached: web search rewrite uses PDF context and returns relevant sources.
Screenshot 2026-03-24 at 11 10 35 PM

No PDF: long requirement is rewritten before Tavily and still returns relevant sources.
Screenshot 2026-03-24 at 11 12 11 PM

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have added/updated documentation as needed
  • My changes do not introduce new warnings

Copy link
Collaborator

@cosarah cosarah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #258 Review: Context-Aware Web Search Query Rewrite

What Was Done Well

  • Clean extraction of a shared buildSearchQuery helper — eliminates duplication between /api/web-search and generateClassroom.
  • Solid best-effort / graceful-degradation pattern: every failure path falls back to the raw query with logging.
  • SearchQueryBuildResult metadata is well-designed for observability.
  • Prompt templates follow existing project conventions.

Issues

Important

1. maxOutputTokens set to model's full outputWindow for a tiny query-rewrite task
app/api/web-search/route.ts:56

maxOutputTokens: modelInfo?.outputWindow,

The rewrite prompt asks for a single JSON object with a query string under 320 chars. The model's outputWindow can be 128k tokens. Some providers allocate resources or charge based on the requested maxOutputTokens. Consider capping to a small value (e.g., 256–512 tokens) to avoid waste and speed up responses. The same pattern in classroom-generation.ts:200 is reasonable since it generates full scenes, but for a query rewrite it's excessive.

2. Double-normalization in shouldRewriteSearchQuery
lib/server/search-query-builder.ts:34-36

shouldRewriteSearchQuery normalizes its inputs internally, but buildSearchQuery on line 47 passes already-normalized values into it. Not a bug (result is the same), but confusing and wasteful. The function should either accept raw inputs or skip re-normalization.

3. PDF text accepted without size limits at the API boundary
app/api/web-search/route.ts:29

pdfText is accepted with no validation or truncation at the API layer. normalizePdfExcerpt truncates to 7000 chars before the LLM, but a client could still send an arbitrarily large payload that gets parsed and held in memory. Consider adding an explicit check/truncation at the API boundary, or at minimum documenting the reliance on the framework body limit.

Minor

4. Content-Type header set twice
app/generation-preview/page.tsx:312-314

getApiHeaders() already sets 'Content-Type': 'application/json'. The explicit override on line 314 is redundant — remove it.

5. User prompt template contradicts itself on code fences
lib/generation/prompts/templates/web-search-query-rewrite/user.md:13-18

Says "no code fences" then immediately shows a code-fenced JSON example. This could confuse the LLM. Show the example without fences, or rephrase the instruction.

6. Inconsistent callLLM calling convention
app/api/web-search/route.ts:51-54 uses { system, prompt } while classroom-generation.ts:196-199 uses { messages: [...] } for the same logical operation. Minor consistency gap.


Summary

Well-structured PR with clean architecture and robust fallback behavior. The most actionable item is Issue #1 — capping maxOutputTokens for the rewrite call would be a meaningful cost/latency improvement for what is effectively a one-line JSON generation. The other issues are minor consistency and hygiene items.

Verdict: request changes (for issue #1 primarily)

@nkmohit
Copy link
Contributor Author

nkmohit commented Mar 25, 2026

Thanks for the detailed review, @cosarah. I pushed a follow-up commit addressing the review items.

Changes made:

  • Capped rewrite-only maxOutputTokens to 256 in both /api/web-search and generateClassroom(...)
  • Removed the double-normalization in shouldRewriteSearchQuery
  • Added explicit pdfText truncation at the /api/web-search boundary before entering the rewrite flow
  • Removed the redundant Content-Type override in generation-preview
  • Fixed the prompt JSON examples so they no longer conflict with the no-code-fences instruction
  • Aligned the rewrite callLLM usage to messages: [...] in both paths

The rewrite behavior itself is unchanged: it is still best-effort and still falls back to the normalized raw requirement if the rewrite step is unavailable or unusable.

One small note on the pdfText clamp: it now limits what the rewrite flow sees, but it still happens after req.json(), so it does not change request-parse memory behavior.

Ready for re-review when you have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Context-aware web search query when files are uploaded

2 participants