-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Add optional sentence-window retrieval controls to file_search in Responses API #5390
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or request
Description
🚀 Describe the new functionality needed
Add support for sentence-window retrieval (also called contextual expansion / neighboring chunk expansion) as an optional parameter on file_search in the Responses API.
Proposed direction:
- Extend
file_searchtool config with optional fields, e.g.:sentence_window_size(integer, default0)- optionally
window_unit("sentences"now, extensible later)
- Example:
{ "type": "file_search", "sentence_window_size": 2 }
💡 Why is this needed? What if we don't build it?
RAG quality often depends on not just the matched sentence/chunk, but its immediate context. Window retrieval improves answer grounding by including local neighboring text that contains definitions, qualifiers, and references.
Without this:
- Retrieval can return overly narrow snippets, reducing answer quality.
- Users must implement custom post-processing outside the API.
- Different clients will duplicate similar logic, creating inconsistent behavior.
- It is harder to evaluate and tune retrieval quality in a standard, portable way.
Other thoughts
Compatibility: Optional params on file_search seems the cleanest and most elegant path (agrees with current discussion).
Alternatives considered:
- Metadata-based config (metadata.file_search_sentence_window_size) is less discoverable/typed.
- Vendor extension (x_llama_stack) is flexible but fragments API usage.
- Separate tool adds surface area and complexity for a small behavioral variation.
Trade-offs / complexity:
- Must define interaction with chunking strategy and max context/token limits.
- Should add tests for boundary cases (start/end of doc, short docs, multilingual punctuation).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request