Summary
This proposal adds an adaptive RAG prefetcher and topology-aware retrieval planning layer for docker-agent. The idea comes from analyzing Aether-Lang's sparse attention graph, hierarchical block tree, adaptive epsilon governor, and drift detector, then translating the useful algorithms into docker-agent's Go RAG/runtime architecture without importing the Aether runtime.
Motivation
Long-running agent sessions repeatedly query the same RAG sources, tool outputs, and nearby semantic regions. Today docker-agent indexes documents in the background and runs retrieval on demand, but it does not pre-warm likely next retrievals or adapt retrieval breadth based on topic drift. That leaves latency on the critical model/tool path and misses a chance to use existing turn signals.
Proposed direction
- Add a small RAG prefetch subsystem that records recent user prompts, RAG queries, returned chunks, tool calls, and turn boundaries.
- Represent recent activity as lightweight topology metadata: centroid, radius, variance, concentration, and drift.
- Use an adaptive threshold to decide when the session is stable enough to prefetch nearby query candidates versus drifting enough to skip stale work.
- Warm query embeddings and optionally cached retrieval results for likely follow-up queries.
- Keep the feature opt-in and observable through debug logs/events/metrics.
Candidate implementation surfaces
pkg/rag/manager.go: coordinate prefetch lifecycle around real Query calls.
pkg/rag/strategy/vector_store.go: add query embedding/result prewarm hooks without changing indexing semantics.
pkg/runtime/loop.go: emit turn/user/tool signals to the prefetcher at safe boundaries.
pkg/config/latest/types.go and agent-schema.json: add latest-only config.
docs/tools/rag/index.md and examples: document the feature.
Summary
This proposal adds an adaptive RAG prefetcher and topology-aware retrieval planning layer for docker-agent. The idea comes from analyzing Aether-Lang's sparse attention graph, hierarchical block tree, adaptive epsilon governor, and drift detector, then translating the useful algorithms into docker-agent's Go RAG/runtime architecture without importing the Aether runtime.
Motivation
Long-running agent sessions repeatedly query the same RAG sources, tool outputs, and nearby semantic regions. Today docker-agent indexes documents in the background and runs retrieval on demand, but it does not pre-warm likely next retrievals or adapt retrieval breadth based on topic drift. That leaves latency on the critical model/tool path and misses a chance to use existing turn signals.
Proposed direction
Candidate implementation surfaces
pkg/rag/manager.go: coordinate prefetch lifecycle around realQuerycalls.pkg/rag/strategy/vector_store.go: add query embedding/result prewarm hooks without changing indexing semantics.pkg/runtime/loop.go: emit turn/user/tool signals to the prefetcher at safe boundaries.pkg/config/latest/types.goandagent-schema.json: add latest-only config.docs/tools/rag/index.mdand examples: document the feature.