Skip to content

Feature: @ Context References — Scoped File, Folder, Diff & URL Injection in Messages #682

@teknium1

Description

@teknium1

Overview

Add an @ reference system that lets users scope what context the agent sees by including @file:path, @folder:dir, @diff, @git:N, and @url:https://... in their messages. These references are expanded before the message reaches the LLM, injecting the referenced content as additional context.

This is the most common interaction pattern in modern AI coding tools. Cursor has @codebase, @file, @folder, @web, @docs. Aider has /add and /read. Toad has @ with fuzzy file search. Claude Code has context management commands. Hermes has none of these — users must manually paste file contents or hope the agent searches correctly.

Split from #502 for atomicity. The companion feature (.hermes.md project config) is tracked in #681.


Research Findings

How Competitors Do It

Cursor — Most mature:

  • @file — include specific file as context
  • @folder — scope to a directory
  • @codebase — semantic search across entire repo (embeddings)
  • @web — search the web and include results
  • @docs — reference indexed documentation
  • @git — reference git history/diffs
  • Fuzzy matching with auto-complete dropdown
  • .cursorignore for exclusions

Aider — CLI-native:

  • /add <file> — add to editable context
  • /read <file> — add as read-only context
  • /drop <file> — remove from context
  • Shows token cost of each added file

Toad — Terminal agent:

  • @ triggers fuzzy file search (respects .gitignore)
  • Tab completion for file paths

Key Design Decision

The @ prefix is universally used. Auto-complete is essential for usability. Token cost visibility prevents context overflow.


Current State in Hermes Agent

No @ reference parsing exists. The message flow is:

  1. User types in prompt_toolkit TextArea (CLI) or sends chat message (gateway)
  2. CLI: paste reference expansion (regex for [Pasted text #N])
  3. Image attachment → multimodal content conversion
  4. Raw message string passed to run_conversation()

The natural insertion point for @ expansion is between step 2 and step 3 in CLI (cli.py ~line 3383), and in the gateway message handler for messaging platforms.

What we DO have:

  • read_file tool — agent can read files, but user must describe what to read
  • search_files tool — agent can search, but no structured @ syntax
  • Paste reference expansion — proves the "expand references in message" pattern works
  • SlashCommandCompleter — existing tab-completion infrastructure in prompt_toolkit

Implementation Plan

Skill vs. Tool Classification

This should be a core codebase change. It requires deterministic message preprocessing (regex parsing, file I/O, token estimation) before the LLM sees the message. It touches the CLI input pipeline and gateway message handler.

Reference Types

Reference Syntax Expansion
File @file:path/to/file.py Full file contents (or truncated with line range)
File range @file:path/to/file.py:10-50 Lines 10-50 of file
Folder @folder:src/ Directory tree listing (files + sizes)
Git diff @diff Current git diff output
Git staged @staged git diff --staged output
Git log @git:5 Last 5 git log entries with diffs
URL @url:https://example.com Extracted web content (via web_extract)

Detailed Specification

Parsing rules:

  • Match @(file|folder|diff|staged|git|url):... patterns in user messages
  • Must NOT match email addresses (user@domain.com) — require the : delimiter
  • Must NOT match social media handles (@username) — require known prefix
  • References can appear anywhere in the message, including mid-sentence
  • Multiple references per message supported

Expansion format:

[User's original message with @references removed]

--- Attached Context ---

📄 @file:src/main.py (2,340 tokens)
```python
[file contents]

📁 @folder:src/ (directory listing)
src/
├── main.py (156 lines)
├── config.py (42 lines)
├── utils/
│ ├── helpers.py (89 lines)
│ └── constants.py (23 lines)


**Token budget**:
- Show token estimate for each expanded reference
- Show total injected tokens: `[@ context: 4,230 tokens injected]`
- Warn if total exceeds 25% of model context window
- Hard limit: refuse to expand if total would exceed 50% of context

**CLI tab-completion**:
- After typing `@file:`, trigger file path completion (using prompt_toolkit completer)
- Respect `.gitignore` patterns
- Respect `.hermesignore` if present (#681)
- Show file size/line count in completion hints

**Gateway support**:
- Parse @ references from plain text messages on Telegram/Discord/etc.
- File paths resolve relative to the agent's CWD
- `@url:` works on all platforms (uses web_extract tool internally)
- `@diff` and `@git:` work when CWD is a git repo

### Deliverables

- [ ] Message preprocessor: regex parser for @ references
- [ ] File expander: read files with optional line ranges
- [ ] Folder expander: directory tree with file metadata
- [ ] Git expander: diff, staged, log with diffs
- [ ] URL expander: fetch via web_extract
- [ ] Token estimator: count tokens per expansion, show totals
- [ ] CLI tab-completion for `@file:` paths (prompt_toolkit completer)
- [ ] Gateway integration: parse @ refs in messaging platform messages
- [ ] Budget enforcement: warn/refuse on excessive context injection
- [ ] Tests for parsing, expansion, edge cases, email/handle exclusion

---

## Pros & Cons

### Pros
- Gives users precise control over what the agent sees
- Reduces wasted tokens from agent searching for wrong files
- Follows established patterns — Cursor/Aider/Toad users will feel at home
- Tab completion makes it fast and discoverable
- Token budget visibility prevents silent context overflow
- Works on both CLI and messaging platforms

### Cons / Risks
- Parsing must carefully avoid false positives (emails, social handles)
- Large file expansion can consume significant context window
- Cross-platform: @ references on Telegram feel awkward (no tab completion)
- File paths are relative to CWD — may confuse users working in different directories
- Gateway users can't easily discover available files (no tab completion)

---

## Open Questions

- Should `@file:path` use `:` as delimiter (explicit) or just `@path` (shorter but ambiguous)?
- Should we support `@file:path/to/*.py` glob patterns for multiple files?
- How do we handle binary files referenced via `@file:`? (Skip with warning?)
- Should expanded context go at the top of the message (before user text) or bottom (after)?
- For gateway platforms without tab completion, should we offer a `/context add` command as alternative syntax?

---

## References

- [Cursor @ mentions](https://docs.cursor.com/) — @codebase, @file, @folder, @web
- [Aider context commands](https://aider.chat/docs/usage/commands.html) — /add, /read, /drop
- [Toad CLI](https://github.com/toadhq/toad) — @ fuzzy file search in terminal
- Existing code: cli.py paste reference expansion (~line 3383) — proves the pattern
- Existing code: `SlashCommandCompleter` — extensible for @ completion
- Supersedes Phase 2 of closed #502
- Related: #681 (.hermes.md project config — companion feature)
- Related: #489 (Semantic Codebase Search — future @codebase support)
- Related: #535 (PageRank Repo Map — future @codebase support)
- Note: @codebase semantic search is OUT OF SCOPE for this issue — see #489/#535

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions