Skip to content

Feature: CLI Rich Rendering & Block Navigation — Streaming Markdown, Syntax Diffs & Output Indexing #684

@teknium1

Description

@teknium1

Overview

Upgrade the Hermes CLI output rendering from basic ANSI text to rich streaming markdown with syntax-highlighted code blocks, colored diffs, content zone attribution, and navigable output blocks. This is a significant refactor of how cli.py handles agent output — moving from raw print() to a managed output buffer.

This brings the Hermes terminal experience to parity with Toad (best markdown rendering), Aider (best diff display), and Claude Code (best context management) — the three CLI agents that set the standard for terminal UX in 2025-2026.

Split from #504 for atomicity. The companion feature (status bar + token tracking) is tracked in #683.


Research Findings

Terminal Agent UX Leaders

Toad CLI — Best markdown rendering:

  • High-performance streaming markdown with tables, syntax highlighting
  • Notebook-like interactions: navigate/reuse previous conversation blocks
  • SVG export of conversation content

Aider — Best diff display:

  • /diff command to preview changes before applying
  • Git-native atomic commits with descriptive messages
  • Color-coded diff output with file headers

Claude Code — Best context management:

  • /compact with configurable summary focus
  • "Document & Clear" workflow (dump to file, clear window, resume)
  • Custom slash commands

Key Patterns Missing from Hermes CLI

  1. Streaming markdown: Output is plain text with basic ANSI. No table rendering, no header sizing, no nested list indentation during streaming.
  2. Diff highlighting: The patch tool returns diffs in tool results, but they're not syntax-highlighted or visually distinct.
  3. Content zones: No visual distinction between agent prose, tool calls, tool results, and system messages beyond basic formatting.
  4. Block navigation: Can't jump back to a specific code block, copy it, or reference it.

Current State in Hermes Agent

Output model (cli.py):

  • Agent output goes through _cprint() which uses print_formatted_text() with ANSI passthrough
  • Output is NOT in a scrollable panel — it just prints to stdout above the prompt_toolkit input area
  • The application runs with full_screen=False
  • Tool progress is displayed via print() from the agent thread with throttled repainting (0.25s intervals)
  • Response is wrapped in a simple box: ╭─ ⚕ Hermes ──╮ ... ╰──────╯

The fundamental challenge: Moving to rich rendering requires changing the output from "print and scroll away" to "managed buffer that can be navigated, searched, and re-rendered." This likely means switching to full_screen=True with a scrollable output Window.

Rich library is already imported but used minimally. The banner uses Rich panels and tables. Output is mostly plain text with ANSI codes.


Implementation Plan

Skill vs. Tool Classification

This is a core codebase change requiring a significant refactor of cli.py's output pipeline. Cannot be expressed as a skill.

Architecture Decision: Output Buffer

The key architectural question is how to manage output. Options:

Option A: Rich Console Capture

  • Keep full_screen=False
  • Use rich.console.Console with record=True for structured output
  • Render markdown via rich.markdown.Markdown
  • Render diffs via custom rich.syntax.Syntax
  • Still prints to stdout, but with much better formatting
  • Pro: minimal refactor. Con: no navigation, no scrollback management.

Option B: Full-Screen TUI with Output Panel

Option C: Hybrid (Recommended)

  • Phase 1: Option A (rich formatting via Rich console, no layout change)
  • Phase 2: Option B (full TUI overhaul if there's demand)

Phased Rollout

Phase 1: Rich Output Formatting (Option A — minimal refactor)

  • Replace _cprint() ANSI passthrough with Rich console rendering
  • Markdown rendering: headers, tables, code blocks with syntax highlighting, lists, blockquotes
  • Diff rendering: when patch tool returns a diff, render with green/red coloring and file headers
  • Content zone styling:
    • Agent prose: default color
    • Tool calls: dimmed/italic with emoji prefix (enhance existing system)
    • Tool results: distinct background or left-border
    • System messages: yellow/amber
    • Errors: red with actionable context
  • Inline code with distinct background color
  • Code blocks with language label and copy hint

Phase 2: Block Indexing & Navigation (requires Option B)

  • Assign IDs to output blocks: [1] code block, [2] diff, [3] table
  • /block <id> command to display a specific block
  • /copy <id> command to copy block content to clipboard
  • /blocks command to list all blocks with previews
  • Keyboard shortcuts: Ctrl+Up/Down to jump between blocks
  • Full-screen mode with scrollable output panel
  • Search within output: /search <term> or Ctrl+F

Phase 3: Advanced Features

  • /transcript command: navigable view of full conversation with blocks
  • /export command: save conversation as well-formatted markdown
  • Split-pane mode: show tool progress in a side panel while output streams
  • Subagent monitoring panel: when delegate_task is active, show status
  • Side-by-side diff display (togglable from unified)

Pros & Cons

Pros

  • Rich markdown makes output dramatically more readable
  • Diff highlighting makes code review faster and more accurate
  • Content zones reduce cognitive load (instantly see what's agent vs tool vs system)
  • Phase 1 is achievable without major refactoring (just improve rendering)
  • Block navigation (Phase 2) enables precise interaction with past output

Cons / Risks

  • prompt_toolkit layout changes are fragile and hard to test
  • Streaming markdown renderer must handle partial content (mid-code-block, mid-table)
  • Full-screen mode (Phase 2) breaks terminal scrollback history
  • Performance: complex rendering could slow streaming display
  • Terminal compatibility: mosh, tmux, SSH may not support all ANSI features
  • cli.py is already 3,620 lines — needs modularization before major refactor

Prerequisite Work


Open Questions

  • Option A (Rich formatting) or Option B (full TUI) for Phase 1? (Recommend A)
  • Should cli.py be split into modules before this work starts?
  • How do we handle terminals that don't support true color? Graceful 256/16-color degradation?
  • Should block IDs be persistent across scrollback or reset per response?
  • Is Textual (Rich's TUI framework) worth evaluating for Phase 2, or stay with prompt_toolkit?
  • Should we add a --plain flag for users who prefer unformatted output?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions