diff --git a/CLAUDE.md b/CLAUDE.md index 461f25d2..6dc7cd4c 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,6 +1,6 @@ # Claude Code Log -A Python CLI tool that converts Claude transcript JSONL files into readable HTML format. +A Python CLI tool that converts Claude Code transcript JSONL files into readable HTML format. ## Project Overview @@ -11,319 +11,89 @@ This tool processes Claude Code transcript files (stored as JSONL) and generates - **Interactive TUI (Terminal User Interface)**: Browse and manage Claude Code sessions with real-time navigation, summaries, and quick actions for HTML export and session resuming - **Project Hierarchy Processing**: Process entire `~/.claude/projects/` directory with linked index page - **Individual Session Files**: Generate separate HTML files for each session with navigation links -- **Single File or Directory Processing**: Convert individual JSONL files or specific directories - **Session Navigation**: Interactive table of contents with session summaries and quick navigation - **Token Usage Tracking**: Display token consumption for individual messages and session totals -- **Runtime Message Filtering**: JavaScript-powered filtering to show/hide message types (user, assistant, system, tool use, etc.) -- **Chronological Ordering**: All messages sorted by timestamp across sessions -- **Cross-Session Summary Matching**: Properly match async-generated summaries to their original sessions -- **Date Range Filtering**: Filter messages by date range using natural language (e.g., "today", "yesterday", "last week") -- **Rich Message Types**: Support for user/assistant messages, tool use/results, thinking content, images -- **System Command Visibility**: Show system commands (like `init`) in expandable details with structured parsing -- **Markdown Rendering**: Server-side markdown rendering with syntax highlighting using mistune -- **Interactive Timeline**: Optional vis-timeline visualization showing message chronology across all types, with click-to-scroll navigation (implemented in JavaScript within the HTML template) -- **Floating Navigation**: Always-available back-to-top button and filter controls -- **Space-Efficient Layout**: Compact design optimised for content density -- **CLI Interface**: Simple command-line tool using Click +- **Runtime Message Filtering**: JavaScript-powered filtering to show/hide message types +- **Interactive Timeline**: vis-timeline visualization with click-to-scroll navigation +- **Date Range Filtering**: Filter messages using natural language (e.g., "today", "yesterday") +- **Markdown Rendering**: Server-side markdown with syntax highlighting using mistune ## Usage -### Interactive TUI (Terminal User Interface) - -The TUI provides an interactive interface for browsing and managing Claude Code sessions with real-time navigation, session summaries, and quick actions. - ```bash -# Launch TUI for all projects (default behavior) -claude-code-log --tui - -# Launch TUI for specific project directory -claude-code-log /path/to/project --tui - -# Launch TUI for specific Claude project -claude-code-log my-project --tui # Automatically converts to ~/.claude/projects/-path-to-my-project -``` - -**TUI Features:** - -- **Session Listing**: Interactive table showing session IDs, summaries, timestamps, message counts, and token usage -- **Smart Summaries**: Prioritizes Claude-generated summaries over first user messages for better session identification -- **Working Directory Matching**: Automatically finds and opens projects matching your current working directory -- **Quick Actions**: - - `h` or "Export to HTML" button: Generate and open session HTML in browser - - `c` or "Resume in Claude Code" button: Continue session with `claude -r ` - - `r` or "Refresh" button: Reload session data from files - - `p` or "Projects View" button: Switch to project selector view -- **Project Statistics**: Real-time display of total sessions, messages, tokens, and date range -- **Cache Integration**: Leverages existing cache system for fast loading with automatic cache validation -- **Keyboard Navigation**: Arrow keys to navigate, Enter to expand row details, `q` to quit -- **Row Expansion**: Press Enter to expand selected row showing full summary, first user message, working directory, and detailed token usage - -**Working Directory Matching:** - -The TUI automatically identifies Claude projects that match your current working directory by: - -1. **Cache-based Matching**: Uses stored working directories from session messages (`cwd` field) -2. **Path-based Matching**: Falls back to Claude's project naming convention (replacing `/` with `-`) -3. **Smart Prioritization**: When multiple projects are found, prioritizes those matching your current directory -4. **Subdirectory Support**: Matches parent directories, so you can run the TUI from anywhere within a project - -### Default Behavior (Process All Projects) - -```bash -# Process all projects in ~/.claude/projects/ (default behavior) +# Process all projects (default) claude-code-log -# Explicitly process all projects -claude-code-log --all-projects +# Interactive TUI +claude-code-log --tui + +# Single file/directory +claude-code-log path/to/transcript.jsonl -# Process all projects and open in browser +# With browser open claude-code-log --open-browser -# Process all projects with date filtering +# Date filtering (natural language) claude-code-log --from-date "yesterday" --to-date "today" claude-code-log --from-date "last week" - -# Skip individual session files (only create combined transcripts) -claude-code-log --no-individual-sessions -``` - -This creates: - -- `~/.claude/projects/index.html` - Master index with project cards and statistics -- `~/.claude/projects/project-name/combined_transcripts.html` - Individual project pages -- `~/.claude/projects/project-name/session-{session-id}.html` - Individual session pages - -### Single File or Directory Processing - -```bash -# Single file -claude-code-log transcript.jsonl - -# Specific directory -claude-code-log /path/to/transcript/directory - -# Custom output location -claude-code-log /path/to/directory -o combined_transcripts.html - -# Open in browser after conversion -claude-code-log /path/to/directory --open-browser - -# Filter by date range (supports natural language) -claude-code-log /path/to/directory --from-date "yesterday" --to-date "today" -claude-code-log /path/to/directory --from-date "3 days ago" --to-date "yesterday" ``` -## File Structure - -- `claude_code_log/parser.py` - Data extraction and parsing from JSONL files -- `claude_code_log/renderer.py` - HTML generation and template rendering -- `claude_code_log/renderer_timings.py` - Performance timing instrumentation -- `claude_code_log/converter.py` - High-level conversion orchestration -- `claude_code_log/cli.py` - Command-line interface with project discovery -- `claude_code_log/models.py` - Pydantic models for transcript data structures -- `claude_code_log/tui.py` - Interactive Terminal User Interface using Textual -- `claude_code_log/cache.py` - Cache management for performance optimization -- `claude_code_log/templates/` - Jinja2 HTML templates - - `transcript.html` - Main transcript viewer template - - `index.html` - Project directory index template -- `pyproject.toml` - Project configuration with dependencies +**Output Files Generated:** +- `~/.claude/projects/index.html` - Master index with project cards +- `~/.claude/projects/project-name/combined_transcripts.html` - Combined project page +- `~/.claude/projects/project-name/session-{id}.html` - Individual session pages ## Development -The project uses: - -- Python 3.10+ with uv package management -- Click for CLI interface and argument parsing -- Textual for interactive Terminal User Interface -- Pydantic for robust data modelling and validation -- Jinja2 for HTML template rendering -- dateparser for natural language date parsing -- Standard library for JSON/HTML processing -- Minimal dependencies for portability -- mistune for quick Markdown rendering - -## Development Commands - -### Testing - -The project uses a categorized test system to avoid async event loop conflicts between different testing frameworks: +See @CONTRIBUTING.md for detailed development setup, testing, architecture, and release process. -#### Test Categories +### Claude-Specific Testing Tips -- **Unit Tests** (no mark): Fast, standalone tests with no external dependencies -- **TUI Tests** (`@pytest.mark.tui`): Tests for the Textual-based Terminal User Interface -- **Browser Tests** (`@pytest.mark.browser`): Playwright-based tests that run in real browsers -- **Snapshot Tests**: HTML regression tests using syrupy (runs with unit tests) - -#### Snapshot Testing - -Snapshot tests detect unintended HTML output changes using [syrupy](https://github.com/syrupy-project/syrupy): - -```bash -# Run snapshot tests -uv run pytest -n auto test/test_snapshot_html.py -v - -# Update snapshots after intentional HTML changes -uv run pytest -n auto test/test_snapshot_html.py --snapshot-update -``` - -#### Running Tests +**Always use `-n auto` for parallel test execution:** ```bash -# Run only unit tests (fast, recommended for development) +# Unit tests (fast, recommended for development) just test # or: uv run pytest -n auto -m "not (tui or browser)" -v -# Run TUI tests (isolated event loop) +# TUI tests just test-tui # or: uv run pytest -n auto -m tui -# Run browser tests (requires Chromium) +# Browser tests just test-browser # or: uv run pytest -n auto -m browser -# Run all tests in sequence (separated to avoid conflicts) +# All tests just test-all - -# Run tests with coverage (all categories) -just test-cov -``` - -#### Prerequisites - -Browser tests require Chromium to be installed: - -```bash -uv run playwright install chromium ``` -#### Why Test Categories? - -The test suite is categorized because: - -- **TUI tests** use Textual's async event loop (`run_test()`) -- **Browser tests** use Playwright's internal asyncio -- **pytest-asyncio** manages async test execution - -Running all tests together can cause "RuntimeError: This event loop is already running" conflicts. The categorization ensures reliable test execution by isolating different async frameworks. +**Tip:** Add `-x` to stop on first failure (e.g., `uv run pytest -n auto -m "not (tui or browser)" -v -x`). -### Test Coverage - -Generate test coverage reports: +### Code Quality ```bash -# Run all tests with coverage (recommended) -just test-cov - -# Or run coverage manually -uv run pytest -n auto --cov=claude_code_log --cov-report=html --cov-report=term - -# Generate HTML coverage report only -uv run pytest -n auto --cov=claude_code_log --cov-report=html - -# View coverage in terminal -uv run pytest -n auto --cov=claude_code_log --cov-report=term-missing +ruff format # Format code +ruff check --fix # Lint and fix +uv run pyright # Type check +uv run ty check # Alternative type check ``` -HTML coverage reports are generated in `htmlcov/index.html`. - -### Code Quality - -- **Format code**: `ruff format` -- **Lint and fix**: `ruff check --fix` -- **Type checking**: `uv run pyright` and `uv run ty check` - ### Performance Profiling -Enable timing instrumentation to identify performance bottlenecks: - ```bash -# Enable timing output CLAUDE_CODE_LOG_DEBUG_TIMING=1 claude-code-log path/to/file.jsonl - -# Or export for a session -export CLAUDE_CODE_LOG_DEBUG_TIMING=1 -claude-code-log path/to/file.jsonl -``` - -This outputs detailed timing for each rendering phase: - -``` -[TIMING] Initialization 0.001s (total: 0.001s) -[TIMING] Deduplication (1234 messages) 0.050s (total: 0.051s) -[TIMING] Session summary processing 0.012s (total: 0.063s) -[TIMING] Main message processing loop 5.234s (total: 5.297s) -[TIMING] Template rendering (30MB chars) 15.432s (total: 20.729s) - -[TIMING] Loop statistics: -[TIMING] Total messages: 1234 -[TIMING] Average time per message: 4.2ms -[TIMING] Slowest 10 messages: -[TIMING] Message abc-123 (#42, assistant): 245.3ms -[TIMING] ... - -[TIMING] Pygments highlighting: -[TIMING] Total operations: 89 -[TIMING] Total time: 1.234s -[TIMING] Slowest 10 operations: -[TIMING] def-456: 50.2ms -[TIMING] ... -``` - -The timing module is in `claude_code_log/renderer_timings.py`. - -### Testing & Style Guide - -- **Unit and Integration Tests**: See [test/README.md](test/README.md) for comprehensive testing documentation -- **Visual Style Guide**: `uv run python scripts/generate_style_guide.py` -- **Manual Testing**: Use representative test data in `test/test_data/` - -Test with Claude transcript JSONL files typically found in `~/.claude/projects/` directories. - -### Dependency management - -The project uses `uv` so: - -```sh -# Add a new dep -uv add textual - -# Remove a dep -uv remove textual ``` -## Architecture Notes - -### Data Models - -The application uses Pydantic models to parse and validate transcript JSON data: - -- **TranscriptEntry**: Union of UserTranscriptEntry, AssistantTranscriptEntry, SummaryTranscriptEntry -- **UsageInfo**: Token usage tracking (input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens) -- **ContentItem**: Union of TextContent, ToolUseContent, ToolResultContent, ThinkingContent, ImageContent - -### Template System - -Uses Jinja2 templates for HTML generation: - -- **Session Navigation**: Generates table of contents with timestamp ranges and token summaries -- **Message Rendering**: Handles different content types with appropriate formatting -- **Token Display**: Shows usage for individual assistant messages and session totals - -### Token Usage Features - -- **Individual Messages**: Assistant messages display token usage in header -- **Session Aggregation**: ToC shows total tokens consumed per session -- **Format**: "Input: X | Output: Y | Cache Creation: Z | Cache Read: W" -- **Data Source**: Extracted from AssistantMessage.usage field in JSONL +### Timeline Component -### Session Management +The interactive timeline is implemented in JavaScript within `claude_code_log/templates/components/timeline.html`. It parses message types from CSS classes generated by the Python renderer. -- **Session Detection**: Groups messages by sessionId field -- **Summary Attachment**: Links session summaries via leafUuid -> message UUID -> session ID mapping -- **Timestamp Tracking**: Records first and last timestamp for each session -- **Navigation**: Generates clickable ToC with session previews and metadata +**Important**: When adding new message types or modifying CSS class generation in `renderer.py`, ensure the timeline's message type detection logic is updated accordingly to maintain feature parity. Also, make sure that the filter is still applied consistently to the messages both in the main transcript and in the timeline. You can use Playwright to test browser runtime features. -### Timeline Component +## Architecture -The interactive timeline is implemented in JavaScript within `claude_code_log/templates/components/timeline.html`. It parses message types from CSS classes generated by the Python renderer. **Important**: When adding new message types or modifying CSS class generation in `renderer.py`, ensure the timeline's message type detection logic is updated accordingly to maintain feature parity. Also, make sure that the filter is still applied consistently to the messages both in the main transcript and in the timeline. You can use Playwright to test browser runtime features. +For detailed architecture documentation, see: +- [dev-docs/rendering-architecture.md](dev-docs/rendering-architecture.md) - Data flow and rendering pipeline +- [dev-docs/messages.md](dev-docs/messages.md) - Message type reference +- [dev-docs/css-classes.md](dev-docs/css-classes.md) - CSS class combinations diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 00000000..14857e05 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,265 @@ +# Contributing to Claude Code Log + +This guide covers development setup, testing, architecture, and release processes for contributors. + +## Prerequisites + +- Python 3.10+ +- [uv](https://docs.astral.sh/uv/) package manager + +## Getting Started + +```bash +git clone https://github.com/daaain/claude-code-log.git +cd claude-code-log +uv sync +``` + +## File Structure + +``` +claude_code_log/ +├── cli.py # Command-line interface with project discovery +├── tui.py # Interactive Terminal User Interface (Textual) +├── parser.py # Data extraction and parsing from JSONL files +├── renderer.py # Format-neutral message processing and tree building +├── renderer_timings.py # Performance timing instrumentation +├── converter.py # High-level conversion orchestration +├── models.py # Pydantic models for transcript data structures +├── cache.py # Cache management for performance optimization +├── factories/ # Transcript entry to MessageContent transformation +│ ├── meta_factory.py +│ ├── user_factory.py +│ ├── assistant_factory.py +│ ├── tool_factory.py +│ └── system_factory.py +├── html/ # HTML-specific rendering +│ ├── renderer.py +│ ├── user_formatters.py +│ ├── assistant_formatters.py +│ ├── system_formatters.py +│ ├── tool_formatters.py +│ └── utils.py +├── markdown/ # Markdown output rendering +│ └── renderer.py +└── templates/ # Jinja2 HTML templates + ├── transcript.html + ├── index.html + └── components/ + └── timeline.html + +scripts/ # Development utilities +test/test_data/ # Representative JSONL samples +dev-docs/ # Architecture documentation +``` + +## Development Setup + +The project uses: + +- Python 3.10+ with uv package management +- Click for CLI interface +- Textual for Terminal User Interface +- Pydantic for data modeling and validation +- Jinja2 for HTML template rendering +- mistune for Markdown rendering +- dateparser for natural language date parsing + +### Dependency Management + +```bash +# Add a new dependency +uv add textual + +# Remove a dependency +uv remove textual + +# Sync dependencies +uv sync +``` + +## Testing + +The project uses a categorized test system to avoid async event loop conflicts. + +### Test Categories + +- **Unit Tests** (no mark): Fast, standalone tests +- **TUI Tests** (`@pytest.mark.tui`): Textual-based TUI tests +- **Browser Tests** (`@pytest.mark.browser`): Playwright-based browser tests +- **Snapshot Tests**: HTML regression tests using syrupy + +### Running Tests + +```bash +# Unit tests only (fast, recommended for development) +just test +# or: uv run pytest -n auto -m "not (tui or browser)" -v + +# TUI tests (isolated event loop) +just test-tui + +# Browser tests (requires Chromium) +just test-browser + +# All tests in sequence +just test-all + +# Tests with coverage +just test-cov +``` + +### Snapshot Testing + +Snapshot tests detect unintended HTML output changes using [syrupy](https://github.com/syrupy-project/syrupy): + +```bash +# Run snapshot tests +uv run pytest -n auto test/test_snapshot_html.py -v + +# Update snapshots after intentional HTML changes +uv run pytest -n auto test/test_snapshot_html.py --snapshot-update +``` + +When snapshot tests fail: +1. Review the diff to verify changes are intentional +2. If intentional, run `--snapshot-update` to accept new output +3. If unintentional, fix your code and re-run tests + +### Test Prerequisites + +Browser tests require Chromium: + +```bash +uv run playwright install chromium +``` + +### Why Test Categories? + +The test suite is categorized because different async frameworks conflict: + +- **TUI tests** use Textual's async event loop (`run_test()`) +- **Browser tests** use Playwright's internal asyncio +- **pytest-asyncio** manages async test execution + +Running all tests together can cause "RuntimeError: This event loop is already running". The categorization ensures reliable test execution. + +### Test Coverage + +```bash +# Run with coverage +just test-cov + +# Or manually: +uv run pytest -n auto --cov=claude_code_log --cov-report=html --cov-report=term +``` + +HTML coverage reports are generated in `htmlcov/index.html`. + +### Testing Resources + +- See [test/README.md](test/README.md) for comprehensive testing documentation +- Visual Style Guide: `uv run python scripts/generate_style_guide.py` +- Test data in `test/test_data/` + +## Code Quality + +```bash +# Format code +ruff format + +# Lint and fix +ruff check --fix + +# Type checking +uv run pyright +uv run ty check +``` + +## Performance Profiling + +Enable timing instrumentation to identify bottlenecks: + +```bash +CLAUDE_CODE_LOG_DEBUG_TIMING=1 claude-code-log path/to/file.jsonl +``` + +This outputs detailed timing for each rendering phase. The timing module is in `claude_code_log/renderer_timings.py`. + +## Architecture + +For detailed architecture documentation, see [dev-docs/rendering-architecture.md](dev-docs/rendering-architecture.md). + +### Data Flow Overview + +``` +JSONL File + ↓ (parser.py) +list[TranscriptEntry] + ↓ (factories/) +list[TemplateMessage] with MessageContent + ↓ (renderer.py) +Tree of TemplateMessage (roots with children) + ↓ (html/renderer.py or markdown/renderer.py) +Final output (HTML or Markdown) +``` + +### Data Models + +The application uses Pydantic models to parse and validate transcript JSON data: + +- **TranscriptEntry**: Union of User, Assistant, Summary, System, QueueOperation entries +- **UsageInfo**: Token usage tracking (input/output tokens, cache tokens) +- **ContentItem**: Union of Text, ToolUse, ToolResult, Thinking, Image content + +### Template System + +Uses Jinja2 templates for HTML generation: + +- **Session Navigation**: Table of contents with timestamp ranges and token summaries +- **Message Rendering**: Handles different content types with appropriate formatting +- **Token Display**: Shows usage for individual messages and session totals + +### Timeline Component + +The interactive timeline is implemented in JavaScript within `claude_code_log/templates/components/timeline.html`. When adding new message types or modifying CSS class generation, ensure the timeline's message type detection logic is updated accordingly. + +## Cache System + +The tool implements a caching system for performance: + +- **Location**: `.cache/` directory within each project folder +- **Contents**: Pre-parsed session metadata (IDs, summaries, timestamps, token usage) +- **Invalidation**: Automatic detection based on file modification times +- **Performance**: 10-100x faster loading for large projects + +The cache automatically rebuilds when source files change or cache version changes. + +## Release Process + +The project uses automated releases with semantic versioning. + +### Quick Release + +```bash +# Bump version and create release (patch/minor/major) +just release-prep patch # Bug fixes +just release-prep minor # New features +just release-prep major # Breaking changes + +# Or specify exact version +just release-prep 0.4.3 + +# Preview what would be released +just release-preview + +# Push to PyPI and create GitHub release +just release-push +``` + +### GitHub Release Only + +```bash +just github-release # For latest tag +just github-release 0.4.2 # For specific version +``` diff --git a/README.md b/README.md index 9ce42066..e863ea55 100644 --- a/README.md +++ b/README.md @@ -136,167 +136,6 @@ claude-code-log /path/to/directory --from-date "yesterday" --to-date "today" claude-code-log /path/to/directory --from-date "3 days ago" --to-date "yesterday" ``` -## File Structure - -- `claude_code_log/parser.py` - Data extraction and parsing from JSONL files -- `claude_code_log/renderer.py` - HTML generation and template rendering -- `claude_code_log/converter.py` - High-level conversion orchestration -- `claude_code_log/cli.py` - Command-line interface with project discovery -- `claude_code_log/models.py` - Pydantic models for transcript data structures -- `claude_code_log/templates/` - Jinja2 HTML templates - - `transcript.html` - Main transcript viewer template - - `index.html` - Project directory index template -- `pyproject.toml` - Project configuration with dependencies - -## Development - -The project uses: - -- Python 3.10+ with uv package management -- Click for CLI interface and argument parsing -- Textual for interactive Terminal User Interface -- Pydantic for robust data modeling and validation -- dateparser for natural language date parsing -- Standard library for JSON/HTML processing -- Minimal dependencies for portability -- mistune for quick Markdown rendering - -## Development Commands - -### Testing - -The project uses a categorized test system to avoid async event loop conflicts between different testing frameworks: - -#### Test Categories - -- **Unit Tests** (no mark): Fast, standalone tests with no external dependencies -- **TUI Tests** (`@pytest.mark.tui`): Tests for the Textual-based Terminal User Interface -- **Browser Tests** (`@pytest.mark.browser`): Playwright-based tests that run in real browsers - -#### Running Tests - -```bash -# Run only unit tests (fast, recommended for development) -uv run pytest -n auto -m "not (tui or browser)" - -# Run TUI tests (isolated event loop) -uv run pytest -n auto -m tui - -# Run browser tests (requires Chromium) -uv run pytest -n auto -m browser - -# Run all tests in sequence (separated to avoid conflicts) -uv run pytest -n auto -m "not tui and not browser"; uv run pytest -n auto -m tui; uv run pytest -n auto -m browser -``` - -#### Prerequisites - -Browser tests require Chromium to be installed: - -```bash -uv run playwright install chromium -``` - -#### Why Test Categories? - -The test suite is categorized because: - -- **TUI tests** use Textual's async event loop (`run_test()`) -- **Browser tests** use Playwright's internal asyncio -- **pytest-asyncio** manages async test execution - -Running all tests together can cause "RuntimeError: This event loop is already running" conflicts. The categorization ensures reliable test execution by isolating different async frameworks. - -### Test Coverage - -Generate test coverage reports: - -```bash -# Run tests with coverage -uv run pytest -n auto --cov=claude_code_log --cov-report=html --cov-report=term - -# Generate HTML coverage report only -uv run pytest -n auto --cov=claude_code_log --cov-report=html - -# View coverage in terminal -uv run pytest -n auto --cov=claude_code_log --cov-report=term-missing -``` - -HTML coverage reports are generated in `htmlcov/index.html`. - -**Comprehensive Testing & Style Guide**: The project includes extensive testing infrastructure and visual documentation. See [test/README.md](test/README.md) for details on: - -- **Unit Tests**: Template rendering, message type handling, edge cases -- **Test Coverage**: 78%+ coverage across all modules with detailed reporting -- **Visual Style Guide**: Interactive documentation showing all message types -- **Representative Test Data**: Real-world JSONL samples for development -- **Style Guide Generation**: Create visual documentation with `uv run python scripts/generate_style_guide.py` - -### Code Quality - -- **Format code**: `ruff format` -- **Lint and fix**: `ruff check --fix` -- **Type checking**: `uv run pyright` and `uv run ty check` - -### All Commands - -- **Test (Unit only)**: `uv run pytest -n auto` -- **Test (TUI)**: `uv run pytest -n auto -m tui` -- **Test (Browser)**: `uv run pytest -n auto -m browser` -- **Test (All categories)**: `uv run pytest -n auto -m "not tui and not browser"; uv run pytest -n auto -m tui; uv run pytest -n auto -m browser` -- **Test with Coverage**: `uv run pytest -n auto --cov=claude_code_log --cov-report=html --cov-report=term` -- **Format**: `ruff format` -- **Lint**: `ruff check --fix` -- **Type Check**: `uv run pyright` and `uv run ty check` -- **Generate Style Guide**: `uv run python scripts/generate_style_guide.py` - -Test with Claude transcript JSONL files typically found in `~/.claude/projects/` directories. - -## Release Process (For Maintainers) - -The project uses an automated release process with semantic versioning. Here's how to create and publish a new release: - -### Quick Release - -```bash -# Bump version and create release (patch/minor/major) -just release-prep patch # For bug fixes -just release-prep minor # For new features -just release-prep major # For breaking changes - -# Or specify exact version -just release-prep 0.4.3 - -# Preview what would be released -just release-preview - -# Push to PyPI and create GitHub release -just release-push -``` - -3. **GitHub Release Only**: If you need to create/update just the GitHub release: - - ```bash - just github-release # For latest tag - just github-release 0.4.2 # For specific version - ``` - -### Cache Structure and Benefits - -The tool implements a sophisticated caching system for performance: - -- **Cache Location**: `.cache/` directory within each project folder -- **Session Metadata**: Pre-parsed session information (IDs, summaries, timestamps, token usage) -- **Timestamp Index**: Enables fast date-range filtering without parsing full files -- **Invalidation**: Automatic detection of stale cache based on file modification times -- **Performance**: 10-100x faster loading for large projects with many sessions - -The cache is transparent to users and automatically rebuilds when: - -- Source JSONL files are modified -- New sessions are added -- Cache structure version changes - ## Project Hierarchy Output When processing all projects, the tool generates: @@ -373,6 +212,10 @@ uv sync uv run claude-code-log ``` +## Contributing + +See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, testing, and architecture documentation. + ## TODO - tutorial overlay diff --git a/claude_code_log/factories/assistant_factory.py b/claude_code_log/factories/assistant_factory.py index 3f8b977d..06966c41 100644 --- a/claude_code_log/factories/assistant_factory.py +++ b/claude_code_log/factories/assistant_factory.py @@ -12,7 +12,6 @@ AssistantTextMessage, ContentItem, MessageMeta, - TextContent, ThinkingContent, ThinkingMessage, UsageInfo, @@ -69,14 +68,9 @@ def create_assistant_message( # Create AssistantTextMessage directly from items # (empty text already filtered by chunk_message_content) if items: - # Extract text content from items for dedup matching and simple renderers - text_content = "\n".join( - item.text for item in items if isinstance(item, TextContent) - ) return AssistantTextMessage( meta, items=items, # type: ignore[arg-type] - raw_text_content=text_content if text_content else None, token_usage=format_token_usage(usage) if usage else None, ) return None @@ -84,31 +78,22 @@ def create_assistant_message( def create_thinking_message( meta: MessageMeta, - tool_item: ContentItem, + thinking: ThinkingContent, usage: Optional[UsageInfo] = None, ) -> ThinkingMessage: - """Create ThinkingMessage from a thinking content item. + """Create ThinkingMessage from ThinkingContent. Args: meta: Message metadata. - tool_item: ThinkingContent or compatible object with 'thinking' attribute + thinking: ThinkingContent with thinking text and optional signature. usage: Optional token usage info to format and attach. Returns: ThinkingMessage containing the thinking text and optional signature. """ - # Extract thinking text from the content item - if isinstance(tool_item, ThinkingContent): - thinking_text = tool_item.thinking.strip() - signature = getattr(tool_item, "signature", None) - else: - thinking_text = getattr(tool_item, "thinking", str(tool_item)).strip() - signature = None - - # Create the content model (formatting happens in HtmlRenderer) return ThinkingMessage( meta, - thinking=thinking_text, - signature=signature, + thinking=thinking.thinking.strip(), + signature=thinking.signature, token_usage=format_token_usage(usage) if usage else None, ) diff --git a/claude_code_log/factories/tool_factory.py b/claude_code_log/factories/tool_factory.py index 312472db..af55462a 100644 --- a/claude_code_log/factories/tool_factory.py +++ b/claude_code_log/factories/tool_factory.py @@ -20,11 +20,8 @@ from ..models import ( # Tool input models AskUserQuestionInput, - AskUserQuestionItem, - AskUserQuestionOption, BashInput, EditInput, - EditItem, ExitPlanModeInput, GlobInput, GrepInput, @@ -34,7 +31,6 @@ ReadInput, TaskInput, TodoWriteInput, - TodoWriteItem, ToolInput, ToolResultContent, ToolResultMessage, @@ -74,149 +70,6 @@ } -# ============================================================================= -# Lenient Parsing Helpers -# ============================================================================= -# These functions create typed models even when strict validation fails. -# They use defaults for missing fields and skip invalid nested items. - - -def _parse_todowrite_lenient(data: dict[str, Any]) -> TodoWriteInput: - """Parse TodoWrite input leniently, handling malformed data.""" - todos_raw = data.get("todos", []) - valid_todos: list[TodoWriteItem] = [] - for item in todos_raw: - if isinstance(item, dict): - try: - valid_todos.append(TodoWriteItem.model_validate(item)) - except Exception: - pass - elif isinstance(item, str): - valid_todos.append(TodoWriteItem(content=item)) - return TodoWriteInput(todos=valid_todos) - - -def _parse_bash_lenient(data: dict[str, Any]) -> BashInput: - """Parse Bash input leniently.""" - return BashInput( - command=data.get("command", ""), - description=data.get("description"), - timeout=data.get("timeout"), - run_in_background=data.get("run_in_background"), - ) - - -def _parse_write_lenient(data: dict[str, Any]) -> WriteInput: - """Parse Write input leniently.""" - return WriteInput( - file_path=data.get("file_path", ""), - content=data.get("content", ""), - ) - - -def _parse_edit_lenient(data: dict[str, Any]) -> EditInput: - """Parse Edit input leniently.""" - return EditInput( - file_path=data.get("file_path", ""), - old_string=data.get("old_string", ""), - new_string=data.get("new_string", ""), - replace_all=data.get("replace_all"), - ) - - -def _parse_multiedit_lenient(data: dict[str, Any]) -> MultiEditInput: - """Parse Multiedit input leniently.""" - edits_raw = data.get("edits", []) - valid_edits: list[EditItem] = [] - for edit in edits_raw: - if isinstance(edit, dict): - try: - valid_edits.append(EditItem.model_validate(edit)) - except Exception: - pass - return MultiEditInput(file_path=data.get("file_path", ""), edits=valid_edits) - - -def _parse_task_lenient(data: dict[str, Any]) -> TaskInput: - """Parse Task input leniently.""" - return TaskInput( - prompt=data.get("prompt", ""), - subagent_type=data.get("subagent_type", ""), - description=data.get("description", ""), - model=data.get("model"), - run_in_background=data.get("run_in_background"), - resume=data.get("resume"), - ) - - -def _parse_read_lenient(data: dict[str, Any]) -> ReadInput: - """Parse Read input leniently.""" - return ReadInput( - file_path=data.get("file_path", ""), - offset=data.get("offset"), - limit=data.get("limit"), - ) - - -def _parse_askuserquestion_lenient(data: dict[str, Any]) -> AskUserQuestionInput: - """Parse AskUserQuestion input leniently, handling malformed data.""" - questions_raw = data.get("questions", []) - valid_questions: list[AskUserQuestionItem] = [] - for q in questions_raw: - if isinstance(q, dict): - q_dict = cast(dict[str, Any], q) - try: - # Parse options leniently - options_raw = q_dict.get("options", []) - valid_options: list[AskUserQuestionOption] = [] - for opt in options_raw: - if isinstance(opt, dict): - try: - valid_options.append( - AskUserQuestionOption.model_validate(opt) - ) - except Exception: - pass - valid_questions.append( - AskUserQuestionItem( - question=str(q_dict.get("question", "")), - header=q_dict.get("header"), - options=valid_options, - multiSelect=bool(q_dict.get("multiSelect", False)), - ) - ) - except Exception: - pass - return AskUserQuestionInput( - questions=valid_questions, - question=data.get("question"), - ) - - -def _parse_exitplanmode_lenient(data: dict[str, Any]) -> ExitPlanModeInput: - """Parse ExitPlanMode input leniently.""" - return ExitPlanModeInput( - plan=data.get("plan", ""), - launchSwarm=data.get("launchSwarm"), - teammateCount=data.get("teammateCount"), - ) - - -# Mapping of tool names to their lenient parsers -TOOL_LENIENT_PARSERS: dict[str, Any] = { - "Bash": _parse_bash_lenient, - "Write": _parse_write_lenient, - "Edit": _parse_edit_lenient, - "MultiEdit": _parse_multiedit_lenient, - "Task": _parse_task_lenient, - "TodoWrite": _parse_todowrite_lenient, - "Read": _parse_read_lenient, - "AskUserQuestion": _parse_askuserquestion_lenient, - "ask_user_question": _parse_askuserquestion_lenient, # Legacy tool name - "ExitPlanMode": _parse_exitplanmode_lenient, -} - - # ============================================================================= # Tool Input Creation # ============================================================================= @@ -227,7 +80,9 @@ def create_tool_input( ) -> Optional[ToolInput]: """Create typed tool input from raw dictionary. - Uses strict validation first, then lenient parsing if available. + Uses Pydantic model_validate for strict validation. On failure, returns None + and the caller should use ToolUseContent as the fallback (which preserves + all original data for display). Args: tool_name: The name of the tool (e.g., "Bash", "Read") @@ -235,18 +90,12 @@ def create_tool_input( Returns: A typed input model if parsing succeeds, None otherwise. - When None is returned, the caller should use ToolUseContent itself - as the fallback (it's part of the ToolInput union). """ model_class = TOOL_INPUT_MODELS.get(tool_name) if model_class is not None: try: return cast(ToolInput, model_class.model_validate(input_data)) except Exception: - # Try lenient parsing if available - lenient_parser = TOOL_LENIENT_PARSERS.get(tool_name) - if lenient_parser is not None: - return cast(ToolInput, lenient_parser(input_data)) return None return None diff --git a/claude_code_log/factories/transcript_factory.py b/claude_code_log/factories/transcript_factory.py index c70821e1..ae3a22d7 100644 --- a/claude_code_log/factories/transcript_factory.py +++ b/claude_code_log/factories/transcript_factory.py @@ -6,10 +6,10 @@ Also provides: - Conditional casts for TranscriptEntry discrimination -- Usage info normalization for Anthropic SDK compatibility +- Usage info normalization """ -from typing import Any, Callable, Optional, Sequence, cast +from typing import Any, Callable, Sequence, cast from pydantic import BaseModel @@ -75,38 +75,11 @@ def as_assistant_entry(entry: TranscriptEntry) -> AssistantTranscriptEntry | Non # ============================================================================= -def normalize_usage_info(usage_data: Any) -> Optional[UsageInfo]: - """Normalize usage data from various formats to UsageInfo.""" +def normalize_usage_info(usage_data: dict[str, Any] | None) -> UsageInfo | None: + """Normalize usage data from JSON to UsageInfo.""" if usage_data is None: return None - - # If it's already a UsageInfo instance, return as-is - if isinstance(usage_data, UsageInfo): - return usage_data - - # If it's a dict, validate and convert - if isinstance(usage_data, dict): - return UsageInfo.model_validate(usage_data) - - # Handle object-like access (e.g., from SDK types) - if hasattr(usage_data, "input_tokens"): - server_tool_use = getattr(usage_data, "server_tool_use", None) - if server_tool_use is not None and hasattr(server_tool_use, "model_dump"): - server_tool_use = server_tool_use.model_dump() - return UsageInfo( - input_tokens=getattr(usage_data, "input_tokens", None), - output_tokens=getattr(usage_data, "output_tokens", None), - cache_creation_input_tokens=getattr( - usage_data, "cache_creation_input_tokens", None - ), - cache_read_input_tokens=getattr( - usage_data, "cache_read_input_tokens", None - ), - service_tier=getattr(usage_data, "service_tier", None), - server_tool_use=server_tool_use, - ) - - return None + return UsageInfo.model_validate(usage_data) # ============================================================================= diff --git a/claude_code_log/factories/user_factory.py b/claude_code_log/factories/user_factory.py index 22b30f66..f14a4e4d 100644 --- a/claude_code_log/factories/user_factory.py +++ b/claude_code_log/factories/user_factory.py @@ -464,14 +464,4 @@ def create_user_message( # Duck-typed image content - convert to our Pydantic model items.append(ImageContent.model_validate(item.model_dump())) # type: ignore[union-attr] - # Extract text content from items for dedup matching and simple renderers - raw_text_content = "\n".join( - item.text for item in items if isinstance(item, TextContent) - ) - - # Return UserTextMessage with items list and cached raw text - return UserTextMessage( - items=items, - raw_text_content=raw_text_content if raw_text_content else None, - meta=meta, - ) + return UserTextMessage(items=items, meta=meta) diff --git a/claude_code_log/html/__init__.py b/claude_code_log/html/__init__.py index 168ba9e2..bd5f97b3 100644 --- a/claude_code_log/html/__init__.py +++ b/claude_code_log/html/__init__.py @@ -39,13 +39,10 @@ # Legacy formatters (still used) format_askuserquestion_result, format_exitplanmode_result, - # Tool summary and title - format_tool_use_title, - get_tool_summary, + # Generic render_params_table, ) from .system_formatters import ( - format_dedup_notice_content, format_hook_summary_content, format_session_header_content, format_system_content, @@ -56,7 +53,6 @@ BashOutputMessage, CommandOutputMessage, CompactedSummaryMessage, - DedupNoticeMessage, IdeDiagnostic, IdeNotificationContent, IdeOpenedFile, @@ -126,17 +122,13 @@ # Legacy formatters (still used) "format_askuserquestion_result", "format_exitplanmode_result", - # Tool summary and title - "format_tool_use_title", - "get_tool_summary", + # Generic "render_params_table", # system_formatters - "format_dedup_notice_content", "format_hook_summary_content", "format_session_header_content", "format_system_content", # system content models - "DedupNoticeMessage", "SessionHeaderMessage", # user_formatters (content models) "SlashCommandMessage", diff --git a/claude_code_log/html/assistant_formatters.py b/claude_code_log/html/assistant_formatters.py index 9bd725f1..3128a50c 100644 --- a/claude_code_log/html/assistant_formatters.py +++ b/claude_code_log/html/assistant_formatters.py @@ -10,6 +10,9 @@ Content models are defined in models.py, this module only handles formatting. """ +from typing import Callable, Optional + +from ..image_export import export_image from ..models import ( AssistantTextMessage, ImageContent, @@ -18,6 +21,9 @@ ) from .utils import escape_html, render_markdown_collapsible +# Type alias for image formatter callback +ImageFormatter = Callable[[ImageContent], str] + # ============================================================================= # Formatting Functions @@ -28,25 +34,29 @@ def format_assistant_text_content( content: AssistantTextMessage, line_threshold: int = 30, preview_line_count: int = 10, + image_formatter: Optional[ImageFormatter] = None, ) -> str: """Format assistant text content as HTML. Iterates through content.items preserving order: - TextContent: Rendered as markdown with collapsible support - - ImageContent: Rendered as inline tag with base64 data URL + - ImageContent: Rendered as inline tag Args: content: AssistantTextMessage with text/items to render line_threshold: Number of lines before content becomes collapsible preview_line_count: Number of preview lines to show when collapsed + image_formatter: Optional callback for image formatting. If None, uses + format_image_content() which embeds images as base64 data URLs. Returns: HTML string with markdown-rendered, optionally collapsible content """ + formatter = image_formatter or format_image_content parts: list[str] = [] for item in content.items: if isinstance(item, ImageContent): - parts.append(format_image_content(item)) + parts.append(formatter(item)) else: # TextContent if item.text.strip(): text_html = render_markdown_collapsible( @@ -83,7 +93,11 @@ def format_thinking_content( def format_image_content(image: ImageContent) -> str: - """Format image content as HTML. + """Format image content as HTML with embedded base64 data. + + This is the default image formatter for backward compatibility. + For other export modes (referenced, placeholder), use the renderer's + _format_image() method via the image_formatter callback. Args: image: ImageContent with base64 image data @@ -91,8 +105,10 @@ def format_image_content(image: ImageContent) -> str: Returns: HTML img tag with data URL """ - data_url = f"data:{image.source.media_type};base64,{image.source.data}" - return f'Uploaded image' + src = export_image(image, mode="embedded") + if src is None: + return "[Image]" + return f'image' def format_unknown_content(content: UnknownMessage) -> str: diff --git a/claude_code_log/html/renderer.py b/claude_code_log/html/renderer.py index 41113639..747671b0 100644 --- a/claude_code_log/html/renderer.py +++ b/claude_code_log/html/renderer.py @@ -1,5 +1,7 @@ """HTML renderer implementation for Claude Code transcripts.""" +from __future__ import annotations + from pathlib import Path from typing import TYPE_CHECKING, Any, Optional, Tuple, cast @@ -10,8 +12,8 @@ BashOutputMessage, CommandOutputMessage, CompactedSummaryMessage, - DedupNoticeMessage, HookSummaryMessage, + ImageContent, SessionHeaderMessage, SlashCommandMessage, SystemMessage, @@ -59,7 +61,6 @@ ) from ..utils import format_timestamp from .system_formatters import ( - format_dedup_notice_content, format_hook_summary_content, format_session_header_content, format_system_content, @@ -148,124 +149,160 @@ def __init__(self, image_export_mode: str = "embedded"): Args: image_export_mode: Image export mode - "placeholder", "embedded", or "referenced". - Currently only "embedded" is fully supported for HTML. """ super().__init__() self.image_export_mode = image_export_mode + self._output_dir: Path | None = None + self._image_counter = 0 # ------------------------------------------------------------------------- - # System Content Formatters + # Private Utility Methods # ------------------------------------------------------------------------- - def format_SystemMessage(self, message: SystemMessage) -> str: - """Format →
...
.""" - return format_system_content(message) + def _format_image(self, image: ImageContent) -> str: + """Format image based on export mode.""" + from ..image_export import export_image - def format_HookSummaryMessage(self, message: HookSummaryMessage) -> str: - """Format →
...
.""" - return format_hook_summary_content(message) + self._image_counter += 1 + src = export_image( + image, + self.image_export_mode, + output_dir=self._output_dir, + counter=self._image_counter, + ) + if src is None: + return "[Image]" + return f'image' - def format_SessionHeaderMessage(self, message: SessionHeaderMessage) -> str: - """Format →
...
.""" - return format_session_header_content(message) + # ------------------------------------------------------------------------- + # System Content Formatters + # ------------------------------------------------------------------------- + + def format_SystemMessage(self, content: SystemMessage, _: TemplateMessage) -> str: + return format_system_content(content) + + def format_HookSummaryMessage( + self, content: HookSummaryMessage, _: TemplateMessage + ) -> str: + return format_hook_summary_content(content) - def format_DedupNoticeMessage(self, message: DedupNoticeMessage) -> str: - """Format → ....""" - return format_dedup_notice_content(message) + def format_SessionHeaderMessage( + self, content: SessionHeaderMessage, _: TemplateMessage + ) -> str: + return format_session_header_content(content) # ------------------------------------------------------------------------- # User Content Formatters # ------------------------------------------------------------------------- - def format_UserTextMessage(self, message: UserTextMessage) -> str: - """Format → rendered markdown HTML.""" - return format_user_text_model_content(message) + def format_UserTextMessage( + self, content: UserTextMessage, _: TemplateMessage + ) -> str: + return format_user_text_model_content( + content, image_formatter=self._format_image + ) - def format_UserSlashCommandMessage(self, message: UserSlashCommandMessage) -> str: - """Format → /cmd.""" - return format_user_slash_command_content(message) + def format_UserSlashCommandMessage( + self, content: UserSlashCommandMessage, _: TemplateMessage + ) -> str: + return format_user_slash_command_content(content) - def format_SlashCommandMessage(self, message: SlashCommandMessage) -> str: - """Format → /cmd arg.""" - return format_slash_command_content(message) + def format_SlashCommandMessage( + self, content: SlashCommandMessage, _: TemplateMessage + ) -> str: + return format_slash_command_content(content) - def format_CommandOutputMessage(self, message: CommandOutputMessage) -> str: - """Format →
...
.""" - return format_command_output_content(message) + def format_CommandOutputMessage( + self, content: CommandOutputMessage, _: TemplateMessage + ) -> str: + return format_command_output_content(content) - def format_BashInputMessage(self, message: BashInputMessage) -> str: - """Format →
$ cmd
.""" - return format_bash_input_content(message) + def format_BashInputMessage( + self, content: BashInputMessage, _: TemplateMessage + ) -> str: + return format_bash_input_content(content) - def format_BashOutputMessage(self, message: BashOutputMessage) -> str: - """Format →
...
.""" - return format_bash_output_content(message) + def format_BashOutputMessage( + self, content: BashOutputMessage, _: TemplateMessage + ) -> str: + return format_bash_output_content(content) - def format_CompactedSummaryMessage(self, message: CompactedSummaryMessage) -> str: - """Format →
...
.""" - return format_compacted_summary_content(message) + def format_CompactedSummaryMessage( + self, content: CompactedSummaryMessage, _: TemplateMessage + ) -> str: + return format_compacted_summary_content(content) - def format_UserMemoryMessage(self, message: UserMemoryMessage) -> str: - """Format →
...
.""" - return format_user_memory_content(message) + def format_UserMemoryMessage( + self, content: UserMemoryMessage, _: TemplateMessage + ) -> str: + return format_user_memory_content(content) # ------------------------------------------------------------------------- # Assistant Content Formatters # ------------------------------------------------------------------------- - def format_AssistantTextMessage(self, message: AssistantTextMessage) -> str: - """Format → rendered markdown HTML.""" - return format_assistant_text_content(message) + def format_AssistantTextMessage( + self, content: AssistantTextMessage, _: TemplateMessage + ) -> str: + return format_assistant_text_content( + content, image_formatter=self._format_image + ) - def format_ThinkingMessage(self, message: ThinkingMessage) -> str: + def format_ThinkingMessage( + self, content: ThinkingMessage, _: TemplateMessage + ) -> str: """Format →
...
(foldable if >10 lines).""" - return format_thinking_content(message, line_threshold=10) + return format_thinking_content(content, line_threshold=10) - def format_UnknownMessage(self, message: UnknownMessage) -> str: + def format_UnknownMessage(self, content: UnknownMessage, _: TemplateMessage) -> str: """Format →
JSON dump
.""" - return format_unknown_content(message) + return format_unknown_content(content) # ------------------------------------------------------------------------- # Tool Input Formatters # ------------------------------------------------------------------------- - def format_BashInput(self, input: BashInput) -> str: + def format_BashInput(self, input: BashInput, _: TemplateMessage) -> str: """Format →
$ command
.""" return format_bash_input(input) - def format_ReadInput(self, input: ReadInput) -> str: + def format_ReadInput(self, input: ReadInput, _: TemplateMessage) -> str: """Format → file_path | ...
.""" return format_read_input(input) - def format_WriteInput(self, input: WriteInput) -> str: + def format_WriteInput(self, input: WriteInput, _: TemplateMessage) -> str: """Format → file path + syntax-highlighted content preview.""" return format_write_input(input) - def format_EditInput(self, input: EditInput) -> str: + def format_EditInput(self, input: EditInput, _: TemplateMessage) -> str: """Format → file path + diff of old_string/new_string.""" return format_edit_input(input) - def format_MultiEditInput(self, input: MultiEditInput) -> str: + def format_MultiEditInput(self, input: MultiEditInput, _: TemplateMessage) -> str: """Format → file path + multiple diffs.""" return format_multiedit_input(input) - def format_TaskInput(self, input: TaskInput) -> str: + def format_TaskInput(self, input: TaskInput, _: TemplateMessage) -> str: """Format →
prompt text
.""" return format_task_input(input) - def format_TodoWriteInput(self, input: TodoWriteInput) -> str: + def format_TodoWriteInput(self, input: TodoWriteInput, _: TemplateMessage) -> str: """Format → .""" return format_todowrite_input(input) - def format_AskUserQuestionInput(self, input: AskUserQuestionInput) -> str: + def format_AskUserQuestionInput( + self, input: AskUserQuestionInput, _: TemplateMessage + ) -> str: """Format → questions as definition list.""" return format_askuserquestion_input(input) - def format_ExitPlanModeInput(self, input: ExitPlanModeInput) -> str: + def format_ExitPlanModeInput( + self, input: ExitPlanModeInput, _: TemplateMessage + ) -> str: """Format → empty string (no content).""" return format_exitplanmode_input(input) - def format_ToolUseContent(self, content: ToolUseContent) -> str: + def format_ToolUseContent(self, content: ToolUseContent, _: TemplateMessage) -> str: """Format → key | value rows
.""" return render_params_table(content.input) @@ -273,35 +310,41 @@ def format_ToolUseContent(self, content: ToolUseContent) -> str: # Tool Output Formatters # ------------------------------------------------------------------------- - def format_ReadOutput(self, output: ReadOutput) -> str: + def format_ReadOutput(self, output: ReadOutput, _: TemplateMessage) -> str: """Format → syntax-highlighted file content.""" return format_read_output(output) - def format_WriteOutput(self, output: WriteOutput) -> str: + def format_WriteOutput(self, output: WriteOutput, _: TemplateMessage) -> str: """Format → status message (e.g. 'Wrote 42 bytes').""" return format_write_output(output) - def format_EditOutput(self, output: EditOutput) -> str: + def format_EditOutput(self, output: EditOutput, _: TemplateMessage) -> str: """Format → status message (e.g. 'Applied edit').""" return format_edit_output(output) - def format_BashOutput(self, output: BashOutput) -> str: + def format_BashOutput(self, output: BashOutput, _: TemplateMessage) -> str: """Format →
stdout/stderr
.""" return format_bash_output(output) - def format_TaskOutput(self, output: TaskOutput) -> str: + def format_TaskOutput(self, output: TaskOutput, _: TemplateMessage) -> str: """Format → rendered markdown of task result.""" return format_task_output(output) - def format_AskUserQuestionOutput(self, output: AskUserQuestionOutput) -> str: + def format_AskUserQuestionOutput( + self, output: AskUserQuestionOutput, _: TemplateMessage + ) -> str: """Format → user's answers as definition list.""" return format_askuserquestion_output(output) - def format_ExitPlanModeOutput(self, output: ExitPlanModeOutput) -> str: + def format_ExitPlanModeOutput( + self, output: ExitPlanModeOutput, _: TemplateMessage + ) -> str: """Format → status message.""" return format_exitplanmode_output(output) - def format_ToolResultContent(self, output: ToolResultContent) -> str: + def format_ToolResultContent( + self, output: ToolResultContent, _: TemplateMessage + ) -> str: """Format →
raw content
(fallback for unknown tools).""" return format_tool_result_content_raw(output) @@ -321,18 +364,21 @@ def _tool_title( return f"{prefix}{escaped_name} {escaped_summary}" return f"{prefix}{escaped_name}" - def title_TodoWriteInput(self, message: TemplateMessage) -> str: # noqa: ARG002 + def title_TodoWriteInput( + self, _input: TodoWriteInput, _message: TemplateMessage + ) -> str: """Title → '📝 Todo List'.""" return "📝 Todo List" - def title_AskUserQuestionInput(self, message: TemplateMessage) -> str: # noqa: ARG002 + def title_AskUserQuestionInput( + self, _input: AskUserQuestionInput, _message: TemplateMessage + ) -> str: """Title → '❓ Asking questions...'.""" return "❓ Asking questions..." - def title_TaskInput(self, message: TemplateMessage) -> str: + def title_TaskInput(self, input: TaskInput, message: TemplateMessage) -> str: """Title → '🔧 Task (subagent_type)'.""" content = cast(ToolUseMessage, message.content) - input = cast(TaskInput, content.input) escaped_name = escape_html(content.tool_name) escaped_subagent = ( escape_html(input.subagent_type) if input.subagent_type else "" @@ -346,19 +392,16 @@ def title_TaskInput(self, message: TemplateMessage) -> str: return f"🔧 {escaped_name} ({escaped_subagent})" return f"🔧 {escaped_name}" - def title_EditInput(self, message: TemplateMessage) -> str: + def title_EditInput(self, input: EditInput, message: TemplateMessage) -> str: """Title → '📝 Edit '.""" - input = cast(EditInput, cast(ToolUseMessage, message.content).input) return self._tool_title(message, "📝", input.file_path) - def title_WriteInput(self, message: TemplateMessage) -> str: + def title_WriteInput(self, input: WriteInput, message: TemplateMessage) -> str: """Title → '📝 Write '.""" - input = cast(WriteInput, cast(ToolUseMessage, message.content).input) return self._tool_title(message, "📝", input.file_path) - def title_ReadInput(self, message: TemplateMessage) -> str: + def title_ReadInput(self, input: ReadInput, message: TemplateMessage) -> str: """Title → '📄 Read [, lines N-M]'.""" - input = cast(ReadInput, cast(ToolUseMessage, message.content).input) summary = input.file_path # Add line range info if available if input.limit is not None: @@ -369,40 +412,33 @@ def title_ReadInput(self, message: TemplateMessage) -> str: summary = f"{summary}, lines {offset + 1}-{offset + input.limit}" return self._tool_title(message, "📄", summary) - def title_GlobInput(self, message: TemplateMessage) -> str: + def title_GlobInput(self, input: GlobInput, message: TemplateMessage) -> str: """Title → '🔍 Glob [ in path]'.""" - input = cast(GlobInput, cast(ToolUseMessage, message.content).input) summary = input.pattern if input.path: summary = f"{summary} in {input.path}" return self._tool_title(message, "🔍", summary) - def title_BashInput(self, message: TemplateMessage) -> str: + def title_BashInput(self, input: BashInput, message: TemplateMessage) -> str: """Title → '💻 Bash '.""" - input = cast(BashInput, cast(ToolUseMessage, message.content).input) return self._tool_title(message, "💻", input.description) def _flatten_preorder( self, roots: list[TemplateMessage] - ) -> Tuple[ - list[Tuple[TemplateMessage, str, str, str]], - list[Tuple[str, list[Tuple[float, str]]]], - ]: + ) -> list[Tuple[TemplateMessage, str, str, str]]: """Flatten message tree via pre-order traversal, formatting each message. Traverses the tree depth-first (pre-order), computes title and formats content to HTML, building a flat list of (message, title, html, timestamp) tuples. - Also tracks timing statistics for Markdown and Pygments operations when - DEBUG_TIMING is enabled. + Also tracks and reports timing statistics for Markdown and Pygments operations + when DEBUG_TIMING is enabled. Args: roots: Root messages (typically session headers) with children populated Returns: - Tuple of: - - Flat list of (message, title, html_content, formatted_timestamp) tuples - - Operation timing data for reporting: [("Markdown", timings), ("Pygments", timings)] + Flat list of (message, title, html_content, formatted_timestamp) tuples """ flat: list[Tuple[TemplateMessage, str, str, str]] = [] @@ -425,26 +461,33 @@ def visit(msg: TemplateMessage) -> None: for root in roots: visit(root) - # Return timing data for reporting - operation_timings: list[Tuple[str, list[Tuple[float, str]]]] = [ - ("Markdown", markdown_timings), - ("Pygments", pygments_timings), - ] + # Report timing statistics for Markdown/Pygments operations + if DEBUG_TIMING: + report_timing_statistics( + [ + ("Markdown", markdown_timings), + ("Pygments", pygments_timings), + ] + ) - return flat, operation_timings + return flat def generate( self, messages: list[TranscriptEntry], title: Optional[str] = None, combined_transcript_link: Optional[str] = None, - output_dir: Optional[Path] = None, # noqa: ARG002 + output_dir: Optional[Path] = None, ) -> str: """Generate HTML from transcript messages.""" import time t_start = time.time() + # Set output directory for image export (used in "referenced" mode) + self._output_dir = output_dir + self._image_counter = 0 + if not title: title = "Claude Transcript" @@ -453,11 +496,7 @@ def generate( # Flatten tree via pre-order traversal, formatting content along the way with log_timing("Content formatting (pre-order)", t_start): - template_messages, operation_timings = self._flatten_preorder(root_messages) - - # Report timing statistics for Markdown/Pygments operations - if DEBUG_TIMING: - report_timing_statistics([], operation_timings) + template_messages = self._flatten_preorder(root_messages) # Render template with log_timing("Template environment setup", t_start): @@ -488,7 +527,7 @@ def generate_session( session_id: str, title: Optional[str] = None, cache_manager: Optional["CacheManager"] = None, - output_dir: Optional[Path] = None, # noqa: ARG002 + output_dir: Optional[Path] = None, ) -> str: """Generate HTML for a single session.""" # Filter messages for this session (SummaryTranscriptEntry.sessionId is always None) @@ -508,6 +547,7 @@ def generate_session( session_messages, title or f"Session {session_id[:8]}", combined_transcript_link=combined_link, + output_dir=output_dir, ) def generate_projects_index( diff --git a/claude_code_log/html/system_formatters.py b/claude_code_log/html/system_formatters.py index 2d71876d..08807033 100644 --- a/claude_code_log/html/system_formatters.py +++ b/claude_code_log/html/system_formatters.py @@ -12,7 +12,6 @@ from .ansi_colors import convert_ansi_to_html from ..models import ( - DedupNoticeMessage, HookSummaryMessage, SessionHeaderMessage, SystemMessage, @@ -92,30 +91,8 @@ def format_session_header_content(content: SessionHeaderMessage) -> str: return escaped_title -def format_dedup_notice_content(content: DedupNoticeMessage) -> str: - """Format a deduplication notice as HTML. - - Args: - content: DedupNoticeMessage with notice text and optional target link - - Returns: - HTML for the dedup notice display with optional anchor link - """ - escaped_notice = html.escape(content.notice_text) - - if content.target_message_id: - # Create clickable link to the target message - return ( - f'

' - f"{escaped_notice}

" - ) - else: - return f"

{escaped_notice}

" - - __all__ = [ "format_system_content", "format_hook_summary_content", "format_session_header_content", - "format_dedup_notice_content", ] diff --git a/claude_code_log/html/tool_formatters.py b/claude_code_log/html/tool_formatters.py index 77506fff..6ce93fe3 100644 --- a/claude_code_log/html/tool_formatters.py +++ b/claude_code_log/html/tool_formatters.py @@ -18,7 +18,7 @@ import binascii import json import re -from typing import Any, Optional, cast +from typing import Any, cast from .utils import ( escape_html, @@ -42,7 +42,6 @@ TaskInput, TaskOutput, TodoWriteInput, - ToolInput, ToolResultContent, WriteInput, WriteOutput, @@ -524,94 +523,6 @@ def format_task_input(task_input: TaskInput) -> str: return render_markdown_collapsible(task_input.prompt, "task-prompt") -# -- Tool Summary and Title --------------------------------------------------- - - -def get_tool_summary(parsed: Optional[ToolInput]) -> Optional[str]: - """Extract a one-line summary from parsed tool input for display in header. - - Returns a brief description or filename that can be shown in the message header - to save vertical space. - - Args: - parsed: Parsed tool input, or None if parsing failed/not available - """ - if isinstance(parsed, BashInput): - return parsed.description - - if isinstance(parsed, (ReadInput, EditInput, WriteInput)): - return parsed.file_path if parsed.file_path else None - - if isinstance(parsed, TaskInput): - return parsed.description if parsed.description else None - - # No summary for other tools or unparsed input - return None - - -def format_tool_use_title(tool_name: str, parsed: Optional[ToolInput]) -> str: - """Generate the title HTML for a tool use message. - - Returns HTML string for the message header, with tool name, icon, - and optional summary/metadata. - - Args: - tool_name: The tool name (e.g., "Bash", "Read", "Edit") - parsed: Parsed tool input, or None if parsing failed/not available - """ - escaped_name = escape_html(tool_name) - summary = get_tool_summary(parsed) - - # TodoWrite: fixed title - if tool_name == "TodoWrite": - return "📝 Todo List" - - # Task: show subagent_type and description - if isinstance(parsed, TaskInput): - escaped_subagent = ( - escape_html(parsed.subagent_type) if parsed.subagent_type else "" - ) - description = parsed.description - - if description and parsed.subagent_type: - escaped_desc = escape_html(description) - return f"🔧 {escaped_name} {escaped_desc} ({escaped_subagent})" - elif description: - escaped_desc = escape_html(description) - return f"🔧 {escaped_name} {escaped_desc}" - elif parsed.subagent_type: - return f"🔧 {escaped_name} ({escaped_subagent})" - else: - return f"🔧 {escaped_name}" - - # Edit/Write: use 📝 icon - if isinstance(parsed, (EditInput, WriteInput)): - if summary: - escaped_summary = escape_html(summary) - return ( - f"📝 {escaped_name} {escaped_summary}" - ) - else: - return f"📝 {escaped_name}" - - # Read: use 📄 icon - if isinstance(parsed, ReadInput): - if summary: - escaped_summary = escape_html(summary) - return ( - f"📄 {escaped_name} {escaped_summary}" - ) - else: - return f"📄 {escaped_name}" - - # Other tools: append summary if present - if summary: - escaped_summary = escape_html(summary) - return f"{escaped_name} {escaped_summary}" - - return escaped_name - - # -- Generic Parameter Table -------------------------------------------------- @@ -812,9 +723,6 @@ def format_tool_result_content_raw(tool_result: ToolResultContent) -> str: # Legacy formatters (still used) "format_askuserquestion_result", "format_exitplanmode_result", - # Tool summary and title - "get_tool_summary", - "format_tool_use_title", # Generic "render_params_table", ] diff --git a/claude_code_log/html/user_formatters.py b/claude_code_log/html/user_formatters.py index 098a467f..95a384ea 100644 --- a/claude_code_log/html/user_formatters.py +++ b/claude_code_log/html/user_formatters.py @@ -8,6 +8,8 @@ - tool_formatters.py: tool use/result content """ +from typing import Callable, Optional + from .ansi_colors import convert_ansi_to_html from ..models import ( BashInputMessage, @@ -191,7 +193,10 @@ def format_user_text_content(text: str) -> str: return f"
{escaped_text}
" -def format_user_text_model_content(content: UserTextMessage) -> str: +def format_user_text_model_content( + content: UserTextMessage, + image_formatter: Optional[Callable[[ImageContent], str]] = None, +) -> str: """Format UserTextMessage model as HTML. Handles user text with optional IDE notifications, compacted summaries, @@ -199,11 +204,13 @@ def format_user_text_model_content(content: UserTextMessage) -> str: When `items` is set, iterates through the content items preserving order: - TextContent: Rendered as preformatted text - - ImageContent: Rendered as inline tag with base64 data URL + - ImageContent: Rendered as inline tag - IdeNotificationContent: Rendered as IDE notification blocks Args: content: UserTextMessage with text/items and optional flags/notifications + image_formatter: Optional callback for image formatting. If None, uses + format_image_content() which embeds images as base64 data URLs. Returns: HTML string combining all content items @@ -211,6 +218,7 @@ def format_user_text_model_content(content: UserTextMessage) -> str: # Import here to avoid circular dependency from .assistant_formatters import format_image_content + formatter = image_formatter or format_image_content parts: list[str] = [] for item in content.items: @@ -218,7 +226,7 @@ def format_user_text_model_content(content: UserTextMessage) -> str: notifications = format_ide_notification_content(item) parts.extend(notifications) elif isinstance(item, ImageContent): - parts.append(format_image_content(item)) + parts.append(formatter(item)) else: # TextContent # Regular user text as preformatted if item.text.strip(): diff --git a/claude_code_log/html/utils.py b/claude_code_log/html/utils.py index 44d5b753..8822ab3e 100644 --- a/claude_code_log/html/utils.py +++ b/claude_code_log/html/utils.py @@ -28,7 +28,6 @@ BashOutputMessage, CommandOutputMessage, CompactedSummaryMessage, - DedupNoticeMessage, HookSummaryMessage, MessageContent, SessionHeaderMessage, @@ -68,7 +67,6 @@ CommandOutputMessage: ["user", "command-output"], # Assistant message types AssistantTextMessage: ["assistant"], - DedupNoticeMessage: ["assistant", "dedup-notice"], # Styled as assistant # Tool message types ToolUseMessage: ["tool_use"], ToolResultMessage: ["tool_result"], # error added dynamically diff --git a/claude_code_log/image_export.py b/claude_code_log/image_export.py index 455ff655..98c21d10 100644 --- a/claude_code_log/image_export.py +++ b/claude_code_log/image_export.py @@ -17,8 +17,11 @@ def export_image( mode: str, output_dir: Path | None = None, counter: int = 0, -) -> str: - """Export image content based on the specified mode. +) -> str | None: + """Export image content and return the source URL/path. + + This is a format-agnostic function that handles image export logic + and returns just the src. Callers format the result as HTML or Markdown. Args: image: ImageContent with base64-encoded image data @@ -27,21 +30,20 @@ def export_image( counter: Image counter for generating unique filenames Returns: - Markdown/HTML image reference string based on mode: - - placeholder: "[Image]" - - embedded: "![image](data:image/...;base64,...)" - - referenced: "![image](images/image_0001.png)" + For "placeholder" mode: None (caller should render placeholder text) + For "embedded" mode: data URL (e.g., "data:image/png;base64,...") + For "referenced" mode: relative path (e.g., "images/image_0001.png") + For unsupported mode: None """ if mode == "placeholder": - return "[Image]" + return None - elif mode == "embedded": - data_url = f"data:{image.source.media_type};base64,{image.source.data}" - return f"![image]({data_url})" + if mode == "embedded": + return f"data:{image.source.media_type};base64,{image.source.data}" - elif mode == "referenced": + if mode == "referenced": if output_dir is None: - return "[Image: export directory not set]" + return None # Create images subdirectory images_dir = output_dir / "images" @@ -56,10 +58,10 @@ def export_image( image_data = base64.b64decode(image.source.data) filepath.write_bytes(image_data) - return f"![image](images/{filename})" + return f"images/{filename}" - else: - return f"[Image: unsupported mode '{mode}']" + # Unsupported mode + return None def _get_extension(media_type: str) -> str: diff --git a/claude_code_log/markdown/renderer.py b/claude_code_log/markdown/renderer.py index 1a14f67e..c1589903 100644 --- a/claude_code_log/markdown/renderer.py +++ b/claude_code_log/markdown/renderer.py @@ -1,9 +1,11 @@ """Markdown renderer implementation for Claude Code transcripts.""" +from __future__ import annotations + import json import re from pathlib import Path -from typing import TYPE_CHECKING, Any, Optional, cast +from typing import TYPE_CHECKING, Any, Optional from ..cache import get_library_version from ..utils import generate_unified_diff, strip_error_tags @@ -13,7 +15,6 @@ BashOutputMessage, CommandOutputMessage, CompactedSummaryMessage, - DedupNoticeMessage, HookSummaryMessage, ImageContent, SessionHeaderMessage, @@ -137,9 +138,15 @@ def _format_image(self, image: ImageContent) -> str: from ..image_export import export_image self._image_counter += 1 - return export_image( - image, self.image_export_mode, self._output_dir, self._image_counter + src = export_image( + image, + self.image_export_mode, + output_dir=self._output_dir, + counter=self._image_counter, ) + if src is None: + return "[Image]" + return f"![image]({src})" def _lang_from_path(self, path: str) -> str: """Get language hint from file extension.""" @@ -242,52 +249,51 @@ def _get_message_text(self, msg: TemplateMessage) -> str: # System Content Formatters # ------------------------------------------------------------------------- - def format_SystemMessage(self, message: SystemMessage) -> str: + def format_SystemMessage(self, content: SystemMessage, _: TemplateMessage) -> str: level_prefix = {"info": "ℹ️", "warning": "⚠️", "error": "❌"}.get( - message.level, "" + content.level, "" ) - return f"{level_prefix} {message.text}" + return f"{level_prefix} {content.text}" - def format_HookSummaryMessage(self, message: HookSummaryMessage) -> str: + def format_HookSummaryMessage( + self, content: HookSummaryMessage, _: TemplateMessage + ) -> str: parts: list[str] = [] - if message.has_output: + if content.has_output: parts.append("Hook produced output") - if message.hook_errors: - for error in message.hook_errors: + if content.hook_errors: + for error in content.hook_errors: parts.append(f"❌ Error: {error}") - if message.hook_infos: - for info in message.hook_infos: + if content.hook_infos: + for info in content.hook_infos: parts.append(f"ℹ️ {info}") return "\n\n".join(parts) if parts else "" - def format_SessionHeaderMessage(self, message: SessionHeaderMessage) -> str: + def format_SessionHeaderMessage( + self, content: SessionHeaderMessage, _: TemplateMessage + ) -> str: # Return just the anchor - it will be placed before the heading - session_short = message.session_id[:8] + session_short = content.session_id[:8] return f'' - def title_SessionHeaderMessage(self, message: TemplateMessage) -> str: + def title_SessionHeaderMessage( + self, content: SessionHeaderMessage, _: TemplateMessage + ) -> str: # Return the title with session ID and optional summary - content = cast(SessionHeaderMessage, message.content) session_short = content.session_id[:8] if content.summary: return f"📋 Session `{session_short}`: {content.summary}" return f"📋 Session `{session_short}`" - def format_DedupNoticeMessage(self, message: DedupNoticeMessage) -> str: # noqa: ARG002 - # Skip dedup notices in markdown output - return "" - - def title_DedupNoticeMessage(self, message: TemplateMessage) -> str: # noqa: ARG002 - # Skip dedup notices in markdown output - return "" - # ------------------------------------------------------------------------- # User Content Formatters # ------------------------------------------------------------------------- - def format_UserTextMessage(self, message: UserTextMessage) -> str: + def format_UserTextMessage( + self, content: UserTextMessage, _: TemplateMessage + ) -> str: parts: list[str] = [] - for item in message.items: + for item in content.items: if isinstance(item, ImageContent): parts.append(self._format_image(item)) elif isinstance(item, TextContent): @@ -296,63 +302,82 @@ def format_UserTextMessage(self, message: UserTextMessage) -> str: parts.append(self._code_fence(item.text)) return "\n\n".join(parts) - def title_UserTextMessage(self, message: TemplateMessage) -> str: - if excerpt := self._excerpt(self._get_message_text(message)): + def title_UserTextMessage( + self, _content: UserTextMessage, _message: TemplateMessage + ) -> str: + if excerpt := self._excerpt(self._get_message_text(_message)): return f"🤷 User: *{self._escape_stars(excerpt)}*" return "🤷 User" - def format_UserSlashCommandMessage(self, message: UserSlashCommandMessage) -> str: + def format_UserSlashCommandMessage( + self, content: UserSlashCommandMessage, _: TemplateMessage + ) -> str: # UserSlashCommandMessage has a text attribute (markdown), quote to protect it - return self._quote(message.text) if message.text.strip() else "" + return self._quote(content.text) if content.text.strip() else "" - def format_SlashCommandMessage(self, message: SlashCommandMessage) -> str: + def format_SlashCommandMessage( + self, content: SlashCommandMessage, _: TemplateMessage + ) -> str: parts: list[str] = [] # Command name is in the title, only include args and contents here - if message.command_args: - parts.append(f"**Args:** `{message.command_args}`") - if message.command_contents: - parts.append(self._code_fence(message.command_contents)) + if content.command_args: + parts.append(f"**Args:** `{content.command_args}`") + if content.command_contents: + parts.append(self._code_fence(content.command_contents)) return "\n\n".join(parts) - def title_SlashCommandMessage(self, message: TemplateMessage) -> str: - content = cast(SlashCommandMessage, message.content) + def title_SlashCommandMessage( + self, content: SlashCommandMessage, _message: TemplateMessage + ) -> str: # command_name already includes the leading slash return f"🤷 Command `{content.command_name}`" - def format_CommandOutputMessage(self, message: CommandOutputMessage) -> str: - if message.is_markdown: + def format_CommandOutputMessage( + self, content: CommandOutputMessage, _: TemplateMessage + ) -> str: + if content.is_markdown: # Quote markdown output to protect it - return self._quote(message.stdout) - return self._code_fence(message.stdout) + return self._quote(content.stdout) + return self._code_fence(content.stdout) - def format_BashInputMessage(self, message: BashInputMessage) -> str: - return self._code_fence(f"$ {message.command}", "bash") + def format_BashInputMessage( + self, content: BashInputMessage, _: TemplateMessage + ) -> str: + return self._code_fence(f"$ {content.command}", "bash") - def format_BashOutputMessage(self, message: BashOutputMessage) -> str: + def format_BashOutputMessage( + self, content: BashOutputMessage, _: TemplateMessage + ) -> str: # Combine stdout and stderr, strip ANSI codes for markdown output parts: list[str] = [] - if message.stdout: - parts.append(message.stdout) - if message.stderr: - parts.append(message.stderr) + if content.stdout: + parts.append(content.stdout) + if content.stderr: + parts.append(content.stderr) output = "\n".join(parts) output = re.sub(r"\x1b\[[0-9;]*m", "", output) return self._code_fence(output) - def format_CompactedSummaryMessage(self, message: CompactedSummaryMessage) -> str: + def format_CompactedSummaryMessage( + self, content: CompactedSummaryMessage, _: TemplateMessage + ) -> str: # Quote to protect embedded markdown - return self._quote(message.summary_text) + return self._quote(content.summary_text) - def format_UserMemoryMessage(self, message: UserMemoryMessage) -> str: - return self._code_fence(message.memory_text) + def format_UserMemoryMessage( + self, content: UserMemoryMessage, _: TemplateMessage + ) -> str: + return self._code_fence(content.memory_text) # ------------------------------------------------------------------------- # Assistant Content Formatters # ------------------------------------------------------------------------- - def format_AssistantTextMessage(self, message: AssistantTextMessage) -> str: + def format_AssistantTextMessage( + self, content: AssistantTextMessage, _: TemplateMessage + ) -> str: parts: list[str] = [] - for item in message.items: + for item in content.items: if isinstance(item, ImageContent): parts.append(self._format_image(item)) else: # TextContent @@ -361,22 +386,24 @@ def format_AssistantTextMessage(self, message: AssistantTextMessage) -> str: parts.append(self._quote(item.text)) return "\n\n".join(parts) - def format_ThinkingMessage(self, message: ThinkingMessage) -> str: - quoted = self._quote(message.thinking) + def format_ThinkingMessage( + self, content: ThinkingMessage, _: TemplateMessage + ) -> str: + quoted = self._quote(content.thinking) return self._collapsible("Thinking...", quoted) - def format_UnknownMessage(self, message: UnknownMessage) -> str: - return f"*Unknown content type: {message.type_name}*" + def format_UnknownMessage(self, content: UnknownMessage, _: TemplateMessage) -> str: + return f"*Unknown content type: {content.type_name}*" # ------------------------------------------------------------------------- # Tool Input Formatters # ------------------------------------------------------------------------- - def format_BashInput(self, input: BashInput) -> str: + def format_BashInput(self, input: BashInput, _: TemplateMessage) -> str: # Description is in the title, just show the command with $ prefix return self._code_fence(f"$ {input.command}", "bash") - def format_ReadInput(self, input: ReadInput) -> str: + def format_ReadInput(self, input: ReadInput, _: TemplateMessage) -> str: # File path goes in the collapsible summary of ReadOutput # Just show line range hint here if applicable if input.offset or input.limit: @@ -385,17 +412,17 @@ def format_ReadInput(self, input: ReadInput) -> str: return f"*(lines {start}–{end})*" return "" - def format_WriteInput(self, input: WriteInput) -> str: + def format_WriteInput(self, input: WriteInput, _: TemplateMessage) -> str: summary = f"{input.file_path}" content = self._code_fence(input.content, self._lang_from_path(input.file_path)) return self._collapsible(summary, content) - def format_EditInput(self, input: EditInput) -> str: + def format_EditInput(self, input: EditInput, _: TemplateMessage) -> str: # Diff is visible; result goes in collapsible in format_EditOutput diff_text = generate_unified_diff(input.old_string, input.new_string) return self._code_fence(diff_text, "diff") - def format_MultiEditInput(self, input: MultiEditInput) -> str: + def format_MultiEditInput(self, input: MultiEditInput, _: TemplateMessage) -> str: # All diffs visible; result goes in collapsible in format_EditOutput parts: list[str] = [] for i, edit in enumerate(input.edits, 1): @@ -404,17 +431,17 @@ def format_MultiEditInput(self, input: MultiEditInput) -> str: parts.append(self._code_fence(diff_text, "diff")) return "\n\n".join(parts) - def format_GlobInput(self, input: GlobInput) -> str: # noqa: ARG002 + def format_GlobInput(self, _input: GlobInput, _: TemplateMessage) -> str: # Pattern and path are in the title return "" - def format_GrepInput(self, input: GrepInput) -> str: + def format_GrepInput(self, input: GrepInput, _: TemplateMessage) -> str: # Pattern and path are in the title, only show glob filter if present if input.glob: return f"Glob: `{input.glob}`" return "" - def format_TaskInput(self, input: TaskInput) -> str: + def format_TaskInput(self, input: TaskInput, _: TemplateMessage) -> str: # Description is now in the title, just show prompt as collapsible return ( self._collapsible("Instructions", self._quote(input.prompt)) @@ -422,7 +449,7 @@ def format_TaskInput(self, input: TaskInput) -> str: else "" ) - def format_TodoWriteInput(self, input: TodoWriteInput) -> str: + def format_TodoWriteInput(self, input: TodoWriteInput, _: TemplateMessage) -> str: parts: list[str] = [] for todo in input.todos: status_icon = {"pending": "⬜", "in_progress": "🔄", "completed": "✅"}.get( @@ -432,17 +459,18 @@ def format_TodoWriteInput(self, input: TodoWriteInput) -> str: return "\n".join(parts) def format_AskUserQuestionInput( - self, - input: AskUserQuestionInput, # noqa: ARG002 + self, _input: AskUserQuestionInput, _: TemplateMessage ) -> str: # Input is rendered together with output in format_AskUserQuestionOutput return "" - def format_ExitPlanModeInput(self, input: ExitPlanModeInput) -> str: # noqa: ARG002 + def format_ExitPlanModeInput( + self, _input: ExitPlanModeInput, _: TemplateMessage + ) -> str: # Title contains "Exiting plan mode", body is empty return "" - def format_ToolUseContent(self, content: ToolUseContent) -> str: + def format_ToolUseContent(self, content: ToolUseContent, _: TemplateMessage) -> str: """Fallback for unknown tool inputs - render as key/value list.""" return self._render_params(content.input) @@ -471,29 +499,29 @@ def _render_params(self, params: dict[str, Any]) -> str: # Tool Output Formatters # ------------------------------------------------------------------------- - def format_ReadOutput(self, output: ReadOutput) -> str: + def format_ReadOutput(self, output: ReadOutput, _: TemplateMessage) -> str: summary = f"{output.file_path}" if output.file_path else "Content" lang = self._lang_from_path(output.file_path or "") content = self._code_fence(output.content, lang) return self._collapsible(summary, content) - def format_WriteOutput(self, output: WriteOutput) -> str: + def format_WriteOutput(self, output: WriteOutput, _: TemplateMessage) -> str: return f"✓ {output.message}" - def format_EditOutput(self, output: EditOutput) -> str: + def format_EditOutput(self, output: EditOutput, _: TemplateMessage) -> str: if msg := output.message: content = self._code_fence(msg, self._lang_from_path(output.file_path)) return self._collapsible(f"{output.file_path}", content) return "✓ Edited" - def format_BashOutput(self, output: BashOutput) -> str: + def format_BashOutput(self, output: BashOutput, _: TemplateMessage) -> str: # Strip ANSI codes for markdown output text = re.sub(r"\x1b\[[0-9;]*m", "", output.content) # Detect git diff output lang = "diff" if text.startswith("diff --git a/") else "" return self._code_fence(text, lang) - def format_GlobOutput(self, output: GlobOutput) -> str: + def format_GlobOutput(self, output: GlobOutput, _: TemplateMessage) -> str: if not output.files: return "*No files found*" return "\n".join(f"- `{f}`" for f in output.files) @@ -501,38 +529,21 @@ def format_GlobOutput(self, output: GlobOutput) -> str: # Note: GrepOutput is not used (tool results handled as raw strings) # Grep results fall back to format_ToolResultContent - def format_ToolResultMessage(self, message: ToolResultMessage) -> str: - """Override for special output handling.""" - if isinstance(message.output, AskUserQuestionOutput): - return self._format_ask_user_question(message, message.output) - - # TodoWrite success message - render as plain text, not code fence - if message.tool_name == "TodoWrite": - if isinstance(message.output, ToolResultContent): - if isinstance(message.output.content, str): - return message.output.content - return "" - - # Default: dispatch to output formatter - return self._dispatch_format(message.output) - - def _format_ask_user_question( - self, message: ToolResultMessage, output: AskUserQuestionOutput + def format_AskUserQuestionOutput( + self, output: AskUserQuestionOutput, message: TemplateMessage ) -> str: """Format AskUserQuestion with interleaved Q/options/A. - Uses message.message_index to look up paired input for question options. + Uses message.pair_first to look up paired input for question options. """ - # Get questions from paired input via message_index → TemplateMessage → pair + # Get questions from paired input via pair_first questions_map: dict[str, Any] = {} - if message.message_index is not None and self._ctx: - template_msg = self._ctx.get(message.message_index) - if template_msg and template_msg.pair_first is not None: - pair_msg = self._ctx.get(template_msg.pair_first) - if pair_msg and isinstance(pair_msg.content, ToolUseMessage): - input_content = pair_msg.content.input - if isinstance(input_content, AskUserQuestionInput): - questions_map = {q.question: q for q in input_content.questions} + if message.pair_first is not None and self._ctx: + pair_msg = self._ctx.get(message.pair_first) + if pair_msg and isinstance(pair_msg.content, ToolUseMessage): + input_content = pair_msg.content.input + if isinstance(input_content, AskUserQuestionInput): + questions_map = {q.question: q for q in input_content.questions} parts: list[str] = [] for qa in output.answers: @@ -551,91 +562,102 @@ def _format_ask_user_question( return "\n\n".join(parts).rstrip() - def format_TaskOutput(self, output: TaskOutput) -> str: + def format_TaskOutput(self, output: TaskOutput, _: TemplateMessage) -> str: # TaskOutput contains markdown, wrap in collapsible Report return self._collapsible("Report", self._quote(output.result)) - def format_ExitPlanModeOutput(self, output: ExitPlanModeOutput) -> str: + def format_ExitPlanModeOutput( + self, output: ExitPlanModeOutput, _: TemplateMessage + ) -> str: status = "✓ Approved" if output.approved else "✗ Not approved" if output.message: return f"{status}\n\n{output.message}" return status - def format_ToolResultContent(self, output: ToolResultContent) -> str: + def format_ToolResultContent( + self, output: ToolResultContent, message: TemplateMessage + ) -> str: """Fallback for unknown tool outputs.""" + # TodoWrite success message - render as plain text, not code fence + content = message.content + if isinstance(content, ToolResultMessage) and content.tool_name == "TodoWrite": + if isinstance(output.content, str): + return output.content + return "" + # Default: code fence if isinstance(output.content, str): - content = strip_error_tags(output.content) - return self._code_fence(content) + text = strip_error_tags(output.content) + return self._code_fence(text) return self._code_fence(json.dumps(output.content, indent=2), "json") # ------------------------------------------------------------------------- # Title Methods (for tool use dispatch) # ------------------------------------------------------------------------- - def title_BashInput(self, message: TemplateMessage) -> str: - input = cast(BashInput, cast(ToolUseMessage, message.content).input) + def title_BashInput(self, input: BashInput, _: TemplateMessage) -> str: if desc := input.description: return f"💻 Bash: *{self._escape_stars(desc)}*" return "💻 Bash" - def title_ReadInput(self, message: TemplateMessage) -> str: - input = cast(ReadInput, cast(ToolUseMessage, message.content).input) + def title_ReadInput(self, input: ReadInput, _: TemplateMessage) -> str: return f"👀 Read `{Path(input.file_path).name}`" - def title_WriteInput(self, message: TemplateMessage) -> str: - input = cast(WriteInput, cast(ToolUseMessage, message.content).input) + def title_WriteInput(self, input: WriteInput, _: TemplateMessage) -> str: return f"✍️ Write `{Path(input.file_path).name}`" - def title_EditInput(self, message: TemplateMessage) -> str: - input = cast(EditInput, cast(ToolUseMessage, message.content).input) + def title_EditInput(self, input: EditInput, _: TemplateMessage) -> str: return f"✏️ Edit `{Path(input.file_path).name}`" - def title_MultiEditInput(self, message: TemplateMessage) -> str: - input = cast(MultiEditInput, cast(ToolUseMessage, message.content).input) + def title_MultiEditInput(self, input: MultiEditInput, _: TemplateMessage) -> str: return f"✏️ MultiEdit `{Path(input.file_path).name}`" - def title_GlobInput(self, message: TemplateMessage) -> str: - input = cast(GlobInput, cast(ToolUseMessage, message.content).input) + def title_GlobInput(self, input: GlobInput, _: TemplateMessage) -> str: title = f"📂 Glob `{input.pattern}`" return f"{title} in `{input.path}`" if input.path else title - def title_GrepInput(self, message: TemplateMessage) -> str: - input = cast(GrepInput, cast(ToolUseMessage, message.content).input) + def title_GrepInput(self, input: GrepInput, _: TemplateMessage) -> str: base = f"🔎 Grep `{input.pattern}`" return f"{base} in `{input.path}`" if input.path else base - def title_TaskInput(self, message: TemplateMessage) -> str: - input = cast(TaskInput, cast(ToolUseMessage, message.content).input) + def title_TaskInput(self, input: TaskInput, _: TemplateMessage) -> str: subagent = f" ({input.subagent_type})" if input.subagent_type else "" if desc := input.description: return f"🤖 Task{subagent}: *{self._escape_stars(desc)}*" return f"🤖 Task{subagent}" - def title_TodoWriteInput(self, message: TemplateMessage) -> str: # noqa: ARG002 + def title_TodoWriteInput(self, _input: TodoWriteInput, _: TemplateMessage) -> str: return "✅ Todo List" - def title_AskUserQuestionInput(self, message: TemplateMessage) -> str: # noqa: ARG002 + def title_AskUserQuestionInput( + self, _input: AskUserQuestionInput, _: TemplateMessage + ) -> str: return "❓ Asking questions..." - def title_ExitPlanModeInput(self, message: TemplateMessage) -> str: # noqa: ARG002 + def title_ExitPlanModeInput( + self, _input: ExitPlanModeInput, _: TemplateMessage + ) -> str: return "📝 Exiting plan mode" - def title_ThinkingMessage(self, message: TemplateMessage) -> str: + def title_ThinkingMessage( + self, _content: ThinkingMessage, _message: TemplateMessage + ) -> str: # When paired with Assistant, use Assistant title with assistant excerpt - if message.is_first_in_pair and message.pair_last is not None: + if _message.is_first_in_pair and _message.pair_last is not None: if ( - pair_msg := self._ctx.get(message.pair_last) if self._ctx else None + pair_msg := self._ctx.get(_message.pair_last) if self._ctx else None ) and isinstance(pair_msg.content, AssistantTextMessage): if excerpt := self._excerpt(self._get_message_text(pair_msg)): return f"🤖 Assistant: *{self._escape_stars(excerpt)}*" return "🤖 Assistant" # Standalone thinking - if excerpt := self._excerpt(self._get_message_text(message)): + if excerpt := self._excerpt(self._get_message_text(_message)): return f"💭 Thinking: *{self._escape_stars(excerpt)}*" return "💭 Thinking" - def title_AssistantTextMessage(self, message: TemplateMessage) -> str: + def title_AssistantTextMessage( + self, _content: AssistantTextMessage, message: TemplateMessage + ) -> str: # When paired (after Thinking), skip title (already rendered with Thinking) if message.is_last_in_pair: return "" @@ -683,7 +705,7 @@ def _render_message(self, msg: TemplateMessage, level: int) -> str: parts.append(content) content = None # Don't output again below - # Heading with title (skip if empty, e.g., DedupNoticeMessage) + # Heading with title (skip if empty) title = self.title_content(msg) if title: heading_level = min(level, 6) # Markdown max is h6 diff --git a/claude_code_log/models.py b/claude_code_log/models.py index 82a399c7..80c388e6 100644 --- a/claude_code_log/models.py +++ b/claude_code_log/models.py @@ -544,9 +544,6 @@ class UserTextMessage(MessageContent): TextContent | ImageContent | IdeNotificationContent ] = field(default_factory=list) - # Cached raw text extracted from items (for dedup matching, simple renderers) - raw_text_content: Optional[str] = None - @property def message_type(self) -> str: return "user" @@ -589,9 +586,6 @@ class AssistantTextMessage(MessageContent): TextContent | ImageContent ] = field(default_factory=list) - # Cached raw text extracted from items (for dedup matching, simple renderers) - raw_text_content: Optional[str] = None - # Token usage string (formatted from UsageInfo when available) token_usage: Optional[str] = None @@ -673,28 +667,6 @@ def message_type(self) -> str: return "session_header" -@dataclass -class DedupNoticeMessage(MessageContent): - """Content for deduplication notices. - - Displayed when content is deduplicated (e.g., sidechain assistant - text that duplicates the Task tool result). Styled as assistant message - since it replaces sidechain assistant content. - - The `original` field preserves the original AssistantTextMessage so renderers - can optionally show the full content instead of the link. - """ - - notice_text: str - target_message_id: Optional[str] = None # Message ID for anchor link - original_text: Optional[str] = None # Original duplicated content (for debugging) - original: Optional["AssistantTextMessage"] = None # Original message for renderers - - @property - def message_type(self) -> str: - return "assistant" # Styled as assistant (replaces sidechain assistant) - - # ============================================================================= # Tool Message Models # ============================================================================= @@ -748,7 +720,7 @@ def message_type(self) -> str: # ============================================================================= # Tool Input Models # ============================================================================= -# Typed models for tool inputs (Phase 11 of MESSAGE_REFACTORING.md) +# Typed models for tool inputs. # These provide type safety and IDE autocompletion for tool parameters. diff --git a/claude_code_log/renderer.py b/claude_code_log/renderer.py index 61a618b5..d137ee3b 100644 --- a/claude_code_log/renderer.py +++ b/claude_code_log/renderer.py @@ -30,15 +30,20 @@ UsageInfo, # Structured content types AssistantTextMessage, + BashInputMessage, + BashOutputMessage, CommandOutputMessage, - DedupNoticeMessage, + CompactedSummaryMessage, + HookSummaryMessage, SessionHeaderMessage, SlashCommandMessage, SystemMessage, TaskOutput, + ThinkingMessage, ToolResultMessage, ToolUseMessage, UnknownMessage, + UserMemoryMessage, UserSlashCommandMessage, UserSteeringMessage, UserTextMessage, @@ -191,11 +196,6 @@ def is_session_header(self) -> bool: """Check if this message is a session header.""" return isinstance(self.content, SessionHeaderMessage) - @property - def has_markdown(self) -> bool: - """Check if this message has markdown content.""" - return self.content.has_markdown - @property def has_children(self) -> bool: """Check if this message has any children.""" @@ -612,7 +612,7 @@ def generate_template_messages( # Clean up sidechain duplicates on the tree structure # - Remove first UserTextMessage (duplicate of Task input prompt) - # - Replace last AssistantTextMessage (duplicate of Task output) with DedupNotice + # - Remove last AssistantTextMessage (duplicate of Task output) with log_timing("Cleanup sidechain duplicates", t_start): _cleanup_sidechain_duplicates(root_messages) @@ -813,8 +813,6 @@ class PairingIndices: tool_result: dict[tuple[str, str], TemplateMessage] # uuid -> TemplateMessage for system messages (parent-child pairing) uuid: dict[str, TemplateMessage] - # parent_uuid -> TemplateMessage for slash-command messages - slash_command_by_parent: dict[str, TemplateMessage] def _build_pairing_indices(messages: list[TemplateMessage]) -> PairingIndices: @@ -826,7 +824,6 @@ def _build_pairing_indices(messages: list[TemplateMessage]) -> PairingIndices: tool_use_index: dict[tuple[str, str], TemplateMessage] = {} tool_result_index: dict[tuple[str, str], TemplateMessage] = {} uuid_index: dict[str, TemplateMessage] = {} - slash_command_by_parent: dict[str, TemplateMessage] = {} for msg in messages: # Index tool_use and tool_result by (session_id, tool_use_id) @@ -841,17 +838,10 @@ def _build_pairing_indices(messages: list[TemplateMessage]) -> PairingIndices: if msg.meta.uuid and msg.type == "system": uuid_index[msg.meta.uuid] = msg - # Index slash-command user messages by parent_uuid - if msg.parent_uuid and isinstance( - msg.content, (SlashCommandMessage, UserSlashCommandMessage) - ): - slash_command_by_parent[msg.parent_uuid] = msg - return PairingIndices( tool_use=tool_use_index, tool_result=tool_result_index, uuid=uuid_index, - slash_command_by_parent=slash_command_by_parent, ) @@ -906,7 +896,6 @@ def _try_pair_by_index( Index-based pairing rules (can be any distance apart): - tool_use + tool_result (by tool_use_id within same session) - system parent + system child (by uuid/parent_uuid) - - system + slash-command (by uuid -> parent_uuid) """ # Tool use + tool result (by tool_use_id within same session) if current.type == "tool_use" and current.tool_use_id and current.session_id: @@ -919,13 +908,6 @@ def _try_pair_by_index( if current.parent_uuid in indices.uuid: _mark_pair(indices.uuid[current.parent_uuid], current) - # System command finding its slash-command child (by uuid -> parent_uuid) - if ( - current.type == "system" - and current.meta.uuid in indices.slash_command_by_parent - ): - _mark_pair(current, indices.slash_command_by_parent[current.meta.uuid]) - def _identify_message_pairs(messages: list[TemplateMessage]) -> None: """Identify and mark paired messages (e.g., command + output, tool use + result). @@ -975,8 +957,7 @@ def _reorder_paired_messages(messages: list[TemplateMessage]) -> list[TemplateMe Uses dictionary-based approach to find pairs efficiently: 1. Build index of all pair_last messages by tool_use_id - 2. Build index of slash-command pair_last messages by parent_uuid - 3. Single pass through messages, inserting pair_last immediately after pair_first + 2. Single pass through messages, inserting pair_last immediately after pair_first """ from datetime import datetime @@ -984,20 +965,11 @@ def _reorder_paired_messages(messages: list[TemplateMessage]) -> list[TemplateMe # Session ID is included to prevent cross-session pairing when sessions are resumed # Stores message references directly (not list positions) pair_last_index: dict[tuple[str, str], TemplateMessage] = {} - # Index slash-command pair_last messages by parent_uuid - slash_command_pair_index: dict[str, TemplateMessage] = {} for msg in messages: if msg.is_last_in_pair and msg.tool_use_id and msg.session_id: key = (msg.session_id, msg.tool_use_id) pair_last_index[key] = msg - # Index slash-command messages by parent_uuid - if ( - msg.is_last_in_pair - and msg.parent_uuid - and isinstance(msg.content, (SlashCommandMessage, UserSlashCommandMessage)) - ): - slash_command_pair_index[msg.parent_uuid] = msg # Create reordered list reordered: list[TemplateMessage] = [] @@ -1023,10 +995,6 @@ def _reorder_paired_messages(messages: list[TemplateMessage]) -> list[TemplateMe if key in pair_last_index: pair_last = pair_last_index[key] - # Check for system + slash-command pairs (via uuid -> parent_uuid) - if pair_last is None and msg.meta.uuid in slash_command_pair_index: - pair_last = slash_command_pair_index[msg.meta.uuid] - # Only append if we haven't already added this pair_last # (handles case where multiple pair_firsts match the same pair_last) if pair_last is not None: @@ -1285,47 +1253,12 @@ def _normalize_for_dedup(text: str) -> str: return _AGENT_ID_LINE_PATTERN.sub("", text).strip() -def _extract_task_result_text(tool_result_message: ToolResultMessage) -> Optional[str]: - """Extract text content from a Task tool result for deduplication matching. - - Args: - tool_result_message: The ToolResultMessage containing Task output - - Returns: - The extracted text content (normalized), or None if extraction fails - """ - output = tool_result_message.output - - # Handle parsed TaskOutput (preferred - has structured result field) - if isinstance(output, TaskOutput): - text = output.result.strip() if output.result else None - return _normalize_for_dedup(text) if text else None - - # Handle raw ToolResultContent (fallback for unparsed results) - if not isinstance(output, ToolResultContent): - return None - - content = output.content - if isinstance(content, str): - text = content.strip() if content else None - return _normalize_for_dedup(text) if text else None - - # Handle list of dicts (tool result format) - content_parts: list[str] = [] - for item in content: - text_val = item.get("text", "") - if isinstance(text_val, str): - content_parts.append(text_val) - result = "\n".join(content_parts).strip() - return _normalize_for_dedup(result) if result else None - - def _cleanup_sidechain_duplicates(root_messages: list[TemplateMessage]) -> None: """Clean up duplicate content in sidechains after tree is built. For each Task tool_use or tool_result with sidechain children: - Remove the first UserTextMessage (duplicate of Task input prompt) - - For tool_result: Replace last AssistantTextMessage matching result with DedupNotice + - For tool_result: Remove last AssistantTextMessage if it matches the result Sidechain messages can be children of either tool_use or tool_result depending on timestamp order - tool_use during execution, tool_result after completion. @@ -1375,34 +1308,35 @@ def process_message(message: TemplateMessage) -> None: if not is_task_tool_result: return - task_result_text = _extract_task_result_text( - cast(ToolResultMessage, message.content) - ) - if not task_result_text: + # Extract task result text from parsed TaskOutput + tool_result_msg = cast(ToolResultMessage, message.content) + if not isinstance(task_output := tool_result_msg.output, TaskOutput): + return + if not (result := task_output.result): + return + if not (task_result_text := _normalize_for_dedup(result.strip())): return for i in range(len(children) - 1, -1, -1): child = children[i] child_content = child.content - # Get raw_text_content from content (UserTextMessage/AssistantTextMessage) - child_raw = getattr(child_content, "raw_text_content", None) - child_text = _normalize_for_dedup(child_raw) if child_raw else None if ( child.type == "assistant" and child.is_sidechain and isinstance(child_content, AssistantTextMessage) - and child_text - and child_text == task_result_text ): - # Replace with dedup notice pointing to the Task result - # Preserve original meta (sidechain/session flags) and original message - child.content = DedupNoticeMessage( - child_content.meta, - notice_text="Task summary — see result above", - target_message_id=message.message_id, - original_text=child_text, - original=child_content, + # Extract text on-demand for dedup check (only for sidechain assistant) + child_raw = "\n".join( + item.text + for item in child_content.items + if isinstance(item, TextContent) ) + child_text = _normalize_for_dedup(child_raw) if child_raw else None + else: + child_text = None + if child_text and child_text == task_result_text: + # Drop duplicate sidechain assistant message + del children[i] break for root in root_messages: @@ -2068,25 +2002,25 @@ class Renderer: - Subclasses override methods to implement format-specific rendering """ - def _dispatch_format(self, obj: Any) -> str: - """Dispatch to format_{ClassName} method based on object type.""" + def _dispatch_format(self, obj: Any, message: TemplateMessage) -> str: + """Dispatch to format_{ClassName}(obj, message) based on object type.""" for cls in type(obj).__mro__: if cls is object: break if method := getattr(self, f"format_{cls.__name__}", None): - return method(obj) + return method(obj, message) return "" - def _dispatch_title(self, obj: Any, message: "TemplateMessage") -> Optional[str]: - """Dispatch to title_{ClassName} method based on object type.""" + def _dispatch_title(self, obj: Any, message: TemplateMessage) -> Optional[str]: + """Dispatch to title_{ClassName}(obj, message) based on object type.""" for cls in type(obj).__mro__: if cls is object: break if method := getattr(self, f"title_{cls.__name__}", None): - return method(message) + return method(obj, message) return None - def format_content(self, message: "TemplateMessage") -> str: + def format_content(self, message: TemplateMessage) -> str: """Format message content by dispatching to type-specific method. Looks for a method named format_{ClassName} (e.g., format_SystemMessage). @@ -2098,9 +2032,9 @@ def format_content(self, message: "TemplateMessage") -> str: Returns: Formatted string (e.g., HTML), or empty string if no handler found. """ - return self._dispatch_format(message.content) + return self._dispatch_format(message.content, message) - def title_content(self, message: "TemplateMessage") -> str: + def title_content(self, message: TemplateMessage) -> str: """Get message title by dispatching to type-specific title method. Looks for a method named title_{ClassName} (e.g., title_ToolUseMessage). @@ -2117,7 +2051,7 @@ def title_content(self, message: "TemplateMessage") -> str: if cls is object: break if method := getattr(self, f"title_{cls.__name__}", None): - return method(message) + return method(message.content, message) # Fallback: convert message_type to title case return message.content.message_type.replace("_", " ").replace("-", " ").title() @@ -2127,61 +2061,86 @@ def title_content(self, message: "TemplateMessage") -> str: # These methods return title strings for specific content types. # Override in subclasses for format-specific titles (e.g., HTML with icons). - def title_SystemMessage(self, message: "TemplateMessage") -> str: - content = cast("SystemMessage", message.content) + def title_SystemMessage(self, content: SystemMessage, _: TemplateMessage) -> str: return f"System {content.level.title()}" - def title_HookSummaryMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_HookSummaryMessage( + self, _content: HookSummaryMessage, _: TemplateMessage + ) -> str: return "System Hook" - def title_SlashCommandMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_SlashCommandMessage( + self, content: SlashCommandMessage, _message: TemplateMessage + ) -> str: return "Slash Command" - def title_CommandOutputMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_CommandOutputMessage( + self, _content: CommandOutputMessage, _: TemplateMessage + ) -> str: return "" # Empty title for command output - def title_BashInputMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_BashInputMessage( + self, _content: BashInputMessage, _: TemplateMessage + ) -> str: return "Bash command" - def title_BashOutputMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_BashOutputMessage( + self, _content: BashOutputMessage, _: TemplateMessage + ) -> str: return "" # Empty title for bash output - def title_CompactedSummaryMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_CompactedSummaryMessage( + self, _content: CompactedSummaryMessage, _: TemplateMessage + ) -> str: return "User (compacted conversation)" - def title_UserMemoryMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_UserMemoryMessage( + self, _content: UserMemoryMessage, _: TemplateMessage + ) -> str: return "Memory" - def title_UserSlashCommandMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_UserSlashCommandMessage( + self, _content: UserSlashCommandMessage, _: TemplateMessage + ) -> str: return "User (slash command)" - def title_UserTextMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_UserTextMessage( + self, _content: UserTextMessage, _message: TemplateMessage + ) -> str: return "User" - def title_UserSteeringMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_UserSteeringMessage( + self, _content: UserSteeringMessage, _: TemplateMessage + ) -> str: return "User (steering)" - def title_AssistantTextMessage(self, message: "TemplateMessage") -> str: + def title_AssistantTextMessage( + self, _content: AssistantTextMessage, message: TemplateMessage + ) -> str: # Sidechain assistant messages get special title if message.meta.is_sidechain: return "Sub-assistant" return "Assistant" - def title_ThinkingMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_ThinkingMessage( + self, _content: ThinkingMessage, _message: TemplateMessage + ) -> str: return "Thinking" - def title_UnknownMessage(self, message: "TemplateMessage") -> str: # noqa: ARG002 + def title_UnknownMessage(self, _content: UnknownMessage, _: TemplateMessage) -> str: return "Unknown Content" # Tool title methods (dispatch to input/output title methods) - def title_ToolUseMessage(self, message: "TemplateMessage") -> str: - content = cast("ToolUseMessage", message.content) + def title_ToolUseMessage( + self, content: ToolUseMessage, message: TemplateMessage + ) -> str: if title := self._dispatch_title(content.input, message): return title return content.tool_name # Default to tool name - def title_ToolResultMessage(self, message: "TemplateMessage") -> str: - content = cast("ToolResultMessage", message.content) + def title_ToolResultMessage( + self, content: ToolResultMessage, message: TemplateMessage + ) -> str: if content.is_error: return "Error" if title := self._dispatch_title(content.output, message): @@ -2189,48 +2148,44 @@ def title_ToolResultMessage(self, message: "TemplateMessage") -> str: return "" # Tool results typically don't need a title # Tool input title stubs (override in subclasses for custom titles) - # def title_BashInput(self, message: "TemplateMessage") -> str: ... - # def title_ReadInput(self, message: "TemplateMessage") -> str: ... - # def title_EditInput(self, message: "TemplateMessage") -> str: ... - # def title_TaskInput(self, message: "TemplateMessage") -> str: ... - # def title_TodoWriteInput(self, message: "TemplateMessage") -> str: ... + # def title_BashInput(self, input: "BashInput", message: "TemplateMessage") -> str: ... + # def title_ReadInput(self, input: "ReadInput", message: "TemplateMessage") -> str: ... + # def title_EditInput(self, input: "EditInput", message: "TemplateMessage") -> str: ... + # def title_TaskInput(self, input: "TaskInput", message: "TemplateMessage") -> str: ... + # def title_TodoWriteInput(self, input: "TodoWriteInput", message: "TemplateMessage") -> str: ... # ------------------------------------------------------------------------- # Format Method Stubs (override in subclasses) # ------------------------------------------------------------------------- # System content formatters - # def format_SystemMessage(self, message: "SystemMessage") -> str: return "" - # def format_HookSummaryMessage(self, message: "HookSummaryMessage") -> str: ... - # def format_SessionHeaderMessage(self, message: "SessionHeaderMessage") -> str: ... - # def format_DedupNoticeMessage(self, message: "DedupNoticeMessage") -> str: ... + # def format_SystemMessage(self, content: "SystemMessage", message: "TemplateMessage") -> str: ... + # def format_HookSummaryMessage(self, content: "HookSummaryMessage", _: "TemplateMessage") -> str: ... + # def format_SessionHeaderMessage(self, content: "SessionHeaderMessage", _: "TemplateMessage") -> str: ... # User content formatters - # def format_UserTextMessage(self, message: "UserTextMessage") -> str: ... - # def format_UserSteeringMessage(self, message: "UserSteeringMessage") -> str: ... - # def format_UserSlashCommandMessage(self, message: "UserSlashCommandMessage") -> str: ... - # def format_SlashCommandMessage(self, message: "SlashCommandMessage") -> str: ... - # def format_CommandOutputMessage(self, message: "CommandOutputMessage") -> str: ... - # def format_BashInputMessage(self, message: "BashInputMessage") -> str: ... - # def format_BashOutputMessage(self, message: "BashOutputMessage") -> str: ... - # def format_CompactedSummaryMessage(self, message: "CompactedSummaryMessage") -> str: ... - # def format_UserMemoryMessage(self, message: "UserMemoryMessage") -> str: ... + # def format_UserTextMessage(self, content: "UserTextMessage", _: "TemplateMessage") -> str: ... + # ... # Assistant content formatters - # def format_AssistantTextMessage(self, message: "AssistantTextMessage") -> str: ... - # def format_ThinkingMessage(self, message: "ThinkingMessage") -> str: ... - # def format_UnknownMessage(self, message: "UnknownMessage") -> str: ... + # def format_AssistantTextMessage(self, content: "AssistantTextMessage", _: "TemplateMessage") -> str: ... + # def format_ThinkingMessage(self, content: "ThinkingMessage", _: "TemplateMessage") -> str: ... + # def format_UnknownMessage(self, content: "UnknownMessage", _: "TemplateMessage") -> str: ... # Tool content formatters (dispatch to input/output formatters) - def format_ToolUseMessage(self, message: "ToolUseMessage") -> str: - """Dispatch to format_{InputClass} based on message.input type.""" - return self._dispatch_format(message.input) - - def format_ToolResultMessage(self, message: "ToolResultMessage") -> str: - """Dispatch to format_{OutputClass} based on message.output type.""" - return self._dispatch_format(message.output) + def format_ToolUseMessage( + self, content: ToolUseMessage, message: TemplateMessage + ) -> str: + """Dispatch to format_{InputClass} based on content.input type.""" + return self._dispatch_format(content.input, message) + + def format_ToolResultMessage( + self, content: ToolResultMessage, message: TemplateMessage + ) -> str: + """Dispatch to format_{OutputClass} based on content.output type.""" + return self._dispatch_format(content.output, message) # Tool input formatters - # def format_BashInput(self, input: "BashInput") -> str: ... + # def format_BashInput(self, input: "BashInput", _: "TemplateMessage") -> str: ... # def format_ReadInput(self, input: "ReadInput") -> str: ... # def format_WriteInput(self, input: "WriteInput") -> str: ... # def format_EditInput(self, input: "EditInput") -> str: ... diff --git a/claude_code_log/renderer_timings.py b/claude_code_log/renderer_timings.py index b6d43900..9645bf07 100644 --- a/claude_code_log/renderer_timings.py +++ b/claude_code_log/renderer_timings.py @@ -111,40 +111,14 @@ def timing_stat(list_name: str) -> Iterator[None]: def report_timing_statistics( - message_timings: list[Tuple[float, str, int, str]], operation_timings: list[Tuple[str, list[Tuple[float, str]]]], ) -> None: - """Report timing statistics for message rendering. + """Report timing statistics for rendering operations. Args: - message_timings: List of (duration, message_type, index, msg_id) tuples. - Can be empty if only operation timings are being reported. operation_timings: List of (name, timings) tuples where timings is a list of (duration, msg_id) e.g., [("Markdown", markdown_timings), ("Pygments", pygments_timings)] """ - # Report message loop statistics if available - if message_timings: - # Sort by duration descending - sorted_timings = sorted(message_timings, key=lambda x: x[0], reverse=True) - - # Calculate statistics - total_msg_time = sum(t[0] for t in message_timings) - avg_time = total_msg_time / len(message_timings) - - # Report slowest messages - print("\n[TIMING] Loop statistics:", flush=True) - print(f"[TIMING] Total messages: {len(message_timings)}", flush=True) - print( - f"[TIMING] Average time per message: {avg_time * 1000:.1f}ms", flush=True - ) - print("[TIMING] Slowest 10 messages:", flush=True) - for duration, msg_type, idx, msg_id in sorted_timings[:10]: - print( - f"[TIMING] Message {msg_id} (#{idx}, {msg_type}): {duration * 1000:.1f}ms", - flush=True, - ) - - # Report operation-specific statistics for operation_name, timings in operation_timings: if timings: sorted_ops = sorted(timings, key=lambda x: x[0], reverse=True) diff --git a/dev-docs/MESSAGE_REFACTORING.md b/dev-docs/MESSAGE_REFACTORING.md deleted file mode 100755 index 90881e40..00000000 --- a/dev-docs/MESSAGE_REFACTORING.md +++ /dev/null @@ -1,517 +0,0 @@ -# Message Rendering Refactoring Plan - -This document tracks the ongoing refactoring effort to improve the message rendering code in `renderer.py`. - -## Current State (dev/message-tree-refactoring) - -As of December 2025, significant refactoring has been completed. The architecture now separates format-neutral message processing from HTML-specific rendering: - -| Module | Lines | Notes | -|--------|-------|-------| -| `renderer.py` | 2525 | Format-neutral: tree building, pairing, hierarchy | -| `html/renderer.py` | 297 | HtmlRenderer: tree traversal, template rendering | -| `html/tool_formatters.py` | 950 | Tool use/result HTML formatting | -| `html/user_formatters.py` | 326 | User message HTML formatting | -| `html/assistant_formatters.py` | 90 | Assistant/thinking HTML formatting | -| `html/system_formatters.py` | 113 | System message HTML formatting | -| `html/utils.py` | 352 | Shared HTML utilities (markdown, escape, etc.) | -| `html/ansi_colors.py` | 261 | ANSI → HTML conversion | -| `models.py` | 858 | Content models, MessageModifiers | - -**Key architectural changes:** -- **Tree-first architecture** - `generate_template_messages()` returns tree roots, HtmlRenderer flattens via pre-order traversal -- **Format-neutral Renderer base class** - Subclasses (HtmlRenderer) implement format-specific rendering -- **Content models in models.py** - SessionHeaderContent, DedupNoticeContent, IdeNotificationContent, etc. -- **Formatter separation** - HTML formatters split by message type in `html/` directory - -## Motivation - -The refactoring aims to: - -1. **Improve maintainability** - Functions are too large (some 600+ lines) -2. **Better separation of concerns** - Move specialized utilities to dedicated modules -3. **Improve type safety** - Use typed objects instead of generic dictionaries -4. **Enable testing** - Large functions are difficult to unit test -5. **Performance profiling** - Timing instrumentation to identify bottlenecks - -## Related Refactoring Branches - -### dev/message-tree-refactoring (Current Branch) - -This branch implements tree-based message rendering. See [TEMPLATE_MESSAGE_CHILDREN.md](TEMPLATE_MESSAGE_CHILDREN.md) for details. - -**Completed Work:** -- ✅ Phase 1: Added `children: List[TemplateMessage]` field to TemplateMessage -- ✅ Phase 1: Added `flatten()` and `flatten_all()` methods for backward compatibility -- ✅ Phase 2: Implemented `_build_message_tree()` function -- ✅ **Phase 2.5: Tree-first architecture** (December 2025) - - `generate_template_messages()` now returns tree roots, not flat list - - `HtmlRenderer._flatten_preorder()` traverses tree, formats content, builds flat list - - Content formatting happens during pre-order traversal (single pass) - - Template unchanged - still receives flat list (Phase 3 future work) - -**Architecture:** -```text -TranscriptEntry[] → generate_template_messages() → root_messages (tree) - ↓ - HtmlRenderer._flatten_preorder() → flat_list - ↓ - template.render(messages=flat_list) -``` - -**Integration with this refactoring:** -- Tree structure enables future **recursive template rendering** (Phase 3 in TEMPLATE_MESSAGE_CHILDREN.md) -- Provides foundation for **Visitor pattern** output formats (HTML, Markdown, JSON) -- Format-neutral `Renderer` base class allows alternative renderer implementations - -### golergka's text-output-format Branch (ada7ef5) - -Adds text/markdown/chat output formats via new `content_extractor.py` module. - -**Key Changes:** -- Created `content_extractor.py` with dataclasses: `ExtractedText`, `ExtractedThinking`, `ExtractedToolUse`, `ExtractedToolResult`, `ExtractedImage` -- Refactored `render_message_content()` to use extraction layer (~70 lines changed) -- Added `text_renderer.py` for text-based output (426 lines) -- CLI `--format` option: html, text, markdown, chat - -**Relationship to This Refactoring:** - -| Aspect | golergka's Approach | This Refactoring | -|--------|---------------------|------------------| -| Focus | Multi-format output | Code organization | -| Data layer | ContentItem → ExtractedContent | TemplateMessage tree | -| Presentation | Separate renderers per format | Modular HTML renderer | -| Compatibility | Parallel to HTML | Refactor existing HTML | - -**Integration Assessment:** -- **Complementary**: golergka's extraction layer operates at ContentItem level, this refactoring at TemplateMessage level -- **Low conflict**: `content_extractor.py` is a new module, doesn't touch hierarchy/pairing code -- **Synergy opportunity**: Text renderer could benefit from tree structure for nested output -- **Risk**: `render_message_content()` changes in golergka's PR conflict with local changes - -**Recommendation:** Consider integrating golergka's work **after** completing Phase 3 (ANSI extraction) and Phase 4 (Tool formatters extraction). The content extraction layer is useful for multi-format support, but is tangential to the core refactoring goals of reducing renderer.py complexity. - -## Completed Phases - -### Phase 1: Timing Infrastructure (Commits: 56b2807, 8426f39) - -**Goal**: Centralize timing utilities and standardize timing instrumentation patterns - -**Changes**: -- ✅ Extracted timing utilities to `renderer_timings.py` module -- ✅ Moved `DEBUG_TIMING` environment variable handling to timing module -- ✅ Standardized `log_timing` context manager pattern - work goes INSIDE the `with` block -- ✅ Added support for dynamic phase names using lambda expressions -- ✅ Removed top-level `os` import from renderer.py (no longer needed) - -**Benefits**: -- All timing-related code centralized in one module -- Consistent timing instrumentation throughout renderer -- Easy to enable/disable timing with `CLAUDE_CODE_LOG_DEBUG_TIMING` environment variable -- Better insight into rendering performance - -### Phase 2: Tool Use Context Optimization (Commit: 56b2807) - -**Goal**: Simplify tool use context management and eliminate unnecessary pre-processing - -**Analysis**: -- `tool_use_context` was only used when processing tool results -- The "prompt" member stored for Task tools wasn't actually used in lookups -- Tool uses always appear before tool results chronologically -- No need for separate pre-processing pass - -**Changes**: -- ✅ Removed `_define_tool_use_context()` function (68 lines eliminated) -- ✅ Changed `tool_use_context` from `Dict[str, Dict[str, Any]]` to `Dict[str, ToolUseContent]` -- ✅ Build index inline when creating ToolUseContent objects during message processing -- ✅ Use attribute access instead of dictionary access for better type safety -- ✅ Replaced dead code in `render_message_content` with warnings - -**Benefits**: -- Eliminated entire pre-processing pass through messages -- Better type safety with ToolUseContent objects -- Cleaner code with inline index building -- ~70 lines of code removed - -### Phase 3: ANSI Color Module Extraction ✅ COMPLETE - -**Goal**: Extract ANSI color conversion to dedicated module - -**Changes**: -- ✅ Created `claude_code_log/html/ansi_colors.py` (261 lines) -- ✅ Moved `_convert_ansi_to_html()` → `convert_ansi_to_html()` -- ✅ Updated imports in `renderer.py` -- ✅ Updated test imports in `test_ansi_colors.py` - -**Result**: 242 lines removed from renderer.py (4246 → 4004) - -### Phase 4: Code Rendering Module Extraction ✅ COMPLETE - -**Goal**: Extract code-related rendering (Pygments highlighting, diff rendering) to dedicated module - -**Changes**: -- ✅ Created `claude_code_log/html/renderer_code.py` (330 lines) -- ✅ Moved `_highlight_code_with_pygments()` → `highlight_code_with_pygments()` -- ✅ Moved `_truncate_highlighted_preview()` → `truncate_highlighted_preview()` -- ✅ Moved `_render_single_diff()` → `render_single_diff()` -- ✅ Moved `_render_line_diff()` → `render_line_diff()` -- ✅ Updated imports in `renderer.py` -- ✅ Updated test imports in `test_preview_truncation.py` -- ✅ Removed unused Pygments imports from renderer.py - -**Result**: 274 lines removed from renderer.py (4004 → 3730) - -**Note**: The original Phase 4 plan targeted tool formatters (~600 lines), but due to tight coupling with `escape_html`, `render_markdown`, and other utilities, we extracted a cleaner subset: code highlighting and diff rendering. The remaining tool formatters could be extracted in a future phase once the shared utilities are better factored. - -### Phase 5: Message Processing Decomposition ✅ PARTIAL - -**Goal**: Break down the 687-line `_process_messages_loop()` into smaller functions - -**Changes**: -- ✅ Created `_process_system_message()` function (~88 lines) - handles hook summaries, commands, system messages -- ✅ Created `ToolItemResult` dataclass for structured tool processing results -- ✅ Created `_process_tool_use_item()` function (~84 lines) - handles tool_use content items -- ✅ Created `_process_tool_result_item()` function (~71 lines) - handles tool_result content items -- ✅ Created `_process_thinking_item()` function (~21 lines) - handles thinking content -- ✅ Created `_process_image_item()` function (~17 lines) - handles image content -- ✅ Replaced ~220 lines of nested conditionals with clean dispatcher pattern - -**Result**: `_process_messages_loop()` reduced from ~687 to ~460 lines (33% smaller) - -**Note**: File size increased slightly (3730 → 3814 lines) due to new helper functions, but the main loop is now much more maintainable with focused, testable helper functions. Further decomposition (session tracking, token usage extraction) could reduce it to ~200 lines but would require more complex parameter passing. - -### Phase 6: Message Pairing Simplification ✅ COMPLETE - -**Goal**: Simplify the complex pairing logic in `_identify_message_pairs()` - -**Changes**: -- ✅ Created `PairingIndices` dataclass to hold all lookup indices in one place -- ✅ Extracted `_build_pairing_indices()` function (~35 lines) - builds all indices in single pass -- ✅ Extracted `_mark_pair()` utility (~8 lines) - marks first/last message pairing -- ✅ Extracted `_try_pair_adjacent()` function (~25 lines) - handles adjacent message pairs -- ✅ Extracted `_try_pair_by_index()` function (~30 lines) - handles index-based pairing -- ✅ Simplified `_identify_message_pairs()` from ~120 lines to ~37 lines (69% smaller) - -**Result**: Pairing logic decomposed into focused helpers with clear responsibilities: -- `_build_pairing_indices()`: O(n) index building for tool_use, tool_result, uuid, slash_command lookups -- `_try_pair_adjacent()`: Handles system+slash, command+output, tool_use+result adjacent pairs -- `_try_pair_by_index()`: Handles index-based pairing for non-adjacent messages - -**Note**: File size increased slightly (3814 → 3853 lines) due to new helper functions, but the main pairing function is now much cleaner and each helper is independently testable. - -## Planned Future Phases - -### Phase 7: Message Type Documentation ✅ COMPLETE - -**Goal**: Document message types and CSS classes comprehensively - -**Completed Work**: -- ✅ Created comprehensive [css-classes.md](css-classes.md) with: - - Complete CSS class combinations (19 semantic patterns) - - CSS rule support status (24 full, 7 partial, 1 none) - - Pairing behavior documentation (pair_first/pair_last rules) - - Fold-bar support analysis -- ✅ Updated [messages.md](messages.md) with: - - Complete css_class trait mapping table - - Pairing patterns and rules by type - - Full tool table (16 tools with model info) - - Cross-references to css-classes.md - -### Phase 8: Testing Infrastructure ✅ COMPLETE - -**Goal**: Improve test coverage for refactored modules - -**Completed Work**: -- ✅ Created `test/test_phase8_message_variants.py` with tests for: - - Slash command rendering (`isMeta=True` flag) - - Queue operations skip behavior (enqueue/dequeue not rendered) - - CSS class modifiers composition (`error`, `sidechain`, combinations) - - Deduplication with modifiers -- ✅ Created `test/test_renderer.py` with edge case tests for: - - System message handling - - Write and Edit tool rendering -- ✅ Created `test/test_renderer_code.py` with tests for: - - Pygments highlighting (pattern matching, unknown extensions, ClassNotFound) - - Truncated highlighted preview - - Diff rendering edge cases (consecutive removals, hint line skipping) -- ✅ Simplified CSS by removing redundant `paired-message` class -- ✅ Updated snapshot tests and documentation - -**Test Files Added**: -- [test/test_phase8_message_variants.py](../test/test_phase8_message_variants.py) - Message type variants -- [test/test_renderer.py](../test/test_renderer.py) - Renderer edge cases -- [test/test_renderer_code.py](../test/test_renderer_code.py) - Code highlighting/diff tests - -**Coverage Notes**: -- Some lines in `renderer_code.py` (116-118, 319) are unreachable due to algorithm behavior -- Pygments `ClassNotFound` exception path covered via mock testing - -### Phase 9: Type Safety Improvements ✅ COMPLETE - -**Goal**: Replace string-based type checking with enums and typed structures - -**Completed Work**: -- ✅ Added `MessageType(str, Enum)` in `models.py` with all message types -- ✅ Added type guards for TranscriptEntry union narrowing (available for future use) -- ✅ Updated `renderer.py` to use `MessageType` enum for key comparisons -- ✅ Maintained backward compatibility via `str` base class - -**MessageType Enum Values**: -- JSONL entry types: `USER`, `ASSISTANT`, `SYSTEM`, `SUMMARY`, `QUEUE_OPERATION` -- Rendering types: `TOOL_USE`, `TOOL_RESULT`, `THINKING`, `IMAGE`, `BASH_INPUT`, `BASH_OUTPUT`, `SESSION_HEADER`, `UNKNOWN` -- System subtypes: `SYSTEM_INFO`, `SYSTEM_WARNING`, `SYSTEM_ERROR` - -**Type Guards Added**: -- `is_user_entry()`, `is_assistant_entry()`, `is_system_entry()`, `is_summary_entry()`, `is_queue_operation_entry()` -- `is_tool_use_content()`, `is_tool_result_content()`, `is_thinking_content()`, `is_image_content()`, `is_text_content()` - -**Note**: MessageModifiers dataclass deferred - existing boolean flags work well for now - -### Phase 10: Parser Simplification ✅ COMPLETE - -**Goal**: Simplify `extract_text_content()` using isinstance checks - -**Completed Work**: -- ✅ Added imports for Anthropic SDK types: `TextBlock`, `ThinkingBlock` -- ✅ Simplified `extract_text_content()` with clean isinstance checks -- ✅ Removed defensive `hasattr`/`getattr` patterns -- ✅ 23% code reduction (17 lines → 13 lines) - -**Before** (defensive pattern): -```python -if hasattr(item, "type") and getattr(item, "type") == "text": - text = getattr(item, "text", "") - if text: - text_parts.append(text) -``` - -**After** (clean isinstance): -```python -if isinstance(item, (TextContent, TextBlock)): - text_parts.append(item.text) -elif isinstance(item, (ThinkingContent, ThinkingBlock)): - continue -``` - -**Testing Evidence**: All 431 tests pass with simplified version -**Risk**: Low - maintains same behavior, fully tested - -### Phase 11: Tool Model Enhancement ✅ COMPLETE - -**Goal**: Add typed models for tool inputs (currently all generic `Dict[str, Any]`) - -**Completed Work**: -- ✅ Added 9 typed input models to `models.py`: - - `BashInput`, `ReadInput`, `WriteInput`, `EditInput`, `MultiEditInput` - - `GlobInput`, `GrepInput`, `TaskInput`, `TodoWriteInput` -- ✅ Created `ToolInput` union type for type-safe tool input handling -- ✅ Added `TOOL_INPUT_MODELS` mapping for tool name → model class lookup -- ✅ Added `parse_tool_input()` helper function with fallback to raw dict - -**Typed Input Models Added**: -```python -class BashInput(BaseModel): - command: str - description: Optional[str] = None - timeout: Optional[int] = None - run_in_background: Optional[bool] = None - dangerouslyDisableSandbox: Optional[bool] = None - -class ReadInput(BaseModel): - file_path: str - offset: Optional[int] = None - limit: Optional[int] = None - -class EditInput(BaseModel): - file_path: str - old_string: str - new_string: str - replace_all: Optional[bool] = None -``` - -**Note**: The `ToolUseContent.input` field remains `Dict[str, Any]` for backward compatibility. -The new typed models are available for optional use via `parse_tool_input()`. Existing -code continues to work unchanged with dictionary access. - -**Independence from Phase 12**: Phase 11 and Phase 12 are independent improvements. -Phase 12 focuses on architectural decomposition (splitting renderer.py into format-neutral -and format-specific modules), while Phase 11 provides typed tool input models as an -optional type-safety enhancement. The typed models can be adopted incrementally by any -code that wants to use them, independent of the format-neutral refactoring. - -### Phase 12: Renderer Decomposition - Format Neutral ✅ COMPLETE - -**Goal**: Separate format-neutral logic from HTML-specific generation - -**Achieved Architecture** (December 2025): -``` -renderer.py (2525 lines) - Format-neutral -├── generate_template_messages() → returns tree roots -├── Renderer base class (subclassed by HtmlRenderer) -├── TemplateMessage, TemplateProject, TemplateSummary classes -├── Message processing loop with content model creation -├── Pairing & hierarchy logic -└── Deduplication - -html/ directory - HTML-specific -├── renderer.py (297 lines) - HtmlRenderer class -│ ├── _flatten_preorder() - tree traversal + formatting -│ ├── _format_message_content() - dispatches to formatters -│ └── generate(), generate_session() - template rendering -├── tool_formatters.py (950 lines) - Tool use/result formatters -├── user_formatters.py (326 lines) - User message formatters -├── assistant_formatters.py (90 lines) - Assistant/thinking formatters -├── system_formatters.py (113 lines) - System message formatters -├── utils.py (352 lines) - Markdown, escape, collapsibles -└── ansi_colors.py (261 lines) - ANSI → HTML conversion - -models.py (858 lines) - Content models -├── MessageContent base class and subclasses -├── SessionHeaderContent, DedupNoticeContent (renderer content) -├── IdeNotificationContent, UserTextContent (user content) -├── ReadOutput, EditOutput, etc. (tool output models) -└── MessageModifiers dataclass -``` - -**Implementation Steps** (completed differently than original plan): - -| Step | Description | Status | -|------|-------------|--------| -| 1-5 | Initial HTML extraction | ✅ Complete | -| 6 | Split tool formatters (two-stage: parse + render) | ✅ Done via content models in models.py | -| 7 | Split message content renderers | ✅ Done via html/{user,assistant,system,tool}_formatters.py | -| 8 | Split _process_* message functions | ✅ Content models created during processing | -| 9 | Move generate_projects_index_html | ⏸️ Still in renderer.py (format-neutral prep + HTML) | -| 10-11 | Final organization | ✅ Complete | - -**Steps 6-8 Resolution**: -The original plan called for two-stage (parse + render) splits. This was achieved differently: -- **Content models** in `models.py` capture parsed data (SessionHeaderContent, IdeNotificationContent, ReadOutput, etc.) -- **Format-neutral processing** in `renderer.py` creates content models during message processing -- **HTML formatters** in `html/*.py` render content models to HTML -- **Tree-first architecture** means HtmlRenderer traverses tree and formats during pre-order walk - -**Step 9 Status**: -`generate_projects_index_html()` is now in `claude_code_log/html/renderer.py` (thin wrapper over `HtmlRenderer.generate_projects_index()`). - -**Dependencies**: -- Requires Phase 9 (type safety) for clean interfaces ✅ -- Benefits from Phase 10 (parser simplification) ✅ -- Tree-first architecture (TEMPLATE_MESSAGE_CHILDREN.md Phase 2.5) ✅ -- Enables golergka's multi-format integration - -**Risk**: High - requires careful refactoring -**Status**: ✅ COMPLETE - -## Recommended Execution Order - -For maximum impact with minimum risk: - -### Completed -1. ✅ **Phase 3 (ANSI)** - Low risk, self-contained, immediate ~250 line reduction -2. ✅ **Phase 4 (Code rendering)** - Medium risk, ~274 line reduction, clear boundaries -3. ✅ **Phase 5 (Processing)** - High impact, main loop 33% smaller -4. ✅ **Phase 6 (Pairing)** - Pairing function 69% smaller, clear helpers -5. ✅ **Phase 7 (Documentation)** - Complete CSS/message docs -6. ✅ **Phase 8 (Testing)** - Coverage gap tests, message variant tests, CSS simplification -7. ✅ **Phase 9 (Type Safety)** - MessageType enum and type guards added - -### Next Steps -8. ✅ **Phase 10 (Parser)** - Simplified extract_text_content() with isinstance checks -9. ✅ **Phase 11 (Tool Models)** - Added typed input models for 9 common tools -10. ✅ **Phase 12 (Format Neutral)** - HTML formatters in `html/` directory, content models in models.py -11. ✅ **Tree-first architecture** - `generate_template_messages()` returns tree roots (TEMPLATE_MESSAGE_CHILDREN.md Phase 2.5) - -**Current Status (December 2025):** -- All planned phases complete -- renderer.py reduced from 4246 to 2525 lines (41% reduction) -- Clean separation: format-neutral in renderer.py, HTML-specific in html/ directory -- Tree-first architecture enables future recursive template rendering - -**Future Work:** -- **Recursive templates** (TEMPLATE_MESSAGE_CHILDREN.md Phase 3): Pass tree roots directly to template with recursive macro -- **Alternative renderers**: Text/markdown renderer using Renderer base class -- **golergka integration**: Content models and tree structure ready for multi-format output - -## Metrics to Track - -| Metric | Baseline (v0.9) | Current (Dec 2025) | Target | -|--------|-----------------|-------------------|--------| -| renderer.py lines | 4246 | 2525 | ✅ <3000 | -| html/ directory | - | 2389 total | - | -| models.py lines | ~400 | 858 | - | -| Largest function | ~687 lines | ~300 lines | <100 lines | -| `_identify_message_pairs()` | ~120 lines | ~37 lines | ✅ | -| Typed tool input models | 0 | 9 | ✅ | -| Content models | 0 | 15+ | - | -| Module count | 3 | 11 | - | -| Test coverage | ~78% | ~78% | >85% | - -**html/ directory breakdown:** -- renderer.py: 297 lines (HtmlRenderer) -- tool_formatters.py: 950 lines -- user_formatters.py: 326 lines -- utils.py: 352 lines -- ansi_colors.py: 261 lines -- assistant_formatters.py: 90 lines -- system_formatters.py: 113 lines - -**Progress Summary**: -- renderer.py reduced by 41% (4246 → 2525 lines) -- Format-neutral/HTML separation complete -- Tree-first architecture implemented -- Content models moved to models.py -- HTML formatters organized by message type in html/ directory - -## Quality Gates - -Before merging any phase: - -- [ ] `just test-all` passes -- [ ] `uv run pyright` passes with 0 errors -- [ ] `ruff check` passes -- [ ] Snapshot tests unchanged (or intentionally updated) -- [ ] No performance regression (check with `CLAUDE_CODE_LOG_DEBUG_TIMING=1`) - -## Notes - -- All changes should maintain backward compatibility -- Each phase should be committed separately for easy review -- Consider feature flags for large changes during development -- Run against real Claude projects to verify visual correctness - -## References - -### Code Modules - Format Neutral -- [renderer.py](../claude_code_log/renderer.py) - Format-neutral rendering (2525 lines) -- [models.py](../claude_code_log/models.py) - Content models, MessageModifiers, type guards (858 lines) -- [renderer_code.py](../claude_code_log/renderer_code.py) - Code highlighting & diffs (330 lines) -- [renderer_timings.py](../claude_code_log/renderer_timings.py) - Timing utilities -- [parser.py](../claude_code_log/parser.py) - JSONL parsing - -### Code Modules - HTML Specific (html/ directory) -- [html/renderer.py](../claude_code_log/html/renderer.py) - HtmlRenderer class (297 lines) -- [html/tool_formatters.py](../claude_code_log/html/tool_formatters.py) - Tool HTML formatters (950 lines) -- [html/user_formatters.py](../claude_code_log/html/user_formatters.py) - User message formatters (326 lines) -- [html/assistant_formatters.py](../claude_code_log/html/assistant_formatters.py) - Assistant/thinking formatters (90 lines) -- [html/system_formatters.py](../claude_code_log/html/system_formatters.py) - System message formatters (113 lines) -- [html/utils.py](../claude_code_log/html/utils.py) - Markdown, escape, collapsibles (352 lines) -- [html/ansi_colors.py](../claude_code_log/html/ansi_colors.py) - ANSI color conversion (261 lines) - -### Documentation -- [css-classes.md](css-classes.md) - Complete CSS class reference with support status -- [messages.md](messages.md) - Message types, content models, tool documentation -- [FOLD_STATE_DIAGRAM.md](FOLD_STATE_DIAGRAM.md) - Fold system documentation -- [TEMPLATE_MESSAGE_CHILDREN.md](TEMPLATE_MESSAGE_CHILDREN.md) - Tree architecture (Phase 2.5 complete) - -### Tests -- [test/test_ansi_colors.py](../test/test_ansi_colors.py) - ANSI tests -- [test/test_preview_truncation.py](../test/test_preview_truncation.py) - Code preview tests -- [test/test_sidechain_agents.py](../test/test_sidechain_agents.py) - Integration tests -- [test/test_template_data.py](../test/test_template_data.py) - Tree building tests (TestTemplateMessageTree) -- [test/test_phase8_message_variants.py](../test/test_phase8_message_variants.py) - Message variants -- [test/test_renderer.py](../test/test_renderer.py) - Renderer edge cases -- [test/test_renderer_code.py](../test/test_renderer_code.py) - Code highlighting/diff tests - -### External -- golergka's branch: `remotes/golergka/feat/text-output-format` (commit ada7ef5) diff --git a/dev-docs/MESSAGE_REFACTORING2.md b/dev-docs/MESSAGE_REFACTORING2.md deleted file mode 100644 index fc84aaec..00000000 --- a/dev-docs/MESSAGE_REFACTORING2.md +++ /dev/null @@ -1,121 +0,0 @@ -# Message Refactoring Phase 2 - -## Vision - -The goal is to achieve a cleaner, type-driven architecture where: -1. **MessageContent type is the source of truth** - No need for separate `MessageModifiers` or `MessageType` checks -2. **Inverted relationship** - Instead of `TemplateMessage.content: MessageContent`, have `MessageContent.meta: MessageMeta` -3. **Leaner models** - Remove derived/redundant fields like `has_children`, `has_markdown`, `is_session_header`, `raw_text_content` -4. **Modular organization** - Split into `user_models.py`, `assistant_models.py`, `tools_models.py` with corresponding factories - -## Current State Analysis - -### What we've achieved ✓ - -- **Content types now determine behavior** (e.g., `UserSlashCommandMessage` vs `UserTextMessage`) -- **Dispatcher pattern** routes formatting based on content type -- **Removed `ContentBlock`** from `ContentItem` union - using our own types -- **Simplified `_process_regular_message`** - content type detection drives rendering -- **CSS_CLASS_REGISTRY** derives CSS classes from content types (in `html/utils.py`) -- **MessageModifiers removed** - only `is_sidechain` remains as a flag on `MessageMeta` -- **UserSteeringMessage** created for queue-operation "remove" messages -- **IdeNotificationContent** is now a plain dataclass (not a MessageContent subclass) -- **Inverted relationship achieved** - `MessageContent.meta` is the source of truth, `TemplateMessage.meta = content.meta` -- **Leaner TemplateMessage** - `has_markdown` delegates to content, `raw_text_content` moved to content classes -- **Title dispatch pattern** - `Renderer.title_content()` dispatches to `title_{ClassName}` methods - -### Factory Organization ✓ - -Completed reorganization from parsers to factories: - -``` -factories/ -├── __init__.py # Re-exports all public symbols -├── meta_factory.py # create_meta(transcript) -> MessageMeta -├── system_factory.py # create_system_message() -├── user_factory.py # create_user_message(), create_*_message() -├── assistant_factory.py # create_assistant_message(), create_thinking_message() -├── tool_factory.py # create_tool_use_message(), create_tool_result_message() -└── transcript_factory.py # create_transcript_entry(), create_content_item() -``` - -### MessageMeta as Required First Parameter ✓ - -All factory functions now require `MessageMeta` as the first positional parameter: - -```python -def create_user_message(meta: MessageMeta, content_list: list[ContentItem], ...) -> ... -def create_assistant_message(meta: MessageMeta, items: list[ContentItem]) -> ... -def create_tool_use_message(meta: MessageMeta, tool_item: ContentItem, ...) -> ... -def create_tool_result_message(meta: MessageMeta, tool_item: ContentItem, ...) -> ... -def create_thinking_message(meta: MessageMeta, tool_item: ContentItem) -> ... -``` - -This ensures every `MessageContent` subclass has valid metadata. - -### Goals Status - -| Goal | Status | Notes | -|------|--------|-------| -| Inverted relationship | ✅ Done | `MessageContent.meta` is source of truth, `TemplateMessage.meta = content.meta` | -| Leaner TemplateMessage | ✅ Done | `has_markdown` delegates to content, `raw_text_content` on content classes | -| Title dispatch | ✅ Done | `Renderer.title_content()` with `title_{ClassName}` methods | -| Pure MessageContent | ✅ Done | MessageContent has no render-time fields (relationship data on TemplateMessage) | -| TemplateMessage as primary | ✅ Done | RenderingContext registers TemplateMessage, holds pairing/hierarchy data | -| Models split | ❌ Optional | Still single `models.py` - could split if needed | - -### TemplateMessage Architecture ✓ - -TemplateMessage is now the primary render-time object, with clear separation of concerns: - -**MessageContent** (pure transcript data): -- `meta: MessageMeta` - metadata from transcript -- `message_type` property - type identifier -- `has_markdown` property - whether content has markdown - -**TemplateMessage** (render-time wrapper): -- `content: MessageContent` - the wrapped content -- `meta = content.meta` - convenience alias -- `message_index: Optional[int]` - index in RenderingContext.messages -- `message_id` property - formatted ID for HTML ("d-{index}" or "session-{id}") -- Pairing fields: `pair_first`, `pair_last`, `pair_duration` -- Pairing properties: `is_paired`, `is_first_in_pair`, `is_last_in_pair`, `pair_role` -- Hierarchy fields: `ancestry`, `children` -- Fold/unfold counts: `immediate_children_count`, `total_descendants_count`, etc. - -**RenderingContext**: -- `messages: list[TemplateMessage]` - registry of all messages -- `register(message: TemplateMessage) -> int` - assigns `message_index` -- `get(message_index: int) -> TemplateMessage` - lookup by index -- `tool_use_context: dict[str, ToolUseContent]` - for tool result pairing -- `session_first_message: dict[str, int]` - session header indices - -## Cache Considerations - -**Good news**: The cache stores `TranscriptEntry` objects (raw parsed data), not `TemplateMessage`: -```python -class CacheManager: - def load_cached_entries(...) -> Optional[list[TranscriptEntry]] - def save_cached_entries(...) -``` - -This means: -- Cache is at the parsing layer, not rendering layer -- Changing `TemplateMessage` structure won't break cache compatibility -- If we store `MessageContent` class names for deserialization, it's a parsing concern - -**Feasibility of the inversion**: Yes, because: -1. Cache stores raw transcript entries, not TemplateMessages -2. TemplateMessage is generated fresh from entries on each render -3. The relationship between MessageContent and its metadata is internal to rendering - -## Future: Models Split (Optional) - -If we decide to split models.py: - -- `models.py` - Base classes (`MessageContent`, `TranscriptEntry`, etc.) -- `user_models.py` - User message content types -- `assistant_models.py` - Assistant message content types -- `tools_models.py` - Tool use/result models - -This is optional and primarily a code organization improvement. diff --git a/dev-docs/RENAME_CONTENT_TO_MESSAGE.md b/dev-docs/RENAME_CONTENT_TO_MESSAGE.md deleted file mode 100644 index def039fb..00000000 --- a/dev-docs/RENAME_CONTENT_TO_MESSAGE.md +++ /dev/null @@ -1,173 +0,0 @@ -# Refactoring Plan: Content → Message Naming - -## Goal - -Clarify the naming by using consistent suffixes: -- `*Content` = ContentItem members (JSONL parsing layer) -- `*Input` / `*Output` = Tool-specific parsing -- `*Message` = MessageContent subclasses (rendering layer) -- `*Model` = Pydantic JSONL transcript models - -## Phase 1: Free up "Message" names - -Rename Pydantic transcript message models to add `Model` suffix: - -| Current | New | -|---------|-----| -| `UserMessage` | `UserMessageModel` | -| `AssistantMessage` | `AssistantMessageModel` | - -These are only used in `UserTranscriptEntry.message` and `AssistantTranscriptEntry.message` for Pydantic deserialization. - -## Phase 2: Rename MessageContent subclasses to ...Message - -| Current | New | -|---------|-----| -| `UserTextContent` | `UserTextMessage` | -| `UserSteeringContent` | `UserSteeringMessage` | -| `UserSlashCommandContent` | `UserSlashCommandMessage` | -| `UserMemoryContent` | `UserMemoryMessage` | -| `AssistantTextContent` | `AssistantTextMessage` | -| `SlashCommandContent` | `SlashCommandMessage` | -| `CommandOutputContent` | `CommandOutputMessage` | -| `CompactedSummaryContent` | `CompactedSummaryMessage` | -| `BashInputContent` | `BashInputMessage` | -| `BashOutputContent` | `BashOutputMessage` | -| `SystemContent` | `SystemMessage` | -| `HookSummaryContent` | `HookSummaryMessage` | -| `SessionHeaderContent` | `SessionHeaderMessage` | -| `DedupNoticeContent` | `DedupNoticeMessage` | -| `UnknownContent` | `UnknownMessage` | -| `ThinkingContentModel` | `ThinkingMessage` | -| `ToolResultContentModel` | `ToolResultMessage` | - -Also update: -- `CSS_CLASS_REGISTRY` in `html/utils.py` -- All formatters in `html/*_formatters.py` -- All usages in `renderer.py`, `parser.py`, etc. - -## Phase 3: Tool message wrapper pattern with typed inputs/outputs - -### New type aliases - -```python -# Union of all specialized input types + ToolUseContent as generic fallback -ToolInput = Union[ - BashInput, ReadInput, WriteInput, EditInput, MultiEditInput, - GlobInput, GrepInput, TaskInput, TodoWriteInput, AskUserQuestionInput, - ExitPlanModeInput, NotebookEditInput, WebFetchInput, WebSearchInput, - KillShellInput, - ToolUseContent, # Generic fallback when no specialized parser -] - -# Renamed from ToolUseResult for symmetry -# Union of all specialized output types + ToolResultContent as generic fallback -ToolOutput = Union[ - ReadOutput, EditOutput, # ... more as they're implemented - ToolResultContent, # Generic fallback for unparsed results -] -``` - -### New ToolUseMessage - -```python -@dataclass -class ToolUseMessage(MessageContent): - """Message for tool invocations.""" - input: ToolInput # Specialized (BashInput, etc.) or generic (ToolUseContent) - tool_use_id: str # From ToolUseContent.id - tool_name: str # From ToolUseContent.name -``` - -### New ToolResultMessage - -```python -@dataclass -class ToolResultMessage(MessageContent): - """Message for tool results.""" - output: ToolOutput # Specialized (ReadOutput, etc.) or generic (ToolResultContent) - tool_use_id: str - tool_name: Optional[str] = None - file_path: Optional[str] = None - - @property - def is_error(self) -> bool: - if isinstance(self.output, ToolResultContent): - return self.output.is_error or False - return False -``` - -### Simple ThinkingMessage (no wrapper) - -```python -@dataclass -class ThinkingMessage(MessageContent): - thinking_text: str # The thinking content - signature: Optional[str] = None -``` - -## Phase 4: Update CSS_CLASS_REGISTRY - -Update to use new names: - -```python -CSS_CLASS_REGISTRY: dict[type[MessageContent], list[str]] = { - # System messages - SystemMessage: ["system"], - HookSummaryMessage: ["system", "system-hook"], - # User messages - UserTextMessage: ["user"], - UserSteeringMessage: ["user", "steering"], - SlashCommandMessage: ["user", "slash-command"], - UserSlashCommandMessage: ["user", "slash-command"], - UserMemoryMessage: ["user"], - CompactedSummaryMessage: ["user", "compacted"], - CommandOutputMessage: ["user", "command-output"], - # Assistant messages - AssistantTextMessage: ["assistant"], - # Tool messages - ToolUseMessage: ["tool_use"], - ToolResultMessage: ["tool_result"], - # Other messages - ThinkingMessage: ["thinking"], - SessionHeaderMessage: ["session_header"], - BashInputMessage: ["bash-input"], - BashOutputMessage: ["bash-output"], - UnknownMessage: ["unknown"], -} -``` - -## Naming Pattern Summary - -| Suffix | Layer | Examples | -|--------|-------|----------| -| `*Content` | ContentItem (JSONL parsing) | `TextContent`, `ToolUseContent`, `ToolResultContent`, `ThinkingContent`, `ImageContent` | -| `*Input` | Tool input parsing | `BashInput`, `ReadInput`, `TaskInput`, ... | -| `*Output` | Tool output parsing | `ReadOutput`, `EditOutput`, ... | -| `*Message` | MessageContent (rendering) | `UserTextMessage`, `ToolUseMessage`, `ThinkingMessage` | -| `*Model` | Pydantic JSONL models | `UserMessageModel`, `AssistantMessageModel` | - -## Files to Update - -| File | Changes | -|------|---------| -| `models.py` | All renames, new ToolInput/ToolOutput unions | -| `parser.py` | Update imports and usages | -| `renderer.py` | Update imports and usages | -| `html/utils.py` | Update CSS_CLASS_REGISTRY | -| `html/renderer.py` | Update dispatcher and imports | -| `html/user_formatters.py` | Update function signatures and imports | -| `html/assistant_formatters.py` | Update function signatures and imports | -| `html/tool_formatters.py` | Update to use ToolUseMessage/ToolResultMessage | -| `html/system_formatters.py` | Update function signatures and imports | -| `converter.py` | Update imports | -| `dev-docs/messages.md` | Update documentation | - -## Execution Order - -1. Phase 1: Rename `UserMessage` → `UserMessageModel`, `AssistantMessage` → `AssistantMessageModel` -2. Phase 2: Rename all MessageContent subclasses to `*Message` -3. Phase 3: Create `ToolInput`, `ToolOutput` unions; update `ToolUseMessage`, `ToolResultMessage` -4. Phase 4: Update CSS_CLASS_REGISTRY -5. Run tests, fix any remaining issues -6. Update documentation diff --git a/dev-docs/TEMPLATE_MESSAGE_CHILDREN.md b/dev-docs/TEMPLATE_MESSAGE_CHILDREN.md deleted file mode 100644 index 6a4c142b..00000000 --- a/dev-docs/TEMPLATE_MESSAGE_CHILDREN.md +++ /dev/null @@ -1,208 +0,0 @@ -# Template Message Children Architecture - -This document tracks the exploration of a children-based architecture for `TemplateMessage`, where messages can have nested children to form an explicit tree structure. - -## Current Architecture (2025-12-13) - -### Data Flow -``` -TranscriptEntry[] → generate_template_messages() → root_messages (tree) - ↓ - HtmlRenderer._flatten_preorder() → flat_list - ↓ - template.render(messages=flat_list) -``` - -### TemplateMessage (current) -- `generate_template_messages()` returns **tree roots** (typically session headers) -- Each message has `children: List[TemplateMessage]` populated -- `ancestry` field preserved for CSS classes / JavaScript fold/unfold -- HtmlRenderer flattens via pre-order traversal before template rendering - -### Hierarchy Levels -``` -Level 0: Tree roots (messages without ancestry - typically session headers) -Level 1: User messages -Level 2: Assistant, System, Thinking -Level 3: Tool use/result -Level 4: Sidechain user/assistant/thinking -Level 5: Sidechain tools -``` - -**Note:** Tree roots are any messages with empty `ancestry`. This is typically session headers, but in degenerate cases (no session headers), user messages or other top-level messages become roots. - -### Sidechain Hierarchy Details (2025-12-24) - -Sidechain messages come from **Task tool** invocations (subagent spawning). Key findings from investigating real session data: - -#### Where Sidechain Children Attach - -Due to **pair reordering** (tool_result is moved right after its corresponding tool_use), sidechain messages become children of the **Task tool_result**, not the tool_use. - -#### Agent Type Patterns - -Real session data shows two distinct patterns depending on agent type: - -**Plan-type agents** (e.g., `-Users-dain-workspace-coderabbit-*`): -- Start with a **user prompt** (`UserTextMessage` with `isSidechain=true`, `parentUuid=null`) -- This user prompt duplicates the Task input - -Initial tree structure (before cleanup): -``` -tool_use (Task) ← 0 children (pair reordering moves tool_result here) -tool_result (Task) ← sidechain messages become children here - └─ user(sc): UserTextMessage ← Level 4: duplicate of Task input - └─ assistant(sc) ← Level 4: parented to user(sc) - └─ tool_use(sc) ← Level 5: parented to user(sc) - └─ tool_result(sc) ← Level 5 -``` - -After cleanup (user prompt removed, children adopted): -``` -tool_result (Task) - └─ assistant(sc) ← Now direct child of tool_result - └─ tool_use(sc) ← Adopted from removed user(sc) - └─ tool_result(sc) -``` - -**Explore-type agents** (e.g., `-src-deep-manifest`): -- Start directly with **assistant** (no user prompt) -- No cleanup needed for the first message - -``` -tool_result (Task) - └─ assistant(sc): AssistantTextMessage ← First child, kept as-is - └─ tool_use(sc) - └─ tool_result(sc) -``` - -#### Child Adoption During Cleanup - -When `_cleanup_sidechain_duplicates()` removes a UserTextMessage (the duplicate prompt), it must **adopt the removed message's children** to prevent orphaning Level 5 tool messages: - -```python -# In _cleanup_sidechain_duplicates() -if ( - children - and children[0].is_sidechain - and isinstance(children[0].content, UserTextMessage) -): - removed = children.pop(0) - # Adopt orphaned children (tool_use/tool_result from sidechain) - if removed.children: - children[:0] = removed.children -``` - -Without this adoption, the sidechain tool messages would be lost from the tree. - -#### Key Insight - -The hierarchy level is determined by message **type**, not by `parentUuid`. A sidechain user message (`parentUuid=null`) still appears at Level 4 because: -1. It has `isSidechain=true` -2. Its effective parent is determined by the Task tool_result (found via timestamp/session matching) -3. The tree-building algorithm correctly places it as a child of the Task tool_result - -### Template Rendering (current) -- Single `{% for message in messages %}` loop over flattened list -- Ancestry rendered as CSS classes for JavaScript DOM queries -- Fold/unfold uses `document.querySelectorAll('.message.${targetId}')` -- Tree structure used internally but template still receives flat list - -## Future: Recursive Template Rendering - -The next step would be to pass tree roots directly to the template and use a recursive macro, eliminating the flatten step. - -### Template Rendering (future) -Recursive macro approach (Note: html_content is now passed separately, not stored in message): -```jinja2 -{% macro render_message(message, html_content, depth=0) %} -
-
{{ html_content | safe }}
- {% if message.children %} -
- {% for child, child_html in message.children_with_html %} - {{ render_message(child, child_html, depth + 1) }} - {% endfor %} -
- {% endif %} -
-{% endmacro %} - -{% for root, root_html in roots_with_html %} -{{ render_message(root, root_html) }} -{% endfor %} -``` - -### JavaScript Simplification (future) -With nested DOM structure, fold/unfold becomes trivial: -```javascript -// Hide all children -messageEl.querySelector('.children').style.display = 'none'; -// Show children -messageEl.querySelector('.children').style.display = ''; -``` - -This would require updating the fold/unfold JavaScript to work with the nested structure rather than CSS class queries. - -## Exploration Log - -### Phase 1: Foundation ✅ COMPLETE -- [x] Add `children` field to TemplateMessage (commit `7077f68`) -- [x] Keep existing flat-list behavior working -- [x] Add `flatten()` method for backward compatibility (commit `ed4d7b3`) - - Instance method `flatten()` returns self + all descendants in depth-first order - - Static method `flatten_all()` flattens list of root messages - - Unit tests in `test/test_template_data.py::TestTemplateMessageTree` - -### Phase 2: Tree Building ✅ COMPLETE -- [x] Create `_build_message_tree()` function (commit `83fcf31`) - - Takes flat list with `message_id` and `ancestry` already set - - Populates `children` field based on ancestry - - Returns list of root messages (those with empty ancestry) -- [x] Called after `_mark_messages_with_children()` in render pipeline -- [x] Integration tests verify tree building doesn't break HTML generation - -### Phase 2.5: Tree-First Architecture ✅ COMPLETE (2025-12-13) -- [x] `generate_template_messages()` now returns tree roots, not flat list (commit `c5048b9`) -- [x] `HtmlRenderer._flatten_preorder()` traverses tree, formats content, builds flat list -- [x] Content formatting happens during pre-order traversal (no separate pass) -- [x] Template unchanged - still receives flat list - -**Key insight:** The flat list was being passed to template AND the same messages had children populated. This caused confusion about which structure was authoritative. Now the tree is authoritative and the flat list is derived. - -### Phase 3: Template Migration (TODO - Future Work) -- [ ] Create recursive render macro -- [ ] Update DOM structure to use nested `.children` divs -- [ ] Migrate JavaScript fold/unfold to use nested DOM -- [ ] Pass `root_messages` directly to template (eliminate flatten step) - -### Challenges & Notes - -**Current State (2025-12-13):** -- Tree is the primary structure returned from `generate_template_messages()` -- HtmlRenderer flattens via pre-order traversal for template rendering -- This is cleaner than before: tree in → flat list out (explicit transformation) - -**Performance (2025-12-13):** -- Benchmark: 3.35s for 3917 messages across 5 projects -- Pre-order traversal + formatting is O(n) -- No caching needed - each message formatted exactly once - -**Why Keep Flat Template (for now):** -1. JavaScript fold/unfold relies on CSS class queries -2. Changing DOM structure requires JS migration -3. Current approach works correctly - -## Related Work - -### golergka's text-output-format PR -Created `content_extractor.py` for shared content parsing: -- Separates data extraction from presentation -- Dataclasses for extracted content: `ExtractedText`, `ExtractedToolUse`, etc. -- Could be extended for the tree-building approach - -### Visitor Pattern Consideration -For multi-format output (HTML, Markdown, JSON), consider: -- TemplateMessage as a tree data structure (no rendering logic) -- Visitor implementations for each output format -- Preparation in converter.py before any rendering diff --git a/dev-docs/TEMPLATE_MESSAGE_REFACTORING.md b/dev-docs/TEMPLATE_MESSAGE_REFACTORING.md deleted file mode 100644 index fee49101..00000000 --- a/dev-docs/TEMPLATE_MESSAGE_REFACTORING.md +++ /dev/null @@ -1,99 +0,0 @@ -# TemplateMessage Simplification Plan - -## Goal - -Simplify `TemplateMessage` by moving redundant fields to `MessageMeta` (accessible via `content.meta`) and adding properties to `MessageContent` subclasses. This prepares for the eventual replacement of `TemplateMessage` with `MessageContent` directly. - -## Completed Changes ✓ - -### Phase 1: Added `message_type` property to MessageContent subclasses ✓ - -Added to these subclasses that were missing it: -- `SystemMessage` → returns "system" -- `HookSummaryMessage` → returns "system" -- `ToolResultMessage` → returns "tool_result" -- `ToolUseMessage` → returns "tool_use" -- `UnknownMessage` → returns "unknown" -- `SessionHeaderMessage` → returns "session_header" -- `DedupNoticeMessage` → returns "dedup_notice" - -### Phase 2: Added `has_markdown` property ✓ - -- Added to `MessageContent` base class (returns `False` by default) -- Override in `AssistantTextMessage` → returns `True` -- Override in `ThinkingMessage` → returns `True` -- Override in `CompactedSummaryMessage` → returns `True` - -### Phase 3: Skip tool_use_id on base ✓ - -`tool_use_id` already exists as a field on `ToolUseMessage` and `ToolResultMessage`. -No base class property needed - access via `message.content.tool_use_id` when needed. - -### Phase 4: Added `meta` field to TemplateMessage ✓ - -Added `self.meta = content.meta if content else None` for easy transition. - -### Phase 5: Updated template to use new accessors ✓ - -- Changed `message.is_session_header` → `is_session_header(message)` (helper function) -- Changed `message.has_markdown` → `message.content.has_markdown if message.content else false` -- Removed dead `session_subtitle` code from template -- Added `is_session_header` helper to `html/utils.py` and template context - -### Phase 6: Converted parameters to properties ✓ - -In `TemplateMessage`: -- Removed `is_session_header` parameter, added property that checks `isinstance(self.content, SessionHeaderMessage)` -- Removed `has_markdown` parameter, added property that returns `self.content.has_markdown if self.content else False` -- Removed `session_subtitle` assignment (was never set anyway) -- Removed unused imports (`CompactedSummaryMessage`, `ThinkingMessage`) - -## Current TemplateMessage State - -### Parameters (in `__init__`) - -| Parameter | Status | Notes | -|-----------|--------|-------| -| `message_type` | KEEP | Still used for now | -| `raw_timestamp` | KEEP | Still used | -| `session_summary` | KEEP | Complex async matching | -| `session_id` | KEEP | Still used | -| `token_usage` | KEEP | Formatted display string | -| `tool_use_id` | KEEP | Used for tool messages | -| `title_hint` | KEEP | Used for tooltips | -| `message_title` | KEEP | Display title | -| `message_id` | KEEP | Hierarchy-assigned | -| `ancestry` | KEEP | Parent chain | -| `has_children` | KEEP | Tree structure flag | -| `uuid` | KEEP | Still used | -| `parent_uuid` | KEEP | Still used | -| `agent_id` | KEEP | Still used | -| `is_sidechain` | KEEP | Still used | -| `content` | KEEP | The MessageContent | - -### Properties (derived from content) - -| Property | Derivation | -|----------|------------| -| `meta` | `content.meta if content else None` | -| `is_session_header` | `isinstance(self.content, SessionHeaderMessage)` | -| `has_markdown` | `self.content.has_markdown if self.content else False` | - -### Instance attributes (set after init) - -- `raw_text_content` - For deduplication -- Fold/unfold counts and type maps -- Pairing metadata (`is_paired`, `pair_role`, `pair_duration`) -- `children` - Tree structure - -## Future Work - -The following fields could still be derived from `content.meta` in future refactoring: -- `raw_timestamp` → `content.meta.timestamp` -- `session_id` → `content.meta.session_id` -- `uuid` → `content.meta.uuid` -- `parent_uuid` → `content.meta.parent_uuid` -- `agent_id` → `content.meta.agent_id` -- `is_sidechain` → `content.meta.is_sidechain` -- `message_type` → `content.message_type` -- `message_title` → `content.message_title()` diff --git a/dev-docs/messages.md b/dev-docs/messages.md index e10a4f60..5c3cf78f 100644 --- a/dev-docs/messages.md +++ b/dev-docs/messages.md @@ -73,8 +73,8 @@ class TemplateMessage: # Display message_title: str # Display title (e.g., "User", "Assistant") - is_sidechain: bool # Sub-agent message flag - has_markdown: bool # Content should be rendered as markdown + is_sidechain: bool # Sub-agent message flag (via content.meta) + # Note: has_markdown is accessed via content.has_markdown # Note: CSS classes are derived from content type via CSS_CLASS_REGISTRY # Metadata @@ -263,6 +263,18 @@ class UserSteeringMessage(UserTextMessage): Steering messages represent user interrupts that cancel queued operations. +### User Memory + +- **Condition**: Contains `` tags +- **Content Model**: `UserMemoryMessage` +- **CSS Class**: `user` + +```python +@dataclass +class UserMemoryMessage(MessageContent): + memory_text: str # The memory content from the tag +``` + ### Sidechain User (Sub-agent) - **Condition**: `isSidechain: true` @@ -390,7 +402,7 @@ Tool results are wrapped in `ToolResultMessage` for rendering, which provides ad @dataclass class ToolResultMessage(MessageContent): tool_use_id: str - output: ToolOutput # Specialized (ReadOutput, EditOutput) or ToolResultContent + output: ToolOutput # Specialized output or ToolResultContent fallback is_error: bool = False tool_name: Optional[str] = None # Name of the tool file_path: Optional[str] = None # File path for Read/Edit/Write @@ -398,7 +410,12 @@ class ToolResultMessage(MessageContent): # ToolOutput is a union type for tool results ToolOutput = Union[ ReadOutput, + WriteOutput, EditOutput, + BashOutput, + TaskOutput, + AskUserQuestionOutput, + ExitPlanModeOutput, ToolResultContent, # Generic fallback for unparsed results ] ``` @@ -460,6 +477,7 @@ Assistant messages contain `ContentItem` instances that are: @dataclass class AssistantTextMessage(MessageContent): items: list[TextContent | ImageContent] # Interleaved text and images + token_usage: Optional[str] # Formatted token usage string ``` ### Sidechain Assistant @@ -480,6 +498,7 @@ class AssistantTextMessage(MessageContent): class ThinkingMessage(MessageContent): thinking: str # The thinking text signature: Optional[str] # Thinking block signature + token_usage: Optional[str] # Formatted token usage string ``` ```json @@ -527,8 +546,8 @@ The original `ToolUseContent` (Pydantic model) provides: | MultiEdit | `MultiEditInput` | file_path, edits[] | | Bash | `BashInput` | command, description, timeout, run_in_background | | Glob | `GlobInput` | pattern, path | -| Grep | `GrepInput` | pattern, path, glob, type, output_mode | -| Task | `TaskInput` | prompt, subagent_type, description, model | +| Grep | `GrepInput` | pattern, path, glob, type, output_mode, multiline, head_limit, offset | +| Task | `TaskInput` | prompt, subagent_type, description, model, run_in_background, resume | | TodoWrite | `TodoWriteInput` | todos[] | | AskUserQuestion | `AskUserQuestionInput` | questions[], question | | ExitPlanMode | `ExitPlanModeInput` | plan, launchSwarm, teammateCount | @@ -685,18 +704,6 @@ class SessionHeaderMessage(MessageContent): summary: Optional[str] = None # Session summary if available ``` -## 5.2 DedupNoticeMessage - -Deduplication notices are shown when content is deduplicated (e.g., sidechain assistant text that duplicates the Task tool result): - -```python -@dataclass -class DedupNoticeMessage(MessageContent): - notice_text: str # e.g., "Content omitted (duplicates Task result)" - target_uuid: Optional[str] = None # UUID of target message - target_message_id: Optional[str] = None # Resolved message ID for anchor link -``` - --- # Part 6: Infrastructure Models @@ -766,6 +773,7 @@ class BaseTranscriptEntry(BaseModel): timestamp: str # ISO 8601 timestamp isMeta: Optional[bool] = None # Slash command marker agentId: Optional[str] = None # Sub-agent ID + gitBranch: Optional[str] = None # Git branch name when available ``` --- @@ -876,6 +884,5 @@ Sub-agent messages (from `Task` tool): - [tool_factory.py](../claude_code_log/factories/tool_factory.py) - `create_tool_use_message()`, `create_tool_result_message()` - [system_factory.py](../claude_code_log/factories/system_factory.py) - `create_system_message()` - [meta_factory.py](../claude_code_log/factories/meta_factory.py) - `create_meta()` -- [TEMPLATE_MESSAGE_CHILDREN.md](TEMPLATE_MESSAGE_CHILDREN.md) - Tree architecture exploration -- [MESSAGE_REFACTORING.md](MESSAGE_REFACTORING.md) - Refactoring plan (Phase 1) -- [MESSAGE_REFACTORING2.md](MESSAGE_REFACTORING2.md) - Refactoring plan (Phase 2) +- [rendering-architecture.md](rendering-architecture.md) - Rendering pipeline and Renderer class hierarchy +- [rendering-next.md](rendering-next.md) - Future rendering improvements diff --git a/dev-docs/rendering-architecture.md b/dev-docs/rendering-architecture.md new file mode 100644 index 00000000..3b8e4d54 --- /dev/null +++ b/dev-docs/rendering-architecture.md @@ -0,0 +1,346 @@ +# Rendering Architecture + +This document describes how Claude Code transcript data flows from raw JSONL entries to final output (HTML, Markdown). The architecture separates concerns into distinct layers: + +1. **Parsing Layer** - Raw JSONL to typed transcript entries +2. **Factory Layer** - Transcript entries to `MessageContent` models +3. **Rendering Layer** - Format-neutral tree building and relationship processing +4. **Output Layer** - Format-specific rendering (HTML, Markdown) + +--- + +## 1. Data Flow Overview + +``` +JSONL File + ↓ (parser.py) +list[TranscriptEntry] + ↓ (factories/) +list[TemplateMessage] with MessageContent + ↓ (renderer.py: generate_template_messages) +Tree of TemplateMessage (roots with children) ++ RenderingContext (message registry) ++ Session navigation data + ↓ (html/renderer.py or markdown/renderer.py) +Final output (HTML or Markdown) +``` + +**Key cardinality rules**: +- Each transcript entry has a `uuid`, but a single entry's `list[ContentItem]` may be chunked and produce multiple `MessageContent` objects (e.g., tool_use items are split into separate messages) +- Each `MessageContent` gets exactly one `TemplateMessage` wrapper +- The `message_index` (assigned during registration) uniquely identifies a `TemplateMessage` within a render + +--- + +## 2. Naming Conventions + +The codebase uses consistent suffixes to distinguish layers: + +| Suffix | Layer | Examples | +|--------|-------|----------| +| `*Content` | ContentItem (JSONL parsing) | `TextContent`, `ToolUseContent`, `ThinkingContent`, `ImageContent` | +| `*Input` | Tool input models | `BashInput`, `ReadInput`, `TaskInput` | +| `*Output` | Tool output models | `ReadOutput`, `EditOutput`, `TaskOutput` | +| `*Message` | MessageContent (rendering) | `UserTextMessage`, `ToolUseMessage`, `AssistantTextMessage` | +| `*Model` | Pydantic JSONL models | `UserMessageModel`, `AssistantMessageModel` | + +**Key distinction**: +- `ToolUseContent` is the raw JSONL content item +- `ToolUseMessage` is the render-time wrapper containing a typed `ToolInput` +- `BashInput` is a specific tool input model parsed from `ToolUseContent.input` + +--- + +## 3. The Factory Layer + +Factories ([factories/](../claude_code_log/factories/)) transform raw transcript data into typed `MessageContent` models. Each factory focuses on a specific message category: + +| Factory | Creates | Key Function | +|---------|---------|--------------| +| [meta_factory.py](../claude_code_log/factories/meta_factory.py) | `MessageMeta` | `create_meta(entry)` | +| [user_factory.py](../claude_code_log/factories/user_factory.py) | User message types | `create_user_message(meta, content_list, ...)` | +| [assistant_factory.py](../claude_code_log/factories/assistant_factory.py) | Assistant messages | `create_assistant_message(meta, items)` | +| [tool_factory.py](../claude_code_log/factories/tool_factory.py) | Tool use/result | `create_tool_use_message(meta, item, ...)` | +| [system_factory.py](../claude_code_log/factories/system_factory.py) | System messages | `create_system_message(meta, ...)` | + +### Factory Pattern + +All factory functions require `MessageMeta` as the first parameter: + +```python +def create_user_message( + meta: MessageMeta, + content_list: list[ContentItem], + ... +) -> UserTextMessage | UserSlashCommandMessage | ... +``` + +This ensures every `MessageContent` has valid metadata accessible via `content.meta`. + +### Tool Input Parsing + +Tool inputs are parsed into typed models in [tool_factory.py:create_tool_input()](../claude_code_log/factories/tool_factory.py): + +```python +TOOL_INPUT_MODELS: dict[str, type[ToolInput]] = { + "Bash": BashInput, + "Read": ReadInput, + "Write": WriteInput, + ... +} + +def create_tool_input(tool_use: ToolUseContent) -> ToolInput: + model_class = TOOL_INPUT_MODELS.get(tool_use.name) + if model_class: + return model_class.model_validate(tool_use.input) + return tool_use # Fallback to raw ToolUseContent +``` + +### Tool Output Parsing + +Tool outputs use a **different approach** than inputs. While inputs are parsed via Pydantic `model_validate()`, outputs are extracted from text using **regex patterns** since tool results arrive as text content: + +```python +TOOL_OUTPUT_PARSERS: dict[str, ToolOutputParser] = { + "Read": parse_read_output, + "Edit": parse_edit_output, + "Write": parse_write_output, + "Bash": parse_bash_output, + "Task": parse_task_output, + ... +} + +def create_tool_output(tool_name, tool_result, file_path) -> ToolOutput: + if parser := TOOL_OUTPUT_PARSERS.get(tool_name): + if parsed := parser(tool_result, file_path): + return parsed + return tool_result # Fallback to raw ToolResultContent +``` + +Each parser extracts text from `ToolResultContent` and parses patterns like: +- `cat -n` format: `" 123→content"` for file content with line numbers +- Structured prefixes: `"The file ... has been updated."` for edit results + +--- + +## 4. The TemplateMessage Wrapper + +`TemplateMessage` ([renderer.py:132](../claude_code_log/renderer.py#L132)) wraps `MessageContent` with render-time state: + +**MessageContent** (pure transcript data): +- `meta: MessageMeta` - timestamp, session_id, uuid, is_sidechain, etc. +- `message_type` property - type identifier ("user", "assistant", etc.) +- `has_markdown` property - whether content contains markdown + +**TemplateMessage** (render-time wrapper): +- `content: MessageContent` - the wrapped content +- `meta` property - delegates to `content.meta` (`message.meta is message.content.meta`) +- `message_index: Optional[int]` - unique index in RenderingContext registry +- `message_id` property - formatted as `"d-{message_index}"` for HTML element IDs + +Relationship fields (populated by processing phases, using `message_index` for references): +- Pairing: `pair_first`, `pair_last`, `pair_duration`, `is_first_in_pair`, `is_last_in_pair` +- Hierarchy: `ancestry` (list of parent `message_index` values), `children` +- Fold/unfold: `immediate_children_count`, `total_descendants_count` + +--- + +## 5. Format-Neutral Processing Pipeline + +The core rendering pipeline is in [renderer.py:generate_template_messages()](../claude_code_log/renderer.py#L523). It returns: + +1. **Tree of TemplateMessage** - Session headers as roots with nested children +2. **Session navigation data** - For table of contents +3. **RenderingContext** - Message registry for `message_index` lookups + +### Processing Phases + +The pipeline processes messages through several phases: + +#### Phase 1: Message Loop +[_process_messages_loop()](../claude_code_log/renderer.py) creates `TemplateMessage` wrappers for each transcript entry. The loop handles: +- Inserting session headers at session boundaries +- Creating `MessageContent` via factories +- Registering messages in `RenderingContext` + +#### Phase 2: Pairing +[_identify_message_pairs()](../claude_code_log/renderer.py#L929) marks related messages: +- **Adjacent pairs**: thinking+assistant, bash-input+output, system+slash-command +- **Indexed pairs**: tool_use+tool_result (by tool_use_id) + +After identification, [_reorder_paired_messages()](../claude_code_log/renderer.py#L968) moves `pair_last` messages adjacent to their `pair_first`. + +#### Phase 3: Hierarchy +[_build_message_hierarchy()](../claude_code_log/renderer.py) assigns `ancestry` based on message relationships: +- User messages at level 1 +- Assistant/system at level 2 +- Tool use/result at level 3 +- Sidechain messages at level 4+ + +#### Phase 4: Tree Building +[_build_message_tree()](../claude_code_log/renderer.py#L1226) populates `children` lists from `ancestry`: + +``` +Session Header (root) + └─ User message + └─ Assistant message + └─ Tool use + └─ Tool result + └─ Sidechain assistant (Task result children) +``` + +--- + +## 6. RenderingContext + +`RenderingContext` ([renderer.py:75](../claude_code_log/renderer.py#L75)) holds per-render state: + +```python +@dataclass +class RenderingContext: + messages: list[TemplateMessage] # All messages by index + tool_use_context: dict[str, ToolUseContent] # For result→use lookup + session_first_message: dict[str, int] # Session header indices + + def register(self, message: TemplateMessage) -> int: + """Assign message_index and add to registry.""" + + def get(self, message_index: int) -> Optional[TemplateMessage]: + """Lookup by index.""" +``` + +This enables parallel-safe rendering where each render operation gets its own context. + +--- + +## 7. The Renderer Class Hierarchy + +The base `Renderer` class ([renderer.py:2056](../claude_code_log/renderer.py#L2056)) defines the method-based dispatcher pattern. Subclasses implement format-specific rendering. + +### Dispatch Mechanism + +The dispatcher finds methods by content type name and passes both the typed object and the `TemplateMessage`: + +```python +def _dispatch_format(self, obj: Any, message: TemplateMessage) -> str: + """Dispatch to format_{ClassName}(obj, message) method.""" + for cls in type(obj).__mro__: + if cls is object: + break + if method := getattr(self, f"format_{cls.__name__}", None): + return method(obj, message) + return "" +``` + +For example, `ToolUseMessage` with `BashInput`: +1. `format_content(message)` calls `_dispatch_format(message.content, message)` +2. Finds `format_ToolUseMessage(content, message)` which calls `_dispatch_format(content.input, message)` +3. Finds `format_BashInput(input, message)` for the specific tool + +### Consistent (obj, message) Signature + +All `format_*` and `title_*` methods receive both parameters: + +```python +def format_BashInput(self, input: BashInput, _: TemplateMessage) -> str: + return format_bash_input(input) + +def title_BashInput(self, input: BashInput, message: TemplateMessage) -> str: + return self._tool_title(message, "💻", input.description) +``` + +This design gives handlers access to: +- **The typed object** (`input: BashInput`) for type-safe field access without casting +- **The full context** (`message: TemplateMessage`) for paired message lookups, ancestry, etc. + +Methods that don't need the message parameter use `_` or `_message` (for LSP compliance in overrides). + +### Title Dispatch + +Similar pattern for titles via `title_{ClassName}` methods: + +```python +def title_ToolUseMessage(self, content: ToolUseMessage, message: TemplateMessage) -> str: + if title := self._dispatch_title(content.input, message): + return title + return content.tool_name # Default fallback +``` + +### Subclass Implementations + +**HtmlRenderer** ([html/renderer.py](../claude_code_log/html/renderer.py)): +- Implements `format_*` methods by delegating to formatter functions +- `_flatten_preorder()` traverses tree, formats content, builds flat list for template +- Generates HTML via Jinja2 templates + +**MarkdownRenderer** ([markdown/renderer.py](../claude_code_log/markdown/renderer.py)): +- Implements `format_*` methods inline +- Writes directly to file/string without templates +- Simpler structure suited to plain text output + +--- + +## 8. HTML Formatter Organization + +HTML formatters are split by message category: + +| Module | Scope | Key Functions | +|--------|-------|---------------| +| [user_formatters.py](../claude_code_log/html/user_formatters.py) | User messages | `format_user_text_model_content()`, `format_bash_input_content()` | +| [assistant_formatters.py](../claude_code_log/html/assistant_formatters.py) | Assistant/thinking | `format_assistant_text_content()`, `format_thinking_content()` | +| [system_formatters.py](../claude_code_log/html/system_formatters.py) | System messages | `format_system_content()`, `format_session_header_content()` | +| [tool_formatters.py](../claude_code_log/html/tool_formatters.py) | Tool inputs/outputs | `format_bash_input()`, `format_read_output()`, etc. | +| [utils.py](../claude_code_log/html/utils.py) | Shared utilities | `render_markdown()`, `escape_html()`, `CSS_CLASS_REGISTRY` | + +--- + +## 9. CSS Class Derivation + +CSS classes are derived from content types using `CSS_CLASS_REGISTRY` in [html/utils.py](../claude_code_log/html/utils.py#L56): + +```python +CSS_CLASS_REGISTRY: dict[type[MessageContent], list[str]] = { + SystemMessage: ["system"], # level added dynamically + UserTextMessage: ["user"], + UserSteeringMessage: ["user", "steering"], + ToolUseMessage: ["tool_use"], + ToolResultMessage: ["tool_result"], # error added dynamically + ... +} +``` + +The function `css_class_from_message()` walks the content type's MRO to find matching classes, then adds dynamic modifiers (sidechain, error level). + +See [css-classes.md](css-classes.md) for the complete reference. + +--- + +## 10. Key Architectural Decisions + +### Content as Source of Truth + +`MessageContent.meta` holds all identity data. `TemplateMessage.meta` is the same object: +```python +assert message.meta is message.content.meta # Same object +``` + +Note that `meta.uuid` is the original transcript entry's UUID. Since a single entry may be split into multiple `MessageContent` objects (e.g., multiple tool_use items), several messages can share the same UUID. Use `message_index` for unique identification within a render. + +### Tree-First Architecture + +`generate_template_messages()` returns tree roots. Flattening for template rendering is an explicit step in `HtmlRenderer._flatten_preorder()`. This keeps the tree authoritative while supporting existing flat-list templates. + +### Separation of Concerns + +- **models.py**: Pure data structures, no rendering logic +- **factories/**: Data transformation, no I/O +- **renderer.py**: Format-neutral processing (pairing, hierarchy, tree) +- **html/**, **markdown/**: Format-specific output generation + +--- + +## Related Documentation + +- [messages.md](messages.md) - Complete message type reference +- [css-classes.md](css-classes.md) - CSS class combinations and rules +- [FOLD_STATE_DIAGRAM.md](FOLD_STATE_DIAGRAM.md) - Fold/unfold state machine diff --git a/dev-docs/rendering-next.md b/dev-docs/rendering-next.md new file mode 100644 index 00000000..9896ebbf --- /dev/null +++ b/dev-docs/rendering-next.md @@ -0,0 +1,153 @@ +# Rendering: Future Work + +This document captures potential improvements and future work for the rendering system. + +--- + +## 1. Recursive Template Rendering + +Currently, `HtmlRenderer._flatten_preorder()` flattens the message tree into a list for template rendering. The template uses a flat `{% for message in messages %}` loop with CSS class-based ancestry for JavaScript fold/unfold. + +### Goal + +Pass tree roots directly to the template and use a recursive macro: + +```jinja2 +{% macro render_message(message, html_content, depth=0) %} +
+
{{ html_content | safe }}
+ {% if message.children %} +
+ {% for child, child_html in message.children_with_html %} + {{ render_message(child, child_html, depth + 1) }} + {% endfor %} +
+ {% endif %} +
+{% endmacro %} + +{% for root, root_html in roots_with_html %} +{{ render_message(root, root_html) }} +{% endfor %} +``` + +### Benefits + +- **Simpler JavaScript**: Fold/unfold becomes trivial with nested DOM: + ```javascript + messageEl.querySelector('.children').style.display = 'none'; + ``` +- **Natural nesting**: DOM structure mirrors logical tree structure +- **Elimination of flatten step**: One less transformation + +### Migration Steps + +1. Create recursive render macro +2. Update DOM structure to use nested `.children` divs +3. Migrate JavaScript fold/unfold to use nested DOM +4. Pass `root_messages` directly to template + +### Considerations + +- JavaScript fold/unfold currently relies on CSS class queries (`.message.${targetId}`) +- Changing DOM structure requires JS migration +- Current approach works correctly, so this is optional optimization + +--- + +## 2. Visitor Pattern for Multi-Format Output + +For cleaner multi-format support, consider a visitor pattern where each output format implements a visitor over the message tree. + +### Current Approach + +```python +class Renderer: + def format_content(self, message) -> str: + return self._dispatch_format(message.content) + +class HtmlRenderer(Renderer): + def format_SystemMessage(self, content) -> str: + return format_system_content(content) + +class MarkdownRenderer(Renderer): + def format_SystemMessage(self, content) -> str: + return f"## System\n{content.text}" +``` + +### Visitor Alternative + +```python +class MessageVisitor(Protocol): + def visit_system_message(self, content: SystemMessage) -> T: ... + def visit_user_message(self, content: UserTextMessage) -> T: ... + # ... + +class HtmlVisitor(MessageVisitor[str]): + def visit_system_message(self, content): + return format_system_content(content) + +class MarkdownVisitor(MessageVisitor[str]): + def visit_system_message(self, content): + return f"## System\n{content.text}" +``` + +The current dispatcher approach works well; the visitor pattern would mainly help if we add many more output formats. + +### ✅ COMPLETED: Consistent (obj, message) Signatures + +Previously there was an asymmetry in method signatures: +- `format_{ClassName}(obj)` received the precise type directly +- `title_{ClassName}(message)` received the `TemplateMessage` wrapper + +**Resolution**: All `format_*` and `title_*` methods now consistently receive both parameters: + +```python +def format_BashInput(self, input: BashInput, _: TemplateMessage) -> str: + ... + +def title_BashInput(self, input: BashInput, message: TemplateMessage) -> str: + ... +``` + +This gives handlers access to both the specific type (for type-safe field access) and the full context (for paired message lookups, ancestry, etc.). Methods that don't need the message parameter use `_` or `_message` to indicate it's unused. + +--- + +## 3. Additional Tool Output Parsers + +Currently parsed: `ReadOutput`, `WriteOutput`, `EditOutput`, `BashOutput`, `TaskOutput`, `AskUserQuestionOutput`, `ExitPlanModeOutput` + +### Not Yet Parsed (fallback to `ToolResultContent`) + +- `GlobOutput` - Would enable structured file list display +- `GrepOutput` - Would enable structured search result display +- `WebFetchOutput` - Would enable structured web content display +- `WebSearchOutput` - Would enable structured search result display + +Adding these would improve rendering for those tool results. + +--- + +## 4. Performance Optimization + +Benchmarks (3.35s for 3917 messages) show adequate performance, but potential improvements: + +### Template Caching + +Jinja2 templates are already cached via `@lru_cache`. No action needed. + +### Pygments Caching + +Syntax highlighting is a significant portion of render time. Could cache highlighted code by content hash for repeated identical blocks. + +### Parallel Rendering + +`RenderingContext` is already designed for parallel-safe rendering. Could process multiple sessions in parallel with separate contexts. + +--- + +## Related Documentation + +- [rendering-architecture.md](rendering-architecture.md) - Current architecture +- [FOLD_STATE_DIAGRAM.md](FOLD_STATE_DIAGRAM.md) - Fold/unfold state machine diff --git a/test/__snapshots__/test_snapshot_html.ambr b/test/__snapshots__/test_snapshot_html.ambr index f0d7e69c..bb6c3651 100644 --- a/test/__snapshots__/test_snapshot_html.ambr +++ b/test/__snapshots__/test_snapshot_html.ambr @@ -9984,7 +9984,7 @@
- 📝 Todo List + 🛠️ TodoWrite
2025-06-14 10:02:00 @@ -9992,41 +9992,48 @@
-
-
- -
- - broken_todo - -
- -
- 🔄 - Implement core functionality - #2 -
- -
- - Add comprehensive tests - #3 -
- -
- - Write user documentation - #4 -
- -
- - Perform code review - #5 -
- -
-
+
+ + + + +
todos +
+ [ + "broken_todo", + { + "id": "2", + "content": "Implement core functionality", + "status": "... +
[
+    "broken_todo",
+    {
+      "id": "2",
+      "content": "Implement core functionality",
+      "status": "in_progress",
+      "priority": "high"
+    },
+    {
+      "id": "3",
+      "content": "Add comprehensive tests",
+      "status": "pending",
+      "priority": "medium"
+    },
+    {
+      "id": "4",
+      "content": "Write user documentation",
+      "status": "pending",
+      "priority": "low"
+    },
+    {
+      "id": "5",
+      "content": "Perform code review",
+      "status": "pending",
+      "priority": "medium"
+    }
+  ]
+
+
diff --git a/test/__snapshots__/test_snapshot_markdown.ambr b/test/__snapshots__/test_snapshot_markdown.ambr index 4d18aa8a..0c45c779 100644 --- a/test/__snapshots__/test_snapshot_markdown.ambr +++ b/test/__snapshots__/test_snapshot_markdown.ambr @@ -284,13 +284,39 @@ # 📋 Session `todowrit` - ## ✅ Todo List - - - ⬜ broken_todo - - 🔄 Implement core functionality - - ⬜ Add comprehensive tests - - ⬜ Write user documentation - - ⬜ Perform code review + ## TodoWrite + + **todos:** + + ```json + [ + "broken_todo", + { + "id": "2", + "content": "Implement core functionality", + "status": "in_progress", + "priority": "high" + }, + { + "id": "3", + "content": "Add comprehensive tests", + "status": "pending", + "priority": "medium" + }, + { + "id": "4", + "content": "Write user documentation", + "status": "pending", + "priority": "low" + }, + { + "id": "5", + "content": "Perform code review", + "status": "pending", + "priority": "medium" + } + ] + ``` # --- # name: TestTranscriptMarkdownSnapshots.test_multi_session_markdown diff --git a/test/test_image_export.py b/test/test_image_export.py new file mode 100644 index 00000000..71482417 --- /dev/null +++ b/test/test_image_export.py @@ -0,0 +1,88 @@ +"""Tests for image_export.py.""" + +import pytest +from pathlib import Path + +from claude_code_log.image_export import export_image +from claude_code_log.models import ImageContent, ImageSource + + +@pytest.fixture +def sample_image() -> ImageContent: + """Create a sample ImageContent for testing.""" + # Minimal valid PNG: 1x1 transparent pixel + png_data = ( + "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR4nGNgYGBgAAAABQABpfZFQAAAAA" + "BJRU5ErkJggg==" + ) + return ImageContent( + type="image", + source=ImageSource(type="base64", media_type="image/png", data=png_data), + ) + + +class TestExportImagePlaceholder: + """Tests for placeholder mode.""" + + def test_placeholder_returns_none(self, sample_image: ImageContent): + """Placeholder mode returns None (caller renders placeholder text).""" + result = export_image(sample_image, mode="placeholder") + assert result is None + + +class TestExportImageEmbedded: + """Tests for embedded mode.""" + + def test_embedded_returns_data_url(self, sample_image: ImageContent): + """Embedded mode returns data URL.""" + result = export_image(sample_image, mode="embedded") + assert result is not None + assert result.startswith("data:image/png;base64,") + + +class TestExportImageReferenced: + """Tests for referenced mode.""" + + def test_referenced_without_output_dir_returns_none( + self, sample_image: ImageContent + ): + """Referenced mode without output_dir returns None.""" + result = export_image(sample_image, mode="referenced", output_dir=None) + assert result is None + + def test_referenced_creates_image_file( + self, sample_image: ImageContent, tmp_path: Path + ): + """Referenced mode creates image file and returns relative path.""" + result = export_image( + sample_image, + mode="referenced", + output_dir=tmp_path, + counter=1, + ) + + assert result == "images/image_0001.png" + assert (tmp_path / "images" / "image_0001.png").exists() + + def test_referenced_with_different_counter( + self, sample_image: ImageContent, tmp_path: Path + ): + """Referenced mode uses counter for filename.""" + result = export_image( + sample_image, + mode="referenced", + output_dir=tmp_path, + counter=42, + ) + + assert result == "images/image_0042.png" + assert (tmp_path / "images" / "image_0042.png").exists() + + +class TestExportImageUnsupportedMode: + """Tests for unsupported mode.""" + + def test_unsupported_mode_returns_none(self, sample_image: ImageContent): + """Unsupported mode returns None.""" + result = export_image(sample_image, mode="unknown_mode") + assert result is None diff --git a/test/test_sidechain_agents.py b/test/test_sidechain_agents.py index ff0ffa17..d0423b6d 100644 --- a/test/test_sidechain_agents.py +++ b/test/test_sidechain_agents.py @@ -51,7 +51,7 @@ def test_agent_insertion(): def test_deduplication_task_result_vs_sidechain(): - """Test that sidechain assistant final message is deduplicated when it matches Task result.""" + """Test that sidechain assistant final message is dropped when it matches Task result.""" with tempfile.TemporaryDirectory() as tmpdir: tmpdir_path = Path(tmpdir) @@ -71,13 +71,7 @@ def test_deduplication_task_result_vs_sidechain(): html = generate_html(messages, title="Test") # Verify deduplication occurred: - # The sidechain assistant's final message should be replaced with a forward link - assert "Task summary" in html - assert "see result above" in html - - # Verify the dedup notice has an anchor link to the Task result - assert 'href="#msg-' in html - + # The sidechain assistant's final message should be dropped entirely # The actual content "I created the test file successfully" should only appear once # in the Task result, not in the sidechain assistant content_count = html.count("I created the test file successfully") diff --git a/test/test_todowrite_rendering.py b/test/test_todowrite_rendering.py index e3577e07..95d994f4 100644 --- a/test/test_todowrite_rendering.py +++ b/test/test_todowrite_rendering.py @@ -15,6 +15,7 @@ TodoWriteItem, ToolUseMessage, ) +from claude_code_log.renderer import TemplateMessage class TestTodoWriteRendering: @@ -230,8 +231,10 @@ def test_todowrite_vs_regular_tool_use(self): # Test both through the HtmlRenderer renderer = HtmlRenderer() - regular_html = renderer.format_ToolUseMessage(regular_tool) - todowrite_html = renderer.format_ToolUseMessage(todowrite_tool) + regular_msg = TemplateMessage(regular_tool) + todowrite_msg = TemplateMessage(todowrite_tool) + regular_html = renderer.format_ToolUseMessage(regular_tool, regular_msg) + todowrite_html = renderer.format_ToolUseMessage(todowrite_tool, todowrite_msg) # Edit tool should use diff formatting (not table) assert "edit-diff" in regular_html diff --git a/test/test_tool_result_image_rendering.py b/test/test_tool_result_image_rendering.py index b3b9b999..890011c1 100644 --- a/test/test_tool_result_image_rendering.py +++ b/test/test_tool_result_image_rendering.py @@ -1,7 +1,15 @@ -"""Test image rendering within tool results.""" +"""Test image rendering within tool results and assistant messages.""" from claude_code_log.html.tool_formatters import format_tool_result_content_raw -from claude_code_log.models import ToolResultContent +from claude_code_log.html.assistant_formatters import format_assistant_text_content +from claude_code_log.models import ( + AssistantTextMessage, + ImageContent, + ImageSource, + MessageMeta, + TextContent, + ToolResultContent, +) def test_tool_result_with_image(): @@ -153,3 +161,112 @@ def test_tool_result_structured_text_only(): # Should not be treated as having images assert "Text and image content" not in html + + +# ============================================================================= +# Assistant Message Image Tests +# ============================================================================= +# These tests prepare for future image generation capabilities where Claude +# might return ImageContent in assistant messages. + + +def _make_meta() -> MessageMeta: + """Create a minimal MessageMeta for testing.""" + return MessageMeta( + session_id="test-session", + timestamp="2024-01-01T00:00:00Z", + uuid="test-uuid", + ) + + +def test_assistant_message_with_image(): + """Test assistant message containing an image (future image generation).""" + sample_image_data = "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mP8z8DwHwAFBQIAX8jx0gAAAABJRU5ErkJggg==" + + content = AssistantTextMessage( + _make_meta(), + items=[ + ImageContent( + type="image", + source=ImageSource( + type="base64", + media_type="image/png", + data=sample_image_data, + ), + ), + ], + ) + + html = format_assistant_text_content(content) + + # Should contain the image with proper data URL + assert "