Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 25 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
# Claude Code Log

A Python CLI tool that converts Claude transcript JSONL files into readable HTML format.
A Python CLI tool that converts Claude Code transcript JSONL files into readable HTML, text, markdown, or chat format.

## Project Overview

This tool processes Claude Code transcript files (stored as JSONL) and generates clean, minimalist HTML pages with comprehensive session navigation and token usage tracking. It's designed to create a readable log of your Claude interactions with rich metadata and easy navigation.
This tool processes Claude Code transcript files (stored as JSONL) and generates clean, readable output in multiple formats with comprehensive session navigation and token usage tracking. It supports HTML for browser viewing, verbose text for detailed analysis, markdown for documentation, and compact chat format for quick conversation review.

## Key Features

- **Multiple Output Formats**: Generate HTML, plain text, markdown, or compact chat format from transcript files
- **Stdin Piping Support**: Pipe JSONL data directly for use in CI/CD pipelines and automation
- **Interactive TUI (Terminal User Interface)**: Browse and manage Claude Code sessions with real-time navigation, summaries, and quick actions for HTML export and session resuming
- **Project Hierarchy Processing**: Process entire `~/.claude/projects/` directory with linked index page
- **Individual Session Files**: Generate separate HTML files for each session with navigation links
Expand Down Expand Up @@ -113,10 +115,31 @@ claude-code-log /path/to/directory --from-date "yesterday" --to-date "today"
claude-code-log /path/to/directory --from-date "3 days ago" --to-date "yesterday"
```

### Text, Markdown, and Chat Output

Generate non-HTML formats for documentation or quick review:

```bash
# Verbose text format (timestamps, tokens, full tool details)
claude-code-log /path/to/directory --format text -o output.txt

# Markdown format (for documentation)
claude-code-log /path/to/directory --format markdown -o output.md

# Compact chat format (clean conversation, like Claude Code UI)
claude-code-log /path/to/directory --format chat -o chat.txt
```

**Format Comparison:**
- **text**: Verbose with timestamps, token usage, working directories
- **markdown**: Same as text with markdown heading hierarchy
- **chat**: Compact conversation flow with tool symbols (⏺ for tool use, ⎿ for results)

## File Structure

- `claude_code_log/parser.py` - Data extraction and parsing from JSONL files
- `claude_code_log/renderer.py` - HTML generation and template rendering
- `claude_code_log/text_renderer.py` - Plain text and markdown rendering
- `claude_code_log/converter.py` - High-level conversion orchestration
- `claude_code_log/cli.py` - Command-line interface with project discovery
- `claude_code_log/models.py` - Pydantic models for transcript data structures
Expand Down
33 changes: 33 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ uvx claude-code-log@latest --open-browser

## Key Features

- **Multiple Output Formats**: Generate HTML, plain text, markdown, or compact chat output from transcript files
- **Interactive TUI (Terminal User Interface)**: Browse and manage Claude Code sessions with real-time navigation, summaries, and quick actions for HTML export and session resuming
- **Project Hierarchy Processing**: Process entire `~/.claude/projects/` directory with linked index page
- **Individual Session Files**: Generate separate HTML files for each session with navigation links
Expand Down Expand Up @@ -136,10 +137,42 @@ claude-code-log /path/to/directory --from-date "yesterday" --to-date "today"
claude-code-log /path/to/directory --from-date "3 days ago" --to-date "yesterday"
```

### Text and Markdown Output

Convert transcripts to plain text or markdown format for documentation or terminal viewing:

```bash
# Generate plain text output (verbose with timestamps, token usage)
claude-code-log /path/to/directory --format text -o output.txt

# Generate markdown output
claude-code-log /path/to/directory --format markdown -o output.md

# Generate compact chat format (clean conversation flow)
claude-code-log /path/to/directory --format chat -o chat.txt

# Single file with chat format (most readable)
claude-code-log transcript.jsonl --format chat
```

**Format Comparison:**

- **text**: Verbose format with timestamps, token usage, working directories, and full tool details
- **markdown**: Same as text but with markdown heading hierarchy for better document integration
- **chat**: Compact format mimicking Claude Code UI - clean conversation flow with tool use symbols (⏺) and truncated results (⎿)

**All Format Features:**
- Session headers with IDs and summaries (text/markdown only)
- User and assistant message separation
- Tool use and tool result rendering
- Thinking content blocks (text/markdown only)
- Chat format: clean, minimal output perfect for quick review

## File Structure

- `claude_code_log/parser.py` - Data extraction and parsing from JSONL files
- `claude_code_log/renderer.py` - HTML generation and template rendering
- `claude_code_log/text_renderer.py` - Plain text and markdown rendering
- `claude_code_log/converter.py` - High-level conversion orchestration
- `claude_code_log/cli.py` - Command-line interface with project discovery
- `claude_code_log/models.py` - Pydantic models for transcript data structures
Expand Down
44 changes: 35 additions & 9 deletions claude_code_log/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,11 @@
import click
from git import Repo, InvalidGitRepositoryError

from .converter import convert_jsonl_to_html, process_projects_hierarchy
from .converter import (
convert_jsonl_to_html,
convert_jsonl_to_output,
process_projects_hierarchy,
)
from .cache import CacheManager, get_library_version


Expand Down Expand Up @@ -338,12 +342,20 @@ def _clear_html_files(input_path: Path, all_projects: bool) -> None:
"-o",
"--output",
type=click.Path(path_type=Path),
help="Output HTML file path (default: input file with .html extension or combined_transcripts.html for directories)",
help="Output file path (default: input file with appropriate extension based on format)",
)
@click.option(
"-f",
"--format",
"output_format",
type=click.Choice(["html", "text", "markdown", "chat"], case_sensitive=False),
default="html",
help="Output format: html, text, markdown, or chat (default: html)",
)
@click.option(
"--open-browser",
is_flag=True,
help="Open the generated HTML file in the default browser",
help="Open the generated HTML file in the default browser (only works with HTML format)",
)
@click.option(
"--from-date",
Expand All @@ -358,12 +370,12 @@ def _clear_html_files(input_path: Path, all_projects: bool) -> None:
@click.option(
"--all-projects",
is_flag=True,
help="Process all projects in ~/.claude/projects/ hierarchy and create linked HTML files",
help="Process all projects in ~/.claude/projects/ hierarchy and create linked files",
)
@click.option(
"--no-individual-sessions",
is_flag=True,
help="Skip generating individual session HTML files (only create combined transcript)",
help="Skip generating individual session files (only create combined transcript)",
)
@click.option(
"--no-cache",
Expand All @@ -388,6 +400,7 @@ def _clear_html_files(input_path: Path, all_projects: bool) -> None:
def main(
input_path: Optional[Path],
output: Optional[Path],
output_format: str,
open_browser: bool,
from_date: Optional[str],
to_date: Optional[str],
Expand All @@ -398,15 +411,27 @@ def main(
clear_html: bool,
tui: bool,
) -> None:
"""Convert Claude transcript JSONL files to HTML.
"""Convert Claude transcript JSONL files to HTML, text, or markdown.

INPUT_PATH: Path to a Claude transcript JSONL file, directory containing JSONL files, or project path to convert. If not provided, defaults to ~/.claude/projects/ and --all-projects is used.
"""
# Configure logging to show warnings and above
logging.basicConfig(level=logging.WARNING, format="%(levelname)s: %(message)s")

try:
# Handle TUI mode
# Validate incompatible options
if output_format.lower() != "html" and tui:
click.echo("Error: TUI mode only works with HTML format", err=True)
sys.exit(1)

if output_format.lower() != "html" and open_browser:
click.echo("Warning: --open-browser only works with HTML format", err=True)

if output_format.lower() != "html" and all_projects:
click.echo("Error: --all-projects only works with HTML format", err=True)
sys.exit(1)

Comment on lines 408 to +433
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Non-HTML default run still uses all-projects HTML path

Right now the all-projects guard for non-HTML formats only checks the explicit --all-projects flag:

if output_format.lower() != "html" and all_projects:
    ...

But later, when input_path is None, you implicitly set:

if input_path is None:
    input_path = Path.home() / ".claude" / "projects"
    all_projects = True

This means claude-code-log --format text (no input_path) will still go through process_projects_hierarchy(...) and generate HTML index files, ignoring the requested text format.

To align behavior with the intended restriction (“--all-projects only works with HTML format”), re-validate after you default all_projects:

@@
-    # Handle default case - process all projects hierarchy if no input path and --all-projects flag
-    if input_path is None:
-        input_path = Path.home() / ".claude" / "projects"
-        all_projects = True
+    # Handle default case - process all projects hierarchy if no input path
+    if input_path is None:
+        input_path = Path.home() / ".claude" / "projects"
+        all_projects = True
+
+    # After defaulting to all_projects, ensure non-HTML formats are rejected
+    if all_projects and output_format.lower() != "html":
+        click.echo(
+            "Error: --all-projects only works with HTML format", err=True
+        )
+        sys.exit(1)

This preserves the early guard for explicit --all-projects and also covers the implicit default case.

Also applies to: 529-575

# Handle TUI mode (HTML only)
if tui:
# Handle default case for TUI - use ~/.claude/projects if no input path
if input_path is None:
Expand Down Expand Up @@ -571,19 +596,20 @@ def main(
f"Neither {input_path} nor {claude_path} exists"
)

output_path = convert_jsonl_to_html(
output_path = convert_jsonl_to_output(
input_path,
output,
from_date,
to_date,
output_format,
not no_individual_sessions,
not no_cache,
)
if input_path.is_file():
click.echo(f"Successfully converted {input_path} to {output_path}")
else:
jsonl_count = len(list(input_path.glob("*.jsonl")))
if not no_individual_sessions:
if output_format.lower() == "html" and not no_individual_sessions:
session_files = list(input_path.glob("session-*.html"))
click.echo(
f"Successfully combined {jsonl_count} transcript files from {input_path} to {output_path} and generated {len(session_files)} individual session files"
Expand Down
179 changes: 179 additions & 0 deletions claude_code_log/content_extractor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
#!/usr/bin/env python3
"""Extract data from ContentItem objects without formatting.

This module provides shared content extraction logic used by both HTML and text renderers.
It separates data extraction from presentation formatting.
"""

import json
from typing import Any, Dict, List, Union, Optional
from dataclasses import dataclass

from .models import (
ContentItem,
TextContent,
ToolUseContent,
ToolResultContent,
ThinkingContent,
ImageContent,
)


@dataclass
class ExtractedText:
"""Extracted text content."""

text: str


@dataclass
class ExtractedThinking:
"""Extracted thinking content."""

thinking: str
signature: Optional[str] = None


@dataclass
class ExtractedToolUse:
"""Extracted tool use content."""

name: str
id: str
input: Dict[str, Any]


@dataclass
class ExtractedToolResult:
"""Extracted tool result content."""

tool_use_id: str
is_error: bool
content: Union[str, List[Dict[str, Any]]]


@dataclass
class ExtractedImage:
"""Extracted image content."""

media_type: str
data: str


# Union type for all extracted content
ExtractedContent = Union[
ExtractedText,
ExtractedThinking,
ExtractedToolUse,
ExtractedToolResult,
ExtractedImage,
]


def extract_content_data(content: ContentItem) -> Optional[ExtractedContent]:
"""Extract raw data from ContentItem without any formatting.

Args:
content: A ContentItem object (TextContent, ToolUseContent, etc.)

Returns:
Extracted data as a dataclass, or None if content type is unknown
"""
# Handle TextContent
if isinstance(content, TextContent) or (
hasattr(content, "type") and getattr(content, "type") == "text"
):
text = getattr(content, "text", str(content))
return ExtractedText(text=text)

# Handle ThinkingContent
elif isinstance(content, ThinkingContent) or (
hasattr(content, "type") and getattr(content, "type") == "thinking"
):
thinking_text = getattr(content, "thinking", "")
signature = getattr(content, "signature", None)
return ExtractedThinking(thinking=thinking_text, signature=signature)

# Handle ToolUseContent
elif isinstance(content, ToolUseContent) or (
hasattr(content, "type") and getattr(content, "type") == "tool_use"
):
tool_name = getattr(content, "name", "unknown")
tool_id = getattr(content, "id", "")
tool_input = getattr(content, "input", {})
return ExtractedToolUse(name=tool_name, id=tool_id, input=tool_input)

# Handle ToolResultContent
elif isinstance(content, ToolResultContent) or (
hasattr(content, "type") and getattr(content, "type") == "tool_result"
):
tool_use_id = getattr(content, "tool_use_id", "")
is_error = getattr(content, "is_error", False)
content_data = getattr(content, "content", "")
return ExtractedToolResult(
tool_use_id=tool_use_id, is_error=is_error, content=content_data
)

# Handle ImageContent
elif isinstance(content, ImageContent) or (
hasattr(content, "type") and getattr(content, "type") == "image"
):
source = getattr(content, "source", {})
media_type = (
getattr(source, "media_type", "unknown")
if hasattr(source, "media_type")
else "unknown"
)
data = getattr(source, "data", "") if hasattr(source, "data") else ""
return ExtractedImage(media_type=media_type, data=data)

# Unknown content type
return None


def format_tool_input_json(tool_input: Dict[str, Any], indent: int = 2) -> str:
"""Format tool input as indented JSON string.

Args:
tool_input: Tool input dictionary
indent: Number of spaces for JSON indentation

Returns:
Formatted JSON string
"""
return json.dumps(tool_input, indent=indent)


def is_text_content(content: ContentItem) -> bool:
"""Check if content is TextContent."""
return isinstance(content, TextContent) or (
hasattr(content, "type") and getattr(content, "type") == "text"
)


def is_thinking_content(content: ContentItem) -> bool:
"""Check if content is ThinkingContent."""
return isinstance(content, ThinkingContent) or (
hasattr(content, "type") and getattr(content, "type") == "thinking"
)


def is_tool_use_content(content: ContentItem) -> bool:
"""Check if content is ToolUseContent."""
return isinstance(content, ToolUseContent) or (
hasattr(content, "type") and getattr(content, "type") == "tool_use"
)


def is_tool_result_content(content: ContentItem) -> bool:
"""Check if content is ToolResultContent."""
return isinstance(content, ToolResultContent) or (
hasattr(content, "type") and getattr(content, "type") == "tool_result"
)


def is_image_content(content: ContentItem) -> bool:
"""Check if content is ImageContent."""
return isinstance(content, ImageContent) or (
hasattr(content, "type") and getattr(content, "type") == "image"
)
Loading
Loading