Support distributed tracing #6

AraiYuno · 2025-12-25T03:04:40Z

Distributed Tracing for Claude Code

Problem Statement

The current trace-claude-code skill creates standalone traces for each Claude Code session. However, in production environments, Claude Code often runs as part of a larger system:

Agent orchestrators that invoke Claude Code as a sub-agent
Slack bots that spawn Claude Code sessions to handle user requests
CI/CD pipelines that use Claude Code for automated code changes
Multi-agent systems where Claude Code is one of several agents

In these scenarios, operators need end-to-end visibility across the entire request lifecycle, not just isolated Claude Code traces. The Claude Code spans should nest under their parent orchestrator's trace.

Current (isolated):                    Desired (nested):
──────────────────                     ─────────────────
[Orchestrator Trace]                   [Orchestrator Trace]
  └── ...                                └── invoke_claude_code
                                              └── [Claude Code Session]
[Claude Code Trace]  (separate)                    ├── Turn 1
  ├── Turn 1                                       │   └── Tool calls...
  └── Turn 2                                       └── Turn 2

Requirements

Dynamic parent span injection: Pass parent span context to Claude Code at session start (not via static environment variables)
Transparent nesting: Claude Code traces appear as children of the orchestrator's span in Braintrust UI
Backward compatible: Standalone mode continues to work when no parent context is provided
SDK-agnostic: Works regardless of how the parent span was created (Python SDK, TypeScript SDK, REST API)
No Claude Code modifications: Solution works with hooks, not Claude Code internals

Proposed Solution: Session Context File

Overview

Introduce a session context file that orchestrators write before starting Claude Code. The hooks read this file to determine parent span context.

┌─────────────────────────────────────────────────────────────────────┐
│                    /tmp/braintrust_session.json                     │
│                                                                     │
│  {                                                                  │
│    "parent_span": "<exported_span_string>",                        │
│    "project": "my-project"                                         │
│  }                                                                  │
└─────────────────────────────────────────────────────────────────────┘
                          │
          ┌───────────────┴───────────────┐
          ▼                               ▼
┌─────────────────────┐       ┌─────────────────────┐
│    Orchestrator     │       │   Claude Code Hooks │
│  (writes context)   │       │   (reads context)   │
└─────────────────────┘       └─────────────────────┘

Why a File (Not Environment Variables)?

Approach	Limitation
Environment variables	Static at process start; can't change per-session in containerized deployments
HTTP header	Claude Code doesn't expose hook access to incoming request headers
Stdin injection	Hooks receive Claude Code's internal event format, not custom data
Session context file	Dynamic, per-session, readable by hooks, writable by any orchestrator

Session Context Schema

{
  "parent_span": "<braintrust_exported_span_string>",
  "project": "optional-project-override"
}

Field	Required	Description
`parent_span`	Yes	Braintrust SDK exported span string (from `span.export()`)
`project`	No	Override project name (otherwise uses `BRAINTRUST_CC_PROJECT` or default)

File Location

Default: /tmp/braintrust_session.json

Configurable via: BRAINTRUST_SESSION_CONTEXT_FILE environment variable

Implementation Plan

Phase 1: Hook Changes

1.1 Update `common.sh`

Add function to read and parse session context:

SESSION_CONTEXT_FILE="${BRAINTRUST_SESSION_CONTEXT_FILE:-/tmp/braintrust_session.json}"

# Parse Braintrust SDK exported span format (SpanComponentsV3)
# Returns: span_id root_span_id (space-separated)
parse_exported_span() {
    local exported="$1"
    [ -z "$exported" ] && return 1

    python3 -c "
import base64, sys
from uuid import UUID

try:
    data = base64.b64decode('$exported')
    num_uuids = data[2]
    uuids = {}
    offset = 3
    for _ in range(num_uuids):
        field_id = data[offset]
        uuid_bytes = data[offset + 1:offset + 17]
        uuids[field_id] = str(UUID(bytes=uuid_bytes))
        offset += 17
    span_id = uuids.get(3, '')      # SPAN_ID field
    root_span_id = uuids.get(4, '') # ROOT_SPAN_ID field
    if span_id and root_span_id:
        print(f'{span_id} {root_span_id}')
except Exception:
    sys.exit(1)
"
}

# Read parent span info from session context
# Sets: PARENT_SPAN_ID, TRACE_ROOT_ID, CONTEXT_PROJECT
get_session_context() {
    PARENT_SPAN_ID=""
    TRACE_ROOT_ID=""
    CONTEXT_PROJECT=""

    if [ -f "$SESSION_CONTEXT_FILE" ]; then
        local exported project
        exported=$(jq -r '.parent_span // empty' "$SESSION_CONTEXT_FILE" 2>/dev/null)
        project=$(jq -r '.project // empty' "$SESSION_CONTEXT_FILE" 2>/dev/null)

        if [ -n "$exported" ]; then
            local parsed
            parsed=$(parse_exported_span "$exported")
            if [ -n "$parsed" ]; then
                PARENT_SPAN_ID=$(echo "$parsed" | cut -d' ' -f1)
                TRACE_ROOT_ID=$(echo "$parsed" | cut -d' ' -f2)
            fi
        fi

        [ -n "$project" ] && CONTEXT_PROJECT="$project"
    fi
}

1.2 Update `session_start.sh`

Modify to nest under parent span when context is available:

get_session_context

if [ -n "$PARENT_SPAN_ID" ] && [ -n "$TRACE_ROOT_ID" ]; then
    # Nested mode: Claude Code session is child of external span
    ROOT_SPAN_ID="$TRACE_ROOT_ID"
    SESSION_PARENT="$PARENT_SPAN_ID"
    debug "Nesting under parent span: $PARENT_SPAN_ID (root: $TRACE_ROOT_ID)"

    EVENT=$(jq -n \
        --arg span_id "$SESSION_SPAN_ID" \
        --arg root_span_id "$ROOT_SPAN_ID" \
        --arg parent "$SESSION_PARENT" \
        ... \
        '{
            span_id: $span_id,
            root_span_id: $root_span_id,
            span_parents: [$parent],
            ...
        }')
else
    # Standalone mode: Claude Code session is its own root
    ROOT_SPAN_ID="$SESSION_SPAN_ID"
    debug "Creating standalone session (no parent context)"

    EVENT=$(jq -n \
        --arg span_id "$SESSION_SPAN_ID" \
        --arg root_span_id "$ROOT_SPAN_ID" \
        ... \
        '{
            span_id: $span_id,
            root_span_id: $root_span_id,
            ...
        }')
fi

1.3 Update Other Hooks

Ensure user_prompt_submit.sh, post_tool_use.sh, and stop_hook.sh use the correct root_span_id from session state (which now may be an external trace root).

Phase 2: Documentation

2.1 Update SKILL.md

Add section on distributed tracing:

## Distributed Tracing

To nest Claude Code traces under a parent orchestrator span:

### 1. Create session context file

Before starting Claude Code, write the parent span context:

```bash
# Using Braintrust Python SDK
python3 << 'EOF'
import json
import braintrust

logger = braintrust.init_logger(project="my-project")
with logger.start_span(name="orchestrator") as span:
    # Write context for Claude Code
    with open("/tmp/braintrust_session.json", "w") as f:
        json.dump({"parent_span": span.export()}, f)

    # Now start Claude Code (it will read the context)
    subprocess.run(["claude", "--prompt", "..."])
EOF

2. Start Claude Code

Claude Code hooks automatically detect the session context and nest traces accordingly.

3. View nested traces

In Braintrust, the Claude Code session appears as a child of your orchestrator span.


#### 2.2 Add Integration Examples

Create examples for common orchestrator patterns:

- Python SDK orchestrator
- TypeScript/Node orchestrator
- Shell script orchestrator
- Docker/container orchestrator

### Phase 3: Testing

#### 3.1 Unit Tests

- Parse various exported span formats (v2, v3)
- Handle missing/malformed session context gracefully
- Verify span relationships are correct

#### 3.2 Integration Tests

- End-to-end test with Python orchestrator
- Verify traces appear nested in Braintrust UI
- Test backward compatibility (no context file = standalone mode)

## Exported Span Format Reference

The Braintrust SDK `span.export()` returns a base64-encoded `SpanComponentsV3` structure:

Byte 0: Version (3)
Byte 1: Object type (2 = PROJECT_LOGS)
Byte 2: Number of UUID fields
Bytes 3+: For each UUID: 1-byte field_id + 16-byte UUID
Remaining: JSON metadata

Field IDs:
1 = OBJECT_ID
2 = ROW_ID
3 = SPAN_ID ← Use for span_parents
4 = ROOT_SPAN_ID ← Use for root_span_id


The hooks must parse this format to extract `SPAN_ID` (for `span_parents`) and `ROOT_SPAN_ID` (for `root_span_id`).

## Configuration Reference

| Variable | Default | Description |
|----------|---------|-------------|
| `BRAINTRUST_SESSION_CONTEXT_FILE` | `/tmp/braintrust_session.json` | Path to session context file |
| `TRACE_TO_BRAINTRUST` | - | Enable tracing (`true`/`false`) |
| `BRAINTRUST_API_KEY` | - | Braintrust API key |
| `BRAINTRUST_CC_PROJECT` | `claude-code` | Default project name |

## Migration Guide

### For Existing Users

No changes required. Without a session context file, hooks behave exactly as before (standalone traces).

### For Orchestrator Developers

1. Before spawning Claude Code, create a span in your Braintrust logger
2. Write `{"parent_span": span.export()}` to `/tmp/braintrust_session.json`
3. Start Claude Code
4. (Optional) Clean up the context file after Claude Code exits

## Security Considerations

- Session context file should be in a directory only accessible by the orchestrator and Claude Code process
- In multi-tenant environments, use unique file paths per session: `BRAINTRUST_SESSION_CONTEXT_FILE=/tmp/session_${SESSION_ID}.json`
- Consider cleaning up context files after use to prevent stale context from affecting subsequent sessions

## Future Enhancements

1. **HTTP header propagation**: If Claude Code exposes request headers to hooks, support `X-Braintrust-Parent-Span` header
2. **OpenTelemetry context**: Parse W3C trace context format for broader compatibility
3. **Automatic cleanup**: Hooks could delete the context file after reading to prevent stale context
4. **Project inheritance**: Automatically use the parent span's project instead of requiring separate configuration

ankrgyl · 2025-12-26T17:13:40Z

skills/trace-claude-code/hooks/common.sh

+    [ -z "$exported" ] && return 1
+
+    local result
+    result=$(python3 -c "


can we do this without python? Right now there's no requirement that you need a working python interpreter and ideally would like to keep it that way.

support distributed bt spans

59b9a5b

ankrgyl reviewed Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support distributed tracing #6

Support distributed tracing #6

Uh oh!

AraiYuno commented Dec 25, 2025

Uh oh!

ankrgyl Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Support distributed tracing #6

Are you sure you want to change the base?

Support distributed tracing #6

Uh oh!

Conversation

AraiYuno commented Dec 25, 2025

Distributed Tracing for Claude Code

Problem Statement

Requirements

Proposed Solution: Session Context File

Overview

Why a File (Not Environment Variables)?

Session Context Schema

File Location

Implementation Plan

Phase 1: Hook Changes

1.1 Update common.sh

1.2 Update session_start.sh

1.3 Update Other Hooks

Phase 2: Documentation

2.1 Update SKILL.md

2. Start Claude Code

3. View nested traces

Uh oh!

ankrgyl Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1.1 Update `common.sh`

1.2 Update `session_start.sh`