Skip to content

Commit c2fb5df

Browse files
kabirclaude
andcommitted
docs: restructure AI documentation into .agents and .claude directories
- Move skills to .agents/skills for universal agent access - Create .claude/architecture with EventQueue deep-dive docs - Add CLAUDE.md symlink to AGENTS.md for auto-discovery - Link .claude/skills to .agents/skills for Claude compatibility - Protect .claude/settings.local.json in .gitignore Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
1 parent d11484b commit c2fb5df

File tree

9 files changed

+879
-4
lines changed

9 files changed

+879
-4
lines changed

.claude/architecture/EVENTQUEUE.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
# EventQueue Architecture - A2A Java SDK
2+
3+
> **Quick Reference** for event processing, queue management, and task lifecycle
4+
5+
## Overview
6+
7+
The EventQueue architecture guarantees:
8+
1. **Events persist BEFORE clients see them** (no unpersisted events visible)
9+
2. **Serial processing** eliminates concurrent update race conditions
10+
3. **Task state drives queue lifecycle** (fire-and-forget support, late reconnections)
11+
12+
## Architecture Diagram
13+
14+
```
15+
AgentExecutor.execute() [YOUR CODE]
16+
17+
AgentEmitter → MainQueue.enqueueEvent()
18+
19+
MainEventBus.submit() [ALL events queue here FIRST]
20+
21+
MainEventBusProcessor.take() [single background thread]
22+
23+
1. TaskStore.save() FIRST ← Persist before visibility
24+
2. PushNotificationSender.send()
25+
3. MainQueue.distributeToChildren() ← Clients see LAST
26+
27+
ChildQueue → EventConsumer → ResultAggregator → Client
28+
```
29+
30+
**Key Insight**: All events flow through a single-threaded processor that persists events BEFORE distributing to clients.
31+
32+
---
33+
34+
## Core Components
35+
36+
### MainEventBus
37+
**Location**: `server-common/.../events/MainEventBus.java`
38+
39+
- `@ApplicationScoped` CDI bean - single instance shared by all MainQueues
40+
- `LinkedBlockingDeque<MainEventBusContext>` - thread-safe centralized queue
41+
- `submit(taskId, eventQueue, item)` - enqueue events (called by MainQueue)
42+
- `take()` - blocking consumption (called by MainEventBusProcessor)
43+
44+
**Guarantees**: Events persist BEFORE distribution, serial processing, push notifications AFTER persistence
45+
46+
### MainEventBusProcessor
47+
**Location**: `server-common/.../events/MainEventBusProcessor.java`
48+
49+
Single background thread "MainEventBusProcessor" that processes events in order:
50+
1. `TaskManager.process(event)` → persist to TaskStore
51+
2. `PushNotificationSender.send()` → notifications
52+
3. `mainQueue.distributeToChildren()` → clients receive
53+
54+
**Exception Handling**: Converts `TaskStoreException` to `InternalError` events, continues processing
55+
56+
### EventQueue System
57+
**Location**: `server-common/.../events/EventQueue.java`
58+
59+
**Queue Types**:
60+
- **MainQueue**: No local queue - events submit directly to MainEventBus
61+
- **ChildQueue**: Has local queue for client consumption
62+
63+
**Characteristics**: Bounded (1000 events), thread-safe, graceful shutdown, hook support
64+
65+
### QueueManager
66+
**Location**: `server-common/.../events/QueueManager.java`
67+
68+
- `createOrTap(taskId)` → Get existing MainQueue or create new
69+
- `tap(taskId)` → Create ChildQueue for existing MainQueue
70+
- **Default**: InMemoryQueueManager (thread-safe ConcurrentHashMap)
71+
- **Replicated**: ReplicatedQueueManager (Kafka-based)
72+
73+
### EventConsumer & ResultAggregator
74+
**Locations**: `server-common/.../events/EventConsumer.java`, `server-common/.../tasks/ResultAggregator.java`
75+
76+
**EventConsumer**: Polls queue, returns `Flow.Publisher<Event>`, closes queue on final event
77+
78+
**ResultAggregator** bridges EventConsumer and DefaultRequestHandler:
79+
- `consumeAndBreakOnInterrupt()` - Non-streaming (polls until terminal/AUTH_REQUIRED)
80+
- `consumeAndEmit()` - Streaming (returns Flow.Publisher immediately)
81+
- `consumeAll()` - Simple consumption
82+
83+
---
84+
85+
## Key Concepts
86+
87+
### Queue Structure
88+
- MainQueue has NO local queue (events → MainEventBus directly)
89+
- Only ChildQueues have local queues
90+
- `MainQueue.dequeueEventItem()` throws `UnsupportedOperationException`
91+
- `MainQueue.size()` returns `mainEventBus.size()`
92+
- `ChildQueue.size()` returns local queue size
93+
94+
### Terminal Events
95+
Events that cause polling loop exit:
96+
- `TaskStatusUpdateEvent` with `isFinal() == true`
97+
- `Message` (legacy)
98+
- `Task` with state: COMPLETED, CANCELED, FAILED, REJECTED, UNKNOWN
99+
100+
### AUTH_REQUIRED Special Case
101+
- Returns task to client immediately
102+
- Agent continues in background
103+
- Queue stays open, async cleanup
104+
- Future events update TaskStore
105+
106+
---
107+
108+
## Deep Dives
109+
110+
For detailed documentation on specific aspects:
111+
112+
- **[Queue Lifecycle & Two-Level Protection](eventqueue/LIFECYCLE.md)**
113+
- THE BIG IDEA: fire-and-forget, late reconnections
114+
- TaskStateProvider interface and state-driven cleanup
115+
- Memory management and cleanup modes
116+
117+
- **[Request Flows](eventqueue/FLOWS.md)**
118+
- Non-streaming vs streaming flows
119+
- DefaultRequestHandler orchestration
120+
- Background cleanup patterns
121+
122+
- **[Usage Scenarios & Pitfalls](eventqueue/SCENARIOS.md)**
123+
- Fire-and-forget pattern (TCK)
124+
- Late resubscription scenarios
125+
- Tapping and multiple consumers
126+
- Common mistakes to avoid
127+
128+
---
129+
130+
## Key Files Reference
131+
132+
| Component | Path |
133+
|-----------|------|
134+
| MainEventBus | `server-common/.../events/MainEventBus.java` |
135+
| MainEventBusProcessor | `server-common/.../events/MainEventBusProcessor.java` |
136+
| EventQueue | `server-common/.../events/EventQueue.java` |
137+
| QueueManager | `server-common/.../events/QueueManager.java` |
138+
| InMemoryQueueManager | `server-common/.../events/InMemoryQueueManager.java` |
139+
| EventConsumer | `server-common/.../events/EventConsumer.java` |
140+
| ResultAggregator | `server-common/.../tasks/ResultAggregator.java` |
141+
| DefaultRequestHandler | `server-common/.../requesthandlers/DefaultRequestHandler.java` |
142+
| TaskStateProvider | `server-common/.../tasks/TaskStateProvider.java` |
143+
| AgentEmitter | `server-common/.../tasks/AgentEmitter.java` |
144+
145+
---
146+
147+
## Related Documentation
148+
149+
- **Main Architecture**: `AGENTS.md` - High-level system overview
150+
- **Task Persistence**: See TaskStore exception handling in main docs
151+
- **Replication**: `extras/queue-manager-replicated/README.md`
Lines changed: 223 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,223 @@
1+
# Request Flows - EventQueue Processing
2+
3+
> Deep-dive on streaming vs non-streaming request handling
4+
5+
## Non-Streaming Flow (`onMessageSend()`)
6+
7+
**Location**: `DefaultRequestHandler.java`
8+
9+
```
10+
1. initMessageSend()
11+
→ Create TaskManager & RequestContext
12+
13+
2. queueManager.createOrTap(taskId)
14+
→ Get/create EventQueue (MainQueue or ChildQueue)
15+
16+
3. registerAndExecuteAgentAsync()
17+
→ Start AgentExecutor in background thread
18+
19+
4. resultAggregator.consumeAndBreakOnInterrupt(consumer)
20+
→ Poll queue until terminal event or AUTH_REQUIRED
21+
→ Blocking wait for events
22+
23+
5. cleanup(queue, task, async)
24+
→ Close queue immediately OR in background
25+
26+
6. Return Task/Message to client
27+
```
28+
29+
### Terminal Events
30+
31+
Events that cause polling loop exit:
32+
- `TaskStatusUpdateEvent` with `isFinal() == true`
33+
- `Message` (legacy)
34+
- `Task` with state: COMPLETED, CANCELED, FAILED, REJECTED, UNKNOWN
35+
36+
### AUTH_REQUIRED Special Case
37+
38+
**Behavior**:
39+
- Returns current task to client immediately
40+
- Agent continues running in background
41+
- Queue stays open, cleanup happens async
42+
- Future events update TaskStore
43+
44+
**Why**: Allows client to handle authentication prompt while agent waits for credentials.
45+
46+
---
47+
48+
## Streaming Flow (`onMessageSendStream()`)
49+
50+
**Location**: `DefaultRequestHandler.java`
51+
52+
```
53+
1. initMessageSend()
54+
→ Same as non-streaming
55+
56+
2. queueManager.createOrTap(taskId)
57+
→ Same
58+
59+
3. registerAndExecuteAgentAsync()
60+
→ Same
61+
62+
4. resultAggregator.consumeAndEmit(consumer)
63+
→ Returns Flow.Publisher<Event> immediately
64+
→ Non-blocking
65+
66+
5. processor() wraps publisher:
67+
- Validates task ID
68+
- Adds task to QueueManager
69+
- Stores push notification config
70+
- Sends push notifications
71+
72+
6. cleanup(queue, task, true)
73+
→ ALWAYS async for streaming
74+
75+
7. Return Flow.Publisher<StreamingEventKind>
76+
```
77+
78+
### Key Difference
79+
80+
**Non-Streaming**: Blocks until terminal event, then returns Task/Message
81+
**Streaming**: Returns Flow.Publisher immediately, client receives events as they arrive
82+
83+
**Cleanup**: Streaming ALWAYS uses async cleanup (background thread)
84+
85+
---
86+
87+
## EventConsumer Details
88+
89+
**Location**: `server-common/.../events/EventConsumer.java`
90+
91+
**Purpose**: Consumes events from EventQueue and exposes as reactive stream
92+
93+
**Key Methods**:
94+
- `consume()` → Returns `Flow.Publisher<Event>`
95+
- Polls queue with 500ms timeout
96+
- Closes queue on final event
97+
- Thread-safe concurrent consumption
98+
99+
**Usage**:
100+
```java
101+
EventConsumer consumer = new EventConsumer(eventQueue);
102+
Flow.Publisher<Event> publisher = consumer.consume();
103+
// Subscribe to receive events as they arrive
104+
```
105+
106+
---
107+
108+
## ResultAggregator Modes
109+
110+
**Location**: `server-common/.../tasks/ResultAggregator.java`
111+
112+
Bridges EventConsumer and DefaultRequestHandler with three consumption modes:
113+
114+
### 1. consumeAndBreakOnInterrupt()
115+
116+
**Used by**: `onMessageSend()` (non-streaming)
117+
118+
**Behavior**:
119+
- Polls queue until terminal event or AUTH_REQUIRED
120+
- Returns `EventTypeAndInterrupt(event, interrupted)`
121+
- Blocking operation
122+
- Exits early on AUTH_REQUIRED (interrupted = true)
123+
124+
**Use Case**: Non-streaming requests that need single final response
125+
126+
### 2. consumeAndEmit()
127+
128+
**Used by**: `onMessageSendStream()` (streaming)
129+
130+
**Behavior**:
131+
- Returns all events as `Flow.Publisher<Event>`
132+
- Non-blocking, immediate return
133+
- Client subscribes to stream
134+
- Events delivered as they arrive
135+
136+
**Use Case**: Streaming requests where client wants all events in real-time
137+
138+
### 3. consumeAll()
139+
140+
**Used by**: `onCancelTask()`
141+
142+
**Behavior**:
143+
- Consumes all events from queue
144+
- Returns first `Message` or final `Task` found
145+
- Simple consumption without streaming
146+
- Blocks until queue exhausted
147+
148+
**Use Case**: Task cancellation where final state matters
149+
150+
---
151+
152+
## Flow Comparison Table
153+
154+
| Aspect | Non-Streaming | Streaming |
155+
|--------|---------------|-----------|
156+
| **ResultAggregator Mode** | consumeAndBreakOnInterrupt | consumeAndEmit |
157+
| **Return Type** | Task/Message | Flow.Publisher |
158+
| **Blocking** | Yes (until terminal event) | No (immediate return) |
159+
| **Cleanup** | Immediate or async | Always async |
160+
| **AUTH_REQUIRED** | Early exit, return task | Continue streaming |
161+
| **Use Case** | Simple request/response | Real-time event updates |
162+
163+
---
164+
165+
## Cleanup Integration
166+
167+
### Non-Streaming Cleanup Decision
168+
169+
```java
170+
if (event instanceof Message || isFinalEvent(event)) {
171+
if (!interrupted) {
172+
cleanup(queue, task, false); // Immediate: wait for agent, close queue
173+
} else {
174+
cleanup(queue, task, true); // Async: close in background (AUTH_REQUIRED case)
175+
}
176+
}
177+
```
178+
179+
**Logic**:
180+
- Terminal event + not interrupted → Immediate cleanup (wait for agent, close queue)
181+
- Terminal event + interrupted (AUTH_REQUIRED) → Async cleanup (agent still running)
182+
183+
### Streaming Cleanup
184+
185+
```java
186+
cleanup(queue, task, true); // ALWAYS async for streaming
187+
```
188+
189+
**Logic**: Streaming always uses async cleanup because:
190+
- Publisher already returned to client
191+
- Events may still be processing
192+
- Queue cleanup happens in background
193+
194+
---
195+
196+
## Thread Model
197+
198+
### Agent Execution Thread
199+
- `CompletableFuture.runAsync(agentExecutor::execute, executor)`
200+
- Agent runs in background thread pool
201+
- Enqueues events to MainQueue
202+
203+
### MainEventBusProcessor Thread
204+
- Single background thread: "MainEventBusProcessor"
205+
- Processes events from MainEventBus
206+
- Persists to TaskStore, distributes to ChildQueues
207+
208+
### Consumer Thread
209+
- Non-streaming: Request handler thread (blocking)
210+
- Streaming: Subscriber thread (reactive)
211+
- Polls ChildQueue for events
212+
213+
### Cleanup Thread
214+
- Async cleanup: Background thread pool
215+
- Immediate cleanup: Request handler thread
216+
217+
---
218+
219+
## Related Documentation
220+
221+
- **[Main Overview](../EVENTQUEUE.md)** - Architecture and components
222+
- **[Lifecycle](LIFECYCLE.md)** - Queue lifecycle and cleanup
223+
- **[Scenarios](SCENARIOS.md)** - Real-world usage patterns

0 commit comments

Comments
 (0)