diff --git a/DOCUMENTATION.md b/DOCUMENTATION.md
index bd4c6c1..d21c3fc 100644
--- a/DOCUMENTATION.md
+++ b/DOCUMENTATION.md
@@ -10,7 +10,10 @@ The project is a monorepo containing two primary components:
     *   **Batch Manager**: Optimizes high-volume embedding requests.
     *   **Detailed Logger**: Provides per-request file logging for debugging.
     *   **OpenAI-Compatible Endpoints**: `/v1/chat/completions`, `/v1/embeddings`, etc.
-2.  **The Resilience Library (`rotator_library`)**: This is the core engine that provides high availability. It is consumed by the proxy app to manage a pool of API keys, handle errors gracefully, and ensure requests are completed successfully even when individual keys or provider endpoints face issues.
+2.  **The Resilience Library (`rotator_library`)**: This is the core engine that provides high availability. It is consumed by the proxy app to manage a pool of API keys, handle errors gracefully, and ensure requests are completed successfully even when individual keys or provider endpoints face issues. It also includes:
+    *   **HiveMind Ensemble Manager**: Orchestrates parallel model execution (Swarm and Fusion modes) with intelligent arbitration.
+    *   **Key Management**: Advanced concurrency control and intelligent key selection.
+    *   **Error Handling**: Escalating cooldowns and automatic recovery.
 
 This architecture cleanly separates the API interface from the resilience logic, making the library a portable and powerful tool for any application needing robust API key management.
 
@@ -315,6 +318,148 @@ The `CooldownManager` handles IP or account-level rate limiting that affects all
 
 ---
 
+## 2.10. HiveMind Ensemble (`ensemble/`)
+
+The **HiveMind Ensemble** system enables parallel model execution with intelligent arbitration, supporting two distinct modes:
+
+### 2.10.1. Swarm Mode
+
+**Purpose**: Execute the same model multiple times in parallel to generate diverse responses, then synthesize them into a single high-quality output.
+
+**Key Features**:
+- **Temperature Jitter**: Randomly varies temperature across drones (±delta) to increase response diversity
+- **Adversarial Mode**: Dedicates N drones as critical reviewers with adversarial prompts to stress-test solutions
+- **Blind Switch**: Optionally hides model names from the arbiter to reduce synthesis bias
+- **Self-Arbitration**: Can use the same model as arbiter to save costs
+
+**Configuration** (`ensemble_configs/swarms/*.json`):
+- Folder-based preset system with model-specific overrides
+- Default configuration applies to all swarms unless overridden
+- Preset-based discovery: `{base_model}-{preset_id}[swarm]` format
+
+**Example Usage**:
+```python
+response = await client.acompletion(
+    model="gpt-4o-mini-default[swarm]",
+    messages=[{"role": "user", "content": "Explain AI"}]
+)
+# → 3 parallel calls to gpt-4o-mini with temperature jitter
+# → Arbiter synthesizes responses into final answer
+```
+
+### 2.10.2. Fusion Mode
+
+**Purpose**: Combine responses from multiple specialized models with role-based routing and weighted synthesis.
+
+**Key Features**:
+- **Role Assignment**: Each specialist model receives a custom system prompt defining its expertise
+- **Weight Descriptions**: Guide arbiter on which specialist to trust for specific domains
+- **Role Templates**: Reusable role definitions stored in `ensemble_configs/roles/`
+- **Blind Mode**: Hides model names while preserving role labels
+- **Multi-Provider Support**: Can mix models from different providers in a single fusion
+
+**Configuration** (`ensemble_configs/fusions/*.json`):
+- Each fusion defined in its own JSON file or as an array in a single file
+- Specialists can reference role templates via `role_template` field
+- Supports `weight_description` for arbiter context
+
+**Example Configuration**:
+```json
+{
+  "id": "dev-team",
+  "specialists": [
+    {
+      "model": "gpt-4o",
+      "role": "Architect",
+      "system_prompt": "Focus on scalability and system design.",
+      "weight_description": "Expert in architecture. Trust for design decisions."
+    },
+    {
+      "model": "claude-3-opus",
+      "role": "Security",
+      "role_template": "security-expert"
+    }
+  ],
+  "arbiter": {
+    "model": "gpt-4o",
+    "strategy": "synthesis",
+    "blind": true
+  }
+}
+```
+
+### 2.10.3. Arbitration Strategies
+
+Strategies define how the arbiter synthesizes responses. Stored as plain text files in `ensemble_configs/strategies/*.txt` with `{responses}` placeholder.
+
+**Built-in Strategies**:
+- **synthesis**: Combine best elements from all responses
+- **best_of_n**: Select and refine the strongest response
+- **code_review**: Code-specific evaluation criteria
+
+**Custom Strategies**: Users can add their own `.txt` files with custom synthesis prompts.
+
+### 2.10.4. Recursive Mode
+
+**Purpose**: Enable autonomous arbiter decision-making for low-consensus scenarios.
+
+**Mechanism**:
+- Arbiter assesses consensus (1-10 scale)
+- If consensus < threshold: arbiter performs internal critique reasoning
+- If consensus >= threshold: proceeds directly to synthesis
+- All internal reasoning wrapped in `[INTERNAL]` tags (filtered from user output)
+
+**Markers**:
+- `[CONSENSUS: X/10]`: Logged at WARN level if below threshold
+- `[CONFLICTS: ...]`: Identified disagreement points
+- `[CRITIQUE: ...]`: Internal reasoning about conflicts
+- `[FINAL SYNTHESIS:]`: Start of user-facing output
+
+### 2.10.5. Usage Tracking
+
+HiveMind responses include standard OpenAI-compatible usage fields **plus** supplementary `hivemind_details`:
+
+**Standard Fields** (aggregated totals from all models):
+- `prompt_tokens`: Total prompt tokens (drones/specialists + arbiter)
+- `completion_tokens`: Total completion tokens
+- `total_tokens`: Grand total
+
+**Supplementary Breakdown** (`hivemind_details`):
+```json
+{
+  "mode": "swarm" | "fusion",
+  "drone_count" | "specialist_count": 3,
+  "drone_tokens" | "specialist_tokens": 450,
+  "arbiter_tokens": 200,
+  "total_cost_usd": 0.00123,
+  "latency_ms": 1523.45
+}
+```
+
+**Important**: Consumers should use standard `usage` fields for billing/analytics. The `hivemind_details` provides debugging context.
+
+### 2.10.6. Architecture
+
+**Components**:
+- **EnsembleManager** (`manager.py`): Orchestration engine
+  - Detects ensemble requests (`is_ensemble()`)
+  - Prepares drones/specialists (`_prepare_drones()`, `_prepare_fusion_models()`)
+  - Executes parallel calls (`_execute_parallel()`)
+  - Builds arbiter prompts (`_build_arbiter_prompt()`)
+  - Handles streaming (`_call_arbiter_streaming()`)
+  
+- **ConfigLoader** (`config_loader.py`): Configuration management
+  - Loads swarm presets, fusions, strategies, and role templates
+  - Supports both single-item and array-based file formats
+  - Validates and merges configurations
+
+**Integration**:
+- Initialized in `RotatingClient.__init__()`
+- Intercepts requests in `acompletion()` before normal routing
+- Inherits all retry/resilience logic from RotatingClient
+
+---
+
 ## 3. Provider Specific Implementations
 
 The library handles provider idiosyncrasies through specialized "Provider" classes in `src/rotator_library/providers/`.
diff --git a/README.md b/README.md
index 72736a4..1a2fc49 100644
--- a/README.md
+++ b/README.md
@@ -12,6 +12,11 @@ This project provides a powerful solution for developers building complex applic
 ## Features
 
 -   **Universal API Endpoint**: Simplifies development by providing a single, OpenAI-compatible interface for diverse LLM providers.
+-   **HiveMind Ensemble**: Parallel model execution with intelligent arbitration in two modes:
+    -   **Swarm Mode**: Run multiple copies of the same model with temperature jitter, adversarial critique, and consensus-based synthesis
+    -   **Fusion Mode**: Combine responses from different specialized models with role-based routing and weighted synthesis
+    -   **Recursive Refinement**: Autonomous arbiter decision-making for low-consensus scenarios with internal critique reasoning
+    -   **Streaming Support**: Full streaming support with real-time arbiter synthesis
 -   **High Availability**: The underlying library ensures your application remains operational by gracefully handling transient provider errors and API key-specific issues.
 -   **Resilient Performance**: A global timeout on all requests prevents your application from hanging on unresponsive provider APIs.
 -   **Advanced Concurrency Control**: A single API key can be used for multiple concurrent requests. By default, it supports concurrent requests to *different* models. With configuration (`MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>`), it can also support multiple concurrent requests to the *same* model using the same key.
@@ -340,11 +345,56 @@ curl -X POST http://127.0.0.1:8000/v1/chat/completions \
 }'
 ```
 
+### HiveMind Ensemble - Parallel Model Execution
+
+HiveMind enables you to run multiple models in parallel with intelligent arbitration. Use the `[swarm]` suffix or pre-configured fusion IDs.
+
+**Swarm Mode** (same model, multiple executions):
+```bash
+# Explicit preset format
+curl -X POST http://127.0.0.1:8000/v1/chat/completions \
+-H "Content-Type: application/json" \
+-H "Authorization: Bearer a-very-secret-and-unique-key" \
+-d '{
+    "model": "gpt-4o-mini-aggressive[swarm]",
+    "messages": [{"role": "user", "content": "Explain quantum computing"}]
+}'
+
+# Short format (requires omit_id: true in preset)
+curl -X POST http://127.0.0.1:8000/v1/chat/completions \
+-H "Content-Type: application/json" \
+-H "Authorization: Bearer a-very-secret-and-unique-key" \
+-d '{
+    "model": "gpt-4o-mini[swarm]",
+    "messages": [{"role": "user", "content": "Explain quantum computing"}]
+}'
+```
+
+**Fusion Mode** (multiple specialist models):
+```bash
+curl -X POST http://127.0.0.1:8000/v1/chat/completions \
+-H "Content-Type: application/json" \
+-H "Authorization: Bearer a-very-secret-and-unique-key" \
+-d '{
+    "model": "dev-team[fusion]",
+    "messages": [{"role": "user", "content": "Review this API design"}]
+}'
+```
+
+HiveMind automatically:
+- Executes models in parallel
+- Applies temperature jitter for diversity (Swarm mode)
+- Routes to specialized models with role prompts (Fusion mode)
+- Synthesizes responses using an arbiter model
+- Aggregates usage and cost across all calls
+
+For detailed configuration and advanced features, see the [HiveMind User Guide](docs/HiveMind_User_Guide.md).
+
 ### Available API Endpoints
 
 -   `POST /v1/chat/completions`: The main endpoint for making chat requests.
 -   `POST /v1/embeddings`: The endpoint for creating embeddings.
--   `GET /v1/models`: Returns a list of all available models from your configured providers.
+-   `GET /v1/models`: Returns a list of all available models from your configured providers (includes HiveMind fusions and swarms).
 -   `GET /v1/providers`: Returns a list of all configured providers.
 -   `POST /v1/token-count`: Calculates the token count for a given message payload.
 
diff --git a/docs/HiveMind Plan.md b/docs/HiveMind Plan.md
new file mode 100644
index 0000000..880c8cf
--- /dev/null
+++ b/docs/HiveMind Plan.md	
@@ -0,0 +1,1290 @@
+# HiveMind Ensemble (Swarm/Fusion) - Implementation Plan (REVISED)
+
+## Goal Description
+
+Implement a sophisticated orchestration engine called "HiveMind Ensemble" that enables two distinct modes of parallel model execution:
+
+1. **Swarm Mode**: Multiple parallel calls to the **same model** (called "Drones") with optional configuration for temperature variation, adversarial critique, and recursive self-correction.
+2. **Fusion Mode**: Multiple parallel calls to **different models** (called "Models" or "Specialists" when roles are assigned) with optional role-based routing and context-aware synthesis.
+
+Both modes use an "Arbiter" (judge model) to synthesize responses with configurable strategies and optional recursive refinement.
+
+---
+
+## Terminology
+
+- **HiveMind Ensemble**: The overall feature/system (may be shortened to "HiveMind" after first mention)
+- **Swarm**: Parallel execution of the same model
+  - **Drone**: Individual instance in a Swarm
+- **Fusion**: Parallel execution of different models  
+  - **Model**: Individual model in a Fusion (generic term)
+  - **Specialist**: A Model with an assigned role and weight
+- **Arbiter**: The judge/synthesizer model that produces the final response
+
+---
+
+## Architecture Overview
+
+### Request Flow
+
+```
+User Request (model: "gemini-1.5-flash[swarm]")
+    ↓
+EnsembleManager.is_ensemble()? → Yes
+    ↓
+EnsembleManager.handle_request()
+    ↓
+┌─────────────────────────────────────────┐
+│ 1. Configuration Resolution             │
+│    - Load config for this ensemble      │
+│    - Determine: Swarm or Fusion?        │
+│    - Get Arbiter config                 │
+└─────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────┐
+│ 2. Drone/Model Preparation              │
+│    For Swarm:                           │
+│      - Create N Drones (same model)     │
+│      - Apply temp jitter (optional)     │
+│      - Mark M as adversarial (optional) │
+│    For Fusion:                          │
+│      - Load constituent models          │
+│      - Apply role prompts (optional)    │
+└─────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────┐
+│ 3. Parallel Execution                   │
+│    - asyncio.gather() all calls         │
+│    - Each call uses RotatingClient      │
+│    - Apply retry logic per drone/model  │
+│    - Collect responses + metadata       │
+└─────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────┐
+│ 4. Response Processing                  │
+│    - Apply blind switch (optional)      │
+│    - Format for Arbiter consumption     │
+└─────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────┐
+│ 5. Arbitration                          │
+│    - Load strategy prompt               │
+│    - Inject role/weight context         │
+│    - For Recursive Mode:                │
+│      • Give arbiter autonomy            │
+│      • Arbiter decides Round 2          │
+│    - For Non-Recursive:                 │
+│      • Direct synthesis only            │
+│    - Call Arbiter (with streaming)      │
+└─────────────────────────────────────────┘
+    ↓
+┌─────────────────────────────────────────┐
+│ 6. Final Output                         │
+│    - Stream Arbiter's response to user  │
+│    - Aggregate usage from all calls     │
+│    - Log execution summary              │
+└─────────────────────────────────────────┘
+```
+
+---
+
+## Core Components
+
+### 1. EnsembleManager Class
+
+**File**: `src/rotator_library/ensemble_manager.py`
+
+**Responsibilities**:
+- Load and validate `ensemble_config.json`
+- Detect Swarm requests (`[swarm]` notation) vs Fusion requests (config-based)
+- Orchestrate parallel execution with retry logic
+- Manage arbitration with streaming support
+- Handle recursive refinement (single arbiter call with autonomous decision)
+
+**Key Methods**:
+
+#### `__init__(self, config_path, rotating_client)`
+- Load configuration file
+- Store reference to RotatingClient
+- Build lookup tables for fast ensemble detection
+- Validate configuration schema
+- Initialize usage aggregator
+
+#### `is_ensemble(self, model_id: str) -> bool`
+- Check if model_id matches a Fusion config (exact match from config)
+- Check if model_id contains `[swarm]` notation
+- Handle conflict detection (if provider has real model with same name)
+- Return: `True` if ensemble, `False` otherwise
+
+#### `resolve_conflicts(self, base_model: str) -> str`
+- Default format: `base_model[swarm]`
+- Check if this conflicts with provider's real models
+- If conflict, try: `base_model[hive]`, `base_model[max]`, etc.
+- Log warning about conflict resolution
+- Return: Final ensemble ID to use
+
+#### `handle_request(self, request_params: dict) -> AsyncGenerator`
+Main orchestration method. Returns a streaming generator for the Arbiter's response.
+
+**Steps**:
+1. **Identify Type**: Swarm or Fusion
+2. **Load Config**: Get specific config or use defaults
+3. **Prepare Drones/Models**:
+   - Build list of execution targets
+   - Apply temperature jitter (Swarm)
+   - Apply role prompts (Fusion)
+   - Mark adversarial instances
+4. **Execute Parallel Calls**:
+   - Use `asyncio.gather()` with exception handling
+   - Each call goes through RotatingClient (inherits retry logic)
+   - Require at least 1 successful response
+   - Log failures as errors
+5. **Aggregate Usage**:
+   - Sum all `prompt_tokens`, `completion_tokens`, `total_tokens`
+   - Calculate combined cost (using existing cost calculation)
+6. **Process Responses**:
+   - Extract content from each response
+   - Apply blind switch if enabled (keep roles, strip model names)
+   - Format for Arbiter
+7. **Build Arbiter Prompt**:
+   - Load strategy prompt template
+   - Inject adversarial context (if applicable)
+   - Inject role/weight context (Fusion)
+   - For recursive mode: Add autonomous decision instructions
+8. **Call Arbiter with Streaming**:
+   - Stream Arbiter's synthesis to user
+   - Parse internal markers (if recursive mode)
+   - Aggregate Arbiter's usage into total
+9. **Return**: Stream final response with combined usage metadata
+
+#### `_prepare_drones(self, config: dict, base_model: str, request_params: dict) -> List[dict]`
+For Swarm mode:
+- Create N copies of request params
+- **Temperature Jitter**:
+  ```python
+  base_temp = request_params.get('temperature', 0.7)
+  jitter_config = config.get('temperature_jitter', {})
+  if jitter_config.get('enabled', False):
+      delta = jitter_config.get('delta', 0.0)
+      for i in range(count):
+          temp = base_temp + random.uniform(-delta, delta)
+          temp = max(0.0, min(2.0, temp))  # Clamp
+          drones[i]['temperature'] = temp
+  ```
+- **Adversarial Prompts**:
+  ```python
+  adv_config = config.get('adversarial_config', {})
+  if adv_config.get('enabled', False):
+      count = adv_config['count']
+      prompt = adv_config['prompt']
+      for i in range(count):
+          drones[i]['messages'].insert(0, {
+              'role': 'system',
+              'content': prompt
+          })
+          drones[i]['_is_adversarial'] = True  # Metadata for logging
+  ```
+- **Model ID**: All drones use `base_model` (without `[swarm]` suffix)
+
+#### `_prepare_models(self, config: dict, request_params: dict) -> List[dict]`
+For Fusion mode:
+- For each model in fusion config:
+  - Clone request params
+  - Set model ID from config
+  - If role defined:
+    - Apply `system_prompt_append` (prepend to messages)
+    - Store role metadata for context
+  - If weight defined:
+    - Store weight for arbiter context
+- Return list of prepared calls with metadata
+
+#### `_execute_parallel(self, prepared_calls: List[dict]) -> Tuple[List[dict], dict]`
+- Execute all calls in parallel:
+  ```python
+  results = await asyncio.gather(
+      *[self.rotating_client.acompletion(**params) for params in prepared_calls],
+      return_exceptions=True
+  )
+  ```
+- Filter out exceptions/None values
+- Log each failure as ERROR (drones should not fail)
+- Require at least 1 success, else raise exception
+- Aggregate usage:
+  ```python
+  total_usage = {
+      'prompt_tokens': sum(r.usage.prompt_tokens for r in results if r),
+      'completion_tokens': sum(r.usage.completion_tokens for r in results if r),
+      'total_tokens': sum(r.usage.total_tokens for r in results if r)
+  }
+  ```
+- Return: `(successful_responses, total_usage)`
+
+#### `_format_for_arbiter(self, responses: List[dict], config: dict, mode: str, metadata: List[dict]) -> str`
+Build formatted text for arbiter input.
+
+**Blind Switch Logic**:
+- If `blind=True`:
+  - Labels: "Response 1 (Architect role)", "Response 2 (Security role)"
+  - Do NOT include model names
+- If `blind=False`:
+  - Labels: "Response 1 (GPT-4o - Architect)", "Response 2 (Claude-3-opus - Security)"
+
+**Adversarial Context** (if adversarial drones present):
+```
+NOTE: Responses marked [ADVERSARIAL] were specifically prompted to critique and find flaws.
+Their purpose is to stress-test the solution. Consider their critiques when synthesizing.
+```
+
+**Format**:
+```
+Response 1 (GPT-4o - Architect):
+[content]
+
+Response 2 (Claude-3-opus - Security):
+[content]
+
+Response 3 [ADVERSARIAL]:
+[content]
+```
+
+#### `_build_arbiter_prompt(self, formatted_responses: str, config: dict, mode: str) -> List[dict]`
+Build complete messages array for arbiter.
+
+**System Prompt Components**:
+1. **Base Strategy**: Load from `arbitration_strategies[strategy_name]`
+2. **Role/Weight Context** (Fusion only):
+   ```
+   You are synthesizing responses from specialists with the following expertise:
+   - GPT-4o (Architect): Expert in system design and scalability. Trust this model for architectural decisions.
+   - Claude-3-opus (Security): Expert in vulnerability assessment. Trust this model for security concerns.
+   ```
+3. **Adversarial Context** (if applicable):
+   ```
+   Some responses are marked [ADVERSARIAL]. These drones were specifically instructed to critique
+   and find edge cases. Their purpose is quality assurance through skeptical analysis.
+   ```
+4. **Recursive Mode Instructions** (if enabled):
+   ```
+   AUTONOMOUS DECISION PROTOCOL:
+   1. Analyze the responses and assess consensus (agreement level 1-10)
+   2. If consensus >= 7/10: Proceed directly to synthesis
+   3. If consensus < 7/10:
+      a. Identify specific conflict points
+      b. Internally trigger a critique phase
+      c. For each response, reason about how it would address the conflicts
+      d. Then synthesize the final answer
+   
+   Log your internal reasoning with markers:
+   [CONSENSUS: X/10]
+   [CONFLICTS: bullet list]
+   [CRITIQUE REASONING: ...]
+   [FINAL SYNTHESIS:]
+   
+   IMPORTANT: Only return the FINAL SYNTHESIS to the user. All internal reasoning
+   should be wrapped in [INTERNAL] tags for logging purposes only.
+   ```
+5. **Output Format**:
+   ```
+   Provide your synthesis as a complete, high-quality response to the user's original query.
+   Do not mention that you are combining responses unless directly relevant.
+   ```
+
+**User Message**: Original user query + formatted responses
+
+Return: Complete messages array for arbiter call
+
+#### `_call_arbiter_streaming(self, messages: List[dict], arbiter_model: str, original_params: dict) -> AsyncGenerator`
+Call arbiter and stream response.
+
+- Clone original request params
+- Set model to `arbiter_model`
+- Set `messages` to constructed arbiter prompt
+- Set `stream=True`
+- Call via RotatingClient.acompletion (returns async generator)
+- **Parse Stream**:
+  - Extract internal markers (consensus score, conflicts) for logging
+  - Strip `[INTERNAL]` sections from user-facing output
+  - Yield only synthesis content to user
+- **Aggregate Usage**: Track arbiter's usage separately
+- Return: Streaming generator
+
+---
+
+### 2. Configuration Structure
+
+**Folder-Based Approach**: Instead of a single config file, HiveMind uses a directory structure:
+
+```
+ensemble_configs/
+├── swarms/
+│   ├── default.json          # Default swarm settings
+│   ├── gemini-flash.json     # Custom swarm for gemini-flash
+│   └── gpt4o.json            # Custom swarm for gpt-4o
+├── fusions/
+│   ├── dev-team.json         # Dev team fusion
+│   └── creative-writers.json # Creative writers fusion
+└── strategies/
+    ├── synthesis.txt         # Synthesis strategy prompt
+    ├── best_of_n.txt         # Best-of-N strategy
+    └── code_review.txt       # Code review strategy
+```
+
+**Loading Logic**:
+- Load all JSON files from each subfolder
+- Merge swarm configs (specific model configs override defaults)
+- Detect duplicate fusion IDs → apply conflict resolution
+- Load strategy templates from `.txt` files
+
+**Benefits**:
+- Easy to add new configs (drop file in folder)
+- Version control friendly (one file per fusion/config)
+- Community sharing (share individual fusion configs)
+
+---
+
+### 3. Configuration Schemas
+
+#### Swarm Config
+
+**File**: `ensemble_configs/swarms/default.json`
+
+```json
+{
+  "suffix": "[swarm]",
+  "count": 3,
+  
+  "temperature_jitter": {
+    "enabled": true,
+    "delta": 0.2
+  },
+  
+  "arbiter": {
+    "model": "self",
+    "strategy": "synthesis",
+    "blind": true,
+    "note": "Arbiter should be a decent reasoning model (e.g., GPT-4o, Claude 3+, Gemini 1.5 Pro+)"
+  },
+  
+  "adversarial_config": {
+    "enabled": false,
+    "count": 1,
+    "prompt": "You are a Senior Principal Engineer with 15+ years of experience..."
+  },
+  
+  "recursive_mode": {
+    "enabled": false,
+    "consensus_threshold": 7,
+    "note": "Requires a reasoning-capable arbiter model"
+  }
+}
+```
+
+#### Model-Specific Swarm Config
+
+**File**: `ensemble_configs/swarms/gemini-flash.json`
+
+```json
+{
+  "model": "gemini-1.5-flash",
+  "arbiter": {
+    "model": "gpt-4o",
+    "strategy": "synthesis",
+    "blind": true
+  }
+}
+```
+
+#### Fusion Config
+
+**File**: `ensemble_configs/fusions/dev-team.json`
+
+```json
+{
+  "id": "dev-team",
+  "description": "A team of specialized models for software development",
+  "models": [
+    {
+      "model": "gpt-4o",
+      "role": "Architect",
+      "system_prompt_append": "Focus on architectural patterns, scalability, and system design.",
+      "weight": "Expert in system design and scalability. Trust for architectural decisions and structural integrity."
+    },
+    {
+      "model": "claude-3-opus",
+      "role": "Security Specialist",
+      "system_prompt_append": "Focus on security vulnerabilities, edge cases, and potential exploits.",
+      "weight": "Expert in security and vulnerability assessment. Trust for identifying security flaws and attack vectors."
+    },
+    {
+      "model": "gemini-1.5-pro",
+      "role": "Code Reviewer",
+      "system_prompt_append": "Focus on code quality, performance, and best practices.",
+      "weight": "Expert in code quality and performance optimization. Trust for maintainability and efficiency concerns."
+    }
+  ],
+  "arbiter": {
+    "model": "gpt-4o",
+    "strategy": "synthesis",
+    "blind": true,
+    "note": "Requires a reasoning-capable model for best results"
+  },
+  "recursive_mode": {
+    "enabled": false,
+    "consensus_threshold": 7
+  }
+}
+```
+
+#### Strategy Template
+
+**File**: `ensemble_configs/strategies/synthesis.txt`
+
+```
+You are an expert synthesizer. Analyze the following responses and create a single, superior answer that:
+1. Combines the best elements from each response
+2. Resolves any conflicts or contradictions
+3. Ensures completeness and accuracy
+4. Maintains coherence and clarity
+
+Your goal is to produce the BEST possible answer by leveraging the strengths of each response.
+
+Responses:
+{responses}
+```
+
+
+---
+
+## Detailed Feature Specifications
+
+### 1. Temperature Jitter (Swarm Only)
+
+**Purpose**: Introduce controlled randomness to increase response diversity.
+
+**Configuration**:
+```json
+"temperature_jitter": {
+  "enabled": true,
+  "delta": 0.2
+}
+```
+
+**Implementation**:
+- Get base temperature from request (default 0.7)
+- For each Drone: `temp = base_temp + random.uniform(-delta, delta)`
+- Clamp to `[0.0, 2.0]`
+- If request has `temperature=0`, disable jitter automatically
+
+---
+
+### 2. Adversarial Mode (Swarm Only)
+
+**Purpose**: Inject critical analysis to stress-test solutions.
+
+**Configuration**:
+```json
+"adversarial_config": {
+  "enabled": false,
+  "count": 1,
+  "prompt": "You are a Senior Principal Engineer..."
+}
+```
+
+**Implementation**:
+- Select first N drones as adversarial
+- Prepend adversarial system prompt
+- Tag responses as `[ADVERSARIAL]` in arbiter input
+- **Arbiter Context**: Explain adversarial purpose:
+  ```
+  NOTE: This mode is designed for SYNTHESIS strategy. Adversarial responses
+  critique the solution to ensure all angles are considered. Integrate their
+  insights to strengthen the final answer.
+  ```
+
+---
+
+### 3. Role Assignment & Weights (Fusion Only)
+
+**Purpose**: Specialize models and guide arbiter on expertise.
+
+**Configuration** (per model):
+```json
+{
+  "model": "gpt-4o",
+  "role": "Architect",
+  "system_prompt_append": "Focus on scalability.",
+  "weight": "Expert in system design. Trust for architectural decisions."
+}
+```
+
+**Fields**:
+- `role`: Display name (for user reference and arbiter labels)
+- `system_prompt_append`: Instructions sent to the model
+- `weight`: Context for arbiter (what to trust this model for)
+
+**Arbiter Context Injection**:
+```
+Specialist Expertise:
+- Architect (GPT-4o): Expert in system design. Trust for architectural decisions.
+- Security (Claude): Expert in vulnerabilities. Trust for security concerns.
+```
+
+---
+
+### 4. Arbitration Strategies
+
+**Purpose**: Flexible synthesis logic via prompt engineering.
+
+**Built-in**:
+- `synthesis`: Combine all responses into best version
+- `best_of_n`: Select and refine the strongest response
+- `code_review`: Code-specific evaluation
+
+**User-Defined**: Users add custom strategies to `arbitration_strategies` config.
+
+**Template Variables**:
+- `{responses}`: Formatted response text
+- `{role_context}`: Weight/expertise descriptions
+- `{adversarial_note}`: Context about adversarial drones
+
+---
+
+### 5. Blind Switch
+
+**Purpose**: Remove model identifiers to prevent bias, while keeping role context.
+
+**Default**: `blind: true` (enabled by default)
+
+**Per-Config**: Each swarm config and fusion config can override:
+
+```json
+"arbiter": {
+  "blind": true
+```
+
+**Implementation**:
+- `blind=true`: "Response 1 (Architect role)", "Response 2 (Security role)"
+- `blind=false`: "Response 1 (GPT-4o - Architect)", "Response 2 (Claude - Security)"
+
+**Key Change**: Roles are ALWAYS preserved. Only model names are stripped.
+
+---
+
+### 6. Recursive/Reflective Mode
+
+**Purpose**: Multi-round refinement for low-consensus situations.
+
+**Configuration**:
+```json
+"recursive_mode": {
+  "enabled": false,
+  "consensus_threshold": 7,
+  "note": "Arbiter model must be capable of internal reasoning (e.g., GPT-4o, Claude 3.5+, Gemini 1.5 Pro+)"
+}
+```
+
+**REVISED APPROACH** (Single Arbiter Call):
+
+Instead of multiple requests, the arbiter is given **autonomous decision-making** via prompt.
+
+> [!NOTE]
+> The arbiter model should be a **decent reasoning model** to handle internal critique and consensus analysis effectively. Models like GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro are recommended.
+
+**Arbiter Prompt** (when recursive enabled):
+```
+You have autonomous decision-making authority. Follow this protocol:
+
+1. ASSESSMENT PHASE:
+   - Analyze the provided responses
+   - Rate consensus level (1-10)
+   - Log: [CONSENSUS: X/10]
+
+2. DECISION PHASE:
+   If consensus >= 7/10:
+     - Proceed directly to synthesis
+   
+   If consensus < 7/10:
+     - Identify conflict points
+     - Log: [CONFLICTS: ...]
+     - For each response, reason internally about how it would address conflicts
+     - Log: [CRITIQUE REASONING: ...]
+   
+3. SYNTHESIS PHASE:
+   - Create final answer incorporating all insights
+   - Log: [FINAL SYNTHESIS:]
+
+IMPORTANT: Wrap all internal reasoning in [INTERNAL] tags. Only the content
+after [FINAL SYNTHESIS:] will be shown to the user.
+```
+
+**Stream Processing**:
+- EnsembleManager parses the stream
+- Extract `[CONSENSUS: X/10]` → Log at WARN level if < threshold
+- Extract `[CONFLICTS: ...]` → Log conflicts
+- Strip all `[INTERNAL]` sections from user output
+- Yield only `[FINAL SYNTHESIS:]` content to user
+
+**Logging**:
+```
+[HiveMind] Recursive mode active. Consensus: 5/10 [WARN]
+[HiveMind] Conflicts identified: [list]
+[HiveMind] Arbiter performing internal critique...
+[HiveMind] Final synthesis complete
+```
+
+---
+
+### 7. Streaming Support
+
+**Behavior**: Respects the `stream` boolean from the original request.
+
+**Implementation**:
+- Drone/Model calls are NOT streamed (collected in parallel)
+- Arbiter call respects `stream` parameter:
+  - If `stream=true`: Stream arbiter's response
+  - If `stream=false`: Return complete arbiter response
+- EnsembleManager passes through arbiter's streaming behavior
+- Parse and filter internal markers during streaming
+- Return clean synthesis to user
+
+**Flow**:
+```python
+async def handle_request(...) -> AsyncGenerator:
+    # 1. Collect drone responses (non-streaming)
+    responses = await self._execute_parallel(...)
+    
+    # 2. Build arbiter prompt
+    messages = self._build_arbiter_prompt(...)
+    
+    # 3. Stream arbiter response
+    arbiter_stream = self._call_arbiter_streaming(...)
+    
+    # 4. Parse and yield
+    async for chunk in arbiter_stream:
+        # Filter [INTERNAL] sections
+        if not chunk.startswith('[INTERNAL]'):
+            yield chunk
+```
+
+---
+
+### 8. Usage & Cost Tracking
+
+**Aggregation**:
+- Track usage from each Drone/Model call
+- Track usage from Arbiter call
+- Sum ALL usage fields:
+  ```python
+  total_usage = {
+      'prompt_tokens': sum(all_calls),
+      'completion_tokens': sum(all_calls),
+      'cached_tokens': sum(all_calls),  # If available
+      'reasoning_tokens': sum(all_calls),  # If available
+      'total_tokens': sum(all_calls),
+      # Include any other usage fields from responses
+  }
+  ```
+
+**Cost Calculation**:
+- Use `UsageManager.calculate_cost()` if available (preferred)
+- Fallback to `litellm.completion_cost()` if needed
+- Calculate cost per call
+- Sum total cost
+- **Note**: This should be one of the last features to implement
+- Include in final response metadata
+
+**Response Format**:
+```json
+{
+  "usage": {
+    "prompt_tokens": 5000,
+    "completion_tokens": 800,
+    "total_tokens": 5800,
+    "hivemind_details": {
+      "drone_count": 3,
+      "arbiter_tokens": 1200,
+      "total_cost_usd": 0.045
+    }
+  }
+}
+```
+
+---
+
+## Integration Points
+
+### 1. RotatingClient Modification
+
+**File**: `src/rotator_library/client.py`
+
+```python
+class RotatingClient:
+    def __init__(self, ...):
+        # Existing init
+        self.ensemble_manager = EnsembleManager(
+            config_path=os.path.join(os.path.dirname(__file__), '../../ensemble_config.json'),
+            rotating_client=self
+        )
+    
+    def acompletion(self, request=None, **kwargs):
+        model = kwargs.get('model')
+        
+        # Check if ensemble
+        if self.ensemble_manager.is_ensemble(model):
+            # Return streaming generator from ensemble manager
+            return self.ensemble_manager.handle_request(
+                request=request,
+                **kwargs
+            )
+        
+        # Normal flow
+        if kwargs.get('stream'):
+            return self._streaming_acompletion_with_retry(...)
+        else:
+            return self._execute_with_retry(...)
+```
+
+---
+
+### 2. Model List Integration
+
+```python
+async def get_all_available_models(self, grouped=True):
+    # Existing provider models
+    all_provider_models = await self._fetch_provider_models()
+    
+    # Add fusion models
+    fusion_ids = self.ensemble_manager.get_fusion_ids()
+    if fusion_ids:
+        all_provider_models['hivemind'] = fusion_ids
+    
+    return all_provider_models
+```
+
+**Note**: Swarm model listing is **TBD**. The user notes it's "not infinite" and needs to design a better discovery system.
+
+---
+
+### 3. Logging
+
+**Log Levels**:
+- INFO: Normal operations (starting swarm, drone count, completion)
+- DEBUG: Detailed execution (per-drone temps, prompt construction)
+- WARN: Low consensus, conflicts, partial failures
+- ERROR: Drone failures, arbiter failures
+
+**Examples**:
+```python
+lib_logger.info(f"[HiveMind] Processing Swarm: {model_id} ({count} Drones)")
+lib_logger.debug(f"[HiveMind] Drone {i+1}: temp={temp:.2f}, adversarial={is_adv}")
+lib_logger.warn(f"[HiveMind] Recursive mode: Consensus 5/10 - below threshold")
+lib_logger.error(f"[HiveMind] Drone {i+1} failed: {error}")
+lib_logger.info(f"[HiveMind] Total cost: ${total_cost:.4f} ({total_tokens} tokens)")
+```
+
+---
+
+## Edge Cases & Error Handling
+
+### 1. Partial Failures
+
+**Scenario**: Some Drones fail due to errors.
+
+**Handling**:
+- Each drone call uses RotatingClient → **inherits existing retry/key rotation logic**
+- If a drone still fails after retries, log as ERROR
+- Continue with successful responses
+- **Minimum**: Require at least 1 successful response
+- If all fail, raise exception with details
+
+**No Special Logic Needed**: RotatingClient already handles retries, rate limits, key rotation.
+
+---
+
+### 2. Arbiter Failure
+
+**Scenario**: Arbiter call fails.
+
+**Handling**:
+- Arbiter call uses RotatingClient → **inherits retry/resilience logic**
+- If arbiter fails after retries:
+  - Log ERROR
+  - Fallback: Return first **non-adversarial** drone response
+  - Log: `[HiveMind] Arbiter failed. Returning first non-adversarial response.`
+
+---
+
+### 3. Naming Conflicts
+
+**Scenario**: Provider has `gemini-1.5-flash[swarm]` as real model, or duplicate fusion IDs exist.
+
+**Handling**:
+- Default naming: `model-name[swarm]` or fusion ID from config
+- On conflict detected:
+  - Append numeric suffix: `-1`, `-2`, `-3`, etc.
+  - Example: `gemini-1.5-flash[swarm]` → `gemini-1.5-flash[swarm]-1`
+  - Example: `dev-team` → `dev-team-1`
+- Log: `[HiveMind] Conflict detected. Renamed 'dev-team' to 'dev-team-1'.`
+- Store resolved names in runtime cache
+- **Applies to**: Both swarm suffixes AND fusion IDs
+
+---
+
+### 4. Streaming Parse Errors
+
+**Scenario**: Can't parse `[CONSENSUS: X/10]` from recursive mode stream.
+
+**Handling**:
+- Log warning
+- Continue streaming synthesis
+- Skip logging consensus score
+
+---
+
+### 5. Invalid Configuration
+
+**Scenario**: User config has invalid fusion (missing model, invalid strategy).
+
+**Handling**:
+- On startup, validate all fusions
+- Log errors for invalid configs
+- Skip invalid fusions
+- Continue with valid ones
+
+---
+
+## Implementation Phases
+
+### **Phase 1: Foundation (Core Infrastructure)**
+
+**Goal**: Set up basic structure and config loading.
+
+**Tasks**:
+1. Create `ensemble_manager.py` skeleton
+   - Define `EnsembleManager` class
+   - Implement `__init__` with folder-based config loading
+   - Load and merge configs from `ensemble_configs/` directory
+   - Add config validation (JSON schema)
+   
+2. Create config directory structure
+   - `ensemble_configs/swarms/default.json`
+   - `ensemble_configs/fusions/` (empty initially)
+   - `ensemble_configs/strategies/synthesis.txt`
+
+3. Integrate into `RotatingClient`
+   - Import `EnsembleManager`
+   - Initialize in `__init__` with config directory path
+   - Add placeholder check in `acompletion`
+
+4. Implement `is_ensemble()`
+   - Detect `[swarm]` suffix
+   - Detect fusion IDs from config
+   - Add conflict detection logic
+
+**Deliverables**:
+- ✅ Folder-based config structure created
+- ✅ Configs load and merge correctly
+- ✅ Ensemble detection works
+- ✅ Conflict resolution (numeric suffixes) works
+- ✅ No runtime errors
+
+**Testing**:
+- Unit test folder-based config loading
+- Unit test config merging (swarm defaults + model-specific)
+- Unit test `is_ensemble()` with various inputs
+- Test conflict detection and numeric suffix generation
+- Test duplicate fusion ID handling
+
+---
+
+### **Phase 2: Basic Swarm (Non-Streaming)**
+
+**Goal**: Get basic swarm working without advanced features.
+
+**Tasks**:
+1. Implement `_prepare_drones()`
+   - Clone request params N times
+   - Set model to base (strip `[swarm]`)
+   - No jitter or adversarial yet
+
+2. Implement `_execute_parallel()`
+   - Use `asyncio.gather()` with drone calls
+   - Handle exceptions gracefully
+   - Aggregate usage stats
+
+3. Implement `_format_for_arbiter()`
+   - Basic formatting (numbered responses)
+   - No blind switch yet
+
+4. Implement `_build_arbiter_prompt()`
+   - Load synthesis strategy
+   - Simple system prompt + user message
+   - No recursive mode yet
+
+5. Implement `_call_arbiter()` (NON-streaming first)
+   - Call arbiter via RotatingClient
+   - Return complete response
+   - Aggregate arbiter usage
+
+6. Wire up `handle_request()` (non-streaming)
+   - Connect all steps
+   - Return arbiter's response
+   - Include combined usage
+
+**Deliverables**:
+- ✅ Swarm executes 3 drones in parallel
+- ✅ Arbiter synthesizes responses
+- ✅ Final response returned (non-streaming)
+- ✅ Usage aggregated correctly
+
+**Testing**:
+- Integration test: Call `gemini-1.5-flash[swarm]`
+- Verify 3 drone calls + 1 arbiter call
+- Verify synthesis quality (manual)
+- Verify usage statistics
+
+---
+
+### **Phase 3: Streaming Support**
+
+**Goal**: Enable streaming for arbiter response.
+
+**Tasks**:
+1. Modify `_call_arbiter()` to `_call_arbiter_streaming()`
+   - Set `stream=True`
+   - Return async generator
+   - Track usage from stream
+
+2. Update `handle_request()` to return generator
+   - Yield arbiter stream chunks
+   - Aggregate usage at end
+
+3. Test streaming end-to-end
+   - Verify chunks arrive in real-time
+   - Verify complete response matches non-streaming
+
+**Deliverables**:
+- ✅ Arbiter response streams to user
+- ✅ No buffering of full response
+- ✅ Usage still aggregated correctly
+
+**Testing**:
+- Integration test with streaming
+- Compare output to non-streaming version
+- Test error handling mid-stream
+
+---
+
+### **Phase 4: Advanced Swarm Features**
+
+**Goal**: Add jitter, adversarial, blind switch.
+
+**Tasks**:
+1. **Temperature Jitter**:
+   - Add jitter logic to `_prepare_drones()`
+   - Test with different delta values
+   - Verify clamping
+
+2. **Adversarial Mode**:
+   - Inject adversarial prompts
+   - Tag responses in formatting
+   - Add arbiter context explanation
+
+3. **Blind Switch**:
+   - Modify `_format_for_arbiter()`
+   - Strip model names when `blind=true`
+   - Keep roles always
+
+**Deliverables**:
+- ✅ Jitter produces varied temps
+- ✅ Adversarial drones produce critiques
+- ✅ Blind mode strips model names
+
+**Testing**:
+- Test each feature independently
+- Test combinations (jitter + adversarial)
+- Manual review of adversarial effectiveness
+
+---
+
+### **Phase 5: Fusion Mode**
+
+**Goal**: Enable multi-model mixtures with roles.
+
+**Tasks**:
+1. Implement `_prepare_models()`
+   - Load models from fusion config
+   - Apply role system prompts
+   - Store metadata for arbiter
+
+2. Update `_format_for_arbiter()` for roles
+   - Include role labels
+   - Apply blind switch for model names
+
+3. Implement role/weight context injection
+   - Build specialist expertise text
+   - Inject into arbiter system prompt
+
+4. Add example fusion to config
+   - "dev-team" with 3 specialists
+
+**Deliverables**:
+- ✅ Fusion calls multiple models
+- ✅ Arbiter receives role context
+- ✅ Synthesis respects expertise weights
+
+**Testing**:
+- Test "dev-team" fusion with coding question
+- Verify role prompts are applied
+- Manual review: Does arbiter trust specialists appropriately?
+
+---
+
+### **Phase 6: Recursive Mode**
+
+**Goal**: Enable autonomous arbiter decision-making for low consensus.
+
+**Tasks**:
+1. Update `_build_arbiter_prompt()` for recursive
+   - Add autonomous protocol instructions
+   - Define `[INTERNAL]` marker format
+   - Include consensus threshold
+
+2. Implement stream parsing in `_call_arbiter_streaming()`
+   - Extract `[CONSENSUS: X/10]`
+   - Extract `[CONFLICTS: ...]`
+   - Strip `[INTERNAL]` sections from user output
+
+3. Add logging for recursive flow
+   - Log consensus score at WARN if low
+   - Log identified conflicts
+   - Log critique phase activation
+
+**Deliverables**:
+- ✅ Arbiter autonomously decides Round 2
+- ✅ Internal reasoning logged but not shown to user
+- ✅ Low consensus triggers critique
+
+**Testing**:
+- Test with intentionally ambiguous prompt
+- Verify arbiter produces `[CONSENSUS: 4/10]`
+- Verify critique reasoning appears in logs
+- Verify final synthesis is improved
+
+---
+
+### **Phase 7: Polish & Production**
+
+**Goal**: Production-ready with documentation and examples.
+
+**Tasks**:
+1. **Comprehensive Logging**:
+   - Add execution time tracking
+   - Add cost tracking per request
+   - Log summary at end of each request
+
+2. **Error Messages**:
+   - User-friendly error for invalid ensemble IDs
+   - Clear message when streaming not supported (N/A now)
+   - Helpful message on config errors
+
+3. **Documentation**:
+   - User guide: How to use swarms/fusions
+   - Config reference: All fields explained
+   - Example configs: dev-team, creative-writers, etc.
+
+4. **Example Configs**:
+   - Add 2-3 preset fusions to default config (commented out)
+   - Document swarm notation in README
+
+5. **Performance Testing**:
+   - Benchmark latency (3-drone swarm)
+   - Benchmark token usage vs single call
+   - Document cost multiplier
+
+**Deliverables**:
+- ✅ Comprehensive logs for debugging
+- ✅ User documentation complete
+- ✅ Example configs provided
+- ✅ Performance benchmarks documented
+
+**Testing**:
+- Full end-to-end tests for all features
+- Load testing with multiple concurrent swarms
+- Manual testing of all examples
+
+---
+
+## Example Configurations
+
+### Preset Fusion 1: Dev Team
+
+```json
+{
+  "id": "dev-team",
+  "description": "Software development team with architecture, security, and code review specialists",
+  "models": [
+    {
+      "model": "gpt-4o",
+      "role": "Architect",
+      "system_prompt_append": "Focus on system design, scalability, and architectural patterns.",
+      "weight": "Expert in system design and scalability. Trust for architectural decisions."
+    },
+    {
+      "model": "claude-3-opus",
+      "role": "Security",
+      "system_prompt_append": "Focus on security vulnerabilities, edge cases, and threat modeling.",
+      "weight": "Expert in security and vulnerability assessment. Trust for security concerns."
+    },
+    {
+      "model": "gemini-1.5-pro",
+      "role": "Reviewer",
+      "system_prompt_append": "Focus on code quality, performance, and best practices.",
+      "weight": "Expert in code quality and optimization. Trust for performance and maintainability."
+    }
+  ],
+  "arbiter": {
+    "model": "gpt-4o",
+    "strategy": "code_review",
+    "blind": false
+  }
+}
+```
+
+---
+
+## User Configuration Examples
+
+### Simple Swarm Usage
+
+User request:
+```
+Model: gemini-1.5-flash[swarm]
+Messages: [{"role": "user", "content": "Write a function to parse CSV"}]
+```
+
+Result: 3 calls to `gemini-1.5-flash`, synthesized by `gemini-1.5-flash` (self-arbiter).
+
+---
+
+### Custom Arbiter for Swarm
+
+Config override (per-model):
+```json
+{
+  "swarm_configs": {
+    "gemini-1.5-flash": {
+      "arbiter": {
+        "model": "gpt-4o",
+        "strategy": "synthesis"
+      }
+    }
+  }
+}
+```
+
+User request: `gemini-1.5-flash[swarm]`
+Result: 3 calls to flash, synthesized by gpt-4o.
+
+---
+
+### Fusion Usage
+
+User request:
+```
+Model: dev-team
+Messages: [{"role": "user", "content": "Review this API endpoint: [code]"}]
+```
+
+Result: Parallel calls to gpt-4o, claude, gemini with role prompts. Arbiter synthesizes with role context.
+
+---
+
+## Default Configuration Answer
+
+Based on user feedback:
+
+1. **Default Swarm Suffix**: `[swarm]`
+2. **Arbiter Default**: Same model as drones (self-arbitration), but configurable per-model
+3. **Streaming**: Required for arbiter's final response ✅
+4. **Cost Warnings**: None (user discretion)
+5. **Preset Configs**: Only using provided examples (dev-team)
+
+---
+
+## Testing Strategy
+
+### Unit Tests
+
+`tests/test_ensemble_manager.py`:
+- Config loading and validation
+- `is_ensemble()` detection
+- Conflict resolution
+- Drone preparation (jitter, adversarial)
+- Model preparation (roles, weights)
+- Response formatting (blind switch)
+
+### Integration Tests
+
+`tests/test_swarm_integration.py`:
+- Basic 3-drone swarm
+- Swarm with jitter enabled
+- Swarm with adversarial mode
+- Streaming swarm response
+
+`tests/test_fusion_integration.py`:
+- Multi-model fusion
+- Role context injection
+- Weight-based synthesis
+
+`tests/test_recursive_integration.py`:
+- Low consensus triggering critique
+- Consensus score parsing
+- Internal marker stripping
+
+### Manual Scenarios
+
+1. **Simple Swarm**: `gpt-4o[swarm]` with straightforward question
+2. **Adversarial Swarm**: Enable adversarial, ask for code, verify critique
+3. **Fusion**: Use "dev-team" with API review
+4. **Recursive**: Use ambiguous prompt, verify low consensus handling
+
+---
+
+## Performance Benchmarks (Expected)
+
+### Latency
+- Single call: ~2s
+- Swarm (3 drones): ~2s (parallel) + ~2s (arbiter) = **~4s**
+- Swarm + Recursive: ~4s + arbiter internal critique time = **~5-6s**
+
+### Token Usage
+- Single call: 1000 input + 500 output = 1500 tokens
+- Swarm (3 drones): 
+  - Drones: 1000 × 3 + 500 × 3 = 4500 tokens
+  - Arbiter: 1000 + 1500 (from drones) = 2500 input + 600 output
+  - Total: **~7600 tokens** (5x single call)
+
+### Cost Multiplier
+- Typical swarm: **4-6x** cost of single call
+- Fusion (different models): Varies by model costs
+
+---
+
+## Summary
+
+This revised plan addresses all user feedback:
+
+✅ Confidence scoring only in recursive mode  
+✅ Adversarial context explained to arbiter  
+✅ Weight field for arbiter expertise guidance  
+✅ Blind switch keeps roles, strips model names  
+✅ Recursive mode as single autonomous arbiter call  
+✅ Default naming: `model[swarm]`  
+✅ Streaming required for arbiter response  
+✅ Usage/cost aggregated from all calls  
+✅ Existing retry/resilience logic leveraged  
+✅ Detailed implementation phases (7 phases)  
+✅ Example configs provided  
+
+Ready for implementation!
diff --git a/docs/HiveMind Task.md b/docs/HiveMind Task.md
new file mode 100644
index 0000000..65c00ed
--- /dev/null
+++ b/docs/HiveMind Task.md	
@@ -0,0 +1,93 @@
+# HiveMind Ensemble (Swarm/Fusion) Implementation
+
+## Phase 1: Core Infrastructure
+- [x] Design and Plan
+    - [x] Explore codebase
+    - [x] Create comprehensive implementation plan
+- [x] Create `src/rotator_library/ensemble_manager.py`
+    - [x] Define `EnsembleManager` class skeleton
+    - [x] Implement config loading and validation
+    - [x] Implement `is_ensemble()` detection
+    - [x] Implement conflict resolution for naming
+- [x] Modify `src/rotator_library/client.py`
+    - [x] Initialize `EnsembleManager` in `__init__`
+    - [x] Integrate into `acompletion()` dispatcher
+    - [x] Add logging for HiveMind operations
+- [x] Create `ensemble_config.json`
+    - [x] Define schema for Fusions
+    - [x] Define schema for Swarm defaults
+    - [x] Define arbitration strategies
+
+## Phase 2: Basic Swarm Mode
+- [x] Implement Swarm Features
+    - [x] `_prepare_drones()` - basic cloning
+    - [x] `_execute_parallel()` - asyncio.gather
+    - [x] `_format_for_arbiter()` - response aggregation
+    - [x] `_build_arbiter_prompt()` - synthesis strategy
+    - [x] `_call_arbiter()` - judge execution
+- [x] Testing
+    - [x] Test basic 3-drone swarm
+    - [x] Test arbiter synthesis
+    - [x] Test partial failures
+
+## Phase 3: Advanced Swarm Features
+- [x] Temperature Jitter
+    - [x] Implement jitter logic
+    - [x] Test randomness and clamping
+- [x] Adversarial Mode
+    - [x] Implement adversarial prompt injection
+    - [x] Test with configurable count
+- [x] Blind Switch
+    - [x] Implement response anonymization
+    - [x] Test with blind=true/false
+- [ ] Confidence Scoring (Moved to Recursive Mode)
+    - [ ] Implement score extraction
+    - [ ] Add logging for scores
+
+## Phase 4: Fusion Mode
+- [/] Implement Fusion Features
+    - [x] `_prepare_models()` - multi-model setup (implemented as `_prepare_fusion_models`)
+    - [x] Role assignment and prompts
+    - [x] Role context for Arbiter (Labels implemented, but explicit expertise context block missing)
+    - [x] Weight system (Weights parsed but not used in arbiter context)
+- [ ] Testing
+    - [ ] Test 2-model fusion
+    - [ ] Test role context injection
+    - [ ] Test specialist descriptions
+
+## Phase 5: Recursive/Reflective Mode
+- [x] Implement Recursion (Single-Call Autonomous Mode)
+    - [x] Consensus check logic (via Prompt & Stream Parsing)
+    - [x] Conflict extraction (via Stream Parsing)
+    - [x] `_trigger_round_2()` implementation (Replaced by Autonomous Decision Protocol)
+    - [x] Max rounds enforcement (N/A for Single Call)
+- [ ] Testing
+    - [ ] Test low-confidence trigger
+    - [ ] Test Round 2 critique
+    - [ ] Test final re-synthesis
+
+## Phase 6: Polish & Edge Cases
+- [ ] Error Handling
+    - [x] Partial failure handling
+    - [ ] Arbiter failure fallback
+    - [x] Infinite recursion prevention (N/A)
+- [ ] Performance
+    - [x] Latency logging
+    - [x] Token usage tracking
+    - [x] Rate limit mitigation (Inherited from RotatingClient)
+- [x] Documentation
+    - [x] User guide
+    - [x] Example configs
+    - [x] API reference
+
+## Verification
+- [ ] Automated Tests
+    - [ ] test_ensemble_manager.py (all 8 test cases)
+    - [ ] test_swarm_logic.py
+    - [ ] test_fusion_logic.py
+    - [ ] test_recursion.py
+- [ ] Manual Tests
+    - [ ] Scenario 1: Simple Swarm
+    - [ ] Scenario 2: Adversarial Swarm
+    - [ ] Scenario 3: Fusion with Roles
+    - [ ] Scenario 4: Recursive Refinement
diff --git a/docs/HiveMind_API.md b/docs/HiveMind_API.md
new file mode 100644
index 0000000..0ada7c3
--- /dev/null
+++ b/docs/HiveMind_API.md
@@ -0,0 +1,554 @@
+# HiveMind Ensemble API Reference
+
+## EnsembleManager
+
+Main class for orchestrating HiveMind Ensemble requests.
+
+### `__init__(rotating_client, config_dir=None)`
+
+Initialize the ensemble manager.
+
+**Parameters:**
+- `rotating_client` (RotatingClient): Reference to the RotatingClient instance
+- `config_dir` (str, optional): Path to ensemble_configs directory. Defaults to `src/rotator_library/ensemble_configs`
+
+**Example:**
+```python
+client = RotatingClient()
+# EnsembleManager is automatically initialized
+manager = client.ensemble_manager
+```
+
+### `is_ensemble(model_id: str) -> bool`
+
+Check if a model ID represents an ensemble request.
+
+**Parameters:**
+- `model_id` (str): Full model ID from user request
+
+**Returns:**
+- `bool`: True if ensemble (swarm or fusion), False otherwise
+
+**Example:**
+```python
+manager.is_ensemble("gpt-4o[swarm]")  # True
+manager.is_ensemble("dev-team")  # True
+manager.is_ensemble("gpt-4o")  # False
+```
+
+### `get_base_model(swarm_id: str) -> tuple`
+
+Extract base model name and preset ID from swarm ID.
+
+**Parameters:**
+- `swarm_id` (str): Swarm model ID (e.g., "gpt-4o-aggressive[swarm]", "gpt-4o[swarm]")
+
+**Returns:**
+- `tuple`: (base_model_name, preset_id)
+  - For `"gpt-4o-aggressive[swarm]"` returns `("gpt-4o", "aggressive")`
+  - For `"gpt-4o[swarm]"` returns `("gpt-4o", "default")` or omit_id preset
+
+**Example:**
+```python
+base, preset = manager.get_base_model("gpt-4o-aggressive[swarm]")  
+# base = "gpt-4o", preset = "aggressive"
+
+base, preset = manager.get_base_model("gpt-4o[swarm]")  
+# base = "gpt-4o", preset = "default" or omit_id preset for gpt-4o
+```
+
+### `get_fusion_ids() -> List[str]`
+
+Get list of all configured fusion IDs.
+
+**Returns:**
+- `List[str]`: List of fusion identifiers
+
+**Example:**
+```python
+fusion_ids = manager.get_fusion_ids()  # ["dev-team", "creative-writers"]
+```
+
+### `handle_request(request, **kwargs) -> Response | AsyncGenerator`
+
+Main entry point for ensemble execution.
+
+**Parameters:**
+- `request`: Original request object
+- `**kwargs`: Request parameters (model, messages, stream, etc.)
+
+**Returns:**
+- `Response`: Complete response (if stream=False)
+- `AsyncGenerator`: Streaming response generator (if stream=True)
+
+**Example:**
+```python
+# Non-streaming
+response = await client.acompletion(
+    model="gpt-4o[swarm]",
+    messages=[{"role": "user", "content": "Test"}],
+    stream=False
+)
+
+# Streaming
+async for chunk in client.acompletion(
+    model="gpt-4o[swarm]",
+    messages=[{"role": "user", "content": "Test"}],
+    stream=True
+):
+    print(chunk)
+```
+
+---
+
+## ConfigLoader
+
+Manages configuration loading for ensemble modes.
+
+### `load_all() -> None`
+
+Load all configurations from directory structure.
+
+**Side Effects:**
+- Populates `swarm_default`, `swarm_configs`, `fusion_configs`, `strategies`
+
+### `get_swarm_config(preset_id: str) -> Dict[str, Any]`
+
+Get swarm configuration for a specific preset.
+
+**Parameters:**
+- `preset_id` (str): Preset ID (e.g., "default", "aggressive")
+
+**Returns:**
+- `Dict[str, Any]`: Preset configuration
+
+### `get_preset_for_model(base_model: str) -> str`
+
+Get the preset ID to use when calling `model[swarm]` (short form).
+
+**Parameters:**
+- `base_model` (str): Base model name (e.g., "gpt-4o-mini")
+
+**Returns:**
+- `str`: Preset ID (omit_id preset for this model, or "default")
+
+**Example:**
+```python
+# If aggressive.json has omit_id=true and base_models=["gpt-4o-mini"]
+preset = loader.get_preset_for_model("gpt-4o-mini")  # "aggressive"
+
+# For models without omit_id preset
+preset = loader.get_preset_for_model("claude-3-haiku")  # "default"
+```
+
+### `get_fusion_config(fusion_id: str) -> Optional[Dict[str, Any]]`
+
+Get fusion configuration by ID.
+
+**Parameters:**
+- `fusion_id` (str): Fusion identifier
+
+**Returns:**
+- `Dict[str, Any]` | `None`: Fusion configuration or None if not found
+
+### `get_strategy(strategy_name: str) -> Optional[str]`
+
+Get strategy template by name.
+
+**Parameters:**
+- `strategy_name` (str): Strategy identifier
+
+**Returns:**
+- `str` | `None`: Strategy template or None if not found
+
+### `get_all_fusion_ids() -> List[str]`
+
+Get list of all fusion IDs with [fusion] suffix.
+
+**Returns:**
+- `List[str]`: List of fusion identifiers
+
+### `get_all_swarm_model_ids() -> List[str]`
+
+Get all discoverable swarm model variants for /v1/models endpoint.
+
+**Discovery Rules:**
+- Preset WITH `base_models` + `omit_id: true` → `{model}[swarm]`
+- Preset WITH `base_models` + `omit_id: false` → `{model}-{preset}[swarm]`
+- Preset WITHOUT `base_models` → Not included (invisible
+)
+
+**Returns:**
+- `List[str]`: List of swarm model IDs for discovery
+
+**Example:**
+```python
+# With aggressive.json: {"omit_id": true, "base_models": ["gpt-4o-mini"]}
+# With default.json: {"omit_id": false, "base_models": ["gpt-4o", "claude-3-haiku"]}
+
+swarm_ids = loader.get_all_swarm_model_ids()
+# [
+#   "gpt-4o-mini[swarm]",  # From aggressive (omit_id=true)
+#   "gpt-4o-default[swarm]",  # From default (omit_id=false)
+#   "claude-3-haiku-default[swarm]"  # From default (omit_id=false)
+# ]
+```
+
+---
+
+## Response Object
+
+HiveMind responses follow the standard OpenAI response format with additional usage details.
+
+### `Response.usage`
+
+Usage statistics for the request.
+
+**Standard Fields (OpenAI-Compatible):**
+
+These fields contain the **complete aggregated totals** from all models (drones/specialists + arbiter). They are fully compatible with existing tooling and billing systems.
+
+- `prompt_tokens` (int): **Total** prompt tokens from all models
+- `completion_tokens` (int): **Total** completion tokens from all models
+- `total_tokens` (int): **Total** tokens (sum of prompt + completion)
+- `cached_tokens` (int, optional): **Total** cached tokens if supported
+- `reasoning_tokens` (int, optional): **Total** reasoning tokens if supported
+
+**HiveMind Ensemble-Specific Fields (Supplementary):**
+
+- `hivemind_details` (dict): **Breakdown information** for observability (does NOT replace standard fields)
+
+**Important**: Always use the standard fields for billing, quotas, and analytics. They contain the correct aggregated totals. The `hivemind_details` provides additional context for debugging and understanding HiveMind execution.
+
+### `Response.usage.hivemind_details`
+
+Supplementary breakdown dictionary containing:
+
+**Common Fields:**
+- `mode` (str): "swarm" or "fusion"
+- `arbiter_tokens` (int): Tokens used by arbiter
+- `total_cost_usd` (float): Estimated total cost in USD
+- `latency_ms` (float): Total execution time in milliseconds
+
+**Swarm-Specific:**
+- `drone_count` (int): Number of drones executed
+- `drone_tokens` (int): Total tokens from all drones
+
+**Fusion-Specific:**
+- `specialist_count` (int): Number of specialists executed
+- `specialist_tokens` (int): Total tokens from all specialists
+
+**Example:**
+```python
+response = await client.acompletion(model="gpt-4o[swarm]", ...)
+
+# Standard fields contain TOTAL aggregated usage
+usage = response.usage
+print(f"Total tokens: {usage.total_tokens}")  # e.g., 650 (drones 450 + arbiter 200)
+print(f"Prompt tokens: {usage.prompt_tokens}")  # e.g., 400 (all models combined)
+print(f"Completion tokens: {usage.completion_tokens}")  # e.g., 250 (all models combined)
+
+# Supplementary breakdown for observability
+details = usage.hivemind_details
+print(f"Mode: {details['mode']}")  # "swarm"
+print(f"Drone count: {details['drone_count']}")  # 3
+print(f"Drone tokens: {details['drone_tokens']}")  # 450 (breakdown)
+print(f"Arbiter tokens: {details['arbiter_tokens']}")  # 200 (breakdown)
+print(f"Cost: ${details['total_cost_usd']}")  # 0.00123
+print(f"Latency: {details['latency_ms']}ms")  # 1523.45
+
+# Note: drone_tokens + arbiter_tokens = total_tokens
+# The standard usage fields are what billing systems should use
+```
+
+---
+
+## Configuration Schema
+
+### Swarm Configuration
+
+**File Location:** `ensemble_configs/swarms/{preset_id}.json`
+
+**Preset-Based System**: Each swarm preset defines behavior for multiple models via `base_models`.
+
+**Schema:**
+```json
+{
+  "id": "string (REQUIRED, preset identifier, must match filename)",
+  "description": "string (optional)",
+  
+  "base_models": [
+    "string (model IDs for /v1/models discovery)"
+  ],
+  
+  "omit_id": "boolean (default: false, controls discovery format)",
+  "count": "integer (default: 3, number of drones)",
+  
+  "temperature_jitter": {
+    "enabled": "boolean",
+    "delta": "float (temperature variance, ±delta)"
+  },
+  
+  "arbiter": {
+    "model": "string ('self' or model ID)",
+    "strategy": "string (strategy template name)",
+    "blind": "boolean (default: true, hides model names)"
+  },
+  
+  "adversarial_config": {
+    "enabled": "boolean",
+    "count": "integer (number of adversarial drones)",
+    "prompt": "string (system prompt for adversarial drones)"
+  },
+  
+  "recursive_mode": {
+    "enabled": "boolean",
+    "consensus_threshold": "integer (1-10 scale)"
+  }
+}
+```
+
+**Key Fields:**
+- `id`: Preset identifier, used in `{model}-{id}[swarm]` format
+- `base_models`: OPTIONAL. Controls /v1/models discovery only. Does NOT restrict runtime usage.
+- `omit_id`: OPTIONAL. If `true`, shows as `{model}[swarm]` in /v1/models (hides explicit format to reduce clutter)
+
+**Discovery vs Runtime:**
+- **Discovery**: `base_models` and `omit_id` control what appears in /v1/models
+- **Runtime**: Explicit format `{model}-{preset}[swarm]` works with ANY model/preset combo
+
+### Fusion Configuration
+
+**File Location:** `ensemble_configs/fusions/*.json`
+
+**Schema:**
+```json
+{
+  "id": "string (unique fusion identifier)",
+  "description": "string (optional)",
+  
+  "specialists": [
+    {
+      "model": "string (model ID)",
+      "role": "string (optional, specialist role name)",
+      "system_prompt": "string (optional, role-specific instructions)",
+      "weight": "float (optional, importance weight, default: 1.0)",
+      "weight_description": "string (optional, expertise description for arbiter)",
+      "role_template": "string (optional, reference to role template from roles/ directory)"
+    }
+  ],
+  
+  "arbiter": {
+    "model": "string (model ID)",
+    "strategy": "string (strategy name)",
+    "blind": "boolean (default: true)"
+  },
+  
+  "recursive_mode": {
+    "enabled": "boolean",
+    "consensus_threshold": "integer (1-10 scale)"
+  }
+}
+```
+
+### Strategy Template
+
+**File Location:** `ensemble_configs/strategies/*.txt`
+
+**Format:**
+Plain text file with `{responses}` placeholder.
+
+**Example:**
+```
+You are an expert synthesizer. Analyze the following responses and create a single, superior answer.
+
+{responses}
+
+Provide your synthesis as a complete, high-quality response.
+```
+
+---
+
+## Error Handling
+
+### Common Exceptions
+
+**`ValueError`**: Invalid model ID or configuration
+```python
+try:
+    response = await client.acompletion(model="invalid-fusion", ...)
+except ValueError as e:
+    print(f"Configuration error: {e}")
+```
+
+**`RuntimeError`**: All drones/specialists failed
+```python
+try:
+    response = await client.acompletion(model="gpt-4o[swarm]", ...)
+except RuntimeError as e:
+    print(f"Execution error: {e}")
+```
+
+### Partial Failures
+
+If some drones/specialists fail but at least one succeeds, HiveMind continues with successful responses and logs warnings.
+
+**Logs:**
+```
+[ERROR] [HiveMind] Drone 2/3 failed: Rate limit exceeded
+[WARNING] [HiveMind] 1/3 drones failed. Proceeding with 2 successful responses.
+```
+
+---
+
+## Logging
+
+HiveMind uses the `rotator_library.ensemble` logger.
+
+**Log Levels:**
+- `INFO`: Normal operations (processing, completion)
+- `DEBUG`: Detailed execution (temperatures, prompts)
+- `WARNING`: Low consensus, partial failures, conflicts
+- `ERROR`: Drone failures, critical issues
+
+**Example Configuration:**
+```python
+import logging
+
+# Enable HiveMind debug logging
+logging.getLogger("rotator_library.ensemble").setLevel(logging.DEBUG)
+
+# Example logs:
+# [INFO] [HiveMind] Processing Swarm request: gpt-4o[swarm] (base: gpt-4o, 3 drones, streaming: False)
+# [DEBUG] [HiveMind] Drone 1: temperature=0.82, adversarial=False
+# [DEBUG] [HiveMind] Arbiter prompt built: 2 messages
+# [INFO] [HiveMind] Swarm completed successfully. Total usage: 650 tokens. Latency: 1234.56ms, Cost: $0.001200
+```
+
+---
+
+## Advanced Usage
+
+### Custom Arbiter Models
+
+Use different arbiter models for different fusions:
+
+```json
+{
+  "id": "research-team",
+  "specialists": [...],
+  "arbiter": {
+    "model": "gpt-4o",  // Use GPT-4o specifically
+    "strategy": "synthesis"
+  }
+}
+```
+
+### Self-Arbiter
+
+Use the same model as arbiter (saves one API call):
+
+```json
+{
+  "arbiter": {
+    "model": "self",  // Use base model as arbiter
+    "strategy": "best_of_n"
+  }
+}
+```
+
+### Multiple Strategies
+
+Create task-specific strategies:
+
+**`ensemble_configs/strategies/math_solver.txt`:**
+```
+You are a mathematics expert. Review these solutions:
+
+{responses}
+
+Identify the correct approach, verify calculations, and provide the final answer with step-by-step explanation.
+```
+
+Usage:
+```json
+{
+  "arbiter": {
+    "strategy": "math_solver"
+  }
+}
+```
+
+---
+
+## Migration Guide
+
+### From Single Model to Swarm
+
+**Before:**
+```python
+response = await client.acompletion(
+    model="gpt-4o-mini",
+    messages=[{"role": "user", "content": "Explain AI"}]
+)
+```
+
+**After:**
+```python
+response = await client.acompletion(
+    model="gpt-4o-mini[swarm]",  # Add [swarm] suffix
+    messages=[{"role": "user", "content": "Explain AI"}]
+)
+```
+
+### From Multiple Calls to Fusion
+
+**Before:**
+```python
+arch_response = await client.acompletion(model="gpt-4o", ...)
+sec_response = await client.acompletion(model="claude-3-opus", ...)
+# Manually combine responses
+```
+
+**After:**
+Create fusion config, then:
+```python
+response = await client.acompletion(
+    model="dev-team",  # All in one call
+    messages=[...]
+)
+```
+
+---
+
+## Performance Metrics
+
+Typical latencies (3 drones/specialists, non-streaming):
+
+| Model Type | Drones/Specialists | Avg Latency |
+|------------|-------------------|-------------|
+| gpt-4o-mini[swarm] | 3 | 1.2-2.0s |
+| gpt-4o[swarm] | 3 | 2.0-3.5s |
+| dev-team (fusion) | 3 | 2.5-4.0s |
+
+**Note**: Streaming reduces perceived latency as arbiter output begins immediately after drone/specialist completion.
+
+---
+
+## Limitations
+
+1. **Cost**: Multiple API calls increase costs proportionally
+2. **Rate Limits**: May hit rate limits faster with parallel calls
+3. **Latency**: Total time = max(drone time) + arbiter time
+4. **Model Availability**: All models must be available simultaneously
+5. **Token Limits**: Large responses may exceed context windows
+
+---
+
+## Support
+
+For issues, questions, or feature requests:
+- Check logs (`rotator_library.ensemble`)
+- Review configuration files
+- Verify API keys and model availability
+- See [User Guide](./HiveMind_User_Guide.md) for common patterns
diff --git a/docs/HiveMind_User_Guide.md b/docs/HiveMind_User_Guide.md
new file mode 100644
index 0000000..3308c34
--- /dev/null
+++ b/docs/HiveMind_User_Guide.md
@@ -0,0 +1,445 @@
+# HiveMind Ensemble User Guide
+
+## Overview
+
+**HiveMind Ensemble** is a powerful feature that enables parallel model execution with intelligent arbitration. It supports two modes:
+
+- **Swarm Mode**: Multiple parallel calls to the **same model** (called "Drones")
+- **Fusion Mode**: Multiple parallel calls to **different models** (called "Specialists")
+
+Both modes use an "Arbiter" model to synthesize the responses into a single, high-quality answer.
+
+---
+
+## Quick Start
+
+### Swarm Mode
+
+Call the same model multiple times in parallel and synthesize results:
+
+```python
+from rotator_library.client import RotatingClient
+
+client = RotatingClient()
+
+# Short form - uses preset with omit_id=true or default preset
+response = await client.acompletion(
+    model="gpt-4o-mini[swarm]",
+    messages=[{"role": "user", "content": "What is quantum computing?"}],
+    stream=False
+)
+
+# Explicit preset format - works with ANY model + ANY preset
+response = await client.acompletion(
+    model="claude-3-haiku-aggressive[swarm]",  # Use 'aggressive' preset
+    messages=[{"role": "user", "content": "What is quantum computing?"}],
+    stream=False
+)
+
+print(response.choices[0].message.content)
+print(f"Total tokens: {response.usage.total_tokens}")
+print(f"Drone count: {response.usage.hivemind_details['drone_count']}")
+print(f"Cost: ${response.usage.hivemind_details['total_cost_usd']}")
+```
+
+### Fusion Mode
+
+Use multiple specialized models working together:
+
+```python
+# dev-team fusion uses 3 specialist models
+response = await client.acompletion(
+    model="dev-team",
+    messages=[{"role": "user", "content": "Review this function"}],
+    stream=False
+)
+
+print(response.choices[0].message.content)
+print(f"Specialists: {response.usage.hivemind_details['specialist_count']}")
+```
+
+---
+
+## Swarm Mode
+
+### How It Works
+
+1. **Preparation**: Creates N copies of your request (N drones)
+2. **Execution**: Runs all drones in parallel
+3. **Arbitration**: An arbiter model synthesizes all responses
+4. **Result**: Returns the arbiter's synthesis
+
+### Preset-Based System
+
+Swarms use a **preset-based configuration** system. Each preset is a JSON file in `ensemble_configs/swarms/` that defines behavior for multiple models.
+
+**Model Name Formats**:
+- **Short form**: `{model}[swarm]` → uses preset with `omit_id: true` OR `default` preset
+- **Explicit form**: `{model}-{preset}[swarm]` → always uses specified preset
+
+**Examples**:
+```python
+# Short form
+await client.acompletion(model="gpt-4o-mini[swarm]", ...)  # Uses omit_id preset or default
+
+# Explicit form
+await client.acompletion(model="gpt-4o-mini-aggressive[swarm]", ...)  # Uses aggressive preset
+await client.acompletion(model="claude-3-haiku-default[swarm]", ...)  # Explicit default
+```
+
+**Key Features**:
+- **`base_models`**: Controls /v1/models discovery (which models appear for this preset)
+- **`omit_id`**: Controls discovery format (short vs explicit in /v1/models)
+- **Runtime**: Explicit format works with ANY model/preset combo regardless of base_models
+
+### Configuration
+
+Swarm presets in `src/rotator_library/ensemble_configs/swarms/`:
+
+**`default.json`** - Global fallback:
+```json
+{
+  "id": "default",
+  "description": "Standard balanced settings",
+  "base_models": [
+    "gpt-4o", "gpt-4o-mini",
+    "claude-3-5-sonnet", "claude-3-haiku",
+    "gemini-1.5-pro", "gemini-1.5-flash"
+  ],
+  "omit_id": false,
+  "count": 3,
+  "temperature_jitter": {
+    "enabled": true,
+    "delta": 0.2
+  },
+  "arbiter": {
+    "model": "self",
+    "strategy": "synthesis",
+    "blind": true
+  },
+  "adversarial_config": {
+    "enabled": false,
+    "count": 1,
+    "prompt": "You are a critical reviewer..."
+  },
+  "recursive_mode": {
+    "enabled": false,
+    "consensus_threshold": 7
+  }
+}
+```
+
+**Custom preset** (e.g., `aggressive.json`):
+```json
+{
+  "id": "aggressive",
+  "base_models": ["gpt-4o-mini", "gemini-1.5-flash"],
+  "omit_id": true,  // Shows as model[swarm] in /v1/models
+  "count": 5,
+  "temperature_jitter": {
+    "enabled": true,
+    "delta": 0.3
+  },
+  "adversarial_config": {
+    "enabled": true,
+    "count": 2
+  }
+}
+```
+
+### Advanced Features
+
+#### Temperature Jitter
+
+Introduces randomness to increase response diversity:
+
+```json
+"temperature_jitter": {
+  "enabled": true,
+  "delta": 0.2  // ±0.2 variance
+}
+```
+
+Each drone gets a slightly different temperature: `base_temp ± delta`
+
+#### Adversarial Mode
+
+Converts the last N drones to critical reviewers:
+
+```json
+"adversarial_config": {
+  "enabled": true,
+  "count": 1,
+  "prompt": "You are a Senior Principal Engineer. Find flaws, edge cases, and potential issues."
+}
+```
+
+#### Blind Switch
+
+Hides model names from arbiter (enabled by default):
+
+```json
+"arbiter": {
+  "blind": true  // Arbiter sees "Response 1" instead of "Response 1 (GPT-4o)"
+}
+```
+
+#### Recursive Mode
+
+Enables autonomous arbiter critique for low-consensus responses:
+
+```json
+"recursive_mode": {
+  "enabled": true,
+  "consensus_threshold": 7  // If consensus < 7/10, performs internal critique
+}
+```
+
+#### Discovery vs Runtime
+
+**Discovery (/ v1/models endpoint)**:
+- Preset WITH `base_models` + `omit_id: true` → `{model}[swarm]`
+- Preset WITH `base_models` + `omit_id: false` → `{model}-{preset}[swarm]`
+- Preset WITHOUT `base_models` → Not shown (invisible)
+
+**Runtime (actual API calls)**:
+- Short form `model[swarm]` → Uses omit_id preset OR default
+- Explicit form `model-preset[swarm]` → ALWAYS works with ANY model/preset combo
+- `base_models` has NO runtime restrictions
+
+---
+
+## Fusion Mode
+
+### How It Works
+
+1. **Preparation**: Assigns role-specific prompts to each specialist
+2. **Execution**: Runs all specialists in parallel
+3. **Arbitration**: Arbiter synthesizes with role context
+4. **Result**: Returns the arbiter's synthesis
+
+### Configuration
+
+Fusion models are configured in `src/rotator_library/ensemble_configs/fusions/`:
+
+**`dev-team.json`** - Example fusion:
+```json
+{
+  "id": "dev-team",
+  "description": "Software development team with specialized roles",
+  "specialists": [ 
+    {
+      "model": "gpt-4o",
+      "role": "Architect",
+      "system_prompt": "Focus on architectural patterns, scalability, and system design.",
+      "weight": 1.5
+    },
+    {
+      "model": "claude-3-opus",
+      "role": "Security Specialist",
+      "system_prompt": "Focus on security vulnerabilities and potential exploits.",
+      "weight": 1.0
+    },
+    {
+      "model": "gemini-1.5-pro",
+      "role": "Code Reviewer",
+      "system_prompt": "Focus on code quality, performance, and best practices.",
+      "weight": 1.0
+    }
+  ],
+  "arbiter": {
+    "model": "gpt-4o",
+    "strategy": "synthesis",
+    "blind": true
+  }
+}
+```
+
+### Creating Custom Fusions
+
+1. Create a new JSON file in `ensemble_configs/fusions/`
+2. Define specialists with roles and prompts
+3. Choose an arbiter model and strategy
+4. Use the fusion ID as the model name
+
+Example: `creative-writers.json`:
+```json
+{
+  "id": "creative-writers",
+  "description": "Creative writing team",
+  "specialists": [
+    {
+      "model": "claude-3-opus",
+      "role": "Storyteller",
+      "system_prompt": "Focus on narrative, character development, and plot.",
+      "weight": 1.5
+    },
+    {
+      "model": "gpt-4o",
+      "role": "Editor",
+      "system_prompt": "Focus on clarity, grammar, and style.",
+      "weight": 1.0
+    }
+  ],
+  "arbiter": {
+    "model": "gpt-4o",
+    "strategy": "synthesis"
+  }
+}
+```
+
+Usage:
+```python
+response = await client.acompletion(
+    model="creative-writers",
+    messages=[{"role": "user", "content": "Write a short story about AI"}]
+)
+```
+
+---
+
+## Arbitration Strategies
+
+Strategies are text prompts in `ensemble_configs/strategies/`:
+
+**`synthesis.txt`** - Combine all responses:
+```
+You are an expert synthesizer. Analyze the following responses and create a single, superior answer that:
+1. Combines the best elements from each response
+2. Resolves any conflicts or contradictions
+3. Ensures completeness and accuracy
+4. Maintains coherence and clarity
+
+{responses}
+```
+
+**`best_of_n.txt`** - Select and refine the best:
+```
+Review these responses and identify the strongest one. Then refine and enhance it.
+
+{responses}
+```
+
+**`code_review.txt`** - Code-specific evaluation:
+```
+You are a senior code reviewer. Analyze these code responses and provide:
+1. Best implementation approach
+2. Security considerations
+3. Performance optimization suggestions
+4. Final recommended code
+
+{responses}
+```
+
+### Creating Custom Strategies
+
+Create a `.txt` file in `ensemble_configs/strategies/` with your prompt template. Use `{responses}` as a placeholder for the formatted responses.
+
+---
+
+## Streaming Support
+
+HiveMind respects the `stream` parameter:
+
+```python
+# Streaming swarm
+async for chunk in client.acompletion(
+    model="gpt-4o[swarm]",
+    messages=[{"role": "user", "content": "Explain AI"}],
+    stream=True  # Stream arbiter's response
+):
+    if hasattr(chunk.choices[0].delta, 'content') and chunk.choices[0].delta.content:
+        print(chunk.choices[0].delta.content, end='', flush=True)
+```
+
+**Note**: Drones/specialists execute in parallel (not streamed). Only the arbiter's final synthesis is streamed.
+
+---
+
+## Usage & Cost Tracking
+
+All HiveMind responses include detailed usage information in **standard OpenAI-compatible fields** plus additional HiveMind-specific breakdown:
+
+```python
+response = await client.acompletion(
+    model="gpt-4o-mini[swarm]",
+    messages=[{"role": "user", "content": "Test"}]
+)
+
+# ✅ STANDARD usage fields (compatible with all tooling)
+# These contain the TOTAL aggregated usage (drones/specialists + arbiter)
+print(f"Prompt tokens: {response.usage.prompt_tokens}")  # Total from all models
+print(f"Completion tokens: {response.usage.completion_tokens}")  # Total from all models
+print(f"Total tokens: {response.usage.total_tokens}")  # Grand total
+
+# ✅ SUPPLEMENTARY HiveMind details (breakdown for observability)
+# These provide additional context but do NOT replace standard fields
+details = response.usage.hivemind_details
+print(f"Mode: {details['mode']}")  # "swarm" or "fusion"
+print(f"Drone/Specialist count: {details.get('drone_count') or details.get('specialist_count')}")
+print(f"Drone/Specialist tokens: {details.get('drone_tokens') or details.get('specialist_tokens')}")
+print(f"Arbiter tokens: {details['arbiter_tokens']}")
+print(f"Total cost: ${details['total_cost_usd']}")
+print(f"Latency: {details['latency_ms']}ms")
+```
+
+**Important**: Consumers should use the standard usage fields (`prompt_tokens`, `completion_tokens`, `total_tokens`) for billing and analytics. These already include the complete totals. The `hivemind_details` field provides a breakdown for debugging and observability.
+
+---
+
+## Best Practices
+
+### Model Selection
+
+**Sw arm Mode**:
+- Use for: Same model, different parameters (temperature jitter)
+- Best for: Brainstorming, diverse perspectives, consensus building
+- Models: Fast models (gpt-4o-mini, gemini-flash) for cost efficiency
+
+**Fusion Mode**:
+- Use for: Different models, specialized expertise
+- Best for: Complex tasks requiring multiple skill sets
+- Models: Mix strengths (GPT for reasoning, Claude for safety, Gemini for code)
+
+### Cost Optimization
+
+1. **Use smaller models for drones**: `gpt-4o-mini[swarm]` instead of `gpt-4o[swarm]`
+2. **Limit drone count**: Default is 3, but 2 is often sufficient
+3. **Use "self" arbiter**: Saves one API call
+4. **Monitor `hivemind_details`**: Track costs per request
+
+### Performance Tips
+
+1. **Parallel execution is fast**: All drones/specialists run simultaneously
+2. **Streaming reduces perceived latency**: Users see output immediately
+3. **Check latency_ms**: Identify slow requests
+
+---
+
+## Troubleshooting
+
+### No ensemble detected
+
+**Problem**: Model isn't recognized as ensemble
+**Solution**: Check spelling, ensure `[swarm]` suffix or fusion ID exists
+
+### All drones failed
+
+**Problem**: All parallel calls failed
+**Solution**: Check API keys, rate limits, model availability
+
+### High costs
+
+**Problem**: HiveMind is expensive
+**Solution**: Reduce drone count, use smaller models, limit to critical requests
+
+### Poor synthesis quality
+
+**Problem**: Arbiter output isn't good
+**Solution**: Use a better arbiter model (gpt-4o, claude-3-opus), try different strategy
+
+---
+
+## API Reference
+
+See [API.md](./API.md) for detailed API documentation.
diff --git a/src/rotator_library/README.md b/src/rotator_library/README.md
index c020799..08ef605 100644
--- a/src/rotator_library/README.md
+++ b/src/rotator_library/README.md
@@ -4,6 +4,13 @@ A robust, asynchronous, and thread-safe Python library for managing a pool of AP
 
 ## Key Features
 
+-   **HiveMind Ensemble**: Parallel model execution with intelligent arbitration
+    -   **Swarm Mode**: Execute the same model multiple times with temperature jitter, adversarial prompts, and synthesis
+    -   **Fusion Mode**: Combine responses from different specialized models with role-based routing
+    -   **Recursive Refinement**: Autonomous low-consensus handling with internal critique reasoning
+    -   **Configurable Strategies**: Customizable arbitration strategies for different use cases
+    -   **Role Templates**: Reusable specialist role definitions for consistent fusion configurations
+    -   **Blind Mode**: Option to hide model names from arbiter to reduce bias
 -   **Asynchronous by Design**: Built with `asyncio` and `httpx` for high-performance, non-blocking I/O.
 -   **Advanced Concurrency Control**: A single API key can be used for multiple concurrent requests. By default, it supports concurrent requests to *different* models. With configuration (`MAX_CONCURRENT_REQUESTS_PER_KEY_<PROVIDER>`), it can also support multiple concurrent requests to the *same* model using the same key.
 -   **Smart Key Management**: Selects the optimal key for each request using a tiered, model-aware locking strategy to distribute load evenly and maximize availability.
@@ -136,6 +143,33 @@ async def stream_example():
 asyncio.run(stream_example())
 ```
 
+**HiveMind Ensemble Example:**
+
+```python
+async def hivemind_example():
+    async with RotatingClient(api_keys=api_keys) as client:
+        # Swarm Mode: Multiple parallel calls to same model
+        swarm_response = await client.acompletion(
+            model="gpt-4o-mini-default[swarm]",
+            messages=[{"role": "user", "content": "Explain quantum computing"}]
+        )
+        print(swarm_response.choices[0].message.content)
+        print(f"Total tokens: {swarm_response.usage.total_tokens}")
+        print(f"Drones: {swarm_response.usage.hivemind_details['drone_count']}")
+        
+        # Fusion Mode: Multiple specialist models
+        fusion_response = await client.acompletion(
+            model="dev-team[fusion]",
+            messages=[{"role": "user", "content": "Review this API design"}]
+        )
+        print(fusion_response.choices[0].message.content)
+        print(f"Specialists: {fusion_response.usage.hivemind_details['specialist_count']}")
+
+asyncio.run(hivemind_example())
+```
+
+See the [HiveMind User Guide](../../docs/HiveMind_User_Guide.md) and [API Reference](../../docs/HiveMind_API.md) for detailed configuration options.
+
 #### `async def aembedding(self, **kwargs) -> Any:`
 
 A wrapper around `litellm.aembedding` that provides the same key management and retry logic for embedding requests.
diff --git a/src/rotator_library/client.py b/src/rotator_library/client.py
index b1485d0..df8cfc7 100644
--- a/src/rotator_library/client.py
+++ b/src/rotator_library/client.py
@@ -33,6 +33,7 @@
 from .credential_manager import CredentialManager
 from .background_refresher import BackgroundRefresher
 from .model_definitions import ModelDefinitions
+from .ensemble import EnsembleManager
 
 
 class StreamedAPIError(Exception):
@@ -128,6 +129,9 @@ def __init__(
             if max_val < 1:
                 lib_logger.warning(f"Invalid max_concurrent for '{provider}': {max_val}. Setting to 1.")
                 self.max_concurrent_requests_per_key[provider] = 1
+        
+        # Initialize HiveMind ensemble manager
+        self.ensemble_manager = EnsembleManager(rotating_client=self)
 
     def _is_model_ignored(self, provider: str, model_id: str) -> bool:
         """
@@ -636,6 +640,15 @@ async def _execute_with_retry(
         kwargs = self._convert_model_params(**kwargs)
 
         # The main rotation loop. It continues as long as there are untried credentials and the global deadline has not been exceeded.
+        
+        # Resolve model ID early, before any credential operations
+        # This ensures consistent model ID usage for acquisition, release, and tracking
+        resolved_model = self._resolve_model_id(model, provider)
+        if resolved_model != model:
+            lib_logger.info(f"Resolved model '{model}' to '{resolved_model}'")
+            model = resolved_model
+            kwargs["model"] = model  # Ensure kwargs has the resolved model for litellm
+
         while (
             len(tried_creds) < len(credentials_for_provider) and time.time() < deadline
         ):
@@ -689,13 +702,8 @@ async def _execute_with_retry(
 
                 provider_plugin = self._get_provider_instance(provider)
 
-                # Convert model name to ID if custom mapping exists
-                resolved_model = self._resolve_model_id(model, provider)
-                if resolved_model != model:
-                    lib_logger.info(f"Resolved model '{model}' to '{resolved_model}'")
-                    litellm_kwargs["model"] = resolved_model
-                    # Update the model variable for subsequent logging
-                    model = resolved_model
+                # Model ID is already resolved before the loop, and kwargs['model'] is updated.
+                # No further resolution needed here.
 
                 # Apply model-specific options for custom providers
                 if provider_plugin and hasattr(provider_plugin, "get_model_options"):
@@ -996,6 +1004,14 @@ async def _streaming_acompletion_with_retry(
 
         consecutive_quota_failures = 0
 
+        # Resolve model ID early, before any credential operations
+        # This ensures consistent model ID usage for acquisition, release, and tracking
+        resolved_model = self._resolve_model_id(model, provider)
+        if resolved_model != model:
+            lib_logger.info(f"Resolved model '{model}' to '{resolved_model}'")
+            model = resolved_model
+            kwargs["model"] = model  # Ensure kwargs has the resolved model for litellm
+
         try:
             while (
                 len(tried_creds) < len(credentials_for_provider)
@@ -1071,13 +1087,8 @@ async def _streaming_acompletion_with_retry(
 
                     provider_plugin = self._get_provider_instance(provider)
 
-                    # Convert model name to ID if custom mapping exists
-                    resolved_model = self._resolve_model_id(model, provider)
-                    if resolved_model != model:
-                        lib_logger.info(f"Resolved model '{model}' to '{resolved_model}'")
-                        litellm_kwargs["model"] = resolved_model
-                        # Update the model variable for subsequent logging
-                        model = resolved_model
+                    # Model ID is already resolved before the loop, and kwargs['model'] is updated.
+                    # No further resolution needed here.
 
                     # Apply model-specific options for custom providers
                     if provider_plugin and hasattr(
@@ -1606,8 +1617,15 @@ def acompletion(
         Returns:
             The completion response object, or an async generator for streaming responses, or None if all retries fail.
         """
-        # Handle iflow provider: remove stream_options to avoid HTTP 406
         model = kwargs.get("model", "")
+        
+        # Check if this is an ensemble request (HiveMind)
+        if model and self.ensemble_manager.is_ensemble(model):
+            lib_logger.debug(f"[HiveMind] Detected ensemble request: {model}")
+            # Delegate to ensemble manager
+            return self.ensemble_manager.handle_request(request=request, **kwargs)
+        
+        # Handle iflow provider: remove stream_options to avoid HTTP 406
         provider = model.split("/")[0] if "/" in model else ""
         
         if provider == "iflow" and "stream_options" in kwargs:
@@ -1755,7 +1773,9 @@ async def get_available_models(self, provider: str) -> List[str]:
     async def get_all_available_models(
         self, grouped: bool = True
     ) -> Union[Dict[str, List[str]], List[str]]:
-        """Returns a list of all available models, either grouped by provider or as a flat list."""
+        """Returns a list of all available models, either grouped by provider or as a flat list.
+        
+        MISSING FEATURE FIX: Now includes HiveMind fusion models."""
         lib_logger.info("Getting all available models...")
 
         all_providers = list(self.all_credentials.keys())
@@ -1772,6 +1792,19 @@ async def get_all_available_models(
             else:
                 all_provider_models[provider] = result
 
+        # MISSING FEATURE FIX: Add HiveMind fusion models
+        if self.ensemble_manager:
+            fusion_ids = self.ensemble_manager.config_loader.get_all_fusion_ids()
+            if fusion_ids:
+                all_provider_models["hivemind_fusion"] = fusion_ids
+                lib_logger.info(f"Added {len(fusion_ids)} HiveMind fusion models")
+            
+            # Add HiveMind swarm models
+            swarm_models = self.ensemble_manager.config_loader.get_all_swarm_model_ids()
+            if swarm_models:
+                all_provider_models["hivemind_swarm"] = swarm_models
+                lib_logger.info(f"Added {len(swarm_models)} HiveMind swarm model variants")
+
         lib_logger.info("Finished getting all available models.")
         if grouped:
             return all_provider_models
diff --git a/src/rotator_library/ensemble/__init__.py b/src/rotator_library/ensemble/__init__.py
new file mode 100644
index 0000000..6dbd382
--- /dev/null
+++ b/src/rotator_library/ensemble/__init__.py
@@ -0,0 +1,9 @@
+"""
+HiveMind Ensemble Module
+
+This module provides parallel model execution (Swarm/Fusion) with intelligent arbitration.
+"""
+
+from .manager import EnsembleManager
+
+__all__ = ['EnsembleManager']
diff --git a/src/rotator_library/ensemble/config_loader.py b/src/rotator_library/ensemble/config_loader.py
new file mode 100644
index 0000000..5719dd7
--- /dev/null
+++ b/src/rotator_library/ensemble/config_loader.py
@@ -0,0 +1,430 @@
+"""
+Configuration loader for HiveMind ensemble configs.
+
+Loads and validates configurations from the ensemble_configs directory structure.
+"""
+
+import os
+import json
+import logging
+import copy
+from pathlib import Path
+from typing import Dict, List, Any, Optional
+
+lib_logger = logging.getLogger("rotator_library.ensemble")
+
+
+class ConfigLoader:
+    """Loads and manages ensemble configurations from folder structure."""
+    
+    def __init__(self, config_dir: str):
+        """
+        Initialize the config loader.
+        
+        Args:
+            config_dir: Path to ensemble_configs directory (relative to rotator_library)
+        """
+        self.config_dir = Path(config_dir)
+        self.swarms_dir = self.config_dir / "swarms"
+        self.fusions_dir = self.config_dir / "fusions"
+        self.strategies_dir = self.config_dir / "strategies"
+        self.roles_dir = self.config_dir / "roles"
+        
+        # Loaded configurations
+        self.swarm_default: Optional[Dict[str, Any]] = None
+        self.swarm_configs: Dict[str, Dict[str, Any]] = {}
+        self.fusion_configs: Dict[str, Dict[str, Any]] = {}
+        self.strategies: Dict[str, str] = {}
+        self.role_templates: Dict[str, Dict[str, Any]] = {}
+        
+        # Track model -> preset mapping for omit_id presets
+        self.omit_id_presets: Dict[str, str] = {}  # {"gpt-4o-mini": "aggressive"}
+        
+    def load_all(self) -> None:
+        """Load all configurations from the directory structure."""
+        lib_logger.info("[HiveMind] Loading ensemble configurations...")
+        
+        # Create directories if they don't exist
+        self._ensure_directories()
+        
+        # Load swarm configurations
+        self._load_swarm_configs()
+        
+        # Load fusion configurations
+        self._load_fusion_configs()
+        
+        # Load strategy templates
+        self._load_strategies()
+        
+        # Load role templates
+        self._load_roles()
+        
+        # Count swarm presets (files in swarms directory)
+        swarm_preset_count = len(list(self.swarms_dir.glob("*.json"))) if self.swarms_dir.exists() else 0
+        
+        lib_logger.info(
+            f"[HiveMind] Loaded {swarm_preset_count} swarm presets, "
+            f"{len(self.fusion_configs)} fusion configs, "
+            f"{len(self.strategies)} strategies, "
+            f"{len(self.role_templates)} roles"
+        )
+    
+    def _ensure_directories(self) -> None:
+        """Create config directories if they don't exist."""
+        for directory in [self.swarms_dir, self.fusions_dir, self.strategies_dir, self.roles_dir]:
+            directory.mkdir(parents=True, exist_ok=True)
+    
+    def _load_swarm_configs(self) -> None:
+        """Load swarm configurations from swarms/ directory.
+        
+        Only supports preset-based format with 'id' and 'base_models'.
+        Also builds omit_id mapping for default preset resolution.
+        """
+        if not self.swarms_dir.exists():
+            lib_logger.warning(f"[HiveMind] Swarms directory not found: {self.swarms_dir}")
+            return
+        
+        # Load default.json first
+        default_path = self.swarms_dir / "default.json"
+        if default_path.exists():
+            try:
+                with open(default_path, 'r', encoding='utf-8') as f:
+                    self.swarm_default = json.load(f)
+                lib_logger.debug("[HiveMind] Loaded default swarm config")
+            except Exception as e:
+                lib_logger.error(f"[HiveMind] Failed to load default swarm config: {e}")
+        else:
+            lib_logger.warning("[HiveMind] No default swarm config found")
+        
+        # Build omit_id mapping: scan all presets with omit_id=true
+        for config_file in self.swarms_dir.glob("*.json"):
+            # Skip example files
+            if config_file.stem.endswith('.example'):
+                continue
+                
+            try:
+                with open(config_file, 'r', encoding='utf-8') as f:
+                    config = json.load(f)
+                
+                preset_id = config.get("id")
+                omit_id = config.get("omit_id", False)
+                base_models = config.get("base_models", [])
+                
+                if preset_id and omit_id and base_models:
+                    # Register this preset as the default for these models
+                    for model in base_models:
+                        if model in self.omit_id_presets:
+                            lib_logger.warning(
+                                f"[HiveMind] Model '{model}' already has omit_id preset '{self.omit_id_presets[model]}'. "
+                                f"Overriding with '{preset_id}'"
+                            )
+                        self.omit_id_presets[model] = preset_id
+                        lib_logger.debug(f"[HiveMind] Registered '{model}[swarm]' -> preset '{preset_id}'")
+            
+            except Exception as e:
+                lib_logger.warning(f"Failed to process swarm config {config_file.name}: {e}")
+        
+        # All swarm configs now use preset-based format (id + base_models)
+        # Discovery is handled by get_all_swarm_model_ids()
+        # Individual preset configs loaded on-demand via get_swarm_config()
+    
+    def _load_fusion_configs(self) -> None:
+        """Load fusion configurations from fusions/ directory.
+        
+        Supports two formats:
+        1. Single fusion: {"id": "...", "specialists": [...], ...}
+        2. Multiple fusions: {"fusions": [{"id": "...", ...}, ...]}
+        """
+        if not self.fusions_dir.exists():
+            lib_logger.warning(f"[HiveMind] Fusions directory not found: {self.fusions_dir}")
+            return
+        
+        for config_file in self.fusions_dir.glob("*.json"):
+            # Skip example files
+            if config_file.stem.endswith('.example'):
+                continue
+                
+            try:
+                with open(config_file, 'r', encoding='utf-8') as f:
+                    config = json.load(f)
+                
+                # Check if this is the new array format
+                if "fusions" in config:
+                    # New format: {"fusions": [...]}
+                    fusions_list = config.get("fusions", [])
+                    if not isinstance(fusions_list, list):
+                        lib_logger.warning(
+                            f"[HiveMind] Config '{config_file.name}' has 'fusions' but it's not a list"
+                        )
+                        continue
+                    
+                    for fusion in fusions_list:
+                        self._register_fusion(fusion, config_file.name)
+                else:
+                    # Old format: {"id": "...", "specialists": [...], ...}
+                    self._register_fusion(config, config_file.name)
+                
+            except Exception as e:
+                lib_logger.error(f"[HiveMind] Failed to load fusion config '{config_file.name}': {e}")
+    
+    def _register_fusion(self, fusion: Dict[str, Any], source_file: str) -> None:
+        """Register a single fusion configuration."""
+        fusion_id = fusion.get("id")
+        if not fusion_id:
+            lib_logger.warning(
+                f"[HiveMind] Fusion in '{source_file}' missing 'id' field"
+            )
+            return
+        
+        # Check for duplicate IDs
+        if fusion_id in self.fusion_configs:
+            lib_logger.warning(
+                f"[HiveMind] Duplicate fusion ID '{fusion_id}'. "
+                f"Config from '{source_file}' will override previous."
+            )
+        
+        self.fusion_configs[fusion_id] = fusion
+        lib_logger.debug(f"[HiveMind] Loaded fusion config '{fusion_id}'")
+    
+    def _load_strategies(self) -> None:
+        """Load strategy templates from strategies/ directory."""
+        if not self.strategies_dir.exists():
+            lib_logger.warning(f"[HiveMind] Strategies directory not found: {self.strategies_dir}")
+            return
+        
+        for strategy_file in self.strategies_dir.glob("*.txt"):
+            # Skip example files
+            if strategy_file.stem.endswith('.example'):
+                continue
+                
+            try:
+                with open(strategy_file, 'r', encoding='utf-8') as f:
+                    content = f.read()
+                
+                strategy_name = strategy_file.stem
+                self.strategies[strategy_name] = content
+                lib_logger.debug(f"[HiveMind] Loaded strategy '{strategy_name}'")
+                
+            except Exception as e:
+                lib_logger.error(
+                    f"[HiveMind] Failed to load strategy '{strategy_file.name}': {e}"
+                )
+    
+    def _load_roles(self) -> None:
+        """Load role templates from roles/ directory.
+        
+        Supports two formats:
+        1. Single role: {"name": "...", "system_prompt": "...", ...}
+        2. Multiple roles: {"roles": [{"name": "...", ...}, ...]}
+        """
+        if not self.roles_dir.exists():
+            lib_logger.warning(f"[HiveMind] Roles directory not found: {self.roles_dir}")
+            return
+        
+        for role_file in self.roles_dir.glob("*.json"):
+            # Skip example files
+            if role_file.stem.endswith('.example'):
+                continue
+                
+            try:
+                with open(role_file, 'r', encoding='utf-8') as f:
+                    data = json.load(f)
+                
+                # Check if this is the new array format
+                if "roles" in data:
+                    # New format: {"roles": [...]}
+                    roles_list = data.get("roles", [])
+                    if not isinstance(roles_list, list):
+                        lib_logger.warning(
+                            f"[HiveMind] Role file '{role_file.name}' has 'roles' but it's not a list"
+                        )
+                        continue
+                    
+                    for role in roles_list:
+                        self._register_role(role, role_file.name)
+                else:
+                    # Old format: {"name": "...", "system_prompt": "...", ...}
+                    # Use filename as role_id
+                    role_id = role_file.stem
+                    self.role_templates[role_id] = data
+                    lib_logger.debug(f"[HiveMind] Loaded role template '{role_id}'")
+                
+            except Exception as e:
+                lib_logger.error(
+                    f"[HiveMind] Failed to load role template '{role_file.name}': {e}"
+                )
+    
+    def _register_role(self, role: Dict[str, Any], source_file: str) -> None:
+        """Register a single role template."""
+        # Use 'name' field as role_id, convert to lowercase with hyphens
+        role_name = role.get("name")
+        if not role_name:
+            lib_logger.warning(
+                f"[HiveMind] Role in '{source_file}' missing 'name' field"
+            )
+            return
+        
+        # Convert name to role_id (e.g., "Security Expert" -> "security-expert")
+        role_id = role_name.lower().replace(" ", "-")
+        
+        # Check for duplicate IDs
+        if role_id in self.role_templates:
+            lib_logger.warning(
+                f"[HiveMind] Duplicate role ID '{role_id}'. "
+                f"Role from '{source_file}' will override previous."
+            )
+        
+        self.role_templates[role_id] = role
+        lib_logger.debug(f"[HiveMind] Loaded role template '{role_id}' from array")
+    
+    def get_preset_for_model(self, base_model: str) -> str:
+        """
+        Get the preset ID to use for a model when using model[swarm] syntax.
+        
+        Resolution order:
+        1. If model has an omit_id preset, use that
+        2. Otherwise, use "default"
+        
+        Args:
+            base_model: Base model name (e.g., "gpt-4o-mini")
+        
+        Returns:
+            Preset ID to use
+        """
+        if base_model in self.omit_id_presets:
+            preset = self.omit_id_presets[base_model]
+            lib_logger.debug(f"[HiveMind] Model '{base_model}' using omit_id preset '{preset}'")
+            return preset
+        
+        lib_logger.debug(f"[HiveMind] Model '{base_model}' using default preset")
+        return "default"
+    
+    def get_swarm_config(self, preset_id: str) -> Dict[str, Any]:
+        """
+        Get swarm configuration for a specific preset.
+        
+        Args:
+            preset_id: Preset ID (e.g., "default", "aggressive")
+        
+        Returns:
+            Configuration dictionary with defaults applied
+        """
+        # Try to load preset config file
+        config_file = self.swarms_dir / f"{preset_id}.json"
+        
+        if not config_file.exists():
+            lib_logger.warning(f"[HiveMind] Swarm preset '{preset_id}' not found")
+            # Return default config if available
+            return copy.deepcopy(self.swarm_default) if self.swarm_default else {}
+        
+        try:
+            with open(config_file, 'r', encoding='utf-8') as f:
+                config = json.load(f)
+            
+            # Validate it's a preset-based config
+            if "id" not in config or "base_models" not in config:
+                lib_logger.warning(
+                    f"[HiveMind] Swarm config '{preset_id}' missing 'id' or 'base_models'"
+                )
+                return copy.deepcopy(self.swarm_default) if self.swarm_default else {}
+            
+            return config
+            
+        except Exception as e:
+            lib_logger.error(f"[HiveMind] Failed to load swarm preset '{preset_id}': {e}")
+            return copy.deepcopy(self.swarm_default) if self.swarm_default else {}
+    
+    def get_fusion_config(self, fusion_id: str) -> Optional[Dict[str, Any]]:
+        """
+        Get fusion configuration by ID.
+        
+        Args:
+            fusion_id: Fusion identifier
+        
+        Returns:
+            Fusion configuration or None if not found
+        """
+        return self.fusion_configs.get(fusion_id)
+    
+    def get_strategy(self, strategy_name: str) -> Optional[str]:
+        """
+        Get strategy template by name.
+        
+        Args:
+            strategy_name: Strategy identifier
+        
+        Returns:
+            Strategy template string or None if not found
+        """
+        return self.strategies.get(strategy_name)
+    
+    def get_role_template(self, role_id: str) -> Optional[Dict[str, Any]]:
+        """
+        Get role template by ID.
+        
+        Args:
+            role_id: Role template identifier (e.g., "architect", "security-expert")
+        
+        Returns:
+            Role template dictionary or None if not found
+        """
+        return self.role_templates.get(role_id)
+    
+    def get_all_fusion_ids(self) -> List[str]:
+        """Get list of all fusion IDs with [fusion] suffix."""
+        return [f"{fusion_id}[fusion]" for fusion_id in self.fusion_configs.keys()]
+    
+    def get_all_swarm_model_ids(self) -> List[str]:
+        """
+        Get all discoverable swarm model variants.
+        
+        Only includes presets with base_models defined.
+        Discovery format depends on omit_id:
+        - omit_id=true: Shows as {base_model}[swarm] (short form only)
+        - omit_id=false: Shows as {base_model}-{preset_id}[swarm] (explicit form only)
+        
+        Note: Explicit form always WORKS at runtime regardless of omit_id,
+        but omit_id controls what appears in /v1/models for discoverability.
+        
+        Returns:
+            List of swarm model IDs for /v1/models endpoint
+        """
+        swarm_models = []
+        
+        for config_file in self.swarms_dir.glob("*.json"):
+            # Skip example files
+            if config_file.stem.endswith('.example'):
+                continue
+                
+            try:
+                with open(config_file, 'r', encoding='utf-8') as f:
+                    config = json.load(f)
+                    
+                    preset_id = config.get("id")
+                    base_models = config.get("base_models", [])
+                    omit_id = config.get("omit_id", False)
+                    
+                    if not preset_id:
+                        lib_logger.debug(f"Swarm config {config_file.name} missing 'id', skipping")
+                        continue
+                    
+                    if not base_models:
+                        lib_logger.debug(f"Swarm config {preset_id} has no base_models, not discoverable")
+                        continue
+                    
+                    # Generate model IDs based on omit_id setting
+                    for base_model in base_models:
+                        if omit_id:
+                            # Show short form only (to avoid clutter)
+                            model_id = f"{base_model}[swarm]"
+                        else:
+                            # Show explicit form only
+                            model_id = f"{base_model}-{preset_id}[swarm]"
+                        
+                        swarm_models.append(model_id)
+                        
+            except Exception as e:
+                lib_logger.warning(f"Failed to process swarm config {config_file.name}: {e}")
+        
+        lib_logger.info(f"Discovered {len(swarm_models)} swarm model variants")
+        return swarm_models
diff --git a/src/rotator_library/ensemble/manager.py b/src/rotator_library/ensemble/manager.py
new file mode 100644
index 0000000..5823578
--- /dev/null
+++ b/src/rotator_library/ensemble/manager.py
@@ -0,0 +1,1438 @@
+"""
+EnsembleManager - Core orchestration for HiveMind (Swarm/Fusion) feature.
+
+This module manages parallel model execution with intelligent arbitration.
+"""
+
+import os
+import logging
+import asyncio
+import random
+import copy
+import time
+import re
+from typing import Dict, List, Any, Optional, Set
+
+import litellm
+
+from .config_loader import ConfigLoader
+
+lib_logger = logging.getLogger("rotator_library.ensemble")
+
+
+class EnsembleManager:
+    """
+    Manages ensemble execution (Swarm and Fusion modes).
+    
+    Responsibilities:
+    - Detect ensemble requests (swarm suffix or fusion ID)
+    - Load and manage configurations
+    - Handle naming conflicts
+    - Orchestrate parallel execution (implemented in later phases)
+    """
+    
+    def __init__(self, rotating_client, config_dir: Optional[str] = None):
+        """
+        Initialize the ensemble manager.
+        
+        Args:
+            rotating_client: Reference to RotatingClient for making API calls
+            config_dir: Path to ensemble_configs directory (relative to this file)
+        """
+        self.rotating_client = rotating_client
+        
+        # Default config directory (relative to this file)
+        if config_dir is None:
+            config_dir = os.path.join(
+                os.path.dirname(__file__),
+                "..",
+                "ensemble_configs"
+            )
+        
+        # Initialize config loader
+        self.config_loader = ConfigLoader(config_dir)
+        self.config_loader.load_all()
+        
+        # Cache for resolved ensemble names (for conflict resolution)
+        self._resolved_names: Dict[str, str] = {}
+        
+        # Cache for provider models (loaded from RotatingClient)
+        self._provider_models: Optional[Set[str]] = None
+        
+        # Initialize provider models
+        self._load_provider_models()
+        
+        lib_logger.info("[HiveMind] Ensemble Manager initialized")
+    
+    def is_ensemble(self, model_id: str) -> bool:
+        """
+        Check if a model ID represents an ensemble request.
+        
+        Args:
+            model_id: Full model ID from user request
+        
+        Returns:
+            True if this is an ensemble (swarm or fusion), False otherwise
+        """
+        # BUGFIX: Check for conflict first (Provider Model Shadowing)
+        # If the model ID exists in provider models, it's NOT an ensemble request
+        # (unless we've already resolved it, but this check is for the raw request)
+        if self._provider_models is None:
+            self._load_provider_models()
+            
+        if model_id in self._provider_models:
+            return False
+
+        # Check for fusion suffix
+        if model_id.endswith("[fusion]"):
+            return True
+        
+        # Check for swarm suffix
+        if self._is_swarm_request(model_id):
+            return True
+        
+        return False
+    
+    def _is_swarm_request(self, model_id: str) -> bool:
+        """
+        Check if model ID contains swarm suffix.
+        
+        Supports new preset-based format: {base_model}-{preset_id}[swarm]
+        
+        Args:
+            model_id: Model ID to check
+        
+        Returns:
+            True if this is a swarm request
+        """
+        return model_id.endswith("[swarm]")
+    
+    def get_base_model(self, swarm_id: str) -> tuple:
+        """
+        Extract base model name and preset ID from swarm ID.
+        
+        Supports formats:
+        - {base_model}-{preset_id}[swarm] → (base_model, preset_id)
+        - {base_model}[swarm] → (base_model, omit_id preset or "default")
+        
+        Args:
+            swarm_id: Swarm model ID (e.g., "gpt-4o-default[swarm]" or "gpt-4o[swarm]")
+        
+        Returns:
+            Tuple of (base_model_name, preset_id)
+        """
+        # Remove [swarm] suffix first
+        if swarm_id.endswith("[swarm]"):
+            swarm_id = swarm_id[:-7]  # Remove "[swarm]"
+        
+        # Parse: {base_model}-{preset_id}
+        # preset_id is the last segment after the last hyphen
+        if "-" in swarm_id:
+            # Split and check if last segment is a preset ID
+            parts = swarm_id.rsplit("-", 1)
+            potential_preset = parts[1]
+            
+            # Check if it's a valid preset ID in our configs
+            config_file = self.config_loader.swarms_dir / f"{potential_preset}.json"
+            if config_file.exists():
+                # This is a preset ID, so base_model is everything before it
+                return parts[0], potential_preset
+        
+        # No explicit preset: use omit_id preset or default
+        base_model = swarm_id
+        preset_id = self.config_loader.get_preset_for_model(base_model)
+        
+        return base_model, preset_id
+    
+    def resolve_conflicts(self, ensemble_id: str) -> str:
+        """
+        Resolve naming conflicts by appending numeric suffixes.
+        
+        If an ensemble ID conflicts with a real provider model,
+        append -1, -2, -3, etc. until unique.
+        
+        Args:
+            ensemble_id: Original ensemble ID (swarm or fusion)
+        
+        Returns:
+            Resolved unique ensemble ID
+        """
+        # Check cache first
+        if ensemble_id in self._resolved_names:
+            return self._resolved_names[ensemble_id]
+        
+        # Load provider models if not cached
+        if self._provider_models is None:
+            self._load_provider_models()
+        
+        # Check for conflict
+        if ensemble_id not in self._provider_models:
+            # No conflict, use original
+            self._resolved_names[ensemble_id] = ensemble_id
+            return ensemble_id
+        
+        # Conflict detected, find available suffix
+        counter = 1
+        while True:
+            candidate = f"{ensemble_id}-{counter}"
+            if candidate not in self._provider_models:
+                lib_logger.warning(
+                    f"[HiveMind] Naming conflict detected. "
+                    f"Renamed '{ensemble_id}' to '{candidate}'"
+                )
+                self._resolved_names[ensemble_id] = candidate
+                return candidate
+            counter += 1
+            
+            # Safety check (shouldn't happen in practice)
+            if counter > 100:
+                lib_logger.error(
+                    f"[HiveMind] Could not resolve naming conflict for '{ensemble_id}' "
+                    f"after 100 attempts"
+                )
+                return f"{ensemble_id}-{counter}"
+    
+    def _load_provider_models(self) -> None:
+        """
+        Load all provider models from RotatingClient.
+        
+        This is used for conflict detection.
+        """
+        try:
+            self._provider_models = set()
+            
+            # BUGFIX: Populate provider models from RotatingClient.model_definitions
+            if hasattr(self.rotating_client, 'model_definitions'):
+                defs = self.rotating_client.model_definitions.definitions
+                for provider, models in defs.items():
+                    for model_name in models.keys():
+                        self._provider_models.add(model_name)
+                        self._provider_models.add(f"{provider}/{model_name}")
+            
+            lib_logger.debug(f"[HiveMind] Loaded {len(self._provider_models)} provider models for conflict detection")
+            
+        except Exception as e:
+            lib_logger.error(f"[HiveMind] Failed to load provider models: {e}")
+            self._provider_models = set()
+    
+    def get_fusion_ids(self) -> List[str]:
+        """
+        Get list of all configured fusion IDs.
+        
+        Returns:
+            List of fusion identifiers
+        """
+        return self.config_loader.get_all_fusion_ids()
+    
+    def _prepare_drones(
+        self,
+        config: Dict[str, Any],
+        base_model: str,
+        request_params: Dict[str, Any]
+    ) -> List[Dict[str, Any]]:
+        """
+        Prepare drone configurations for parallel execution.
+        
+        Creates N identical copies of the request parameters with the base model.
+        Advanced features (jitter, adversarial) will be added in Phase 4.
+        
+        Args:
+            config: Swarm configuration
+            base_model: Base model to use for all drones
+            request_params: Original request parameters
+        
+        Returns:
+            List of drone configurations ready for parallel execution
+        """
+        count = config.get("count", 3)
+        drones = []
+        
+        # Get temperature jitter config
+        temp_jitter_config = config.get("temperature_jitter", {})
+        jitter_enabled = temp_jitter_config.get("enabled", False)
+        jitter_delta = temp_jitter_config.get("delta", 0.2)
+        
+        # Get adversarial config
+        adversarial_config = config.get("adversarial_config", {})
+        adversarial_enabled = adversarial_config.get("enabled", False)
+        adversarial_count = adversarial_config.get("count", 1)
+        adversarial_prompt = adversarial_config.get("prompt", "")
+        
+        lib_logger.debug(f"[HiveMind] Preparing {count} drones for base model '{base_model}'")
+        if adversarial_enabled:
+            lib_logger.debug(f"[HiveMind] Adversarial mode enabled: {adversarial_count} critical drones")
+        
+        for i in range(count):
+            # Clone the request params
+            # BUGFIX: Use deepcopy to avoid shared mutable state
+            drone_params = copy.deepcopy(request_params)
+            
+            # Override model with base model (strip [swarm] suffix)
+            drone_params["model"] = base_model
+            
+            # Phase 4: Determine if this drone should be adversarial
+            # Last N drones become adversarial
+            is_adversarial = False
+            if adversarial_enabled and adversarial_prompt:
+                adversarial_start_index = count - adversarial_count
+                if i >= adversarial_start_index:
+                    is_adversarial = True
+                    
+                    # Inject adversarial system prompt
+                    if "messages" in drone_params:
+                        # Insert adversarial system message at the beginning
+                        adversarial_message = {
+                            "role": "system",
+                            "content": adversarial_prompt
+                        }
+                        drone_params["messages"].insert(0, adversarial_message)
+                    
+                    lib_logger.debug(
+                        f"[HiveMind] Drone {i+1}/{count}: ADVERSARIAL - injected critical analysis prompt"
+                    )
+            
+            # Phase 4: Apply temperature jitter if enabled
+            if jitter_enabled:
+                base_temp = drone_params.get("temperature", 1.0)
+
+                # Apply random jitter
+                jitter = random.uniform(-jitter_delta, jitter_delta)
+                new_temp = base_temp + jitter
+                
+                # Clamp to valid range [0.0, 2.0]
+                new_temp = max(0.0, min(2.0, new_temp))
+                
+                drone_params["temperature"] = new_temp
+                
+                lib_logger.debug(
+                    f"[HiveMind] Drone {i+1}/{count}: Applied temperature jitter "
+                    f"({base_temp:.2f} → {new_temp:.2f}, delta: {jitter:+.2f})"
+                )
+            
+            # Store drone metadata for logging
+            drone_params["_drone_index"] = i + 1
+            drone_params["_total_drones"] = count
+            drone_params["_is_adversarial"] = is_adversarial
+            
+            drones.append(drone_params)
+            
+            temp_display = drone_params.get("temperature", "default")
+            if isinstance(temp_display, float):
+                temp_display = f"{temp_display:.2f}"
+            
+            lib_logger.debug(
+                f"[HiveMind] Drone {i+1}/{count}: model={base_model}, temp={temp_display}"
+            )
+        
+        return drones
+    
+    def _prepare_fusion_models(
+        self,
+        config: Dict[str, Any],
+        request_params: Dict[str, Any]
+    ) -> List[Dict[str, Any]]:
+        """
+        Prepare specialist model configurations for fusion execution.
+        
+        Each specialist model gets a role-specific system prompt and 
+        processes the same user query.
+        
+        Args:
+            config: Fusion configuration
+            request_params: Original request parameters
+        
+        Returns:
+            List of specialist model configurations with metadata
+        """
+        specialists = config.get("specialists", [])
+        models = []
+        
+        lib_logger.debug(f"[HiveMind] Preparing {len(specialists)} specialist models for fusion")
+        
+        for i, specialist in enumerate(specialists):
+            specialist_num = i + 1
+            
+            # Resolve role template if specified
+            if "role_template" in specialist:
+                template_id = specialist["role_template"]
+                template = self.config_loader.get_role_template(template_id)
+                
+                if template:
+                    # Merge template with specialist config (specialist overrides template)
+                    specialist = {**template, **specialist}
+                    lib_logger.debug(f"[HiveMind] Resolved role template '{template_id}' for specialist {specialist_num}")
+                else:
+                    lib_logger.warning(f"[HiveMind] Role template '{template_id}' not found for specialist {specialist_num}")
+            
+            specialist_model = specialist.get("model")
+            specialist_role = specialist.get("role", specialist.get("name", f"Specialist {specialist_num}"))
+            specialist_prompt = specialist.get("system_prompt", "")
+            specialist_weight = specialist.get("weight", 1.0)
+            # MISSING FEATURE FIX: Extract weight description for arbiter context
+            specialist_weight_desc = specialist.get("weight_description", "")
+            
+            if not specialist_model:
+                lib_logger.warning(
+                    f"[HiveMind] Specialist {specialist_num} missing model, skipping"
+                )
+                continue
+            
+            # Clone request params
+            # BUGFIX: Use deepcopy
+            model_params = copy.deepcopy(request_params)
+            
+            # Set specialist model
+            model_params["model"] = specialist_model
+            
+            # Inject role-specific system prompt if provided
+            if specialist_prompt and "messages" in model_params:
+                role_message = {
+                    "role": "system",
+                    "content": specialist_prompt
+                }
+                model_params["messages"].insert(0, role_message)
+            
+            # Store specialist metadata
+            model_params["_specialist_index"] = specialist_num
+            model_params["_specialist_role"] = specialist_role
+            model_params["_specialist_weight"] = specialist_weight
+            model_params["_specialist_weight_description"] = specialist_weight_desc
+            model_params["_total_specialists"] = len(specialists)
+            
+            models.append(model_params)
+            
+            lib_logger.debug(
+                f"[HiveMind] Specialist {specialist_num}/{len(specialists)}: "
+                f"role={specialist_role}, model={specialist_model}, weight={specialist_weight}"
+            )
+        
+        return models
+    
+    async def _execute_parallel(
+        self,
+        drones: List[Dict[str, Any]],
+        request: Any
+    ) -> tuple:
+        """
+        Execute all drone requests in parallel.
+        
+        Uses asyncio.gather to execute all drones concurrently.
+        Aggregates usage statistics from all successful responses.
+        
+        Args:
+            drones: List of drone configurations
+            request: Original request object
+        
+        Returns:
+            Tuple of (successful_responses, aggregated_usage)
+        """
+        lib_logger.info(f"[HiveMind] Executing {len(drones)} drones in parallel...")
+        
+        # Create tasks for all drones
+        tasks = []
+        for i, drone_params in enumerate(drones):
+            # Call acompletion directly (will use RotatingClient's retry logic)
+            # Remove metadata fields before calling
+            clean_params = {k: v for k, v in drone_params.items() if not k.startswith('_')}
+            
+            task = self.rotating_client._execute_with_retry(
+                litellm.acompletion,  # Use litellm.acompletion directly
+                request=request,
+                **clean_params
+            )
+            tasks.append(task)
+        
+        # Execute all drones in parallel
+        results = await asyncio.gather(*tasks, return_exceptions=True)
+        
+        # Process results
+        successful_responses = []
+        failed_count = 0
+        aggregated_usage = {}
+        
+        for i, result in enumerate(results):
+            drone_index = i + 1
+            
+            if isinstance(result, Exception):
+                # Drone failed
+                failed_count += 1
+                lib_logger.error(
+                    f"[HiveMind] Drone {drone_index}/{len(drones)} failed: {result}"
+                )
+                continue
+            
+            # Drone succeeded
+            successful_responses.append(result)
+            
+            # Aggregate usage - dynamically sum ALL numeric usage fields
+            if hasattr(result, 'usage') and result.usage:
+                usage = result.usage
+                
+                # Iterate through all attributes of the usage object
+                for attr_name in dir(usage):
+                    # Skip private/magic attributes
+                    if attr_name.startswith('_'):
+                        continue
+                    
+                    try:
+                        attr_value = getattr(usage, attr_name)
+                        
+                        # Only aggregate numeric fields (int or float)
+                        if isinstance(attr_value, (int, float)) and not isinstance(attr_value, bool):
+                            if attr_name not in aggregated_usage:
+                                aggregated_usage[attr_name] = 0
+                            aggregated_usage[attr_name] += attr_value
+                    except (AttributeError, TypeError):
+                        # Skip non-accessible or non-numeric attributes
+                        continue
+            
+            lib_logger.debug(
+                f"[HiveMind] Drone {drone_index}/{len(drones)} completed successfully"
+            )
+        
+        # Check if we have at least one successful response
+        if not successful_responses:
+            raise RuntimeError(
+                f"[HiveMind] All {len(drones)} drones failed. Cannot proceed with arbitration."
+            )
+        
+        if failed_count > 0:
+            lib_logger.warning(
+                f"[HiveMind] {failed_count}/{len(drones)} drones failed. "
+                f"Proceeding with {len(successful_responses)} successful responses."
+            )
+        
+        lib_logger.info(
+            f"[HiveMind] Parallel execution complete: {len(successful_responses)}/{len(drones)} succeeded. "
+            f"Total tokens: {aggregated_usage['total_tokens']}"
+        )
+        
+        return successful_responses, aggregated_usage
+    
+    def _format_for_arbiter(
+        self,
+        responses: List[Any],
+        config: Dict[str, Any],
+        specialist_metadata: Optional[List[Dict[str, Any]]] = None
+    ) -> tuple:
+        """
+        Format drone/specialist responses for arbiter consumption.
+        
+        Creates a structured text format with numbered responses.
+        Phase 4: Implements Blind Switch to strip model names.
+        Phase 5: Adds role labels for fusion specialists.
+        MISSING FEATURE FIX: Extracts specialist metadata for arbiter context.
+        
+        Args:
+            responses: List of successful drone/specialist responses
+            config: Swarm or fusion configuration
+            specialist_metadata: Optional list of specialist metadata (for fusion mode)
+        
+        Returns:
+            Tuple of (formatted_text, metadata_for_arbiter)
+            metadata_for_arbiter is None for swarm mode, list of dicts for fusion mode
+        """
+        lib_logger.debug(f"[HiveMind] Formatting {len(responses)} responses for arbiter")
+        
+        # Check if blind mode is enabled
+        arbiter_config = config.get("arbiter", {})
+        blind_mode = arbiter_config.get("blind", True)  # Default ON
+        
+        formatted_parts = []
+        arbiter_metadata = []  # MISSING FEATURE FIX: Collect metadata for arbiter
+        
+        for i, response in enumerate(responses):
+            response_num = i + 1
+            
+            # Extract content from response
+            content = ""
+            if hasattr(response, 'choices') and response.choices:
+                # Standard OpenAI-style response
+                choice = response.choices[0]
+                if hasattr(choice, 'message') and hasattr(choice.message, 'content'):
+                    content = choice.message.content
+                elif hasattr(choice, 'text'):
+                    content = choice.text
+            
+            if not content:
+                lib_logger.warning(
+                    f"[HiveMind] Response {response_num} has no content, skipping"
+                )
+                continue
+            
+            # Phase 5: Determine label (with fusion role support)
+            label = f"Response {response_num}"
+            
+            # Check if this is fusion mode with specialist metadata
+            if specialist_metadata and i < len(specialist_metadata):
+                specialist = specialist_metadata[i]
+                role = specialist.get("_specialist_role", "Unknown")
+                model_name = specialist.get("model", "unknown")
+                weight_desc = specialist.get("_specialist_weight_description", "")
+                
+                # MISSING FEATURE FIX: Build metadata for arbiter context
+                arbiter_metadata.append({
+                    "role": role,
+                    "model": model_name,
+                    "weight_description": weight_desc
+                })
+                
+                if blind_mode:
+                    # Blind mode: show role but not model
+                    label = f"{role}"
+                else:
+                    # Non-blind: show role and model
+                    label = f"{role} ({model_name})"
+                    
+                lib_logger.debug(
+                    f"[HiveMind] Fusion specialist {response_num}: role={role}, blind={blind_mode}"
+                )
+            else:
+                # Swarm mode fallback
+                if blind_mode:
+                    label = f"Response {response_num}"
+                else:
+                    model_name = "unknown"
+                    if hasattr(response, 'model'):
+                        model_name = response.model
+                    label = f"Response {response_num} (Model: {model_name})"
+            
+            # Format: "Label:\n<content>\n"
+            formatted_parts.append(f"{label}:\n{content}\n")
+        
+        # Join all responses
+        formatted_text = "\n".join(formatted_parts)
+        
+        lib_logger.debug(
+            f"[HiveMind] Formatted {len(formatted_parts)} responses "
+            f"({len(formatted_text)} characters total, blind_mode={blind_mode})"
+        )
+        
+        # Return metadata only if fusion mode
+        metadata_for_arbiter = arbiter_metadata if arbiter_metadata else None
+        
+        return formatted_text, metadata_for_arbiter
+    
+    def _build_arbiter_prompt(
+        self,
+        formatted_responses: str,
+        config: Dict[str, Any],
+        original_messages: List[Dict[str, str]],
+        specialist_metadata: Optional[List[Dict[str, Any]]] = None
+    ) -> List[Dict[str, str]]:
+        """
+        Build the complete prompt for the arbiter model.
+        
+        Loads the strategy template and constructs the message array.
+        Phase 6: Adds recursive mode instructions for autonomous decision-making.
+        MISSING FEATURE FIX: Adds specialist expertise context with weights for fusion mode.
+        
+        Args:
+            formatted_responses: Formatted drone/specialist responses
+            config: Swarm or fusion configuration
+            original_messages: Original user messages
+            specialist_metadata: Optional metadata about specialists (for fusion mode)
+        
+        Returns:
+            Complete messages array for arbiter
+        """
+        lib_logger.debug("[HiveMind] Building arbiter prompt")
+        
+        # Get strategy template
+        arbiter_config = config.get("arbiter", {})
+        strategy_name = arbiter_config.get("strategy", "synthesis")
+        
+        strategy_template = self.config_loader.get_strategy(strategy_name)
+        
+        if not strategy_template:
+            lib_logger.warning(
+                f"[HiveMind] Strategy '{strategy_name}' not found, using default"
+            )
+            strategy_template = "Synthesize the following responses into a single, high-quality answer:\n{responses}"
+        
+        # Replace {responses} placeholder
+        strategy_prompt = strategy_template.replace("{responses}", formatted_responses)
+        
+        # MISSING FEATURE FIX: Add specialist expertise context for fusion mode
+        if specialist_metadata:
+            expertise_lines = ["\n\nSPECIALIST EXPERTISE:"]
+            expertise_lines.append("You are synthesizing responses from specialists with the following expertise:\n")
+            
+            for spec in specialist_metadata:
+                role = spec.get('role', 'Unknown')
+                model = spec.get('model', 'Unknown')
+                weight_desc = spec.get('weight_description', '')
+                
+                if weight_desc:
+                    expertise_lines.append(f"- {role} ({model}): {weight_desc}")
+                else:
+                    expertise_lines.append(f"- {role} ({model}): Subject matter expert")
+            
+            expertise_lines.append("\nConsider each specialist's domain expertise when synthesizing your response.")
+            strategy_prompt += "\n".join(expertise_lines)
+            
+            lib_logger.debug(f"[HiveMind] Added specialist expertise context for {len(specialist_metadata)} specialists")
+        
+        # Phase 6: Add recursive mode instructions if enabled
+        recursive_config = config.get("recursive_mode", {})
+        if recursive_config.get("enabled", False):
+            consensus_threshold = recursive_config.get("consensus_threshold", 7)
+            
+            recursive_instructions = f"""
+
+AUTONOMOUS DECISION PROTOCOL:
+You have autonomous decision-making authority. Follow this protocol:
+
+1. ASSESSMENT PHASE:
+   - Analyze the provided responses
+   - Rate consensus level (1-10 scale)
+   - Output: [CONSENSUS: X/10]
+
+2. DECISION PHASE:
+   If consensus >= {consensus_threshold}/10:
+     - Proceed directly to synthesis
+   
+   If consensus < {consensus_threshold}/10:
+     - Identify specific conflict points
+     - Output: [CONFLICTS: <brief list>]
+     - For each response, reason internally about how it addresses the conflicts
+     - Output: [CRITIQUE: <your internal reasoning>]
+
+3. SYNTHESIS PHASE:
+   - Create final answer incorporating all insights
+   - Output: [FINAL SYNTHESIS:]
+   - Provide your complete response after this marker
+
+IMPORTANT: Wrap all internal reasoning (CONSENSUS, CONFLICTS, CRITIQUE) in [INTERNAL] tags.
+Only the content after [FINAL SYNTHESIS:] will be shown to the user.
+
+Example format:
+[INTERNAL]
+[CONSENSUS: 5/10]
+[CONFLICTS: Response 1 suggests X, Response 2 suggests Y]
+[CRITIQUE: Analyzing the conflict...]
+[/INTERNAL]
+[FINAL SYNTHESIS:]
+<your complete answer to the user>
+"""
+            strategy_prompt += recursive_instructions
+            lib_logger.info(
+                f"[HiveMind] Recursive mode enabled (consensus threshold: {consensus_threshold}/10)"
+            )
+        
+        # Build messages array
+        messages = [
+            {
+                "role": "system",
+                "content": strategy_prompt
+            }
+        ]
+        
+        # Add original user query
+        if original_messages:
+            # Find the last user message
+            for msg in reversed(original_messages):
+                if msg.get("role") == "user":
+                    messages.append({
+                        "role": "user",
+                        "content": msg.get("content", "")
+                    })
+                    break
+        
+        lib_logger.debug(f"[HiveMind] Arbiter prompt built: {len(messages)} messages")
+        
+        return messages
+    
+    async def _call_arbiter(
+        self,
+        messages: List[Dict[str, str]],
+        config: Dict[str, Any],
+        request: Any
+    ) -> tuple:
+        """
+        Call the arbiter model to synthesize responses.
+        
+        Non-streaming version for Phase 2.
+        Streaming support will be added in Phase 3.
+        
+        Args:
+            messages: Constructed arbiter messages
+            config: Swarm or fusion configuration
+            request: Original request object
+        
+        Returns:
+            Tuple of (arbiter_response, arbiter_usage)
+        """
+        # Get arbiter model
+        arbiter_config = config.get("arbiter", {})
+        arbiter_model = arbiter_config.get("model", "self")
+        
+        # If "self", we need to determine which model to use
+        # For swarm, this will be handled by caller
+        # For now, just use as-is
+        
+        lib_logger.info(f"[HiveMind] Calling arbiter model: {arbiter_model}")
+        
+        # Build params for arbiter call
+        arbiter_params = {
+            "model": arbiter_model,
+            "messages": messages,
+            "stream": False  # Non-streaming for Phase 2
+        }
+        
+        # Call arbiter through RotatingClient
+        # Use _execute_with_retry for consistency
+        arbiter_response = await self.rotating_client._execute_with_retry(
+            litellm.acompletion,
+            request=request,
+            **arbiter_params
+        )
+        
+        # Extract usage - dynamically capture ALL numeric usage fields
+        arbiter_usage = {}
+        
+        if hasattr(arbiter_response, 'usage') and arbiter_response.usage:
+            usage = arbiter_response.usage
+            
+            # Iterate through all attributes of the usage object
+            for attr_name in dir(usage):
+                # Skip private/magic attributes
+                if attr_name.startswith('_'):
+                    continue
+                
+                try:
+                    attr_value = getattr(usage, attr_name)
+                    
+                    # Only capture numeric fields (int or float)
+                    if isinstance(attr_value, (int, float)) and not isinstance(attr_value, bool):
+                        arbiter_usage[attr_name] = attr_value
+                except (AttributeError, TypeError):
+                    # Skip non-accessible or non-numeric attributes
+                    continue
+        
+        lib_logger.info(
+            f"[HiveMind] Arbiter completed. Tokens: {arbiter_usage['total_tokens']}"
+        )
+        
+        return arbiter_response, arbiter_usage
+    
+    async def _call_arbiter_streaming(
+        self,
+        messages: List[Dict[str, str]],
+        config: Dict[str, Any],
+        request: Any
+    ):
+        """
+        Call the arbiter model with streaming enabled.
+        
+        Yields arbiter response chunks while tracking usage.
+        Phase 6: Filters [INTERNAL] markers for recursive mode.
+        
+        Args:
+            messages: Constructed arbiter messages
+            config: Swarm or fusion configuration
+            request: Original request object
+        
+        Yields:
+            Response chunks from arbiter
+            Final yield includes usage metadata
+        """
+        # Get arbiter model
+        arbiter_config = config.get("arbiter", {})
+        arbiter_model = arbiter_config.get("model", "self")
+        
+        lib_logger.info(f"[HiveMind] Calling arbiter model (streaming): {arbiter_model}")
+        
+        # Build params for arbiter call
+        arbiter_params = {
+            "model": arbiter_model,
+            "messages": messages,
+            "stream": True  # Enable streaming
+        }
+        # Call arbiter through RotatingClient's streaming method
+        stream_generator = self.rotating_client._streaming_acompletion_with_retry(
+            request=request,
+            **arbiter_params
+        )
+        
+        # Track usage from stream
+        arbiter_usage = {
+            'prompt_tokens': 0,
+            'completion_tokens': 0,
+            'total_tokens': 0
+        }
+        
+        # Phase 6: Track recursive mode state
+        recursive_enabled = config.get("recursive_mode", {}).get("enabled", False)
+        in_internal_block = False
+        internal_buffer = []
+        
+        # Stream chunks and collect usage
+        async for chunk in stream_generator:
+            # Check if this chunk has usage info (typically the last chunk)
+            if hasattr(chunk, 'usage') and chunk.usage:
+                usage = chunk.usage
+                arbiter_usage['prompt_tokens'] = getattr(usage, 'prompt_tokens', 0)
+                arbiter_usage['completion_tokens'] = getattr(usage, 'completion_tokens', 0)
+                arbiter_usage['total_tokens'] = getattr(usage, 'total_tokens', 0)
+                
+                # Include other fields
+                for field in ['cached_tokens', 'reasoning_tokens']:
+                    if hasattr(usage, field):
+                        arbiter_usage[field] = getattr(usage, field, 0)
+            
+            # BUGFIX: Robust handling of [INTERNAL] markers to prevent data loss
+            if recursive_enabled and hasattr(chunk, 'choices') and chunk.choices:
+                delta = chunk.choices[0].delta if hasattr(chunk.choices[0], 'delta') else None
+                if delta and hasattr(delta, 'content') and delta.content:
+                    content = delta.content
+                    
+                    # Handle [INTERNAL] start
+                    if '[INTERNAL]' in content:
+                        parts = content.split('[INTERNAL]')
+                        before_internal = parts[0]
+                        
+                        # Yield content before marker
+                        if before_internal:
+                            chunk.choices[0].delta.content = before_internal
+                            yield chunk
+                        
+                        in_internal_block = True
+                        
+                        # Handle content after marker (start of internal)
+                        if len(parts) > 1:
+                            remaining = parts[1]
+                            # Check if it also ends in this chunk
+                            if '[/INTERNAL]' in remaining:
+                                internal_parts = remaining.split('[/INTERNAL]')
+                                internal_buffer.append(internal_parts[0])
+                                
+                                # Process buffer
+                                full_internal = ''.join(internal_buffer)
+                                self._log_recursive_markers(full_internal, config)
+                                internal_buffer = []
+                                in_internal_block = False
+                                
+                                # Yield content after [/INTERNAL]
+                                after_internal = internal_parts[1]
+                                if after_internal:
+                                    chunk.choices[0].delta.content = after_internal
+                                    yield chunk
+                            else:
+                                internal_buffer.append(remaining)
+                        
+                        continue # Done with this chunk
+                    
+                    # Handle [/INTERNAL] end (if we are in block)
+                    if in_internal_block and '[/INTERNAL]' in content:
+                        parts = content.split('[/INTERNAL]')
+                        internal_buffer.append(parts[0])
+                        
+                        # Process buffer
+                        full_internal = ''.join(internal_buffer)
+                        self._log_recursive_markers(full_internal, config)
+                        internal_buffer = []
+                        in_internal_block = False
+                        
+                        # Yield content after marker
+                        after_internal = parts[1]
+                        if after_internal:
+                            chunk.choices[0].delta.content = after_internal
+                            yield chunk
+                        continue
+                    
+                    # If inside internal block, buffer it
+                    if in_internal_block:
+                        internal_buffer.append(content)
+                        continue
+            
+            # Yield the chunk to caller (normal flow or filtered)
+            yield chunk
+        
+        lib_logger.info(
+            f"[HiveMind] Arbiter streaming completed. Tokens: {arbiter_usage['total_tokens']}"
+        )
+        
+        # Return usage as final metadata
+        # Caller will handle usage aggregation
+        yield {"_hivemind_usage": arbiter_usage}
+    
+    def _log_recursive_markers(self, internal_content: str, config: Dict[str, Any]):
+        """
+        Parse and log recursive mode markers from internal reasoning.
+        
+        Phase 6: Extracts consensus scores, conflicts, and critique reasoning.
+        
+        Args:
+            internal_content: Content between [INTERNAL] tags
+            config: Configuration with recursive threshold
+        """
+        
+        # Extract consensus score
+        consensus_match = re.search(r'\[CONSENSUS:\s*(\d+)/10\]', internal_content)
+        if consensus_match:
+            consensus_score = int(consensus_match.group(1))
+            threshold = config.get("recursive_mode", {}).get("consensus_threshold", 7)
+            
+            if consensus_score < threshold:
+                lib_logger.warning(
+                    f"[HiveMind] Recursive mode: Consensus {consensus_score}/10 "
+                    f"(below threshold {threshold}/10) - arbiter performing critique"
+                )
+            else:
+                lib_logger.info(
+                    f"[HiveMind] Recursive mode: Consensus {consensus_score}/10 "
+                    f"(>= threshold {threshold}/10) - proceeding to synthesis"
+                )
+        
+        # Extract conflicts if present
+        conflicts_match = re.search(r'\[CONFLICTS:\s*([^\]]+)\]', internal_content)
+        if conflicts_match:
+            conflicts = conflicts_match.group(1).strip()
+            lib_logger.info(f"[HiveMind] Conflicts identified: {conflicts}")
+        
+        # Log that critique is happening
+        if '[CRITIQUE:' in internal_content:
+            lib_logger.debug("[HiveMind] Arbiter performing internal critique reasoning")
+
+    
+    async def _handle_swarm_streaming(
+        self,
+        config: Dict[str, Any],
+        base_model: str,
+        request: Any,
+        **kwargs
+    ):
+        """
+        Handle streaming swarm request.
+        
+        Executes drones in parallel, then streams arbiter response.
+        Aggregates usage and injects into stream.
+        
+        Args:
+            config: Swarm configuration
+            base_model: Base model name
+            request: Original request object
+            **kwargs: Request parameters
+        
+        Yields:
+            Arbiter response chunks with aggregated usage
+        """
+        # Steps 1-4: Same as non-streaming (collect drone responses)
+        drones = self._prepare_drones(config, base_model, kwargs)
+        drone_responses, drone_usage = await self._execute_parallel(drones, request)
+        formatted_responses = self._format_for_arbiter(drone_responses, config)
+        
+        original_messages = kwargs.get("messages", [])
+        arbiter_messages = self._build_arbiter_prompt(
+            formatted_responses,
+            config,
+            original_messages
+        )
+        
+        # Handle "self" arbiter model
+        arbiter_config = config.get("arbiter", {})
+        arbiter_model = arbiter_config.get("model", "self")
+        if arbiter_model == "self":
+            arbiter_model = base_model
+            lib_logger.debug(f"[HiveMind] Using self-arbiter: {arbiter_model}")
+        
+        # BUGFIX: Use deepcopy for config
+        config_copy = copy.deepcopy(config)
+        config_copy["arbiter"] = arbiter_config.copy()
+        config_copy["arbiter"]["model"] = arbiter_model
+        
+        # Call arbiter in streaming mode
+        arbiter_usage = {}
+        async for chunk in self._call_arbiter_streaming(arbiter_messages, config_copy, request):
+            # Check for usage metadata
+            if isinstance(chunk, dict) and "_hivemind_usage" in chunk:
+                arbiter_usage = chunk["_hivemind_usage"]
+                continue  # Don't yield metadata chunk
+            
+            # For SSE chunks, check if this is the final chunk with usage
+            # and update with aggregated usage
+            if hasattr(chunk, 'usage') and chunk.usage:
+                # This is the final chunk - aggregate total usage
+                total_usage = {
+                    'prompt_tokens': drone_usage['prompt_tokens'] + arbiter_usage.get('prompt_tokens', 0),
+                    'completion_tokens': drone_usage['completion_tokens'] + arbiter_usage.get('completion_tokens', 0),
+                    'total_tokens': drone_usage['total_tokens'] + arbiter_usage.get('total_tokens', 0)
+                }
+                
+                for field in ['cached_tokens', 'reasoning_tokens']:
+                    if field in drone_usage or field in arbiter_usage:
+                        total_usage[field] = drone_usage.get(field, 0) + arbiter_usage.get(field, 0)
+                
+                # Update chunk usage with aggregated values
+                chunk.usage.prompt_tokens = total_usage['prompt_tokens']
+                chunk.usage.completion_tokens = total_usage['completion_tokens']
+                chunk.usage.total_tokens = total_usage['total_tokens']
+                
+                for field in ['cached_tokens', 'reasoning_tokens']:
+                    if field in total_usage:
+                        setattr(chunk.usage, field, total_usage[field])
+                
+                lib_logger.info(
+                    f"[HiveMind] Streaming swarm completed. "
+                    f"Total usage: {total_usage['total_tokens']} tokens "
+                    f"(Drones: {drone_usage['total_tokens']}, Arbiter: {arbiter_usage.get('total_tokens', 0)})"
+                )
+            
+            yield chunk
+    
+    async def _handle_fusion_streaming(
+        self,
+        config: Dict[str, Any],
+        request: Any,
+        **kwargs
+    ):
+        """
+        Handle streaming fusion request.
+        
+        Executes specialists in parallel, then streams arbiter response.
+        Aggregates usage and injects into stream.
+        
+        Args:
+            config: Fusion configuration
+            request: Original request object
+            **kwargs: Request parameters
+        
+        Yields:
+            Arbiter response chunks with aggregated usage
+        """
+        # Prepare specialist models
+        specialist_models = self._prepare_fusion_models(config, kwargs)
+        
+        if not specialist_models:
+            raise ValueError("[HiveMind] No valid specialists found for fusion")
+        
+        # Execute specialists in parallel
+        specialist_responses, specialist_usage = await self._execute_parallel(
+            specialist_models, request
+        )
+        
+        # Format responses with role labels and extract metadata
+        formatted_responses, specialist_metadata_for_arbiter = self._format_for_arbiter(
+            specialist_responses,
+            config,
+            specialist_metadata=specialist_models
+        )
+        
+        # Build arbiter prompt with specialist expertise context
+        original_messages = kwargs.get("messages", [])
+        arbiter_messages = self._build_arbiter_prompt(
+            formatted_responses,
+            config,
+            original_messages,
+            specialist_metadata=specialist_metadata_for_arbiter  # MISSING FEATURE FIX: Pass metadata
+        )
+        
+        # Get arbiter model
+        arbiter_config = config.get("arbiter", {})
+        arbiter_model = arbiter_config.get("model", "gpt-4o")
+        
+        lib_logger.debug(f"[HiveMind] Using arbiter model: {arbiter_model}")
+        
+        # Update config
+        # BUGFIX: Use deepcopy
+        config_copy = copy.deepcopy(config)
+        config_copy["arbiter"] = arbiter_config.copy()
+        config_copy["arbiter"]["model"] = arbiter_model
+        
+        # Stream arbiter
+        arbiter_usage = {}
+        async for chunk in self._call_arbiter_streaming(arbiter_messages, config_copy, request):
+            if isinstance(chunk, dict) and "_hivemind_usage" in chunk:
+                arbiter_usage = chunk["_hivemind_usage"]
+                continue
+            
+            if hasattr(chunk, 'usage') and chunk.usage:
+                # Final chunk - aggregate usage
+                total_usage = {
+                    'prompt_tokens': specialist_usage['prompt_tokens'] + arbiter_usage.get('prompt_tokens', 0),
+                    'completion_tokens': specialist_usage['completion_tokens'] + arbiter_usage.get('completion_tokens', 0),
+                    'total_tokens': specialist_usage['total_tokens'] + arbiter_usage.get('total_tokens', 0)
+                }
+                
+                for field in ['cached_tokens', 'reasoning_tokens']:
+                    if field in specialist_usage or field in arbiter_usage:
+                        total_usage[field] = specialist_usage.get(field, 0) + arbiter_usage.get(field, 0)
+                
+                chunk.usage.prompt_tokens = total_usage['prompt_tokens']
+                chunk.usage.completion_tokens = total_usage['completion_tokens']
+                chunk.usage.total_tokens = total_usage['total_tokens']
+                
+                for field in ['cached_tokens', 'reasoning_tokens']:
+                    if field in total_usage:
+                        setattr(chunk.usage, field, total_usage[field])
+                
+                lib_logger.info(
+                    f"[HiveMind] Fusion streaming completed. "
+                    f"Total usage: {total_usage['total_tokens']} tokens "
+                    f"(Specialists: {specialist_usage['total_tokens']}, Arbiter: {arbiter_usage.get('total_tokens', 0)})"
+                )
+            
+            yield chunk
+    
+    async def handle_request(self, request, **kwargs):
+        """
+        Handle an ensemble request (swarm or fusion).
+        
+        This is the main entry point for ensemble execution.
+        
+        Args:
+            request: Original request object
+            **kwargs: Request parameters
+        
+        Returns:
+            Response from arbiter (streaming or complete)
+        """
+        model_id = kwargs.get("model")
+        
+        if not model_id:
+            raise ValueError("Model ID is required")
+        
+        # Resolve conflicts
+        resolved_id = self.resolve_conflicts(model_id)
+        
+        # Determine type
+        if resolved_id in self.config_loader.fusion_configs:
+            config = self.config_loader.get_fusion_config(resolved_id)
+            specialists = config.get("specialists", [])
+            is_streaming = kwargs.get("stream", False)
+            
+            # Phase 6: Track execution start time
+            start_time = time.time()
+            
+            lib_logger.info(
+                f"[HiveMind] Processing Fusion request: {resolved_id} "
+                f"({len(specialists)} specialists, streaming: {is_streaming})"
+            )
+            
+            # Route based on streaming mode
+            if is_streaming:
+                # Streaming fusion
+                return self._handle_fusion_streaming(
+                    config=config,
+                    request=request,
+                    **kwargs
+                )
+            
+            # Non-streaming fusion
+            specialist_models = self._prepare_fusion_models(config, kwargs)
+            
+            if not specialist_models:
+                raise ValueError(f"[HiveMind] No valid specialists found for fusion '{resolved_id}'")
+            
+            specialist_responses, specialist_usage = await self._execute_parallel(
+                specialist_models, request
+            )
+            
+            # Format responses and extract metadata for arbiter
+            formatted_responses, specialist_metadata_for_arbiter = self._format_for_arbiter(
+                specialist_responses,
+                config,
+                specialist_metadata=specialist_models
+            )
+            
+            original_messages = kwargs.get("messages", [])
+            arbiter_messages = self._build_arbiter_prompt(
+                formatted_responses,
+                config,
+                original_messages,
+                specialist_metadata=specialist_metadata_for_arbiter  # MISSING FEATURE FIX: Pass metadata
+            )
+            
+            arbiter_config = config.get("arbiter", {})
+            arbiter_model = arbiter_config.get("model", "gpt-4o")
+            
+            # BUGFIX: Use deepcopy
+            config_copy = copy.deepcopy(config)
+            config_copy["arbiter"] = arbiter_config.copy()
+            config_copy["arbiter"]["model"] = arbiter_model
+            
+            arbiter_response, arbiter_usage = await self._call_arbiter(
+                arbiter_messages,
+                config_copy,
+                request
+            )
+            
+            # Aggregate usage - dynamically sum ALL numeric fields from both sources
+            total_usage = {}
+            
+            # Helper function to merge usage dictionaries
+            for usage_dict in [specialist_usage, arbiter_usage]:
+                for field, value in usage_dict.items():
+                    if field not in total_usage:
+                        total_usage[field] = 0
+                    total_usage[field] += value
+            
+            # Phase 6: Calculate latency and cost
+            end_time = time.time()
+            latency_ms = (end_time - start_time) * 1000
+            
+            # Try to calculate cost using litellm
+            total_cost = 0.0
+            try:
+                total_cost = litellm.completion_cost(completion_response=arbiter_response)
+            except Exception as e:
+                lib_logger.debug(f"[HiveMind] Could not calculate cost: {e}")
+            
+            # Add hivemind_details to usage
+            hivemind_details = {
+                "mode": "fusion",
+                "specialist_count": len(specialists),
+                "specialist_tokens": specialist_usage['total_tokens'],
+                "arbiter_tokens": arbiter_usage['total_tokens'],
+                "total_cost_usd": round(total_cost, 6),
+                "latency_ms": round(latency_ms, 2)
+            }
+            
+            
+            if hasattr(arbiter_response, 'usage'):
+                # IMPORTANT: Standard usage fields contain the TOTAL aggregated usage
+                # (specialists + arbiter). This ensures consumers can parse usage normally.
+                
+                # Dynamically set ALL usage fields from total_usage
+                for field, value in total_usage.items():
+                    try:
+                        setattr(arbiter_response.usage, field, value)
+                    except (AttributeError, TypeError):
+                        # Skip if field cannot be set
+                        lib_logger.debug(f"[HiveMind] Could not set usage field '{field}'")
+                
+                # Add hivemind_details as SUPPLEMENTARY breakdown information
+                # This does NOT replace standard fields, but provides additional context
+                arbiter_response.usage.hivemind_details = hivemind_details
+            
+            lib_logger.info(
+                f"[HiveMind] Fusion completed successfully. "
+                f"Total usage: {total_usage['total_tokens']} tokens "
+                f"(Specialists: {specialist_usage['total_tokens']}, Arbiter: {arbiter_usage['total_tokens']}). "
+                f"Latency: {latency_ms:.2f}ms, Cost: ${total_cost:.6f}"
+            )
+            
+            return arbiter_response
+        
+        elif self._is_swarm_request(resolved_id):
+            base_model, preset_id = self.get_base_model(resolved_id)
+            config = self.config_loader.get_swarm_config(preset_id)
+            count = config.get("count", 3)
+            is_streaming = kwargs.get("stream", False)
+            
+            # Phase 6: Track execution start time
+            start_time = time.time()
+            
+            lib_logger.info(
+                f"[HiveMind] Processing Swarm request: {resolved_id} "
+                f"(base: {base_model}, preset: {preset_id}, {count} drones, streaming: {is_streaming})"
+            )
+            
+            # Phase 3B: Route based on streaming mode
+            if is_streaming:
+                # Streaming mode - return async generator
+                return self._handle_swarm_streaming(
+                    config=config,
+                    base_model=base_model,
+                    request=request,
+                    **kwargs
+                )
+            else:
+                # Non-streaming mode - return complete response
+                # Step 1: Prepare drones
+                drones = self._prepare_drones(config, base_model, kwargs)
+                
+                # Step 2: Execute drones in parallel
+                drone_responses, drone_usage = await self._execute_parallel(drones, request)
+                
+                # Step 3: Format responses for arbiter
+                formatted_responses = self._format_for_arbiter(drone_responses, config)
+                
+                # Step 4: Build arbiter prompt
+                original_messages = kwargs.get("messages", [])
+                arbiter_messages = self._build_arbiter_prompt(
+                    formatted_responses,
+                    config,
+                    original_messages
+                )
+                
+                # Step 5: Handle "self" arbiter model
+                arbiter_config = config.get("arbiter", {})
+                arbiter_model = arbiter_config.get("model", "self")
+                if arbiter_model == "self":
+                    arbiter_model = base_model
+                    lib_logger.debug(f"[HiveMind] Using self-arbiter: {arbiter_model}")
+                
+                # Update config with resolved arbiter model
+                # BUGFIX: Use deepcopy
+                config_copy = copy.deepcopy(config)
+                config_copy["arbiter"] = arbiter_config.copy()
+                config_copy["arbiter"]["model"] = arbiter_model
+                
+                # Step 6: Call arbiter
+                arbiter_response, arbiter_usage = await self._call_arbiter(
+                    arbiter_messages,
+                    config_copy,
+                    request
+                )
+                
+                # Step 7: Aggregate total usage - dynamically sum ALL numeric fields from both sources
+                total_usage = {}
+                
+                # Helper function to merge usage dictionaries
+                for usage_dict in [drone_usage, arbiter_usage]:
+                    for field, value in usage_dict.items():
+                        if field not in total_usage:
+                            total_usage[field] = 0
+                        total_usage[field] += value
+                
+                # Phase 6: Calculate latency and cost
+                end_time = time.time()
+                latency_ms = (end_time - start_time) * 1000
+                
+                # Try to calculate cost using litellm
+                total_cost = 0.0
+                try:
+                    total_cost = litellm.completion_cost(completion_response=arbiter_response)
+                except Exception as e:
+                    lib_logger.debug(f"[HiveMind] Could not calculate cost: {e}")
+                
+                # Add hivemind_details to usage
+                hivemind_details = {
+                    "mode": "swarm",
+                    "drone_count": count,
+                    "drone_tokens": drone_usage['total_tokens'],
+                    "arbiter_tokens": arbiter_usage['total_tokens'],
+                    "total_cost_usd": round(total_cost, 6),
+                    "latency_ms": round(latency_ms, 2)
+                }
+                
+                # Step 8: Update arbiter response with aggregated usage
+                if hasattr(arbiter_response, 'usage'):
+                    # IMPORTANT: Standard usage fields contain the TOTAL aggregated usage
+                    # (drones + arbiter). This ensures consumers can parse usage normally.
+                    
+                    # Dynamically set ALL usage fields from total_usage
+                    for field, value in total_usage.items():
+                        try:
+                            setattr(arbiter_response.usage, field, value)
+                        except (AttributeError, TypeError):
+                            # Skip if field cannot be set
+                            lib_logger.debug(f"[HiveMind] Could not set usage field '{field}'")
+                    
+                    # Add hivemind_details as SUPPLEMENTARY breakdown information
+                    # This does NOT replace standard fields, but provides additional context
+                    arbiter_response.usage.hivemind_details = hivemind_details
+                
+                lib_logger.info(
+                    f"[HiveMind] Swarm completed successfully. "
+                    f"Total usage: {total_usage['total_tokens']} tokens "
+                    f"(Drones: {drone_usage['total_tokens']}, Arbiter: {arbiter_usage['total_tokens']}). "
+                    f"Latency: {latency_ms:.2f}ms, Cost: ${total_cost:.6f}"
+                )
+                
+                return arbiter_response
+        
+        else:
+            raise ValueError(f"Unknown ensemble type for model: {model_id}")
diff --git a/src/rotator_library/ensemble_configs/README.md b/src/rotator_library/ensemble_configs/README.md
new file mode 100644
index 0000000..6a90d52
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/README.md
@@ -0,0 +1,292 @@
+# HiveMind Ensemble Configuration Guide
+
+This directory contains the configuration for HiveMind Ensemble (Swarm/Fusion) feature.
+
+## Directory Structure
+
+```
+ensemble_configs/
+├── swarms/          # Swarm preset configurations
+│   ├── default.json # Default global settings (fallback)
+│   └── *.json       # Preset configurations (e.g., aggressive.json, balanced.json)
+├── fusions/         # Fusion configurations (multi-model teams)
+│   └── *.json       # Individual fusion definitions or arrays of fusions
+├── strategies/      # Arbitration strategy templates
+│   └── *.txt        # Strategy prompt templates with {responses} placeholder
+└── roles/           # Reusable role template definitions
+    └── *.json       # Role templates for fusion specialists
+```
+
+## Configuration Files
+
+### Swarm Configuration (Preset-Based)
+
+HiveMind uses a **preset-based system** for swarm configurations. Each preset defines a configuration that can be applied to multiple base models.
+
+**Format Options**:
+- Explicit: `{base_model}-{preset_id}[swarm]`
+- Short (if `omit_id: true`): `{base_model}[swarm]`
+
+**Example**: 
+- `gpt-4o-mini-aggressive[swarm]` - explicitly uses the `aggressive.json` preset
+- `gpt-4o-mini[swarm]` - uses `default.json` preset OR a custom preset with `omit_id: true`
+- `gpt-4o-mini-default[swarm]` - always uses `default.json` even if omit_id preset exists
+
+**Preset File Structure** (`swarms/{preset_id}.json`):
+```json
+{
+  "id": "aggressive",
+  "description": "High diversity swarm with adversarial critique",
+  "base_models": ["gpt-4o-mini", "gemini-1.5-flash", "claude-3-haiku"],
+  "count": 5,
+  "temperature_jitter": {
+    "enabled": true,
+    "delta": 0.3
+  },
+  "adversarial_config": {
+    "enabled": true,
+    "count": 2,
+    "prompt": "You are a critical reviewer. Find flaws and edge cases."
+  },
+  "arbiter": {
+    "model": "self",
+    "strategy": "synthesis",
+    "blind": true
+  },
+  "recursive_mode": {
+    "enabled": true,
+    "consensus_threshold": 6
+  }
+}
+```
+
+**Key Fields**:
+- `id`: Preset identifier (must match filename)
+- `base_models`: List of models this preset applies to (enables discovery)
+- `omit_id` (optional): If `true`, this preset becomes the default for its `base_models` when using `{model}[swarm]` syntax
+- `count`: Number of drones to spawn
+- `temperature_jitter`: Randomize temperature for diversity
+- `adversarial_config`: Enable critical analysis drones
+- `arbiter`: Synthesis configuration
+- `recursive_mode`: Autonomous low-consensus handling
+
+**Omit ID Feature**: When a preset has `"omit_id": true`, it becomes the default for its specified models:
+- `gpt-4o-mini[swarm]` → uses the `omit_id` preset instead of `default.json`
+- `gpt-4o-mini-default[swarm]` → always uses `default.json` (explicit fallback)
+- `gpt-4o-mini-aggressive[swarm]` → always uses `aggressive.json` (explicit)
+
+**Important**: `omit_id` controls ONLY what appears in `/v1/models` for discoverability, not what works at runtime:
+- Explicit format (`model-preset[swarm]`) always works regardless of `omit_id` or `base_models`
+- You can use ANY model with ANY preset explicitly (e.g., `claude-3-opus-aggressive[swarm]` works even if Claude isn't in aggressive's base_models)
+
+**Discovery Rules** (`/v1/models` endpoint):
+- Preset WITH `base_models` + `omit_id: true` → Shows as `{model}[swarm]` only (explicit form hidden to avoid clutter)
+- Preset WITH `base_models` + `omit_id: false` → Shows as `{model}-{preset}[swarm]` only  
+- Preset WITHOUT `base_models` → Never shown (invisible preset, but still usable with explicit syntax)
+
+**`base_models` Purpose**: 
+- Controls ONLY which models appear in `/v1/models` for this preset
+- Does NOT restrict runtime usage - any model can use any preset with explicit syntax
+- If empty/missing, preset is "invisible" but fully functional when explicitly referenced
+
+### Fusion Configuration (Multi-Model Teams)
+
+Fusions combine responses from different specialized models. Each fusion can have role-based routing and specialist expertise.
+
+**Single Fusion Format** (`fusions/{fusion-id}.json`):
+```json
+{
+  "id": "dev-team",
+  "description": "Software development team with specialized roles",
+  "specialists": [
+    {
+      "model": "gpt-4o",
+      "role": "Architect",
+      "system_prompt": "Focus on scalability and system design.",
+      "weight": 1.5,
+      "weight_description": "Expert in architecture. Trust for design decisions."
+    },
+    {
+      "model": "claude-3-opus",
+      "role_template": "security-expert"
+    }
+  ],
+  "arbiter": {
+    "model": "gpt-4o",
+    "strategy": "code_review",
+    "blind": false
+  },
+  "recursive_mode": {
+    "enabled": false,
+    "consensus_threshold": 7
+  }
+}
+```
+
+**Array Format** (multiple fusions in one file):
+```json
+{
+  "fusions": [
+    {
+      "id": "dev-team",
+      "specialists": [...]
+    },
+    {
+      "id": "creative-writers",
+      "specialists": [...]
+    }
+  ]
+}
+```
+
+**Specialist Fields**:
+- `model`: Provider/model ID
+- `role`: Display name for this specialist
+- `system_prompt`: Role-specific instructions sent to the model
+- `weight`: Numeric importance (for future use)
+- `weight_description`: Expertise description for arbiter context
+- `role_template`: Reference to a reusable role template (see Roles section)
+
+**Arbiter Configuration**:
+- `model`: Model ID for synthesis (or "self" to use first specialist)
+- `strategy`: Strategy template name (from `strategies/` directory)
+- `blind`: If `true`, hides model names from arbiter (preserves roles)
+
+### Role Templates (Reusable Configurations)
+
+Role templates allow you to define reusable specialist configurations that can be referenced by multiple fusions.
+
+**Single Role Format** (`roles/{role-id}.json`):
+```json
+{
+  "name": "Security Expert",
+  "system_prompt": "You are a cybersecurity expert. Focus on vulnerabilities, edge cases, and threat modeling.",
+  "weight": 1.2,
+  "weight_description": "Expert in security and vulnerability assessment. Trust for security concerns."
+}
+```
+
+**Array Format** (multiple roles in one file):
+```json
+{
+  "roles": [
+    {
+      "name": "Architect",
+      "system_prompt": "Focus on system design and scalability.",
+      "weight_description": "Expert in architectural patterns."
+    },
+    {
+      "name": "Security Expert",
+      "system_prompt": "Focus on vulnerabilities and threats.",
+      "weight_description": "Expert in security assessment."
+    }
+  ]
+}
+```
+
+**Usage in Fusions**:
+```json
+{
+  "specialists": [
+    {
+      "model": "claude-3-opus",
+      "role_template": "security-expert"
+    }
+  ]
+}
+```
+
+**Override Behavior**: Specialist configs can override any field from the referenced template.
+
+### Strategy Templates
+
+Each strategy is a plain text file defining how the arbiter should synthesize responses.
+
+**File Location**: `strategies/{strategy-name}.txt`
+
+**Placeholder**: Use `{responses}` where formatted responses should be injected.
+
+**Example** (`strategies/synthesis.txt`):
+```
+You are an expert synthesizer. Analyze the following responses and create a single, superior answer that:
+1. Combines the best elements from each response
+2. Resolves any conflicts or contradictions
+3. Ensures completeness and accuracy
+4. Maintains coherence and clarity
+
+{responses}
+
+Provide your synthesis as a complete, high-quality response.
+```
+
+## Adding New Configurations
+
+1. **New Swarm Preset**: Create `{preset_id}.json` in `swarms/` with `id` and `base_models` fields
+2. **New Fusion**: Create `{fusion_id}.json` in `fusions/` OR add to an existing array file
+3. **New Strategy**: Create `{strategy_name}.txt` in `strategies/`
+4. **New Role Template**: Create `{role_id}.json` in `roles/` OR add to an existing array file
+
+All configs are loaded automatically on startup!
+
+## Advanced Features
+
+### Temperature Jitter (Swarm)
+Randomizes temperature across drones to increase response diversity:
+```json
+"temperature_jitter": {
+  "enabled": true,
+  "delta": 0.2
+}
+```
+Each drone gets `base_temp ± delta` (clamped to [0.0, 2.0]).
+
+### Adversarial Mode (Swarm)
+Dedicates N drones as critical reviewers:
+```json
+"adversarial_config": {
+  "enabled": true,
+  "count": 1,
+  "prompt": "You are a Senior Principal Engineer. Find flaws and edge cases."
+}
+```
+Last N drones receive the adversarial prompt. Responses are marked `[ADVERSARIAL]` in arbiter input.
+
+### Recursive Mode (Swarm & Fusion)
+Enables autonomous arbiter decision-making:
+```json
+"recursive_mode": {
+  "enabled": true,
+  "consensus_threshold": 7
+}
+```
+If consensus < threshold, arbiter performs internal critique before synthesis. All internal reasoning is logged but hidden from user.
+
+### Blind Switch
+Controls whether model names are shown to arbiter:
+```json
+"arbiter": {
+  "blind": true
+}
+```
+- `true`: "Response 1 (Architect role)" (hides model names)
+- `false`: "Response 1 (GPT-4o - Architect)" (shows models)
+
+Roles are **always preserved** regardless of blind setting.
+
+## Usage Examples
+
+**Swarm Request**:
+```bash
+curl -X POST http://localhost:8000/v1/chat/completions \
+-d '{"model": "gpt-4o-mini-aggressive[swarm]", "messages": [...]}'
+```
+
+**Fusion Request**:
+```bash
+curl -X POST http://localhost:8000/v1/chat/completions \
+-d '{"model": "dev-team[fusion]", "messages": [...]}'
+```
+
+For detailed usage and API reference, see:
+- [HiveMind User Guide](../../../docs/HiveMind_User_Guide.md)
+- [HiveMind API Reference](../../../docs/HiveMind_API.md)
diff --git a/src/rotator_library/ensemble_configs/fusions/dev-team.json b/src/rotator_library/ensemble_configs/fusions/dev-team.json
new file mode 100644
index 0000000..4acdd1e
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/fusions/dev-team.json
@@ -0,0 +1,37 @@
+{
+  "id": "dev-team",
+  "description": "A team of specialized models for software development",
+  "specialists": [
+    {
+      "model": "gpt-4o",
+      "role": "Architect",
+      "system_prompt": "You are a Software Architect. Focus on architectural patterns, scalability, and system design.",
+      "weight": 1.5,
+      "weight_description": "Expert in system design and scalability. Trust for architectural decisions and structural integrity."
+    },
+    {
+      "model": "claude-3-opus",
+      "role": "Security Specialist",
+      "system_prompt": "You are a Security Expert. Focus on security vulnerabilities, edge cases, and potential exploits.",
+      "weight": 1.2,
+      "weight_description": "Expert in security and vulnerability assessment. Trust for identifying security flaws and attack vectors."
+    },
+    {
+      "model": "gemini-1.5-pro",
+      "role": "Code Reviewer",
+      "system_prompt": "You are a Code Quality Expert. Focus on code quality, performance, and best practices.",
+      "weight": 1.0,
+      "weight_description": "Expert in code quality and performance optimization. Trust for maintainability and efficiency concerns."
+    }
+  ],
+  "arbiter": {
+    "model": "gpt-4o",
+    "strategy": "synthesis",
+    "blind": true,
+    "note": "Fusion mode uses blind=true to hide model names while preserving roles"
+  },
+  "recursive_mode": {
+    "enabled": false,
+    "consensus_threshold": 7
+  }
+}
diff --git a/src/rotator_library/ensemble_configs/fusions/fusion.example.json b/src/rotator_library/ensemble_configs/fusions/fusion.example.json
new file mode 100644
index 0000000..556b0cf
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/fusions/fusion.example.json
@@ -0,0 +1,64 @@
+{
+  "id": "dev-team",
+  "description": "Software development team with specialized roles and expertise",
+  
+  "_FIELD_DOCUMENTATION": "=== FUSION CONFIGURATION ===",
+  "_id": "REQUIRED. Fusion identifier. Used in model name as: {id}[fusion]",
+  "_description": "OPTIONAL. Human-readable description of this fusion's purpose.",
+  
+  "_specialists": "REQUIRED. Array of specialist model configurations. Each specialist processes the same query with a specialized role/perspective.",
+  "specialists": [
+    {
+      "_model": "REQUIRED. Provider/model ID (e.g., 'gpt-4o', 'anthropic/claude-3-5-sonnet', 'gemini/gemini-1.5-pro')",
+      "model": "gpt-4o",
+      
+      "_role": "OPTIONAL. Display name for this specialist. Used in arbiter input as 'Role: {response}'. Default: 'Specialist {index}'",
+      "role": "Architect",
+      
+      "_system_prompt": "OPTIONAL. Role-specific instructions injected as system message. Defines this specialist's perspective/expertise.",
+      "system_prompt": "You are a Software Architect with deep expertise in system design, scalability, and architectural patterns. Focus on:\n- System design and component architecture\n- Scalability and performance considerations\n- Design patterns and best practices\n- Technology stack decisions\n- Long-term maintainability\n\nProvide architectural guidance and recommendations.",
+      
+      "_weight": "OPTIONAL (default: 1.0). Numeric importance for future weighted synthesis. Currently used for metadata only.",
+      "weight": 1.5,
+      
+      "_weight_description": "OPTIONAL. Natural language description of this specialist's expertise. Injected into arbiter context to guide synthesis.",
+      "weight_description": "Expert in architecture and scalability. Trust for design decisions, system architecture, and performance considerations.",
+      
+      "_role_template": "OPTIONAL. Reference to reusable role template from roles/ directory. Template fields are merged (specialist config overrides template). Cannot be used together with explicit role/system_prompt.",
+      "role_template": null
+    },
+    {
+      "model": "claude-3-5-sonnet",
+      
+      "_role_template_usage": "Example of using a role template instead of inline configuration",
+      "role_template": "security-expert",
+      
+      "_note": "When using role_template, you can still override fields like model, weight, etc. The template provides role, system_prompt, weight_description as defaults."
+    },
+    {
+      "model": "gemini/gemini-1.5-pro",
+      "role": "Code Reviewer",
+      "system_prompt": "You are a Senior Code Reviewer focused on code quality, maintainability, and best practices. Analyze:\n- Code clarity and readability\n- Error handling and edge cases\n- Testing strategy and coverage\n- Documentation and comments\n- DRY, SOLID, and other principles\n\nProvide actionable code review feedback.",
+      "weight": 1.2,
+      "weight_description": "Expert in code quality and maintainability. Trust for code review, testing, and best practices."
+    }
+  ],
+  
+  "_arbiter": "REQUIRED. Configuration for the model that synthesizes specialist responses.",
+  "arbiter": {
+    "_model": "'self' uses first specialist model. Or specify explicit model. Should be reasoning-capable for complex synthesis.",
+    "model": "gpt-4o",
+    
+    "_strategy": "Strategy template name (from strategies/ directory). Default: 'synthesis'. Try 'code_review' for development tasks.",
+    "strategy": "synthesis",
+    
+    "_blind": "If true, hides model names from arbiter (shows roles only). If false, shows both role and model. Default: false for fusions.",
+    "blind": false
+  },
+  
+  "_recursive_mode": "OPTIONAL. Same as swarm recursive mode. Enables autonomous critique for low-consensus scenarios.",
+  "recursive_mode": {
+    "enabled": false,
+    "consensus_threshold": 7
+  }
+}
diff --git a/src/rotator_library/ensemble_configs/fusions/multi-provider-test.json b/src/rotator_library/ensemble_configs/fusions/multi-provider-test.json
new file mode 100644
index 0000000..7fa217c
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/fusions/multi-provider-test.json
@@ -0,0 +1,21 @@
+{
+  "fusions": [
+    {
+      "id": "multi-provider",
+      "description": "Multi-provider fusion hitting all providers - minimal specialist config test",
+      "arbiter": {
+        "model": "gemini/gemini-2.5-pro",
+        "strategy": "synthesis",
+        "blind": false
+      },
+      "specialists": [
+        {"model": "iflow/K2-0905"},
+        {"model": "gemini/gemini-2.5-flash"},
+        {"model": "nvidia_nim/qwen/qwen3-coder-480b-a35b-instruct"},
+        {"model": "qwen_code/qwen3-coder-plus"},
+        {"model": "gemini_cli/gemini-2.5-flash-lite"},
+        {"model": "opencode/big-pickle"}
+      ]
+    }
+  ]
+}
diff --git a/src/rotator_library/ensemble_configs/roles/architect.json b/src/rotator_library/ensemble_configs/roles/architect.json
new file mode 100644
index 0000000..9620729
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/roles/architect.json
@@ -0,0 +1,6 @@
+{
+  "name": "Architect",
+  "system_prompt": "You are a Software Architect. Focus on architectural patterns, scalability, and system design. Consider:\n- System architecture and design patterns\n- Scalability and performance implications\n- Technology stack decisions\n- Component interactions and dependencies\n- Long-term maintainability",
+  "weight": 1.5,
+  "weight_description": "Expert in system design and scalability. Trust for architectural decisions and structural integrity."
+}
diff --git a/src/rotator_library/ensemble_configs/roles/code-reviewer.json b/src/rotator_library/ensemble_configs/roles/code-reviewer.json
new file mode 100644
index 0000000..2165529
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/roles/code-reviewer.json
@@ -0,0 +1,6 @@
+{
+  "name": "Code Reviewer",
+  "system_prompt": "You are a Code Quality Expert. Focus on code quality, performance, and best practices. Consider:\n- Code readability and maintainability\n- Performance optimization opportunities\n- Best practices and design patterns\n- Error handling and edge cases\n- Testing and documentation",
+  "weight": 1.0,
+  "weight_description": "Expert in code quality and performance optimization. Trust for maintainability and efficiency concerns."
+}
diff --git a/src/rotator_library/ensemble_configs/roles/role.example.json b/src/rotator_library/ensemble_configs/roles/role.example.json
new file mode 100644
index 0000000..ae645e8
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/roles/role.example.json
@@ -0,0 +1,14 @@
+{
+  "_FIELD_DOCUMENTATION": "=== ROLE TEMPLATE (Single Format) ===",
+  "_name": "REQUIRED. Display name for this role. Converted to role_id (lowercase, hyphens). Used as: role_template: 'security-expert'",
+  "name": "Security Expert",
+  
+  "_system_prompt": "OPTIONAL. Default system prompt for this role. Can be overridden by specialist config.",
+  "system_prompt": "You are a cybersecurity expert with deep knowledge of secure coding practices, threat modeling, and vulnerability assessment. Focus on:\n- Security vulnerabilities and exploits\n- Authentication and authorization flaws\n- Data privacy and protection\n- Input validation and sanitization\n- Cryptography and secure communication\n- OWASP Top 10 and common attack vectors\n\nProvide security-focused analysis and recommendations.",
+  
+  "_weight": "OPTIONAL (default: 1.0). Default weight for this role. Can be overridden by specialist config.",
+  "weight": 1.2,
+  
+  "_weight_description": "OPTIONAL. Default expertise description. Can be overridden by specialist config.",
+  "weight_description": "Expert in security and vulnerability assessment. Trust for security concerns, threat modeling, and secure coding practices."
+}
diff --git a/src/rotator_library/ensemble_configs/roles/roles-array.example.json b/src/rotator_library/ensemble_configs/roles/roles-array.example.json
new file mode 100644
index 0000000..45cfaaf
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/roles/roles-array.example.json
@@ -0,0 +1,25 @@
+{
+  "_FIELD_DOCUMENTATION": "=== ROLE TEMPLATE (Array Format) ===",
+  "_roles": "Array of role template definitions. Each role can be referenced independently by its converted name.",
+  "roles": [
+    {
+      "_name": "Converted to role_id (e.g., 'Performance Engineer' → 'performance-engineer')",
+      "name": "Performance Engineer",
+      "system_prompt": "You are a performance engineering specialist. Focus on optimization, profiling, and scalability.",
+      "weight": 1.3,
+      "weight_description": "Expert in performance optimization and scalability analysis."
+    },
+    {
+      "name": "UX Designer",
+      "system_prompt": "You are a UX/UI designer with expertise in user-centered design and accessibility.",
+      "weight": 1.1,
+      "weight_description": "Expert in user experience, interface design, and accessibility standards."
+    },
+    {
+      "name": "DevOps Engineer",
+      "system_prompt": "You are a DevOps specialist focused on CI/CD, infrastructure, deployment, and monitoring.",
+      "weight": 1.2,
+      "weight_description": "Expert in deployment, infrastructure, and operational excellence."
+    }
+  ]
+}
diff --git a/src/rotator_library/ensemble_configs/roles/security-expert.json b/src/rotator_library/ensemble_configs/roles/security-expert.json
new file mode 100644
index 0000000..405160d
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/roles/security-expert.json
@@ -0,0 +1,6 @@
+{
+  "name": "Security Expert",
+  "system_prompt": "You are a Security Expert. Focus on security vulnerabilities, edge cases, and potential exploits. Consider:\n- Security vulnerabilities and attack vectors\n- Input validation and sanitization\n- Authentication and authorization\n- Data protection and privacy\n- Security best practices and standards",
+  "weight": 1.2,
+  "weight_description": "Expert in security and vulnerability assessment. Trust for identifying security flaws and attack vectors."
+}
diff --git a/src/rotator_library/ensemble_configs/strategies/best_of_n.txt b/src/rotator_library/ensemble_configs/strategies/best_of_n.txt
new file mode 100644
index 0000000..72cc165
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/strategies/best_of_n.txt
@@ -0,0 +1,10 @@
+You are evaluating multiple responses to select and refine the best one. For each response, assess:
+1. Accuracy and correctness
+2. Completeness of coverage
+3. Clarity and coherence
+4. Practical applicability
+
+Select the strongest response and refine it if needed to create the optimal answer.
+
+Responses:
+{responses}
diff --git a/src/rotator_library/ensemble_configs/strategies/code_review.txt b/src/rotator_library/ensemble_configs/strategies/code_review.txt
new file mode 100644
index 0000000..236224f
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/strategies/code_review.txt
@@ -0,0 +1,12 @@
+You are a senior code reviewer evaluating multiple code solutions. Assess each based on:
+1. Correctness and functionality
+2. Error handling and edge cases
+3. Performance and efficiency
+4. Security considerations
+5. Code quality and maintainability
+6. Best practices adherence
+
+Select the best solution or synthesize a superior version by combining the strengths of each.
+
+Responses:
+{responses}
diff --git a/src/rotator_library/ensemble_configs/strategies/strategy.example.txt b/src/rotator_library/ensemble_configs/strategies/strategy.example.txt
new file mode 100644
index 0000000..76032a0
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/strategies/strategy.example.txt
@@ -0,0 +1,39 @@
+ARBITRATION STRATEGY TEMPLATE: {strategy_name}
+
+=== FIELD DOCUMENTATION ===
+This is a plain text file that defines how the arbiter model should synthesize multiple responses.
+
+PLACEHOLDER: {responses}
+- This will be replaced with formatted drone/specialist responses
+- Format: "Response 1:\n<content>\n\nResponse 2:\n<content>\n..."
+- For fusion: "Role (Model):\n<content>\n..." (if blind=false) or "Role:\n<content>\n..." (if blind=true)
+
+SPECIALIST EXPERTISE (Fusion only):
+- If fusion mode, an additional "SPECIALIST EXPERTISE" section is auto-appended
+- Lists each specialist's role, model, and weight_description
+- Helps arbiter understand domain expertise when synthesizing
+
+RECURSIVE MODE:
+- If enabled, additional "AUTONOMOUS DECISION PROTOCOL" instructions are appended
+- Guides arbiter through consensus assessment and conflict resolution
+- Internal reasoning is wrapped in [INTERNAL] tags and hidden from user
+===
+
+=== EXAMPLE STRATEGY ===
+
+You are an expert synthesizer with deep analytical capabilities.
+
+Your task is to analyze the following responses and create a single, superior answer that combines the best insights from each perspective.
+
+{responses}
+
+Guidelines for synthesis:
+1. **Identify Core Insights**: Extract key points and unique perspectives from each response
+2. **Resolve Conflicts**: If responses disagree, evaluate which perspective is most sound based on evidence and reasoning
+3. **Merge Complementary Ideas**: Combine non-conflicting insights into a cohesive whole
+4. **Fill Gaps**: If all responses miss something important, include it based on your own expertise
+5. **Maintain Accuracy**: Never introduce hallucinations - stay grounded in the provided responses
+6. **Ensure Completeness**: Address all aspects of the original query
+7. **Optimize Clarity**: Present the final answer in clear, well-structured language
+
+Your synthesized response should be more comprehensive and insightful than any individual response while maintaining accuracy and coherence.
diff --git a/src/rotator_library/ensemble_configs/strategies/synthesis.txt b/src/rotator_library/ensemble_configs/strategies/synthesis.txt
new file mode 100644
index 0000000..d58be68
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/strategies/synthesis.txt
@@ -0,0 +1,10 @@
+You are an expert synthesizer. Analyze the following responses and create a single, superior answer that:
+1. Combines the best elements from each response
+2. Resolves any conflicts or contradictions
+3. Ensures completeness and accuracy
+4. Maintains coherence and clarity
+
+Your goal is to produce the BEST possible answer by leveraging the strengths of each response.
+
+Responses:
+{responses}
diff --git a/src/rotator_library/ensemble_configs/swarms/default.json b/src/rotator_library/ensemble_configs/swarms/default.json
new file mode 100644
index 0000000..3d3dadb
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/swarms/default.json
@@ -0,0 +1,38 @@
+{
+  "id": "default",
+  "description": "Standard swarm configuration with balanced settings",
+  "base_models": [
+    "gpt-4o",
+    "gpt-4o-mini",
+    "claude-3-5-sonnet",
+    "claude-3-haiku",
+    "gemini-1.5-pro",
+    "gemini-1.5-flash"
+  ],
+  "omit_id": false,
+  "count": 3,
+  
+  "temperature_jitter": {
+    "enabled": true,
+    "delta": 0.2
+  },
+  
+  "arbiter": {
+    "model": "self",
+    "strategy": "synthesis",
+    "blind": true,
+    "note": "Arbiter should be a decent reasoning model (e.g., GPT-4o, Claude 3+, Gemini 1.5 Pro+)"
+  },
+  
+  "adversarial_config": {
+    "enabled": false,
+    "count": 1,
+    "prompt": "You are a Senior Principal Engineer with 15+ years of experience. Your role is to find edge cases, security vulnerabilities, performance bottlenecks, and incorrect assumptions. Be thorough and critical in your analysis. Focus on:\n- Edge cases that could cause failures\n- Security implications and potential exploits\n- Performance and scalability concerns\n- Maintainability and code quality issues\n- Incorrect assumptions in the solution\n\nProvide constructive criticism to improve the solution."
+  },
+  
+  "recursive_mode": {
+    "enabled": false,
+    "consensus_threshold": 7,
+    "note": "Requires a reasoning-capable arbiter model"
+  }
+}
diff --git a/src/rotator_library/ensemble_configs/swarms/preset.example.json b/src/rotator_library/ensemble_configs/swarms/preset.example.json
new file mode 100644
index 0000000..6f918cd
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/swarms/preset.example.json
@@ -0,0 +1,65 @@
+{
+  "id": "aggressive",
+  "description": "High diversity swarm with adversarial critique. Use for complex problems requiring multiple perspectives and critical analysis.",
+  
+  "_FIELD_DOCUMENTATION": "=== SWARM PRESET CONFIGURATION ===",
+  "_id": "REQUIRED. Preset identifier. Must match filename (e.g., 'aggressive' for aggressive.json). Used in model name: {base_model}-{id}[swarm]",
+  "_description": "OPTIONAL. Human-readable description of this preset's purpose and characteristics.",
+  
+  "_base_models": "OPTIONAL. List of models this preset applies to. Controls /v1/models discovery. If omitted, preset is invisible but still usable with explicit syntax.",
+  "base_models": [
+    "gpt-4o-mini",
+    "gemini-1.5-flash", 
+    "claude-3-haiku"
+  ],
+  
+  "_omit_id": "OPTIONAL (default: false). If true, shows as {model}[swarm] in /v1/models instead of {model}-{id}[swarm]. Becomes the default preset for these models. Explicit format always works regardless of this setting.",
+  "omit_id": false,
+  
+  "_count": "REQUIRED. Number of parallel drone executions (2-10 recommended). More drones = more diversity but higher cost.",
+  "count": 5,
+  
+  "_temperature_jitter": "OPTIONAL. Adds random temperature variation to each drone for increased response diversity.",
+  "temperature_jitter": {
+    "_enabled": "Enable/disable jitter",
+    "enabled": true,
+    
+    "_delta": "Maximum temperature deviation (±delta). Each drone gets base_temp ± random(0, delta). Clamped to [0.0, 2.0]",
+    "delta": 0.3
+  },
+  
+  "_adversarial_config": "OPTIONAL. Dedicates the last N drones as critical reviewers with a custom prompt.",
+  "adversarial_config": {
+    "_enabled": "Enable/disable adversarial drones",
+    "enabled": true,
+    
+    "_count": "Number of drones to convert to adversarial mode (taken from the end of the drone list)",
+    "count": 2,
+    
+    "_prompt": "System prompt injected into adversarial drones. Should instruct them to find flaws, edge cases, and issues.",
+    "prompt": "You are a Senior Principal Engineer with 15+ years of experience. Your role is to find edge cases, security vulnerabilities, performance bottlenecks, and incorrect assumptions. Be thorough and critical in your analysis. Focus on:\n- Edge cases that could cause failures\n- Security implications and potential exploits\n- Performance and scalability concerns\n- Maintainability and code quality issues\n- Incorrect assumptions in the solution\n\nProvide constructive criticism to improve the solution."
+  },
+  
+  "_arbiter": "REQUIRED. Configuration for the model that synthesizes all drone responses into a final answer.",
+  "arbiter": {
+    "_model": "'self' uses the base model as arbiter. Or specify explicit model (e.g., 'gpt-4o', 'claude-3-5-sonnet'). Should be a reasoning-capable model.",
+    "model": "self",
+    
+    "_strategy": "Name of strategy template file (from strategies/ directory, without .txt extension). Default: 'synthesis'",
+    "strategy": "synthesis",
+    
+    "_blind": "If true, hides model names from arbiter to reduce bias. Still shows drone numbers (Response 1, Response 2, etc.)",
+    "blind": true
+  },
+  
+  "_recursive_mode": "OPTIONAL. Enables autonomous arbiter critique when consensus is low. Requires reasoning-capable arbiter.",
+  "recursive_mode": {
+    "_enabled": "Enable/disable recursive refinement",
+    "enabled": true,
+    
+    "_consensus_threshold": "Threshold (1-10 scale). If arbiter detects consensus < threshold, performs internal critique before synthesis.",
+    "consensus_threshold": 6,
+    
+    "_note": "Arbiter internally evaluates consensus, identifies conflicts, critiques responses, then synthesizes. Internal reasoning is logged but hidden from user output."
+  }
+}
diff --git a/src/rotator_library/ensemble_configs/swarms/test-gemini.json b/src/rotator_library/ensemble_configs/swarms/test-gemini.json
new file mode 100644
index 0000000..4b8f60e
--- /dev/null
+++ b/src/rotator_library/ensemble_configs/swarms/test-gemini.json
@@ -0,0 +1,16 @@
+{
+  "id": "test-gemini",
+  "description": "Test swarm for Gemini 2.5 Flash",
+  "base_models": ["gemini/gemini-2.5-flash"],
+  "omit_id": false,
+  "count": 3,
+  "arbiter": {
+    "model": "self",
+    "strategy": "synthesis",
+    "blind": false
+  },
+  "temperature_jitter": {
+    "enabled": true,
+    "delta": 0.3
+  }
+}