Skip to content

feat: LLM performance optimizations and experiment pipelines#43

Merged
stedrew merged 19 commits intodenoslab:mainfrom
legend5teve:feat/visualization-module-plots
Mar 21, 2026
Merged

feat: LLM performance optimizations and experiment pipelines#43
stedrew merged 19 commits intodenoslab:mainfrom
legend5teve:feat/visualization-module-plots

Conversation

@legend5teve
Copy link
Collaborator

Summary

  • Input-hash caching (Plan C): Skip redundant LLM API calls when agent inputs (edge, belief, inbox, margin) haven't changed between decision rounds — applied across all 3 LLM call sites (predeparture, destination-mode, route-mode)
  • Parallel LLM dispatch (Plan A): Restructure process_pending_departures into a two-phase collect-then-process pattern using ThreadPoolExecutor, firing all non-cached predeparture LLM calls concurrently (up to MAX_CONCURRENT_LLM=20)
  • Time-based simulation loop: Replace getMinExpectedNumber() vehicle-count check with SIM_END_TIME_S to prevent early SUMO termination before agents depart
  • RQ1–RQ4 experiment scripts: Automated parameter sweep runners for four research questions (info quality, social trust, Pareto frontier, population heterogeneity)

Test plan

  • All 301 existing tests pass
  • Run a full simulation end-to-end to verify LLM caching and parallel dispatch work correctly
  • Verify MAX_CONCURRENT_LLM env var is respected
  • Run one RQ experiment script to confirm parameter sweep pipeline works

🤖 Generated with Claude Code

legend5teve and others added 17 commits March 5, 2026 16:05
 Changes to be committed:
	modified:   agentevac/simulation/main.py
	modified:   agentevac/simulation/spawn_events.py
	modified:   agentevac/utils/replay.py
	modified:   sumo/Repaired.netecfg
	modified:   sumo/Repaired.sumocfg
  Module updated: agentevac/utils/replay.py

  - Fixed RouteReplay._load_schedule(...) so it only reads step and veh_id for replayable events:
      - route_change
      - departure_release
  - Non-replayable events like agent_cognition and metrics_snapshot are now ignored without touching veh_id.

  Cause

  - The loader was accessing rec["veh_id"] before checking the event type.
  - metrics_snapshot records do not have veh_id, so replay loading crashed with KeyError.

  Verification

  1. python3 -m py_compile agentevac/utils/replay.py passed.
  2. Reproduced the failing case with a small local script:

  - one route_change
  - one agent_cognition
  - one metrics_snapshot
  - replay load now succeeds and only indexes the route-change step.
…g, and per-agent heterogeneity

- Add compute_signal_conflict() using Jensen-Shannon divergence in belief_model.py
- Restructure all three LLM prompts (pre-departure, destination, route) to expose
  raw env vs. social disagreement via your_observation/neighbor_assessment/
  information_conflict/combined_belief fields
- Add conflict_assessment field to all Pydantic response models
- Add conflict recording to metrics (record_conflict_sample, compute_average_signal_conflict)
- Implement distance-based noise scaling (proposal Eq. 1): effective sigma scales
  with fire margin / reference distance via DIST_REF_M config
- Add per-agent parameter heterogeneity via sample_profile_params() with truncated
  normal distributions; configurable via *_SPREAD env vars (default 0 = legacy)
- Fix stale subjective_information reference in scenarios.py
- Add experiment stage scripts (stages 0-5) for RQ1/RQ2/RQ3 sweeps
- Add comprehensive tests for all new features (291 tests passing)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… recording

- Add observation-based exposure function for no_notice scenario that uses
  agent belief state and route length instead of route-specific fire data
- Enable expected_utility in all three scenarios (no_notice, alert_guided,
  advice_guided) with scenario-aware LLM policy text
- Update menu filtering to retain travel time and utility for no_notice agents
- Fix NET_FILE default from .rou.xml (route file) to .net.xml (network file),
  which caused EDGE_SHAPE to be empty and all exposure scores to be zero
- Fix exposure recording to fire only on decision rounds instead of every
  simulation step, preventing dilution of the exposure average
- Add Repaired.net.xml to repo; update SUMO configs to use local net file

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…e parameters

Use SIM_END_TIME_S (default 1200s) to control simulation duration instead of
relying on getMinExpectedNumber(), which terminated early when no agents had
departed yet. Remove dummy t_0 vehicle from route file. Add --sim-end-time
CLI flag and SIM_END_TIME_S env var. Update fire source growth rates and
timing for more aggressive spread scenarios.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add input-hash caching (Plan C) across all 3 LLM call sites to skip
  redundant API calls when agent inputs haven't changed between rounds
- Add parallel LLM dispatch (Plan A) for process_pending_departures using
  ThreadPoolExecutor — two-phase collect-then-process pattern fires all
  non-cached predeparture LLM calls concurrently (up to MAX_CONCURRENT_LLM)
- Add 4 new fields to AgentRuntimeState for cache state tracking
- Add RQ1–RQ4 experiment runner scripts for automated parameter sweeps

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stop the simulation loop as soon as every spawned vehicle has departed
and arrived at its destination, instead of running until SIM_END_TIME_S.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@stedrew
Copy link
Contributor

stedrew commented Mar 19, 2026

@legend5teve Can you resolve the merge conflicts?

legend5teve and others added 2 commits March 20, 2026 12:58
Restructure LLM decision prompts with explicit priority levels
(safety > official guidance > risk assessment), add EOC guidance_source
to operator briefings, and fix early termination to check actual
arrivals via metrics.arrived_count() instead of active vehicle count.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dule-plots

# Conflicts:
#	agentevac/agents/agent_state.py
#	agentevac/analysis/metrics.py
#	agentevac/simulation/main.py
#	sumo/Repaired.netecfg
#	sumo/Repaired.sumocfg
@stedrew stedrew merged commit 0f77025 into denoslab:main Mar 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants