Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
3623b4c
refactor: Change scenario prompts in agents/scenarios.py
legend5teve Mar 5, 2026
df2f056
chore: update the suggested routes and destination for Lytton
legend5teve Mar 6, 2026
477b0de
fix: update replay.py to fix key error
legend5teve Mar 6, 2026
0f4ac33
feat: optimize the visualization module for plotting statistic result…
legend5teve Mar 9, 2026
747d1ae
Merge branch 'main' into feat/visualization-module-plots
legend5teve Mar 9, 2026
4f3172f
feat: implement timeline analysis for evacuation in scripts/plot_agen…
legend5teve Mar 11, 2026
aaca505
Merge branch 'main' into feat/visualization-module-plots
legend5teve Mar 11, 2026
1c7ff71
chore: add test cases to cover newly added features; update doc strin…
legend5teve Mar 11, 2026
44bbd68
chore: update plotting scales according to actual KPI scales
legend5teve Mar 11, 2026
a1f935f
feat: log run parameters for plotting modules
legend5teve Mar 11, 2026
077be18
Merge branch 'main' into feat/visualization-module-plots
legend5teve Mar 11, 2026
bc3874e
feat: implement signal conflict modeling, distance-based noise scalin…
legend5teve Mar 16, 2026
d73e226
Merge branch 'main' into feat/visualization-module-plots
legend5teve Mar 16, 2026
ef0602e
feat: extend utility scoring to all scenarios and fix hazard exposure…
legend5teve Mar 16, 2026
73ab269
feat: replace vehicle-count loop with time-based sim end and tune fir…
legend5teve Mar 17, 2026
b357efe
feat: add LLM input-hash caching and parallel predeparture LLM dispatch
legend5teve Mar 18, 2026
edd71e9
feat: add early termination when all agents have evacuated
legend5teve Mar 19, 2026
e0c0695
feat: add prioritized prompt framework and fix arrival-based termination
legend5teve Mar 20, 2026
36014f4
Merge remote-tracking branch 'origin/main' into feat/visualization-mo…
legend5teve Mar 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ python -m pytest tests/

**Key CLI flags for the simulation:** `--scenario` (no_notice|alert_guided|advice_guided), `--messaging` (on|off), `--events` (on|off), `--web-dashboard` (on|off), `--metrics` (on|off), `--overlays` (on|off).

**Key environment variables:** `OPENAI_MODEL` (default: `gpt-4o-mini`), `DECISION_PERIOD_S` (default: `5.0`), `NET_FILE` (default: `sumo/Repaired.rou.xml`), `SUMO_CFG` (default: `sumo/Repaired.sumocfg`), `RUN_MODE`, `REPLAY_LOG_PATH`, `EVENTS_LOG_PATH`, `METRICS_LOG_PATH`.
**Key environment variables:** `OPENAI_MODEL` (default: `gpt-4o-mini`), `DECISION_PERIOD_S` (default: `5.0`), `NET_FILE` (default: `sumo/Repaired.net.xml`), `SUMO_CFG` (default: `sumo/Repaired.sumocfg`), `RUN_MODE`, `REPLAY_LOG_PATH`, `EVENTS_LOG_PATH`, `METRICS_LOG_PATH`.

## Architecture

Expand Down Expand Up @@ -80,7 +80,7 @@ python -m pytest tests/

At the top of the file (labeled `USER CONFIG`):
- `CONTROL_MODE` — `"destination"` (default) or `"route"`
- `NET_FILE` — path to SUMO route/network file (overridable via `NET_FILE` env var; default: `sumo/Repaired.rou.xml`)
- `NET_FILE` — path to SUMO route/network file (overridable via `NET_FILE` env var; default: `sumo/Repaired.net.xml`)
- `DESTINATION_LIBRARY` / `ROUTE_LIBRARY` — hardcoded choice menus for agents
- `OPENAI_MODEL` / `DECISION_PERIOD_S` — overridable via env vars

Expand Down
6 changes: 5 additions & 1 deletion agentevac/agents/agent_state.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
import math
import random
from dataclasses import dataclass, field
from typing import Any, Dict, List, Tuple
from typing import Any, Dict, List, Optional, Tuple


@dataclass
Expand Down Expand Up @@ -66,6 +66,10 @@ class AgentRuntimeState:
decision_history: List[Dict[str, Any]] = field(default_factory=list)
observation_history: List[Dict[str, Any]] = field(default_factory=list)
has_departed: bool = True
last_input_hash: Optional[int] = None
last_llm_choice_idx: Optional[int] = None
last_llm_reason: Optional[str] = None
last_llm_action: Optional[str] = None


# Global registry of all agent states, keyed by vehicle ID.
Expand Down
88 changes: 74 additions & 14 deletions agentevac/agents/routing_utility.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
- ``E(option)`` : expected exposure score (``_expected_exposure``).
- ``C(option)`` : travel cost in equivalent minutes (``_travel_cost``).

Expected exposure combines four components:
Expected exposure combines four components (in ``alert_guided`` and ``advice_guided``):
1. **risk_sum** : Sum of edge-level fire risk scores along the route (or fastest path).
Scaled by a severity multiplier that increases with ``p_risky`` and ``p_danger``.
2. **blocked_edges** : Number of edges along the route that are currently inside a
Expand All @@ -23,8 +23,14 @@
4. **uncertainty_penalty** : A penalty proportional to ``(1 - confidence)`` that
discourages fragile choices when the agent is unsure of the hazard.

Annotated menus (``annotate_menu_with_expected_utility``) are used in the *advice_guided*
scenario so the LLM receives pre-computed utility context alongside each option.
In ``no_notice`` mode, agents lack route-specific fire data. Exposure is instead
estimated from the agent's general belief state scaled by route length — longer routes
mean more time exposed to whatever danger the agent perceives.

Annotated menus (``annotate_menu_with_expected_utility``) are computed for all three
scenarios so the LLM always receives a utility score. The *precision* of the exposure
estimate varies by information regime: belief-only (no_notice), current fire state
(alert_guided), or current fire state with full route-head data (advice_guided).
"""

from typing import Any, Dict, List
Expand Down Expand Up @@ -107,6 +113,51 @@ def _travel_cost(menu_item: Dict[str, Any]) -> float:
return _num(edge_count, 0.0) * 0.25


def _observation_based_exposure(
menu_item: Dict[str, Any],
belief: Dict[str, Any],
psychology: Dict[str, Any],
) -> float:
"""Estimate hazard exposure when route-specific fire data is unavailable.

Used in the ``no_notice`` scenario where agents have only their own noisy
observation of the current edge. Without per-route fire metrics (risk_sum,
blocked_edges, min_margin_m), exposure is derived from the agent's general
belief state scaled by route length:

hazard_level = 0.3 * p_risky + 0.7 * p_danger + 0.4 * perceived_risk
length_factor = len_edges * 0.15
exposure = hazard_level * length_factor + uncertainty_penalty

Longer routes are penalised more because a longer route means more time
spent driving through a potentially hazardous environment. The coefficients
prioritise ``p_danger`` (0.7) over ``p_risky`` (0.3) to maintain consistency
with the severity weighting in ``_expected_exposure``.

Args:
menu_item: A destination or route dict.
belief: The agent's current Bayesian belief dict.
psychology: The agent's current psychology dict (perceived_risk, confidence).

Returns:
Expected exposure score (>= 0; higher = more hazardous).
"""
p_risky = _num(belief.get("p_risky"), 1.0 / 3.0)
p_danger = _num(belief.get("p_danger"), 1.0 / 3.0)
perceived_risk = _num(psychology.get("perceived_risk"), p_danger)
confidence = _num(psychology.get("confidence"), 0.0)

hazard_level = 0.3 * p_risky + 0.7 * p_danger + 0.4 * perceived_risk
len_edges = _num(
menu_item.get("len_edges", menu_item.get("len_edges_fastest_path")),
1.0,
)
length_factor = len_edges * 0.15
uncertainty_penalty = max(0.0, 1.0 - confidence) * 0.75

return hazard_level * length_factor + uncertainty_penalty


def _expected_exposure(
menu_item: Dict[str, Any],
belief: Dict[str, Any],
Expand Down Expand Up @@ -227,28 +278,38 @@ def annotate_menu_with_expected_utility(
belief: Dict[str, Any],
psychology: Dict[str, Any],
profile: Dict[str, Any],
scenario: str = "advice_guided",
) -> List[Dict[str, Any]]:
"""Annotate each menu option in-place with its expected utility and component breakdown.

For *destination* mode, unreachable options (``reachable=False``) receive
``expected_utility=None`` and a minimal component dict to signal their exclusion.

The annotated menu is later filtered by ``scenarios.filter_menu_for_scenario`` so
that utility scores are only visible to agents in the *advice_guided* regime.
The ``scenario`` parameter controls which exposure function is used:

- ``"no_notice"``: ``_observation_based_exposure`` — uses only the agent's belief
state and route length (no route-specific fire data).
- ``"alert_guided"`` / ``"advice_guided"``: ``_expected_exposure`` — uses route-
specific fire metrics (risk_sum, blocked_edges, min_margin_m).

Args:
menu: List of destination or route dicts (mutated in-place).
mode: ``"destination"`` or ``"route"`` — selects the scoring function.
belief: The agent's current Bayesian belief dict.
psychology: The agent's current psychology dict.
profile: The agent's profile dict (supplies ``lambda_e``, ``lambda_t``).
scenario: Active information regime (``"no_notice"``, ``"alert_guided"``,
or ``"advice_guided"``). Controls which exposure function is used.

Returns:
The same ``menu`` list, with each item updated to include:
- ``expected_utility`` : Scalar utility score or ``None`` if unreachable.
- ``utility_components``: Dict with lambda_e, lambda_t, expected_exposure,
travel_cost (and ``reachable=False`` if unreachable).
"""
use_observation_exposure = str(scenario).strip().lower() == "no_notice"
exposure_fn = _observation_based_exposure if use_observation_exposure else _expected_exposure

for item in menu:
if mode == "destination":
if not item.get("reachable", False):
Expand All @@ -260,18 +321,17 @@ def annotate_menu_with_expected_utility(
"reachable": False,
}
continue
expected_exposure = _expected_exposure(item, belief, psychology)
travel_cost = _travel_cost(item)
utility = score_destination_utility(item, belief, psychology, profile)
else:
expected_exposure = _expected_exposure(item, belief, psychology)
travel_cost = _travel_cost(item)
utility = score_route_utility(item, belief, psychology, profile)

lambda_e = max(0.0, _num(profile.get("lambda_e"), 1.0))
lambda_t = max(0.0, _num(profile.get("lambda_t"), 0.1))
expected_exposure = exposure_fn(item, belief, psychology)
travel_cost = _travel_cost(item)
utility = -((lambda_e * expected_exposure) + (lambda_t * travel_cost))

item["expected_utility"] = round(utility, 4)
item["utility_components"] = {
"lambda_e": round(max(0.0, _num(profile.get("lambda_e"), 1.0)), 4),
"lambda_t": round(max(0.0, _num(profile.get("lambda_t"), 0.1)), 4),
"lambda_e": round(lambda_e, 4),
"lambda_t": round(lambda_t, 4),
"expected_exposure": round(expected_exposure, 4),
"travel_cost": round(travel_cost, 4),
}
Expand Down
51 changes: 27 additions & 24 deletions agentevac/agents/scenarios.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@

**no_notice** — No official warning exists yet.
Agents rely solely on their own noisy margin observations and natural-language
messages from neighbours. Menu items contain only minimal fields (name,
reachability). This represents the typical onset of a rapidly spreading wildfire
messages from neighbours. Menu items include route identity, travel time, and
an observation-based utility score (local road knowledge), but no fire-specific
risk metrics. This represents the typical onset of a rapidly spreading wildfire
before emergency services have issued formal guidance.

**alert_guided** — Official alerts broadcast general hazard information.
Expand Down Expand Up @@ -68,7 +69,7 @@ def load_scenario_config(mode: str) -> Dict[str, Any]:
"forecast_visible": False,
"route_head_forecast_visible": False,
"official_route_guidance_visible": False,
"expected_utility_visible": False,
"expected_utility_visible": True,
"neighborhood_observation_visible": True,
}
if name == "alert_guided":
Expand All @@ -81,7 +82,7 @@ def load_scenario_config(mode: str) -> Dict[str, Any]:
"forecast_visible": True,
"route_head_forecast_visible": False,
"official_route_guidance_visible": False,
"expected_utility_visible": False,
"expected_utility_visible": True,
"neighborhood_observation_visible": True,
}
return {
Expand Down Expand Up @@ -194,22 +195,27 @@ def filter_menu_for_scenario(
for item in menu:
out = dict(item)
if not cfg["official_route_guidance_visible"]:
# Remove advisory labels produced by the operator briefing logic.
# Remove advisory labels and authority source produced by the operator briefing logic.
out.pop("advisory", None)
out.pop("briefing", None)
out.pop("reasons", None)

if not cfg["expected_utility_visible"]:
# Remove pre-computed utility scores so agents cannot use them as a shortcut.
out.pop("expected_utility", None)
out.pop("utility_components", None)
out.pop("guidance_source", None)

if cfg["mode"] == "no_notice":
# Reduce to the bare minimum an agent could reasonably know without warnings.
# Keep fields an agent could plausibly know from local familiarity:
# route identity, reachability, travel time/length (local knowledge),
# and observation-based utility scores.
if control_mode == "destination":
keep_keys = {"idx", "name", "dest_edge", "reachable", "note"}
keep_keys = {
"idx", "name", "dest_edge", "reachable", "note",
"travel_time_s_fastest_path", "len_edges_fastest_path",
"expected_utility", "utility_components",
}
else:
keep_keys = {"idx", "name", "len_edges"}
keep_keys = {
"idx", "name", "len_edges",
"expected_utility", "utility_components",
}
out = {k: v for k, v in out.items() if k in keep_keys}

prompt_menu.append(out)
Expand Down Expand Up @@ -240,17 +246,14 @@ def scenario_prompt_suffix(mode: str) -> str:
)
if cfg["mode"] == "alert_guided":
return (
"This is an alert-guided scenario: official alerts describe the fire, but they do not prescribe a route. "
# "Use forecast and hazard cues, but make your own navigation choice."
"but do not prescribe a specific route. Do NOT invent route guidance. Use the provided official alert content, "
"hazard and forecast cues (if provided), and local road conditions to choose when, where and how to evacuate."

"This is an alert-guided scenario: official alerts describe the fire but do not prescribe a specific route. "
"Do NOT invent route guidance. Use the provided official alert content, "
"hazard and forecast cues, and local road conditions to decide when, where, and how to evacuate."
)
return (
"This is an advice-guided scenario: official alerts include route-oriented guidance. "
"You may use advisories, briefings, and expected utility as formal support. "
# "ADVICE-GUIDED scenario: officials issue an evacuation *order* (leave immediately) and include route-oriented guidance (may be high-level and may change)."
"Default to following designated routes/instructions unless they are blocked, unsafe "
"or extremely congested; if deviating, state why and pick the safest feasible alternative. Stay responsive to updates."

"This is an advice-guided evacuation: the Emergency Operations Center has issued official route guidance for your area. "
"Follow routes marked advisory='Recommended' unless they are physically blocked or impassable. "
"If you must deviate from official guidance, state why and choose the safest feasible alternative. "
"Delayed departure or ignoring recommended routes increases your exposure to dangerous fire conditions. "
"Stay responsive to updated guidance as conditions change."
)
4 changes: 4 additions & 0 deletions agentevac/analysis/metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,10 @@ def record_arrival(self, agent_id: str, sim_t_s: float) -> None:
self._arrival_times[agent_id] = float(sim_t_s)
self._last_seen_time[agent_id] = float(sim_t_s)

def arrived_count(self) -> int:
"""Return the number of agents that have arrived at their destination."""
return len(self._arrival_times)

def observe_active_vehicles(self, active_vehicle_ids: List[str], sim_t_s: float) -> None:
"""Update the active-vehicle set for live bookkeeping only.

Expand Down
Loading
Loading