Skip to content

Commit b5357f9

Browse files
committed
optimizations to the agentic workflow - force rerunning d8 upon extracting crash from DB, etc...
1 parent 0b2fb15 commit b5357f9

File tree

6 files changed

+136
-20
lines changed

6 files changed

+136
-20
lines changed

Sources/Agentic_System/agents/EBG_crash.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,7 @@ def setup_agents(self, crash_program_hash: Optional[str] = None):
240240
execute_javascript_program_tool,
241241
list_d8_flags_tool,
242242
list_v8_trace_options_tool,
243+
trace_v8_analysis_tool,
243244
read_from_generate_folder_tool,
244245
list_generate_folder_tool,
245246
],

Sources/Agentic_System/prompts/EBG-crash-prompts/runtime_analyzer.txt

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,19 @@ which flags you need to use run list_v8_trace_options. Here your goal is to figu
3131
based on the information that was returned to you from the DB analyzer. Your goal should be to figure out what and how the
3232
JS programs runs and create a plan towards figuring out a path forward in terms of analzying the v8 code base to better understand
3333
how to fix the system.
34+
You should execute d8 with tracing flags (via trace_v8_analysis and/or execute_javascript_program) before finalizing conclusions.
35+
If a crash hash is available, you must call trace_v8_analysis for that hash before Stage 5 and include concrete evidence from its stderr/stdout output.
36+
Your final Stage 5 answer must cite: trace_v8_analysis flags_used, return_code, and at least one raw crash line from trace output.
3437

35-
If database evidence indicates synthetic/non-reproducible data (for example fake crash markers), do not launch heavy debugger flows.
36-
In that case, report the limitation and proceed with static/runtime trace evidence only.
38+
CRITICAL EVIDENCE PRIORITY:
39+
1) raw runtime artifacts (stderr, signal, fatal line, stack trace) from direct d8 execution and trace_v8_analysis
40+
2) trace output metadata from tools
41+
3) database summaries/aggregates
42+
43+
Never let database aggregate summaries override contradictory raw runtime artifacts.
44+
If DB indicates "fake crash" but raw runtime has concrete crash evidence (signal/fatal/stack), classify as inconsistent DB state and continue analysis using raw evidence.
45+
46+
Do not skip stages only because the crash is synthetic; gate interpretation confidence, not execution.
3747

3848

3949
## STAGE 3:

Sources/Agentic_System/prompts/EBG-crash-prompts/variant_analysis.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,19 @@ Look for similar instances of the crashing code across the codebase.
2424
Go through the returned RAG entries of V8Search and validate variants, if you can confirm the same bug
2525
exists in the code, please save the code as valid_variant to the RAG.
2626

27+
Evidence priority during validation:
28+
1) raw runtime crash artifacts from d8 runs (stderr/signal/fatal line/stack)
29+
2) trace metadata and tool outputs
30+
3) DB summaries
2731

2832
## STAGE 3
2933
After performing variant analysis, use `JSGenerator` and `Debugger` to create programs
3034
that crash in a similar manner.
35+
You should run d8 with trace flags for crash confirmation context before declaring a variant valid.
36+
For every candidate variant marked valid, you must include one trace_v8_analysis or traced execute_javascript_program result with:
37+
- flags used
38+
- return code
39+
- stderr/stdout crash evidence
3140

3241
Only call `Debugger` after you have a concrete JS artifact path and a specific hypothesis to test.
3342
Do not run debugger loops when there is no reproducible crash signal.

Sources/Agentic_System/prompts/EBG-crash-prompts/variant_manager.txt

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,11 @@ Call `RuntimeAnalyzer` in the following format:
2222
PLEASE PROVIDE A REASON AND PROOF REGARDING YOUR REASONING."
2323
}
2424

25+
CRITICAL EVIDENCE FOR STAGE 1 RESULTS:
26+
- Raw runtime crash artifacts (stderr, signal, fatal line, stack) from direct d8 traces.
27+
- If raw artifacts and DB summaries conflict, treat as DB inconsistency and continue with raw evidence.
28+
- If a crash is found/selected, RuntimeAnalyzer should include atleast one trace_v8_analysis result for that crash hash.
29+
- Stage 1 summary must include: flags used, return code, and quoted stderr/stdout evidence from trace_v8_analysis.
2530

2631
## STAGE 2 Call "VariantAnalysis"
2732
After you have received the reasoning behind the crash/bug, your goal is to call variant analysis:

Sources/Agentic_System/tools/EBG_tools.py

Lines changed: 93 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
from fuzzywuzzy import fuzz
66
from functools import wraps
77
from decimal import Decimal
8+
from collections import OrderedDict
89

910
import psycopg2
1011
import psycopg2.extras
@@ -13,6 +14,7 @@
1314
import hashlib
1415
import random
1516
import re
17+
import json
1618

1719
# Environment variables (optional, for remote PostgreSQL):
1820
# - POSTGRES_HOST: Remote PostgreSQL host/IP (if set, connects to remote instead of local container)
@@ -46,6 +48,8 @@
4648

4749
# Lazy initialization - folder is created only when needed, not at module import
4850
_GENERATE_FOLDER_HASHS = None
51+
_DB_QUERY_CACHE = OrderedDict()
52+
_DB_QUERY_CACHE_MAX = 128
4953

5054
def _get_varianal_folder():
5155
"""Get or create the variant analysis folder path (lazy initialization)."""
@@ -62,6 +66,69 @@ def json_serial(obj):
6266
raise TypeError(f"Type {type(obj)} not serializable")
6367

6468

69+
def _normalize_sql_whitespace(query: str) -> str:
70+
return " ".join((query or "").strip().split())
71+
72+
73+
def _is_read_only_sql(query: str) -> bool:
74+
normalized = _normalize_sql_whitespace(query).lower()
75+
return normalized.startswith("select ") or normalized.startswith("with ") or normalized.startswith("explain ")
76+
77+
78+
def _build_cache_key(query: str, exec_params: tuple) -> str:
79+
return f"{query}||{json.dumps(exec_params, default=str, ensure_ascii=True)}"
80+
81+
82+
def _cache_get(cache_key: str):
83+
if cache_key not in _DB_QUERY_CACHE:
84+
return None
85+
_DB_QUERY_CACHE.move_to_end(cache_key)
86+
return _DB_QUERY_CACHE[cache_key]
87+
88+
89+
def _cache_set(cache_key: str, value: str) -> None:
90+
_DB_QUERY_CACHE[cache_key] = value
91+
_DB_QUERY_CACHE.move_to_end(cache_key)
92+
while len(_DB_QUERY_CACHE) > _DB_QUERY_CACHE_MAX:
93+
_DB_QUERY_CACHE.popitem(last=False)
94+
95+
96+
def _validate_and_prepare_sql(query: str, params: list) -> tuple:
97+
query = (query or "").strip()
98+
params = [] if params is None else params
99+
pg_matches = re.findall(r"\$(\d+)", query)
100+
percent_placeholder_count = len(re.findall(r"(?<!%)%s", query))
101+
102+
if pg_matches and percent_placeholder_count:
103+
return None, None, "Database error: mixed placeholder styles are not allowed (use either $n or %s)."
104+
105+
if pg_matches:
106+
positions = [int(match) for match in pg_matches]
107+
max_pos = max(positions)
108+
if len(params) != max_pos:
109+
return (
110+
None,
111+
None,
112+
f"Database error: positional placeholder mismatch (expected exactly {max_pos} params for $ placeholders, got {len(params)}).",
113+
)
114+
normalized_query = re.sub(r"\$\d+", "%s", query)
115+
exec_params = tuple(params[pos - 1] for pos in positions)
116+
return normalized_query, exec_params, None
117+
118+
if percent_placeholder_count:
119+
if len(params) != percent_placeholder_count:
120+
return (
121+
None,
122+
None,
123+
f"Database error: placeholder mismatch (expected exactly {percent_placeholder_count} params for %s placeholders, got {len(params)}).",
124+
)
125+
return query, tuple(params), None
126+
127+
if len(params) > 0:
128+
return None, None, "Database error: query has no placeholders but params were provided."
129+
return query, tuple(), None
130+
131+
65132

66133
@tool
67134
def db_query(query: str, params: list = None) -> str:
@@ -77,22 +144,19 @@ def db_query(query: str, params: list = None) -> str:
77144
"""
78145
conn = None
79146
try:
80-
params = [] if params is None else params
81-
82-
# Accept PostgreSQL-style placeholders ($1, $2, ...) and normalize to psycopg2 (%s).
83-
positional_matches = re.findall(r"\$(\d+)", query)
84-
if positional_matches:
85-
max_pos = max(int(m) for m in positional_matches)
86-
if len(params) < max_pos:
87-
return (
88-
f"Database error: not enough parameters for positional placeholders "
89-
f"(expected at least {max_pos}, got {len(params)})"
90-
)
91-
normalized_query = re.sub(r"\$\d+", "%s", query)
92-
exec_params = tuple(params[int(m) - 1] for m in positional_matches)
147+
normalized_query, exec_params, validation_error = _validate_and_prepare_sql(query, params)
148+
if validation_error:
149+
return validation_error
150+
151+
read_only_query = _is_read_only_sql(normalized_query)
152+
cache_key = _build_cache_key(_normalize_sql_whitespace(normalized_query), exec_params)
153+
if read_only_query:
154+
cached = _cache_get(cache_key)
155+
if cached is not None:
156+
return cached
93157
else:
94-
normalized_query = query
95-
exec_params = tuple(params)
158+
# Keep cache only for stable read-only data.
159+
_DB_QUERY_CACHE.clear()
96160

97161
conn = psycopg2.connect(
98162
host=POSTGRES_HOST,
@@ -103,8 +167,17 @@ def db_query(query: str, params: list = None) -> str:
103167
)
104168
cursor = conn.cursor(cursor_factory=psycopg2.extras.RealDictCursor)
105169
cursor.execute(normalized_query, exec_params)
170+
if cursor.description is None:
171+
conn.commit()
172+
return json.dumps(
173+
{"status": "ok", "rows_affected": cursor.rowcount},
174+
default=json_serial,
175+
indent=2,
176+
)
106177
rows = cursor.fetchall()
107178
result_json = json.dumps(rows, default=json_serial, indent=2)
179+
if read_only_query:
180+
_cache_set(cache_key, result_json)
108181
return result_json
109182
except psycopg2.Error as e:
110183
return f"Database error: {e}"
@@ -926,6 +999,10 @@ def trace_v8_analysis(
926999
with open(filepath_js, "w") as f:
9271000
f.write(js_code)
9281001

1002+
# Enforce baseline tracing when caller does not specify tracing options.
1003+
if presets is None and custom_flags is None:
1004+
presets = ["tiering", "maglev", "ignition"]
1005+
9291006
flags = ["--allow-natives-syntax"]
9301007

9311008
if presets:
@@ -1165,6 +1242,7 @@ def db_store_generated_program(js_program: str, fuzzer_id: int) -> str:
11651242
# Fetch the inserted row
11661243
row = cursor.fetchone()
11671244
conn.commit()
1245+
_DB_QUERY_CACHE.clear()
11681246

11691247
# If row is None, the program already existed (conflict)
11701248
if row is None:

Sources/Agentic_System/tools/FoG_tools_ika.py

Lines changed: 16 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1633,11 +1633,24 @@ def _execute_javascript_program_executor(params: dict) -> str:
16331633
if not template_js_path:
16341634
return "Error: template_js_path parameter is required"
16351635

1636-
if "--allow-natives-syntax" not in d8_flags:
1637-
d8_flags += " --allow-natives-syntax"
1636+
required_flags = [
1637+
"--allow-natives-syntax",
1638+
"--trace-opt",
1639+
"--trace-deopt",
1640+
"--trace-maglev-graph-building",
1641+
"--print-bytecode",
1642+
]
1643+
for flag in required_flags:
1644+
if flag not in d8_flags:
1645+
d8_flags += f" {flag}"
1646+
d8_flags = d8_flags.strip()
16381647

16391648
d8 = run_command(f"{D8_PATH} {d8_flags} {template_js_path}")
1640-
return f"Program execution result:\n{d8.stderr}\n{d8.stdout}"
1649+
return (
1650+
"Program execution result:\n"
1651+
f"[flags used] {d8_flags}\n"
1652+
f"{d8.stderr}\n{d8.stdout}"
1653+
)
16411654

16421655
execute_javascript_program_tool = IkaTools(
16431656
name="execute_javascript_program",

0 commit comments

Comments
 (0)