You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-**Description:** Iterate on `scripts/optimize_strategies.py` to improve robustness, extensibility, and user experience after the initial implementation (PR #72). Focus areas include stronger validation, clearer feedback for long-running jobs, more flexible parameter handling, consistent logging, and forward-compatible result storage.
1556
+
-**Acceptance Criteria:**
1557
+
- CLI validates YAML and CLI overrides with clear, actionable error messages for invalid ranges, missing/unknown fields, and conflicting options.
1558
+
- Optional parallel evaluation mode (config flag and/or CLI flag) for grid/random search that preserves deterministic behavior when seeds are fixed.
1559
+
- Result storage schema includes a simple version field and continues to read older rows safely (defensive parsing for missing/extra columns).
1560
+
- Logging uses the standard `logging` module with at least `--verbose` and `--quiet` behaviors, replacing ad-hoc `stderr` writes.
1561
+
- Categorical (non-numeric) parameter support is either minimally implemented for a representative case or explicitly documented as a future design (with clear limitations).
1562
+
- All tests pass; coverage for `scripts/optimize_strategies.py` is maintained at or above the current baseline.
See the sections below for details on statistical analysis, regression detection, and report formats.
230
230
231
+
## Strategy Parameter Optimization
232
+
233
+
The `optimize_strategies.py` script (Phase 11, M11.4) provides automated strategy parameter tuning using optimization algorithms to find well-balanced strategy configurations. It helps reduce dominant strategy win rate deltas and improve strategic diversity.
234
+
235
+
### Optimization Algorithms
236
+
237
+
Three optimization algorithms are supported:
238
+
239
+
1. **Grid Search** (`--algorithm grid`): Exhaustive search over all parameter combinations. Best for small parameter spaces or when you need guaranteed coverage.
240
+
241
+
2. **Random Search** (`--algorithm random`): Randomly samples parameter configurations. Efficient for large parameter spaces where exhaustive search is impractical.
242
+
243
+
3. **Bayesian Optimization** (`--algorithm bayesian`): Uses Gaussian processes to intelligently explore the parameter space. Requires `scikit-optimize` package (`pip install scikit-optimize`).
244
+
245
+
### Configuration
246
+
247
+
Optimization can be configured via `content/config/optimization.yml` or CLI flags:
248
+
249
+
```yaml
250
+
parameters:
251
+
stability_low:
252
+
min: 0.5
253
+
max: 0.8
254
+
step: 0.1 # For grid search
255
+
stability_critical:
256
+
min: 0.3
257
+
max: 0.5
258
+
faction_low_legitimacy:
259
+
min: 0.3
260
+
max: 0.6
261
+
262
+
targets:
263
+
- name: win_rate_delta
264
+
weight: 1.0
265
+
direction: minimize
266
+
- name: diversity
267
+
weight: 0.5
268
+
direction: maximize
269
+
270
+
settings:
271
+
algorithm: random
272
+
n_samples: 50
273
+
tick_budget: 100
274
+
seeds: [42, 123, 456]
275
+
strategies: [balanced, aggressive, diplomatic]
276
+
```
277
+
278
+
### Running Optimization
279
+
280
+
**Basic optimization with grid search:**
281
+
```bash
282
+
uv run python scripts/optimize_strategies.py optimize --algorithm grid
283
+
```
284
+
285
+
**Random search with more samples:**
286
+
```bash
287
+
uv run python scripts/optimize_strategies.py optimize --algorithm random --samples 100
uv run python scripts/optimize_strategies.py optimize --algorithm bayesian --samples 50
293
+
```
294
+
295
+
### Optimization Targets
296
+
297
+
The optimizer supports multiple optimization targets:
298
+
299
+
- **win_rate_delta**: Minimize the maximum win rate difference between strategies. Lower values indicate better balance.
300
+
- **diversity**: Maximize strategic diversity (different strategies succeed in different scenarios). Uses entropy-based scoring.
301
+
- **stability**: Target average stability across simulations.
302
+
303
+
Multi-objective optimization produces a Pareto frontier showing trade-offs between competing objectives.
304
+
305
+
### Pareto Frontier
306
+
307
+
The Pareto frontier represents configurations that are optimal in some dimension—no other configuration is better in all objectives simultaneously. View the frontier with:
308
+
309
+
```bash
310
+
uv run python scripts/optimize_strategies.py pareto --database build/sweep_results.db
311
+
```
312
+
313
+
This helps identify trade-offs such as:
314
+
- Balance vs. difficulty (easier games may be more balanced)
315
+
- Diversity vs. stability (more diverse outcomes may have wider stability ranges)
316
+
317
+
#### Interpreting the Pareto Frontier
318
+
319
+
The Pareto frontier is a set of parameter configurations where no single configuration is strictly better than another across all objectives. Each point on the frontier represents a trade-off:
320
+
321
+
- **If you want the most balanced game:** Look for points with the lowest `win_rate_delta`.
322
+
- **If you want the most diverse strategies:** Look for points with the highest `diversity` score.
323
+
- **If you want a compromise:** Choose a point that balances both objectives, or use the weights in your optimization config to bias toward your design goals.
324
+
325
+
**Visualizing the Pareto Frontier:**
326
+
You can plot the Pareto points (e.g., `win_rate_delta` vs. `diversity`) using your favorite plotting tool or spreadsheet. This helps you see the shape of the trade-off curve and pick a configuration that fits your needs.
327
+
328
+
**Example:**
329
+
If the Pareto frontier includes:
330
+
331
+
| win_rate_delta | diversity | stability |
332
+
|----------------|-----------|-----------|
333
+
| 0.10 | 0.95 | 0.70 |
334
+
| 0.12 | 0.98 | 0.68 |
335
+
| 0.08 | 0.90 | 0.72 |
336
+
337
+
You might choose the first row for best balance, the second for best diversity, or the third for a compromise.
338
+
339
+
**Tip:** No point on the Pareto frontier is strictly "best"—the right choice depends on your design priorities.
340
+
341
+
### Output Files
342
+
343
+
After optimization, results are saved to the output directory (default: `build/optimization/`):
344
+
345
+
- `optimization_result.json`: Full optimization data including all evaluated configurations
346
+
- `optimization_report.md`: Human-readable Markdown report with best parameters and Pareto frontier
347
+
348
+
### Integration with Result Storage
349
+
350
+
Optimization results are automatically stored in the sweep results database (`build/sweep_results.db`) for historical tracking. Query past optimization runs:
351
+
352
+
```bash
353
+
uv run python scripts/optimize_strategies.py pareto --limit 10
| `--config, -c` | Path to YAML configuration file |
362
+
| `--samples, -n` | Number of samples for random/bayesian search |
363
+
| `--ticks, -t` | Tick budget per sweep simulation |
364
+
| `--seed` | Random seed for reproducibility |
365
+
| `--output-dir, -o` | Output directory for results |
366
+
| `--database, -d` | Path to sweep results database |
367
+
| `--json` | Output as JSON instead of files |
368
+
| `--verbose, -v` | Print progress information |
369
+
| `--no-store` | Skip storing result in database |
370
+
371
+
### Example Workflow
372
+
373
+
### Troubleshooting & FAQ
374
+
375
+
- **Q: My optimization run is very slow or seems stuck.**
376
+
- Try reducing the number of samples or using random search instead of grid search for large parameter spaces.
377
+
- Use the `--verbose` flag to monitor progress.
378
+
- **Q: I get an error about missing or invalid parameters.**
379
+
- Check your YAML config and CLI flags for typos or out-of-range values. All parameter names must match those in your strategy config.
380
+
- **Q: The Pareto frontier is empty or has only one point.**
381
+
- This can happen if all configurations are dominated or if your parameter ranges are too narrow. Try expanding the search space or adjusting your targets.
382
+
- **Q: How do I add a new optimization target?**
383
+
- Edit your config to add a new target (e.g., `stability`) and re-run the optimizer. See the config example above.
384
+
385
+
1. **Run initial optimization:**
386
+
```bash
387
+
uv run python scripts/optimize_strategies.py optimize --algorithm random --samples 50 --verbose
388
+
```
389
+
390
+
2. **Review results:**
391
+
```bash
392
+
cat build/optimization/optimization_report.md
393
+
```
394
+
395
+
3. **Apply best parameters:** Update `src/gengine/ai_player/strategies.py` with the discovered optimal values for `StrategyConfig`.
396
+
397
+
4. **Validate with batch sweeps:**
398
+
```bash
399
+
uv run python scripts/run_batch_sweeps.py --output-dir build/validation
400
+
```
401
+
402
+
5. **Generate balance report:**
403
+
```bash
404
+
uv run python scripts/analyze_balance.py report --database build/sweep_results.db
405
+
```
406
+
231
407
## Balance Iteration Workflow
232
408
233
409
### Recommended Workflow
234
410
235
411
1. **Initial Exploration:** Run batch sweeps with diverse parameter combinations to establish baseline metrics.
236
412
2. **Tournament Validation:** Run focused tournaments on specific strategy combinations.
237
413
3. **Analysis:** Use the analysis script to identify dominant strategies, underpowered/overpowered actions, and unused content.
238
-
4. **Adjustment:** Modify simulation parameters or authored content based on findings.
239
-
5. **Regression Testing:** Re-run batch sweeps to validate improvements and ensure no regressions.
414
+
4. **Parameter Optimization:** Use `optimize_strategies.py` to find balanced parameter configurations automatically.
415
+
5. **Adjustment:** Apply optimized parameters or modify authored content based on findings.
416
+
6. **Regression Testing:** Re-run batch sweeps to validate improvements and ensure no regressions.
240
417
241
418
## CI Integration
242
419
243
420
A nightly CI workflow automatically runs tournaments and batch sweeps, archiving results for ongoing balance review. See `.github/workflows/ai-tournament.yml` for details.
244
421
245
422
## Usage Tips
246
423
424
+
## Best Practices & Advanced Tips
425
+
426
+
- Start with a broad parameter sweep to understand the landscape, then narrow in on promising regions.
427
+
- Use multiple random seeds to avoid overfitting to a single scenario.
428
+
- Regularly review the Markdown and JSON reports to track progress and spot regressions.
429
+
- Archive your optimization results and reports for future reference and reproducibility.
430
+
- For advanced analysis, export Pareto points and plot them to visualize trade-offs.
431
+
247
432
- Use different world configs and seeds to stress-test balance across scenarios.
248
433
- For large parameter spaces, start with `sampling.mode: random` and a reduced `sample_count`.
249
434
- Review the analysis report regularly to guide design iteration.
0 commit comments