RegimeForge is a regime-aware reinforcement learning workbench for synthetic trading research. It combines hidden-regime market simulators, multiple discrete and continuous RL baselines, live dashboards, and experiment tooling for ablations, OOD sweeps, and artifact-driven analysis.
The public project name is RegimeForge. The internal Python package remains
regime_lens for compatibility.
Most toy RL trading repos stop at a single equity curve. RegimeForge is designed to answer a harder question: did the agent actually learn market structure, or did it just get lucky on one trajectory?
To make that inspectable, the project focuses on three things:
- Synthetic and semi-realistic market configurations with explicit hidden regimes such as
bull,bear,chop, andshock - Multiple policy families, including discrete DQN agents and continuous actor-critic agents
- Instrumentation that exposes training progress, checkpoint performance, regime alignment, expert specialization, and reproducibility metadata
- Synthetic hidden-regime market environments with configurable transition matrices and per-regime dynamics
- Semi-realistic data mode that fits regime parameters from local price CSVs and injects them into training configs
- Discrete agent variants: vanilla DQN, Oracle DQN, HMM+DQN, RCMoE-DQN, and Transformer-DQN
- Continuous actor-critic variants for PPO, SAC, and gated RCMoE actor-critic experiments
- Dreamer-style world-model experiments with RSSM dynamics and latent-space actor/critic training
- Multi-asset and non-stationary continuous market paths for robustness checks
- Terminal dashboard and FastAPI artifact dashboard for inspecting runs and checkpoints
- Experiment runner for smoke tests, full benchmarks, ablations, OOD sweeps, parallel execution, and report rebuilds
- Artifact pipeline for model weights, resume state, summaries, metrics, stats, explainability, and reproducibility metadata
- Visualization helpers for curves, heatmaps, policy surfaces, gate evolution, and LaTeX export
flowchart LR
A["Market Environments"] --> B["Training Manager"]
B --> C["Discrete + Continuous Agents"]
C --> D["Artifacts"]
D --> E["TUI Dashboard"]
D --> F["Web Dashboard"]
D --> G["Experiment Reports"]
D --> H["Visualization Exports"]
dqn: standard DQN baseline with no explicit regime modeloracle_dqn: upper-bound baseline with true regime one-hot appended to observationshmm_dqn: lightweight GMM detector plus DQN policy, used as the current HMM-style proxy baselinercmoe_dqn: regime-conditioned mixture-of-experts DQN with gate routing and expert specialization analysistransformer_dqn: DQN with a Transformer encoder over flattened observation historiesworld_model: Dreamer-style latent dynamics agent for discrete market experimentsppo: continuous actor-critic policy over allocation weightssac: off-policy continuous actor-critic policy over allocation weightsrcmoeactor-critic: gated expert actor-critic path for continuous PPO/SAC variants
The Rich TUI remains the best live training view. The FastAPI dashboard is an artifact-first viewer for completed or running experiment directories.
Overview: latest reward, return, epsilon, loss, checkpoints, runtime, and baseline snapshotsRegime Lens: live regime routing, gate accuracy, clustering alignment, and timeline contextExpert Deep Dive: expert activations, utilization, dominance, and specialization summariesPerformance: financial metrics, baseline comparison, and per-regime breakdownsConfig: reproducibility-focused configuration and runtime context
Start the web dashboard:
cd D:\RL\backend
D:\miniconda\envs\statshell\python.exe -m regime_lens.web --artifact-root D:\RL\backend\artifactsKeyboard shortcuts:
1-5: switch viewsTab/Shift+Tab: change focusSpace: pause or resume trainingr: toggle regime detaile: toggle expert detailq: exit the dashboard
RL/
|-- backend/
| |-- regime_lens/
| | |-- config.py
| | |-- agent_io.py
| | |-- market.py
| | |-- dqn.py
| | |-- oracle_dqn.py
| | |-- hmm_dqn.py
| | |-- rcmoe.py
| | |-- transformer_agent.py
| | |-- world_model.py
| | |-- training.py
| | |-- tui.py
| | |-- web.py
| | |-- run_experiments.py
| | |-- continuous_agent.py
| | |-- continuous_market.py
| | |-- config_io.py
| | |-- data.py
| | |-- stats_ext.py
| | |-- explainability.py
| | `-- visualization.py
| |-- README.md
| `-- pyproject.toml
|-- docs/
| |-- architecture.md
| |-- experiments.md
| `-- ui-guide.md
|-- scripts/
| `-- start_tui.ps1
`-- README.md
The project targets Python 3.12+.
cd D:\RL\backend
D:\miniconda\envs\statshell\python.exe -m pip install -e .From the repository root:
powershell -ExecutionPolicy Bypass -File D:\RL\scripts\start_tui.ps1Direct Python entry:
cd D:\RL\backend
D:\miniconda\envs\statshell\python.exe -m regime_lens.tui --fresh --lang en --charset unicodecd D:\RL\backend
D:\miniconda\envs\statshell\python.exe -m regime_lens.run_experiments plan --suite full
D:\miniconda\envs\statshell\python.exe -m regime_lens.run_experiments run --suite smoke --experiment-name demo_smoke
D:\miniconda\envs\statshell\python.exe -m regime_lens.run_experiments serve --artifact-root D:\RL\backend\artifactssmoke: short pipeline validation runfull: core benchmark matrix across agent families and baselinesablation: RCMoE sweeps across expert count, gate width, hidden size, and load-balancing weightood: generalization sweeps under altered persistence, switching frequency, volatility, or driftall: full benchmark plus ablations plus OOD suites
Generated reports are written under backend/artifacts/_experiments.
Continuous-action smoke example:
cd D:\RL\backend
D:\miniconda\envs\statshell\python.exe -m regime_lens.run_experiments run --suite smoke --algorithm sac --continuous-actions --episodes 1 --evaluation-episodes 1- The repository is artifact-first. TUI, Web dashboard, reports, and visualization helpers all read the same run/checkpoint layout.
- Agent-specific observation shaping lives in
backend/regime_lens/agent_io.py; train, eval, policy-surface, and explainability paths should go through that adapter instead of open-coding Oracle/HMM/context transforms. - The PowerShell launcher is Windows-oriented, but the Python package entry points are usable directly.
- Artifact directories are intentionally excluded from version control.
See CONTRIBUTING.md for local setup, coding expectations, and pull request guidance.
This repository is released under the MIT License.