Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Run `uv run pytest tests/` after every implementation or test revision. Never as

### 4. Never Skip Failing Tests

Investigate root cause and fix the underlying issue. Never use `pytest.mark.skip` or `xfail` to hide failures. Skips are only acceptable for hardware/EP requirements (CUDA, DirectML, AVX).
Investigate root cause and fix the underlying issue. Never use `pytest.mark.skip` or `xfail` to hide failures. Skips are only acceptable for hardware/EP requirements (CUDA, Dml, AVX).
Comment thread
timenick marked this conversation as resolved.

## Development Commands

Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,9 @@
| **QNN** | Qualcomm NPU (Snapdragon X Elite) | 🟢 Ready | `--ep qnn` | `--device npu` |
| **OpenVINO** | Intel NPU (Meteor Lake / Lunar Lake) | 🟢 Ready | `--ep openvino` | `--device npu` |
| **VitisAI** | AMD NPU (Ryzen AI) | 🟢 Ready | `--ep vitisai` | `--device npu` |
| **TensorRT** | NVIDIA discrete GPUs | 🔶 Planned | `--ep tensorrt` | `--device gpu` |
| **NvTensorRTRTX** | NVIDIA discrete GPUs | 🔶 Planned | `--ep nv_tensorrt_rtx` | `--device gpu` |
| **MIGraphX** | AMD discrete GPUs | 🔶 Planned | `--ep migraphx` | `--device gpu` |
| **DirectML** | Hardware-agnostic GPU backend | 🔶 Planned | `--ep dml` | `--device gpu` |
| **Dml** | Hardware-agnostic GPU backend | 🔶 Planned | `--ep dml` | `--device gpu` |
Comment thread
timenick marked this conversation as resolved.
| **CPU** | Cross-platform fallback | ⚪ Always available | `--ep cpu` | `--device cpu` |

> **Tip:** Use `--device auto` and ModelKit picks the best available device — NPU first, then GPU, then CPU.
Expand Down Expand Up @@ -398,7 +398,7 @@ Supported tasks include:
|:----------|:-------|:-----------|
| 🟡 **Kickoff** | Q4 2025 | Internal prototype, core primitive commands |
| 🟢 **Early Access** | Q1 2026 | First external testers, config + build pipeline, hub catalog |
| 🔵 **Public Beta** | Q2 2026 | Open source, agent skills, AI Toolkit integration |
| 🔵 **Public Beta** | Q2 2026 | Open source, agent skills, Foundry Toolkit integration |
| 🟣 **RC** | Q3-Q4 2026 | **LLM support** (with LoRA), broader device coverage, MLIR |

<details>
Expand All @@ -418,11 +418,11 @@ Supported tasks include:
**Q2 2026 — Public Beta**
- Open source release
- Agent-ready skills for coding assistants (Claude Code, Cursor, Copilot)
- AI Toolkit for VS Code integration
- Foundry Toolkit for VS Code integration

**Q3-Q4 2026 — Release Candidate**
- LLM support (decoder-only architectures with LoRA adapters)
- TensorRT, MIGraphX, and DirectML execution providers
- NvTensorRTRTX, MIGraphX, and Dml execution providers
- MLIR-based optimization backend
- Public SDK and framework APIs

Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -380,8 +380,8 @@ markers = [
"qnn: marks tests for QNN backend",
"openvino: marks tests for OpenVINO backend",
"cuda: marks tests requiring CUDA runtime",
"directml: marks tests for DirectML backend",
"tensorrt: marks tests for TensorRT backend",
"dml: marks tests for Dml backend",
"nv_tensorrt_rtx: marks tests for NvTensorRTRTX backend",
"vitisai: marks tests for AMD Vitis AI backend",
"training: marks tests requiring training-specific ORT features",
]
Expand Down
6 changes: 3 additions & 3 deletions scripts/e2e_eval/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ uv run python scripts/e2e_eval/run_eval.py
# Filter by priority / task / group
uv run python scripts/e2e_eval/run_eval.py --priority P0
uv run python scripts/e2e_eval/run_eval.py --task image-classification
uv run python scripts/e2e_eval/run_eval.py --group AITK
uv run python scripts/e2e_eval/run_eval.py --group "Foundry Toolkit"

# Single ad-hoc model
uv run python scripts/e2e_eval/run_eval.py --hf-model microsoft/resnet-50
Expand All @@ -84,7 +84,7 @@ uv run python scripts/e2e_eval/run_eval.py --retry-failed
| `--task` | — | Filter by HF task |
| `--priority` | — | Filter: `P0`, `P1`, `P2` |
| `--model-type` | — | Filter by model_type (e.g. `bert`) |
| `--group` | — | Filter by group (e.g. `AITK`) |
| `--group` | — | Filter by group (e.g. `Foundry Toolkit`) |
| `--device` | `auto` | Target device |
| `--timeout` | 600 | Per-model timeout (seconds) |
| `--list` | off | List models and exit |
Expand Down Expand Up @@ -118,7 +118,7 @@ uv run python scripts/e2e_eval/generate_report.py --input-dir eval_results/2026-
| **P1** | Important — tracked closely, regressions flagged |
| **P2** | Extended coverage — best-effort |

Groups (`AITK`, `Benchmark`, `Top200`, etc.) categorize models by source/purpose.
Groups (`Foundry Toolkit`, `Benchmark`, `Top200`, etc.) categorize models by source/purpose.

### Failure Classification

Expand Down
2 changes: 1 addition & 1 deletion scripts/e2e_eval/build_registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ def build_registry(
if is_p0:
priority = "P0"
group = p0_group_lookup.get((model_id, task)) or p0_model_group.get(
model_id, "AITK"
model_id, "Foundry Toolkit"
)
else:
priority = "P1"
Expand Down
4 changes: 2 additions & 2 deletions scripts/e2e_eval/models_viewer_static.html

Large diffs are not rendered by default.

Loading
Loading