diff --git a/ai-data-integration/SKILL.md b/ai-data-integration/SKILL.md index 3bb5531..7ba2a62 100644 --- a/ai-data-integration/SKILL.md +++ b/ai-data-integration/SKILL.md @@ -1,7 +1,15 @@ --- name: ai-data-integration description: "Use this skill when connecting AI or LLMs to data platforms. Covers MCP servers for warehouses, natural-language-to-SQL, embeddings for data discovery, LLM-powered enrichment, and AI agent data access patterns. Common phrases: \"text-to-SQL\", \"MCP server for Snowflake\", \"LLM data enrichment\", \"AI agent access\". Do NOT use for general data integration (use data-integration) or dbt modeling (use dbt-transforms)." -model_tier: reasoning +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium + conditions: + - when: "designing novel security tier taxonomy from scratch" + hold_at: opus version: 1.0.0 --- @@ -28,7 +36,9 @@ Expert guidance for integrating AI/LLM capabilities with data engineering system | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| high | Opus | Sonnet | Sonnet | +| medium | Sonnet | Sonnet, Opus | Sonnet | + +Condition: designing novel security tier taxonomy from scratch → hold at Opus. ## Core Principles diff --git a/client-delivery/SKILL.md b/client-delivery/SKILL.md index 00fe303..d9c0cf2 100644 --- a/client-delivery/SKILL.md +++ b/client-delivery/SKILL.md @@ -1,7 +1,12 @@ --- name: client-delivery description: "Use this skill when managing a consulting data cleaning engagement. Covers engagement setup, schema profiling, security tier selection, project scaffolding, deliverable generation, and client handoff. Common phrases: \"set up a cleaning project\", \"profile this schema\", \"data cleaning engagement\", \"generate deliverables\", \"client handoff\". Do NOT use for writing dbt models (use dbt-transforms), DuckDB queries (use duckdb), or pipeline orchestration (use data-pipelines)." -model_tier: analytical +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium version: 1.0.0 --- @@ -33,7 +38,7 @@ Guides data cleaning engagements from discovery through client handoff. Microsof | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| medium | Sonnet | Opus, Haiku | Haiku | +| medium | Sonnet | Sonnet, Opus | Sonnet | ## Core Principles diff --git a/data-integration/SKILL.md b/data-integration/SKILL.md index 3e5bbcd..5f1b411 100644 --- a/data-integration/SKILL.md +++ b/data-integration/SKILL.md @@ -1,7 +1,12 @@ --- name: data-integration description: "Use this skill when designing data integrations or connecting systems. Covers iPaaS platforms (Workato, MuleSoft, Boomi), dlt pipelines, API patterns, CDC, webhooks, and Reverse ETL. Common phrases: \"connect these systems\", \"build a dlt pipeline\", \"event-driven architecture\", \"change data capture\". Do NOT use for stream processing frameworks (use event-streaming) or pipeline scheduling (use data-pipelines)." -model_tier: analytical +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium version: 1.0.0 --- @@ -27,7 +32,7 @@ This skill covers enterprise data integration patterns. It does NOT cover: basic | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| medium | Sonnet | Opus, Haiku | Haiku | +| medium | Sonnet | Sonnet, Opus | Sonnet | ## Core Principles diff --git a/data-pipelines/SKILL.md b/data-pipelines/SKILL.md index ce28cbd..57ffc23 100644 --- a/data-pipelines/SKILL.md +++ b/data-pipelines/SKILL.md @@ -1,7 +1,12 @@ --- name: data-pipelines description: "Use this skill when scheduling, orchestrating, or monitoring data pipelines. Covers Dagster assets, Airflow DAGs, Prefect flows, sensors, retries, alerting, and cross-tool integrations (dagster-dbt, dagster-dlt). Common phrases: \"schedule this pipeline\", \"Dagster vs Airflow\", \"add retry logic\", \"pipeline alerting\", \"consulting pipeline\". Do NOT use for building transformations (use dbt-transforms or python-data-engineering) or designing integration patterns (use data-integration)." -model_tier: analytical +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium version: 1.0.0 --- @@ -31,7 +36,7 @@ Expert guidance for orchestrating data pipelines. Dagster-first for greenfield p | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| medium | Sonnet | Opus, Haiku | Haiku | +| medium | Sonnet | Sonnet, Opus | Sonnet | ## Core Principles diff --git a/dbt-transforms/SKILL.md b/dbt-transforms/SKILL.md index ffe8a59..27333a9 100644 --- a/dbt-transforms/SKILL.md +++ b/dbt-transforms/SKILL.md @@ -1,7 +1,12 @@ --- name: dbt-transforms description: "Use this skill when building or reviewing dbt models, tests, or project structure. Triggers on analytics engineering tasks including staging/marts layers, materializations, incremental strategies, Jinja macros, sources, warehouse configuration, DuckDB adapter, data cleaning, and deduplication patterns. Common phrases: \"dbt model\", \"write a dbt test\", \"incremental strategy\", \"semantic layer\", \"dbt DuckDB\", \"cleaning patterns\". Do NOT use for Python DataFrame code (use python-data-engineering), pipeline scheduling (use data-pipelines), or standalone DuckDB queries without dbt (use duckdb)." -model_tier: analytical +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium version: 1.0.0 --- @@ -26,7 +31,7 @@ Comprehensive dbt guidance covering project structure, modeling, testing, CI/CD, | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| medium | Sonnet | Opus, Haiku | Haiku | +| medium | Sonnet | Sonnet, Opus | Sonnet | ## Core Principles diff --git a/dlt-extract/SKILL.md b/dlt-extract/SKILL.md index 0a13bcf..1d9eff8 100644 --- a/dlt-extract/SKILL.md +++ b/dlt-extract/SKILL.md @@ -1,7 +1,12 @@ --- name: dlt-extract description: "Use this skill when building DLT pipelines for file-based or consulting data extraction. Covers Excel/CSV/SharePoint ingestion via DLT, destination swapping (DuckDB dev to warehouse prod), schema contracts for cleaning, and portable pipeline patterns. Common phrases: \"dlt pipeline for files\", \"extract Excel with dlt\", \"portable data pipeline\", \"dlt filesystem source\". Do NOT use for core DLT concepts like REST API or SQL database sources (use data-integration) or pipeline scheduling (use data-pipelines)." -model_tier: analytical +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium version: 1.0.0 --- @@ -33,7 +38,7 @@ File-based extraction and consulting portability only. Hands off to data-integra | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| medium | Sonnet | Opus, Haiku | Haiku | +| medium | Sonnet | Sonnet, Opus | Sonnet | ## Core Principles diff --git a/duckdb/SKILL.md b/duckdb/SKILL.md index 5345e64..02a91d6 100644 --- a/duckdb/SKILL.md +++ b/duckdb/SKILL.md @@ -1,7 +1,12 @@ --- name: duckdb description: "Use this skill when working with DuckDB for local data analysis, file ingestion, or data exploration. Covers reading CSV/Excel/Parquet/JSON files into DuckDB, SQL analytics on local data, data profiling, cleaning transformations, and export to various formats. Common phrases: \"analyze this CSV\", \"DuckDB query\", \"local data analysis\", \"read Excel in SQL\", \"profile this data\". Do NOT use for dbt model building (use dbt-transforms with DuckDB adapter) or cloud warehouse administration." -model_tier: analytical +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium version: 1.0.0 --- @@ -26,7 +31,7 @@ Local-first SQL analytics on files. Read, profile, clean, and export data withou | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| medium | Sonnet | Opus, Haiku | Haiku | +| medium | Sonnet | Sonnet, Opus | Sonnet | ## Core Principles diff --git a/event-streaming/SKILL.md b/event-streaming/SKILL.md index ebb6e55..ab63430 100644 --- a/event-streaming/SKILL.md +++ b/event-streaming/SKILL.md @@ -1,7 +1,12 @@ --- name: event-streaming description: "Use this skill when building real-time or near-real-time data pipelines. Covers Kafka, Flink, Spark Streaming, Snowpipe, BigQuery streaming, materialized views, and batch-vs-streaming decisions. Common phrases: \"real-time pipeline\", \"Kafka consumer\", \"streaming vs batch\", \"low latency ingestion\". Do NOT use for batch integration patterns (use data-integration) or pipeline orchestration (use data-pipelines)." -model_tier: reasoning +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium version: 1.0.0 --- @@ -37,7 +42,7 @@ Do NOT use for: batch ETL (use `dbt-transforms`), static data modeling, SQL opti | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| high | Opus | Sonnet | Sonnet | +| medium | Sonnet | Sonnet, Opus | Sonnet | ## Core Principles diff --git a/pipeline/config/budgets.json b/pipeline/config/budgets.json index 22dbcf9..4bb6fb3 100644 --- a/pipeline/config/budgets.json +++ b/pipeline/config/budgets.json @@ -12,6 +12,7 @@ "overrides": { "python-data-engineering/references/data-validation-patterns.md": { "reference_max_words": 1130, + "reference_max_tokens": 1500, "reason": "3% over target (1130w vs 1100w). Contains Pydantic, Pandera, and Great Expectations patterns \u2014 all three frameworks are essential for the skill's validation coverage. Trimming further would remove one framework entirely." } } diff --git a/pipeline/config/model-routing.yaml b/pipeline/config/model-routing.yaml index a45bdb0..19cf69b 100644 --- a/pipeline/config/model-routing.yaml +++ b/pipeline/config/model-routing.yaml @@ -1,6 +1,6 @@ # Model Routing Configuration # See specs/SKILL-MODEL-ROUTING-SPEC.md for full specification. -spec_version: "1.3" +spec_version: "1.4" # Default model preferences by skill classification defaults: @@ -19,9 +19,9 @@ defaults: # Budget zone thresholds (percentage of max_simultaneous_tokens) zones: - green: 0.70 # 0-70%: use preferred models - yellow: 0.90 # 70-90%: downgrade low/medium reasoning_demand - red: 1.00 # 90-100%: downgrade all to minimum tier + green: 0.70 + yellow: 0.90 + red: 1.00 # Task type defaults task_types: @@ -35,5 +35,17 @@ task_types: preferred: opus description: "Debugging, architecture, complex reasoning" +# Tier-to-model mapping +tiers: + haiku: + claude_code: claude-haiku-4-5-20251001 + cost_ratio: 1 + sonnet: + claude_code: claude-sonnet-4-6 + cost_ratio: 5 + opus: + claude_code: claude-opus-4-6 + cost_ratio: 25 + # Per-skill overrides (empty by default — suites populate as needed) overrides: {} diff --git a/pipeline/scripts/budget-report.py b/pipeline/scripts/budget-report.py index 8c9b35c..e9888d5 100644 --- a/pipeline/scripts/budget-report.py +++ b/pipeline/scripts/budget-report.py @@ -59,7 +59,7 @@ def get_budget_limits(rel_path, classification, budgets): word_key = classification + "_max_words" token_key = classification + "_max_tokens" if word_key in override: - return override[word_key], override[token_key] + return override[word_key], override.get(token_key, budgets.get(token_key)) word_key = classification + "_max_words" token_key = classification + "_max_tokens" diff --git a/pipeline/specs/SKILL-MODEL-ROUTING-SPEC.md b/pipeline/specs/SKILL-MODEL-ROUTING-SPEC.md index 90b38f3..af13b18 100644 --- a/pipeline/specs/SKILL-MODEL-ROUTING-SPEC.md +++ b/pipeline/specs/SKILL-MODEL-ROUTING-SPEC.md @@ -87,16 +87,16 @@ tiers: cost_ratio: 1 # Baseline sonnet: - claude_code: claude-sonnet-4-5 + claude_code: claude-sonnet-4-6 codex: gpt-5.3-codex description: "Analytical tasks — classification, multi-factor decisions, standard coding" - cost_ratio: 8 # ~8x haiku + cost_ratio: 5 # ~5x haiku (Sonnet 4.6: $3/$15 per MTok) opus: claude_code: claude-opus-4-6 codex: gpt-5.3-codex-xl # hypothetical — map to best available - description: "Complex reasoning — debugging, architecture, novel problem solving" - cost_ratio: 60 # ~60x haiku + description: "Adversarial security analysis, formal verification, vulnerability chain synthesis" + cost_ratio: 25 # ~25x haiku (Opus: $15/$75 per MTok) # Session defaults defaults: @@ -607,11 +607,11 @@ tiers: claude_code: claude-haiku-4-5 cost_ratio: 1 sonnet: - claude_code: claude-sonnet-4-5 - cost_ratio: 8 + claude_code: claude-sonnet-4-6 + cost_ratio: 5 opus: claude_code: claude-opus-4-6 - cost_ratio: 60 + cost_ratio: 25 budget_zones: yellow_threshold: 0.70 @@ -688,10 +688,10 @@ Use eval results to refine routing decisions: │ Coordinator .............. haiku (routing only) │ │ Mechanical specialist .... haiku (tracing, matching) │ │ Analytical specialist .... sonnet (classification, code) │ -│ Reasoning specialist ..... opus (debugging, architecture)│ +│ Reasoning specialist ..... opus (adversarial, formal) │ │ │ │ COST RATIOS │ -│ haiku = 1x | sonnet = ~8x | opus = ~60x │ +│ haiku = 1x | sonnet = ~5x | opus = ~25x │ │ │ │ BUDGET ZONES │ │ Green (0-70%) ..... use preferred models │ diff --git a/python-data-engineering/SKILL.md b/python-data-engineering/SKILL.md index 972f371..b1f837c 100644 --- a/python-data-engineering/SKILL.md +++ b/python-data-engineering/SKILL.md @@ -1,7 +1,12 @@ --- name: python-data-engineering description: "Use this skill when writing Python code for data pipelines or transformations. Covers Polars, Pandas, PySpark DataFrames, dbt Python models, API extraction scripts, and data validation with Pydantic or Pandera. Common phrases: \"Polars vs Pandas\", \"PySpark DataFrame\", \"validate this data\", \"Python extraction script\". Do NOT use for SQL-based dbt models (use dbt-transforms) or integration architecture (use data-integration)." -model_tier: analytical +model: + preferred: sonnet + acceptable: [sonnet, opus] + minimum: sonnet + allow_downgrade: false + reasoning_demand: medium version: 1.0.0 --- @@ -25,7 +30,7 @@ Activate when: choosing between DataFrame libraries, writing Polars/Pandas/PySpa | reasoning_demand | preferred | acceptable | minimum | |-----------------|-----------|------------|---------| -| medium | Sonnet | Opus, Haiku | Haiku | +| medium | Sonnet | Sonnet, Opus | Sonnet | ## Core Principles