Skip to content

feat: add per-phase Claude model configuration (claude_model + review_model)#272

Merged
umputun merged 2 commits intoumputun:masterfrom
dwilberger:feat/claude-model-config
Apr 14, 2026
Merged

feat: add per-phase Claude model configuration (claude_model + review_model)#272
umputun merged 2 commits intoumputun:masterfrom
dwilberger:feat/claude-model-config

Conversation

@dwilberger
Copy link
Copy Markdown
Contributor

Summary

  • Add claude_model and review_model config parameters to control which Claude model is used for task execution vs review phases
  • Add --claude-model and --review-model CLI flags with standard precedence (CLI > local config > global config > default)
  • Create separate review executor when review_model differs from claude_model, used by all review phases (first review, fix loops, codex eval, finalize)
  • Empty values preserve current behavior (no --model flag passed to Claude CLI)

Motivation

Review phases run 5-7 parallel Claude agents that can iterate multiple times. Using the same model (e.g., Opus) for both tasks and reviews is unnecessarily expensive. This change allows using a capable model for tasks while using a faster/cheaper model for reviews.

Usage

# in ~/.config/ralphex/config
claude_model = opus
review_model = sonnet
# or via CLI flags
ralphex --claude-model opus --review-model sonnet docs/plans/sprint-1.md

Test plan

  • Config parsing tests: claude_model and review_model parsed from INI
  • Merge logic tests: local overrides global, empty preserves existing
  • Executor test: --model flag injected when Model is set, absent when empty
  • Full build: go build ./... passes
  • Full test suite: all pre-existing tests pass (failures are pre-existing Docker/permission issues)
  • No secrets or env values in diff

Closes #271

Add claude_model and review_model config parameters to control which
Claude model is used for task execution vs review phases. This allows
using a more capable model (e.g., opus) for tasks while using a faster
model (e.g., sonnet) for review iterations, reducing cost on
review-heavy runs.

- Add Model field to ClaudeExecutor, injecting --model flag when set
- Create separate review executor when review_model differs
- Add --claude-model and --review-model CLI flags
- Empty values preserve current behavior (no --model flag)

Closes umputun#271
@dwilberger dwilberger requested a review from umputun as a code owner April 5, 2026 20:15
Copy link
Copy Markdown
Owner

@umputun umputun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice feature, thx. couple things to address:

  1. naming - claude_model is misleading, it only applies to task/planning phases, not to claude globally. task_model would be more accurate and symmetric with review_model. same for the CLI flag - --task-model / --review-model reads better than --claude-model / --review-model
  2. config comment typo - pkg/config/defaults/config:33 says # claude_model = but should be # review_model =
  3. custom wrappers - --model flag is silently ignored by custom providers (claude_command config). not a bug, but worth documenting that model config only applies with the default Claude CLI
  4. missing processor test - the ReviewClaude executor separation logic in runner.go has no test coverage. existing tests only exercise the fallback path (ReviewClaude: nil). a test with distinct task/review mock executors verifying the review phase calls the right one would close this gap
  5. no docs update - README, CLAUDE.md, and llms.txt document all config options but don't mention the new fields or CLI flags

the approach itself is clean - separate executor only when models differ, correct config merge chain, backward compatible. no scope creep, tests pass, linter clean.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds per-phase Claude model selection so task execution and review phases can run on different Claude models, reducing cost for review-heavy runs while preserving existing behavior when unset.

Changes:

  • Introduces claude_model and review_model config keys (plus --claude-model / --review-model CLI flags) and wires them into runner construction.
  • Extends ClaudeExecutor with a Model field that injects --model <value> into the Claude CLI invocation when configured.
  • Routes all review-related phases (first review, review loops, external eval, finalize) through a dedicated review executor when models differ.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/processor/runner.go Builds/uses separate Claude executor for review phases when ReviewModel differs.
pkg/executor/executor.go Adds Model field and injects --model flag into CLI args.
pkg/executor/executor_test.go Adds test coverage for --model flag injection behavior.
pkg/config/values.go Parses and merges new claude_model / review_model keys from INI configs.
pkg/config/values_test.go Adds tests for parsing and merge behavior of new model fields.
pkg/config/defaults/config Documents the new config keys in the embedded defaults template.
pkg/config/config.go Plumbs new Values fields into the runtime config.Config.
cmd/ralphex/main.go Adds CLI flags and resolves effective model values passed into the processor runner.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# review_model: model to use for review phases (first review, fix loops, finalize)
# falls back to claude_model if not set, then to Claude Code's default
# use a cheaper/faster model (e.g., sonnet) for reviews to reduce cost
# claude_model =
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template config comment for the review_model setting ends with # claude_model =, which is misleading and makes it look like the key name is claude_model. This should be # review_model = to match the documented setting.

Suggested change
# claude_model =
# review_model =

Copilot uses AI. Check for mistakes.
Comment thread pkg/executor/executor.go
Comment on lines +221 to +224
// inject --model flag if a model override is configured
if e.Model != "" {
args = append(args, "--model", e.Model)
}
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ClaudeExecutor.Run unconditionally appends --model <Model> when Model is set. If users already include --model in claude_args (or if wrapper scripts inject it), this can lead to duplicate flags with ambiguous precedence. Consider either stripping any existing --model from the configured args before appending, or explicitly documenting/enforcing that Model overrides any --model present in Args (e.g., by removing earlier occurrences).

Copilot uses AI. Check for mistakes.
Comment thread cmd/ralphex/main.go
Comment on lines +845 to +849
// resolve review model: CLI flag > config file > claude_model > empty
reviewModel := req.Config.ReviewModel
if o.ReviewModel != "" {
reviewModel = o.ReviewModel
}
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment says review model precedence includes a fallback to claude_model, but the code here only applies CLI/config values and passes them through. The actual fallback happens later in processor.New (when ReviewModel is empty). Updating the comment (or performing the fallback here) would avoid confusion for future maintainers.

Copilot uses AI. Check for mistakes.
@pySilver
Copy link
Copy Markdown

@dwilberger do you plan to continue work on it?

@dwilberger
Copy link
Copy Markdown
Contributor Author

dwilberger commented Apr 14, 2026

I do plan on working on it, I'm using this project daily and want to contribute. Unfortunatelly I'm quite busy this week, so I'll do it next weekend. If anyone wants to co-author the fixes, feel free @pySilver @umputun

umputun added a commit that referenced this pull request Apr 14, 2026
Rename claude_model to task_model for clarity, fix config comment typo,
add processor test for ReviewClaude executor separation, update docs
(README, CLAUDE.md, llms.txt), and normalize --flag=value form in docs.
@umputun umputun requested a review from Copilot April 14, 2026 18:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/ralphex/main.go
Comment on lines 839 to +866
@@ -848,6 +862,8 @@ func createRunner(req executePlanRequest, o opts, log processor.Logger, holder *
CodexEnabled: codexEnabled,
FinalizeEnabled: req.Config.FinalizeEnabled,
DefaultBranch: req.BaseRef,
ClaudeModel: claudeModel,
ReviewModel: reviewModel,
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new per-phase model settings are only applied via createRunner, but plan mode (--plan) constructs processor.Config{...} separately and doesn't set ClaudeModel/ReviewModel. As a result, claude_model / --claude-model won't affect interactive plan creation even though it uses the Claude executor. Consider reusing the same model-resolution logic for plan mode (or passing ClaudeModel there) so configuration is consistent across modes.

Copilot uses AI. Check for mistakes.
Comment thread cmd/ralphex/main.go
Comment on lines +839 to +847
// resolve claude model: CLI flag > config file > empty (use CLI default)
claudeModel := req.Config.ClaudeModel
if o.ClaudeModel != "" {
claudeModel = o.ClaudeModel
}

// resolve review model: CLI flag > config file > claude_model > empty
reviewModel := req.Config.ReviewModel
if o.ReviewModel != "" {
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createRunner uses if o.ClaudeModel != "" / if o.ReviewModel != "" to detect CLI overrides. With go-flags, a user can explicitly pass an empty value via --claude-model= / --review-model=; today that can't override a non-empty config value because it looks the same as “not provided”. To preserve the stated precedence (CLI > config) for empty values too, consider tracking claudeModelSet/reviewModelSet using isFlagSet (similar to the duration *Set flags) and applying the override when the flag is set even if the value is empty.

Suggested change
// resolve claude model: CLI flag > config file > empty (use CLI default)
claudeModel := req.Config.ClaudeModel
if o.ClaudeModel != "" {
claudeModel = o.ClaudeModel
}
// resolve review model: CLI flag > config file > claude_model > empty
reviewModel := req.Config.ReviewModel
if o.ReviewModel != "" {
// detect whether a string flag was explicitly provided on the CLI, including empty values
hasCLIStringFlag := func(name string) bool {
long := "--" + name
for _, arg := range os.Args[1:] {
if arg == long || strings.HasPrefix(arg, long+"=") {
return true
}
}
return false
}
claudeModelSet := hasCLIStringFlag("claude-model")
reviewModelSet := hasCLIStringFlag("review-model")
// resolve claude model: CLI flag > config file > empty (use CLI default)
claudeModel := req.Config.ClaudeModel
if claudeModelSet {
claudeModel = o.ClaudeModel
}
// resolve review model: CLI flag > config file > claude_model > empty
reviewModel := req.Config.ReviewModel
if reviewModelSet {

Copilot uses AI. Check for mistakes.
Comment on lines +1217 to +1224
e.Run(context.Background(), "test")
assert.Contains(t, capturedArgs, "--model")
assert.Contains(t, capturedArgs, "sonnet")
})

t.Run("model empty does not inject --model flag", func(t *testing.T) {
e := &ClaudeExecutor{cmdRunner: mock}
e.Run(context.Background(), "test")
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test ignores the Result returned by Run(). Asserting result.Error (e.g., require.NoError) would prevent false positives if Run starts returning an error while still capturing args, and keeps the test focused on argument injection.

Suggested change
e.Run(context.Background(), "test")
assert.Contains(t, capturedArgs, "--model")
assert.Contains(t, capturedArgs, "sonnet")
})
t.Run("model empty does not inject --model flag", func(t *testing.T) {
e := &ClaudeExecutor{cmdRunner: mock}
e.Run(context.Background(), "test")
result := e.Run(context.Background(), "test")
require.NoError(t, result.Error)
assert.Contains(t, capturedArgs, "--model")
assert.Contains(t, capturedArgs, "sonnet")
})
t.Run("model empty does not inject --model flag", func(t *testing.T) {
e := &ClaudeExecutor{cmdRunner: mock}
result := e.Run(context.Background(), "test")
require.NoError(t, result.Error)

Copilot uses AI. Check for mistakes.
@umputun umputun merged commit 4cd3e35 into umputun:master Apr 14, 2026
10 checks passed
umputun added a commit that referenced this pull request Apr 16, 2026
… findings

- rename claude_model -> task_model and --claude-model -> --task-model for symmetry with review_model/--review-model
- pass task model to plan mode (was missing)
- strip duplicate --model flags from claude_args to prevent CLI errors
- fix typo in pkg/config/defaults/config: review_model section had `# claude_model =` example
- add processor test for ReviewClaude executor separation (nil fallback + distinct review executor)
- add ReviewModel merge test, normalize --flag=value form in docs
- update docs (README, CLAUDE.md, llms.txt, custom-providers.md)
- update review-claude tests for post-codex-skip behavior added in #282

picks up unmerged follow-up commits from feat/claude-model-config branch
umputun added a commit that referenced this pull request Apr 16, 2026
… findings (#284)

* feat: rename claude_model to task_model, address unmerged #272 review findings

- rename claude_model -> task_model and --claude-model -> --task-model for symmetry with review_model/--review-model
- pass task model to plan mode (was missing)
- strip duplicate --model flags from claude_args to prevent CLI errors
- fix typo in pkg/config/defaults/config: review_model section had `# claude_model =` example
- add processor test for ReviewClaude executor separation (nil fallback + distinct review executor)
- add ReviewModel merge test, normalize --flag=value form in docs
- update docs (README, CLAUDE.md, llms.txt, custom-providers.md)
- update review-claude tests for post-codex-skip behavior added in #282

picks up unmerged follow-up commits from feat/claude-model-config branch

* fix: address copilot review on PR 284

- stripFlag now handles --flag=value and bare --flag forms (was only handling --flag value space-form), prevents duplicate/orphaned --model in claude_args
- add stripFlag tests for equals form, bare flag, repeated occurrences, prefix-only false-match
- doc wording: clarify that task_model/review_model are passed as --model to the configured claude_command and may be ignored or implemented by custom wrappers
@dwilberger dwilberger deleted the feat/claude-model-config branch April 18, 2026 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: per-phase Claude model configuration (claude_model + review_model)

4 participants