feat(ai): add Anthropic provider with chat parity (1/5) by gariasf · Pull Request #1983 · we-promise/sure

gariasf · 2026-05-25T14:30:50Z

Summary

Introduces Provider::Anthropic as a first-class LLM provider alongside Provider::Openai, implementing the LlmConcept#chat_response contract over the official anthropic Ruby SDK. Chat (text + streaming + tool use) is at parity. Batch ops, PDF, and RAG land in stacked follow-up PRs.

This is PR 1 of 5 in the Anthropic integration series. Each PR ships independently green.

Why Anthropic

The Sure assistant currently only speaks to OpenAI. Anthropic offers (a) native PDF understanding without rasterization, (b) explicit prompt-cache control, (c) competitive Sonnet pricing for chat and Haiku for batch ops, and (d) an alternative for users who already standardize on Claude. Adding a parallel provider gives operators a choice without locking anyone in.

What's in this PR

New code

Provider::Anthropic — uses Messages API for sync + streaming responses, wraps SDK errors as Provider::Anthropic::Error, records LLM usage with provider "anthropic", threads Langfuse traces with the same span name (chat_response) so dashboards stay comparable across providers.
Provider::Anthropic::ChatConfig — builds Messages API requests; tags the system prompt and the final tool definition with cache_control: { type: "ephemeral" } (Anthropic's prompt caching).
Provider::Anthropic::MessageFormatter — translates Chat#conversation_messages into Anthropic's content-block schema. Each historical assistant turn that used tools is rebuilt as the required tool_use block + paired user-role tool_result block before the next assistant turn. In-flight function_results (the responder's second-leg call) get the same treatment.
Provider::Anthropic::ChatParser — converts the SDK's Message (or a hash-shaped equivalent) into the shared Provider::LlmConcept::ChatResponse Data struct, surfacing both text blocks and tool_use blocks as function requests.

Wiring

Provider::Registry — new anthropic factory; reads ANTHROPIC_ACCESS_TOKEN / ANTHROPIC_API_KEY / Setting.anthropic_access_token, plus optional ANTHROPIC_BASE_URL (for Bedrock / Vertex proxies) and ANTHROPIC_MODEL. Listed under both :llm and the default available_providers set so existing supports_model? routing picks the right provider for claude-* models.
Setting — adds anthropic_access_token, anthropic_model, anthropic_base_url, and llm_provider (default "openai") fields.
User#ai_available? — now returns true when either OpenAI or Anthropic is configured; exposes openai_configured? / anthropic_configured? for consumers.
Chat.default_model — resolves from Setting.llm_provider, so installs that flip the provider don't have to update every chat default.
Assistant::Responder — passes a new conversation_history: kwarg (raw Array<Message>) alongside the existing messages: (OpenAI-shaped) so providers without hosted conversation state can rebuild context. The OpenAI provider accepts and ignores the new kwarg — no behavior change.
Provider::LlmConcept — signature extended to document the new kwarg.

What's not in this PR (intentional)

❌ auto_categorize, auto_detect_merchants, enhance_provider_merchants — raise a clear not yet implemented error on this provider. PR 2/5 lands these via forced tool calls (tool_choice: {type: "tool", name: ...}) plus the cost-ledger migration.
❌ process_pdf, extract_bank_statement — same story. PR 3/5 uses native document content blocks.
❌ Vector::Store::* / RAG — PR 4/5 ships a pgvector-backed store so RAG works without Anthropic needing a hosted vector store.
❌ Settings UI panel — PR 5/5 mirrors _openai_settings.html.erb and adds the data-retention disclosure (parity with OpenAI's panel).
❌ Live VCR cassettes for chat_response. Cassettes need to be recorded with a real ANTHROPIC_API_KEY; mocked tests cover the wiring in the meantime.

Configuration (new env vars)

Env	Setting field	Default
`ANTHROPIC_ACCESS_TOKEN` (or) `ANTHROPIC_API_KEY`	`Setting.anthropic_access_token`	nil
`ANTHROPIC_MODEL`	`Setting.anthropic_model`	`claude-sonnet-4-6`
`ANTHROPIC_BASE_URL`	`Setting.anthropic_base_url`	nil (uses api.anthropic.com)
`ANTHROPIC_REQUEST_TIMEOUT`	—	600
`ANTHROPIC_MAX_TOKENS`	—	4096
`LLM_PROVIDER`	`Setting.llm_provider`	`openai`

Test plan

bin/rails test test/models/provider/anthropic_test.rb — 8 tests, init / supports_model / supports_pdf_processing / effective_model / chat_response success + error + tool_use paths
bin/rails test test/models/provider/anthropic/ — 14 tests across ChatConfig, ChatParser, MessageFormatter
bin/rails test test/models/provider/registry_test.rb — 8 tests, includes new :anthropic factory + env-empty fallback to Setting
bin/rails test test/models/provider/openai_test.rb — 27 tests, unchanged
Full suite: 4358 runs, 18008 assertions, 0 failures, 26 pre-existing skips, 1 pre-existing env error (libvips unavailable on the dev host)
bin/rubocop clean on all touched files
bin/brakeman --no-pager — 0 security warnings
Record live VCR cassettes (follow-up — needs ANTHROPIC_API_KEY)
System test exercising a chat turn through Provider::Anthropic (follow-up — bundled with cassettes)

Compliance notes

Anthropic's standard tier does not train on API inputs; 30-day Trust & Safety retention. The settings UI in PR 5/5 surfaces this alongside OpenAI's equivalent disclosure.
The financial-advice disclaimer already shown in the chat applies equally to either provider.
anthropic gem is MIT-licensed, compatible with Sure's AGPLv3.
OpenAI integration is untouched.

Summary by CodeRabbit

New Features
- Anthropic added as an alternative LLM provider with configurable credentials and optional custom endpoint.
Behavior Changes
- Default LLM selection now respects the deploy-time provider setting and prefers the configured provider.
- Chat handling now includes prior conversation history to improve responses.
- User AI availability now reflects either OpenAI or Anthropic being configured.
Chores
- Added Anthropic client dependency to the project.
Tests
- Expanded Anthropic-related test coverage (configuration, parsing, formatting, streaming).

Introduces Provider::Anthropic alongside Provider::Openai, implementing the LlmConcept chat_response contract over the official anthropic Ruby SDK. Batch ops, PDF, and RAG land in follow-up PRs. - Provider::Anthropic uses Messages API for sync and streaming responses - ChatConfig builds requests with ephemeral prompt-cache markers on the system prompt and the last tool definition - MessageFormatter reconstructs multi-turn history (text + tool_use + tool_result blocks) from raw Message records, including the paired user-role tool_result turn Anthropic requires after every tool_use - ChatParser maps Anthropic Message into the shared ChatResponse Data - Registry, Setting, User, Chat default model wired for ANTHROPIC_* envs and Setting.anthropic_*; LLM_PROVIDER selects between providers - Responder forwards raw conversation_history (Array<Message>) so providers without hosted conversation state can rebuild context - OpenAI provider accepts and ignores the new kwarg (no behavior change) Tests cover provider init, model gating, MessageFormatter for all turn shapes, ChatConfig request building (max_tokens, system cache, tool conversion), ChatParser for text / tool_use / mixed blocks, Registry discovery, and mocked chat_response success / error / function_request paths. Live VCR cassettes recorded in a follow-up with a real key. Stacked PRs: 2/5 batch ops + cost ledger, 3/5 PDF, 4/5 pgvector RAG, 5/5 settings UI + disclosure.

coderabbitai · 2026-05-25T14:30:58Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds Anthropic as an LLM provider: dependency and settings, Anthropic provider implementation (client, model gating, chat orchestration, Langfuse/usage), request/response shaping (ChatConfig/MessageFormatter/ChatParser), registry + app routing updates, responder history handling, and comprehensive tests.

Changes

Anthropic Provider Integration

Layer / File(s)	Summary
Dependency and settings setup `Gemfile`, `app/models/setting.rb`	Adds `anthropic` gem (~> 1.0) and new `anthropic_access_token`, `anthropic_model`, `anthropic_base_url` settings; makes `llm_provider` default configurable via `ENV["LLM_PROVIDER"]` and adds `anthropic_access_token` to encrypted fields.
Provider contract: conversation_history param `app/models/provider/llm_concept.rb`, `app/models/provider/openai.rb`	Adds `conversation_history: []` keyword to `Provider::LlmConcept#chat_response` and adds `Provider::Openai.configured?`/signature update for compatibility.
Provider registry and discovery `app/models/provider/registry.rb`, `test/models/provider/registry_test.rb`	Registers Anthropic in the `:llm` providers list and constructs `Provider::Anthropic` from ANTHROPIC_* env vars or `Setting` fallbacks; extends registry tests to cover Anthropic credential resolution.
Anthropic provider implementation `app/models/provider/anthropic.rb`	Adds `Provider::Anthropic` with client initialization, `effective_model`/`configured?`, model-support gating, `chat_response` orchestration (sync/stream), Langfuse logging, usage normalization/persistence, and error handling.
ChatConfig / MessageFormatter / ChatParser `app/models/provider/anthropic/chat_config.rb`, `app/models/provider/anthropic/message_formatter.rb`, `app/models/provider/anthropic/chat_parser.rb`	Builds Anthropic request (`system_`, `tools`, `messages`), formats conversation history and function results into Anthropic blocks, strips OpenAI-only `strict` from schemas, and parses Anthropic responses into a universal ChatResponse including function_requests.
Application-level routing and history `app/models/chat.rb`, `app/models/assistant/responder.rb`, `app/models/assistant/builtin.rb`	`Chat.default_model` now selects provider defaults based on `Setting.llm_provider` and each provider’s `configured?`; `Assistant::Responder` forwards raw chat message records as `conversation_history` and uses OpenAI-shaped `messages` payloads; minor builtin comment update.
User availability & integration tests `app/models/user.rb`, `test/models/chat_test.rb`	`User#ai_available?` now checks either provider configured; chat tests updated to use `Chat.default_model` and verify provider-selection routing.
Anthropic provider tests `test/models/provider/anthropic/*`	Adds tests for ChatConfig request building, MessageFormatter history/tool/result behavior, ChatParser parsing of text/tool_use blocks, Provider::Anthropic config/model support, error wrapping, streaming behavior, usage recording, and tool-use → function_request mapping.

Sequence Diagram(s)

sequenceDiagram
  participant Responder as Assistant::Responder
  participant Anthropic as Provider::Anthropic
  participant Config as Provider::Anthropic::ChatConfig
  participant Formatter as Provider::Anthropic::MessageFormatter
  participant Client as ::Anthropic::Client
  participant Parser as Provider::Anthropic::ChatParser
  participant Langfuse as Langfuse

  Responder->>Anthropic: chat_response(prompt, messages:, conversation_history:)
  Anthropic->>Config: build_request(model)
  Config->>Formatter: build messages (from prompt, history, function_results)
  Anthropic->>Langfuse: create trace
  Anthropic->>Client: messages.create / messages.stream
  Client-->>Anthropic: response + usage
  Anthropic->>Parser: parse response -> ChatResponse
  Anthropic->>Langfuse: end generation (success/error)
  Anthropic-->>Responder: return parsed ChatResponse + usage

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

we-promise/sure#223: Overlaps LLM invocation site and conversation-history forwarding changes.
we-promise/sure#1384: Overlaps Assistant::Responder conversation-history forwarding and payload shaping.
we-promise/sure#225: Related Chat.default_model/provider selection changes.

Suggested labels

enhancement

Suggested reviewers

jjmata
sokie

"🐰 In Claude's new glade I hop and sing,
A second path for chat this change will bring.
History in paws, messages neat and clear,
Two providers now, both ready to hear. 🥕"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.53% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly identifies the main change as adding an Anthropic provider with chat feature parity and correctly marks this as part 1 of a multi-part series.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/anthropic-foundation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c1dbb51553

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

app/models/provider/anthropic/chat_config.rb (1)

78-81: ⚡ Quick win

Strip strict for both symbol and string keyed schemas.

Only deleting :strict misses "strict" keys and can forward OpenAI-only schema flags to Anthropic.

♻️ Proposed fix

     def anthropic_input_schema(schema)
       schema = schema.deep_dup
-      schema.delete(:strict) if schema.is_a?(Hash)
+      if schema.is_a?(Hash)
+        schema.delete(:strict)
+        schema.delete("strict")
+      end
       schema
     end

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/provider/anthropic/chat_config.rb` around lines 78 - 81, In
anthropic_input_schema, the code only removes the symbol key :strict which
leaves string-keyed "strict" present and can leak OpenAI-only flags to
Anthropic; after deep_dup and the Hash check, delete both schema.delete(:strict)
and schema.delete('strict') (or otherwise normalize keys and remove strict) so
both symbol and string keyed schemas are stripped of the strict flag before
returning.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/models/chat.rb`:
- Around line 58-62: The current selection uses Setting.llm_provider to pick a
model even if that provider lacks credentials; update the logic in the
model-selection block so it only returns Provider::Anthropic.effective_model
when Setting.llm_provider == "anthropic" AND the Anthropic provider is actually
configured (e.g. check Provider::Anthropic.credentials_present? or a similar
availability method), otherwise fall back to Provider::Openai.effective_model
(and vice‑versa: only return OpenAI model if Provider::Openai is configured);
modify the conditional around Setting.llm_provider to validate provider
availability before returning its model so downstream chat creation won't pick
an unavailable provider.

In `@app/models/provider/anthropic.rb`:
- Around line 75-87: The chat_response method in Provider::Anthropic (method
chat_response in app/models/provider/anthropic.rb) must accept the messages:
keyword because Assistant::Responder calls llm.chat_response(..., messages: ...,
conversation_history: ...); update the method signature to include messages: nil
(or appropriate default), ensure any internal logic uses or forwards that
messages argument (e.g., to the API call or to the conversation_history
handling), and propagate messages when delegating to helper methods so the call
no longer raises ArgumentError: unknown keyword: :messages.

In `@app/models/setting.rb`:
- Around line 13-16: The new secret field anthropic_access_token is currently
stored unencrypted; add :anthropic_access_token to the
EncryptedSettingFields::ENCRYPTED_FIELDS list so it is persisted encrypted
(update the constant/array where ENCRYPTED_FIELDS is declared), run or add any
necessary schema/migration or spec updates that assert encrypted fields include
anthropic_access_token, and ensure the existing default behavior using
ENV["ANTHROPIC_ACCESS_TOKEN"] / ENV["ANTHROPIC_API_KEY"] remains unchanged in
the Setting model.

In `@test/models/provider/anthropic_test.rb`:
- Around line 43-47: Provider::Anthropic.effective_model currently uses
ENV.fetch("ANTHROPIC_MODEL", Setting.anthropic_model) which eagerly calls
Setting.anthropic_model; change the implementation to avoid evaluating the
default when ENV is present (use ENV.fetch("ANTHROPIC_MODEL") {
Setting.anthropic_model } or explicit conditional with ENV.key?), and update the
test in test/models/provider/anthropic_test.rb so that inside
ClimateControl.modify("ANTHROPIC_MODEL" => "claude-haiku-4-5") you also assert
that Setting.anthropic_model is not invoked (e.g., expect(Setting).not_to
receive(:anthropic_model)) before calling Provider::Anthropic.effective_model to
harden the test.

---

Nitpick comments:
In `@app/models/provider/anthropic/chat_config.rb`:
- Around line 78-81: In anthropic_input_schema, the code only removes the symbol
key :strict which leaves string-keyed "strict" present and can leak OpenAI-only
flags to Anthropic; after deep_dup and the Hash check, delete both
schema.delete(:strict) and schema.delete('strict') (or otherwise normalize keys
and remove strict) so both symbol and string keyed schemas are stripped of the
strict flag before returning.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: af0254e9-6d68-4426-b4aa-991f674874c0

📥 Commits

Reviewing files that changed from the base of the PR and between d8a12ad and c1dbb51.

⛔ Files ignored due to path filters (1)

Gemfile.lock is excluded by !**/*.lock

📒 Files selected for processing (17)

Gemfile
app/models/assistant/responder.rb
app/models/chat.rb
app/models/provider/anthropic.rb
app/models/provider/anthropic/chat_config.rb
app/models/provider/anthropic/chat_parser.rb
app/models/provider/anthropic/message_formatter.rb
app/models/provider/llm_concept.rb
app/models/provider/openai.rb
app/models/provider/registry.rb
app/models/setting.rb
app/models/user.rb
test/models/provider/anthropic/chat_config_test.rb
test/models/provider/anthropic/chat_parser_test.rb
test/models/provider/anthropic/message_formatter_test.rb
test/models/provider/anthropic_test.rb
test/models/provider/registry_test.rb

Surface fixes raised by Codex + CodeRabbit on PR 1/5: - Provider::Anthropic#chat_response now accepts (and ignores) a `messages:` kwarg. Assistant::Responder passes both `messages:` (OpenAI-shape) and `conversation_history:` (raw Message records) for cross-provider parity, so the previous signature raised ArgumentError on the first chat turn through the Anthropic provider. - Provider::Anthropic#supports_model? bypasses the `claude` prefix gate when a custom base_url is configured, mirroring the OpenAI provider. Bedrock-shaped IDs like `anthropic.claude-sonnet-4-5-20250929-v1:0` and `claude-opus-4@20250514` are otherwise rejected by Assistant::Provided#get_model_provider and the chat dies. - Setting.anthropic_access_token is now in EncryptedSettingFields::ENCRYPTED_FIELDS so the Anthropic API key is encrypted at rest like every other provider secret. Previously plaintext while siblings (openai_access_token, twelve_data_api_key, external_assistant_token) were ciphertext. - Chat.default_model falls back to whichever provider is actually configured. Previously, with LLM_PROVIDER=anthropic but no Anthropic credentials, the default model resolved to a Claude ID that no registered provider supported, so chats failed even when OpenAI was fully configured. Adds Provider::{Anthropic,Openai}#configured? class methods for the readable callsite. - Provider::Anthropic.effective_model uses `ENV["ANTHROPIC_MODEL"].presence || Setting.anthropic_model` so the Setting lookup is only performed when the env var is absent — the previous `ENV.fetch(KEY, default)` evaluated the default arg eagerly on every call. - Provider::Anthropic::ChatConfig#anthropic_input_schema strips both `:strict` and `"strict"` keys so JSON-decoded schemas with string keys cannot leak the OpenAI-only flag through to Anthropic. Test coverage added: supports_model? bypass on custom endpoints, chat_response messages: kwarg compatibility, default_model fallback in the three credential combinations, configured? against ENV + Setting, strict-flag stripping for both key types, and a `Setting.expects(:anthropic_model).never` assertion proving the ENV-precedence test now exercises the lazy path. All 4365 tests pass (1 pre-existing libvips env error unrelated).

Implements auto_categorize, auto_detect_merchants, and enhance_provider_merchants on Provider::Anthropic via forced tool calls, plus the cost-ledger plumbing they need. - Provider::Anthropic::AutoCategorizer, AutoMerchantDetector, ProviderMerchantEnhancer each define a single output tool whose input_schema mirrors the desired output, then force the model to call it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }. Anthropic guarantees the tool_use.input matches the schema, so there is no JSON parsing fragility, no <think> tag stripping, and no json_object/json_schema fallback ladders. - Concerns::UsageRecorder mirrors the OpenAI sibling but persists cache_creation_input_tokens / cache_read_input_tokens to dedicated columns instead of metadata. - Migration adds cache_creation_tokens, cache_read_tokens (nullable integers) to llm_usages. OpenAI rows leave them null. - LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6 $3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic" for claude-* via the existing exact/prefix lookup. - Provider::Anthropic#chat_response now persists cache columns directly rather than stashing them in metadata. - 25-transaction batch cap mirrors the OpenAI provider so the cost ledger sees the same shape regardless of which provider ran a batch. Tests cover the forced-tool-call path, null/None normalization, case-insensitive merchant matching, the missing-tool_use error path, and Anthropic-specific pricing + provider inference on LlmUsage. Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

app/models/provider/anthropic.rb (1)
170-194: ⚠️ Potential issue | 🔴 Critical

Fix Anthropic streaming event class constants

stream_chat_response matches ::Anthropic::Streaming::TextEvent / ::Anthropic::Streaming::MessageStopEvent, but client.messages.stream yields Anthropic::Helpers::Streaming::TextEvent and Anthropic::Helpers::Streaming::MessageStopEvent. Update the constants so the case branches fire and streaming doesn’t fail to recognize events (app/models/provider/anthropic.rb:170-194).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/provider/anthropic.rb` around lines 170 - 194, In
stream_chat_response, the case currently checks
::Anthropic::Streaming::TextEvent and ::Anthropic::Streaming::MessageStopEvent
but client.messages.stream yields Anthropic::Helpers::Streaming events; update
the event class references in the case to
Anthropic::Helpers::Streaming::TextEvent and
Anthropic::Helpers::Streaming::MessageStopEvent so the when branches in
stream_chat_response (which iterates stream.each from client.messages.stream)
match and handle events correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/models/chat_test.rb`:
- Around line 65-86: The tests assume Chat.default_model equals
Provider::<X>::DEFAULT_MODEL but Chat.default_model prefers
Provider::<X>::effective_model when set; make the tests deterministic by either
stubbing the provider effective_model to return the DEFAULT_MODEL or clearing
any env/model overrides before asserting. Specifically, in the tests referencing
Chat.default_model (the three cases using Provider::Anthropic::DEFAULT_MODEL and
Provider::Openai::DEFAULT_MODEL), stub Provider::Anthropic.effective_model and
Provider::Openai.effective_model to return nil (or to return the corresponding
DEFAULT_MODEL) so the method falls back predictably, or explicitly clear
OPENAI_MODEL/ANTHROPIC_MODEL values via stubs/mocks prior to the assert. Ensure
you modify the tests that set Setting.stubs(:llm_provider) and
Provider::<...>.stubs(:configured?) to also control
Provider::<...>.stubs(:effective_model) to avoid intermittent failures.

---

Outside diff comments:
In `@app/models/provider/anthropic.rb`:
- Around line 170-194: In stream_chat_response, the case currently checks
::Anthropic::Streaming::TextEvent and ::Anthropic::Streaming::MessageStopEvent
but client.messages.stream yields Anthropic::Helpers::Streaming events; update
the event class references in the case to
Anthropic::Helpers::Streaming::TextEvent and
Anthropic::Helpers::Streaming::MessageStopEvent so the when branches in
stream_chat_response (which iterates stream.each from client.messages.stream)
match and handle events correctly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 33a04125-557d-4510-9928-75a3d949397c

📥 Commits

Reviewing files that changed from the base of the PR and between c1dbb51 and 714cf0b.

📒 Files selected for processing (8)

app/models/chat.rb
app/models/provider/anthropic.rb
app/models/provider/anthropic/chat_config.rb
app/models/provider/openai.rb
app/models/setting.rb
test/models/chat_test.rb
test/models/provider/anthropic/chat_config_test.rb
test/models/provider/anthropic_test.rb

CodeRabbit flagged on PR review: the new default_model tests asserted against Provider::*::DEFAULT_MODEL, but Chat.default_model actually returns Provider::*.effective_model.presence (which reads OPENAI_MODEL / ANTHROPIC_MODEL from the environment). With either env var set, the tests would fail intermittently even though routing was correct. - New default_model tests now assert against the provider's effective_model directly, so they verify the routing decision (which provider's value wins) without coupling to the constant. - Pre-existing "creates with default model" assertions had the same brittleness; switch them to compare against Chat.default_model so the chosen model is whatever the env / Setting cascade resolves to. Verified by running `ANTHROPIC_MODEL=claude-haiku-4-5 OPENAI_MODEL=gpt-4o bin/rails test test/models/chat_test.rb` — 16 runs, 0 failures (previously 2 pre-existing failures + 0 from the new tests).

Implements auto_categorize, auto_detect_merchants, and enhance_provider_merchants on Provider::Anthropic via forced tool calls, plus the cost-ledger plumbing they need. - Provider::Anthropic::AutoCategorizer, AutoMerchantDetector, ProviderMerchantEnhancer each define a single output tool whose input_schema mirrors the desired output, then force the model to call it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }. Anthropic guarantees the tool_use.input matches the schema, so there is no JSON parsing fragility, no <think> tag stripping, and no json_object/json_schema fallback ladders. - Concerns::UsageRecorder mirrors the OpenAI sibling but persists cache_creation_input_tokens / cache_read_input_tokens to dedicated columns instead of metadata. - Migration adds cache_creation_tokens, cache_read_tokens (nullable integers) to llm_usages. OpenAI rows leave them null. - LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6 $3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic" for claude-* via the existing exact/prefix lookup. - Provider::Anthropic#chat_response now persists cache columns directly rather than stashing them in metadata. - 25-transaction batch cap mirrors the OpenAI provider so the cost ledger sees the same shape regardless of which provider ran a batch. Tests cover the forced-tool-call path, null/None normalization, case-insensitive merchant matching, the missing-tool_use error path, and Anthropic-specific pricing + provider inference on LlmUsage. Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.

- Provider::Anthropic#supports_pdf_processing? bypasses prefix gate for custom endpoints, mirroring supports_model? - Provider::Anthropic#initialize raises Error when custom_endpoint? AND model.blank?, parity with Provider::Openai - stream_chat_response captures partial usage on mid-stream errors and records it via the new on_partial callback so chat_response can skip the duplicate error row in the outer rescue - safe_accumulated_message swallows the secondary failure when the SDK cannot reconstruct a snapshot - langfuse_client memoizes properly (||= instead of =) so repeated calls don't churn Langfuse instances - MessageFormatter sorts tool_calls by created_at then id so the message array is deterministic across replays; skips tool_calls missing both provider_call_id and provider_id rather than sending `id: nil` and getting rejected by Anthropic - Setting.anthropic_access_token default falls back through ENV["ANTHROPIC_API_KEY"].presence (was missing .presence, so an empty-string env value bled through) - User#openai_configured? / #anthropic_configured? delegate to the Provider::* class methods — single source of truth - Assistant::Responder renames the OpenAI-shape history builder conversation_history → openai_messages_payload so the kwarg name matches the local method name (messages: openai_messages_payload, conversation_history: chat_message_records) - Assistant::Builtin stale-history comment updated to reference both builders Adds a streaming chat_response test using ad-hoc subclasses of the SDK event types so the case/when dispatch matches via is_a? without stubbing class-level === behavior.

Implements auto_categorize, auto_detect_merchants, and enhance_provider_merchants on Provider::Anthropic via forced tool calls, plus the cost-ledger plumbing they need. - Provider::Anthropic::AutoCategorizer, AutoMerchantDetector, ProviderMerchantEnhancer each define a single output tool whose input_schema mirrors the desired output, then force the model to call it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }. Anthropic guarantees the tool_use.input matches the schema, so there is no JSON parsing fragility, no <think> tag stripping, and no json_object/json_schema fallback ladders. - Concerns::UsageRecorder mirrors the OpenAI sibling but persists cache_creation_input_tokens / cache_read_input_tokens to dedicated columns instead of metadata. - Migration adds cache_creation_tokens, cache_read_tokens (nullable integers) to llm_usages. OpenAI rows leave them null. - LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6 $3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic" for claude-* via the existing exact/prefix lookup. - Provider::Anthropic#chat_response now persists cache columns directly rather than stashing them in metadata. - 25-transaction batch cap mirrors the OpenAI provider so the cost ledger sees the same shape regardless of which provider ran a batch. Tests cover the forced-tool-call path, null/None normalization, case-insensitive merchant matching, the missing-tool_use error path, and Anthropic-specific pricing + provider inference on LlmUsage. Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.

mintlify · 2026-05-25T19:19:32Z

Docs PR opened: we-promise/docs#82

Added documentation for the new Anthropic LLM provider, including configuration, environment variables, supported models, and provider selection via LLM_PROVIDER.

jjmata · 2026-05-26T01:14:29Z

Solid foundation PR. A few observations:

Message format handling: The MessageFormatter rebuilding tool-use turns into the required tool_use block + paired user-role tool_result block is the trickiest part of Anthropic's Messages API (Anthropic requires tool results to be in the user role, not assistant). Glad to see it's explicitly handled rather than bolted on later.

Prompt caching: Attaching cache_control: { type: "ephemeral" } to the system prompt and final tool definition in ChatConfig is the correct Anthropic caching pattern. System prompts and tool definitions are the highest-leverage targets since they're large and repeated on every turn.

conversation_history: kwarg on Responder: Backward-compatible (OpenAI path ignores it). Worth confirming that chat_message_records is correctly serialized into Anthropic's content-block format by MessageFormatter — specifically that tool-use turns from prior assistant messages round-trip correctly when retrieved from the database.

Chat.default_model via Setting.llm_provider: Clean abstraction. One edge case to confirm: if llm_provider is "openai" but no OpenAI key is configured (while an Anthropic key is), User#ai_available? would return true (Anthropic configured), but the chat would try to use an OpenAI model. The PR description says Chat.default_model resolves from Setting.llm_provider, so if the operator set llm_provider = "anthropic", the default model would be claude-sonnet-4-6. That's correct. The edge case only applies if an operator sets llm_provider = "openai" with no key, which is a misconfiguration.

VCR cassettes: Noted as a follow-up requiring a real ANTHROPIC_API_KEY. The mocked tests cover the wiring adequately for a foundation PR.

Stacked PRs (#1984–#1987): All currently in draft — appropriate while #1983 awaits review. The series is well-structured with each PR independently green.

Generated by Claude Code

@jjmata

Addresses @jjmata's "worth confirming" note on PR #1983: tool-use turns from prior assistant messages must round-trip correctly when retrieved from the database. - New `ChatParser → ToolCall::Function → MessageFormatter` test walks the full path: Anthropic response with a tool_use block → ChatFunctionRequest → ToolCall::Function.from_function_request → persisted on the AssistantMessage → MessageFormatter rebuild on the next turn. Asserts the original `tool_use.id` is preserved end-to-end as both `tool_use.id` and the paired `tool_result.tool_use_id`, and that the original `input` hash and serialized result content survive. - New multi-tool assistant turn test confirms two tool_use blocks on a single assistant message render as two tool_use blocks followed by two paired tool_result blocks in a single user-role follow-up, matching Anthropic's required alternation. Both tests exercise the existing PR1 code without behavior changes.

gariasf · 2026-05-26T08:39:31Z

Thanks for the review, @jjmata. Quick responses on each point:

Round-trip confirmation — addressed in b26306d8. Added a ChatParser → ToolCall::Function → MessageFormatter round-trip test that walks the full path: Anthropic response w/ a tool_use block → ChatFunctionRequest → ToolCall::Function.from_function_request → persisted on the AssistantMessage → MessageFormatter rebuild on the next turn. Asserts the original tool_use.id survives end-to-end as both the assistant turn's tool_use.id and the paired user turn's tool_result.tool_use_id. Also added a multi-tool turn test confirming two tool_use blocks render as two tool_use blocks followed by two paired tool_result blocks (Anthropic's required alternation).

Chat.default_model edge case — prefers_anthropic && Provider::Anthropic.configured? short-circuits the first branch when Anthropic isn't actually configured. The next branch (elsif Provider::Openai.configured?) wins. If neither is configured the final else returns OpenAI's effective_model for backward compat. The reverse case you described — llm_provider="openai" + only Anthropic key set — falls through elsif Provider::Anthropic.configured? and returns Claude, which is the right call.

VCR cassettes — yes, slated as a follow-up bundled with #1985 (PDF tests need a real PDF + key too). Mocked client coverage is the bar for the foundation PR.

Stacked PRs all rebased and pushed earlier today; each independently green.

Implements auto_categorize, auto_detect_merchants, and enhance_provider_merchants on Provider::Anthropic via forced tool calls, plus the cost-ledger plumbing they need. - Provider::Anthropic::AutoCategorizer, AutoMerchantDetector, ProviderMerchantEnhancer each define a single output tool whose input_schema mirrors the desired output, then force the model to call it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }. Anthropic guarantees the tool_use.input matches the schema, so there is no JSON parsing fragility, no <think> tag stripping, and no json_object/json_schema fallback ladders. - Concerns::UsageRecorder mirrors the OpenAI sibling but persists cache_creation_input_tokens / cache_read_input_tokens to dedicated columns instead of metadata. - Migration adds cache_creation_tokens, cache_read_tokens (nullable integers) to llm_usages. OpenAI rows leave them null. - LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6 $3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic" for claude-* via the existing exact/prefix lookup. - Provider::Anthropic#chat_response now persists cache columns directly rather than stashing them in metadata. - 25-transaction batch cap mirrors the OpenAI provider so the cost ledger sees the same shape regardless of which provider ran a batch. Tests cover the forced-tool-call path, null/None normalization, case-insensitive merchant matching, the missing-tool_use error path, and Anthropic-specific pricing + provider inference on LlmUsage. Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/models/provider/anthropic/message_formatter_test.rb`:
- Around line 94-100: The test uses OpenStruct (see anthropic_response in
test/models/provider/anthropic/message_formatter_test.rb) but doesn’t require
it; add require "ostruct" at the top of that test file so OpenStruct is always
defined (place the require above the test class or before the anthropic_response
fixture/usage).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8eda0b17-1836-4ff3-b8c6-2ae23ea43747

📥 Commits

Reviewing files that changed from the base of the PR and between 6675331 and b26306d.

📒 Files selected for processing (1)

test/models/provider/anthropic/message_formatter_test.rb

sure-design · 2026-05-26T12:16:37Z

Three functional blockers still open after the latest commits — flagging for resolution before merge:

1. Streaming class namespace mismatch (silent failure)
stream_chat_response matches ::Anthropic::Streaming::TextEvent / ::Anthropic::Streaming::MessageStopEvent, but client.messages.stream yields Anthropic::Helpers::Streaming::TextEvent / Anthropic::Helpers::Streaming::MessageStopEvent. The case branches never fire, so streaming returns empty output silently.

2. Provider::Anthropic#chat_response missing messages: kwarg
Assistant::Responder calls llm.chat_response(..., messages: ..., conversation_history: ...). If the Anthropic method signature doesn't accept messages:, every Anthropic-backed chat turn raises ArgumentError: unknown keyword: :messages at runtime — not caught by the current unit tests because they don't go through Responder.

3. anthropic_access_token stored unencrypted in Setting
The field needs to be added to ENCRYPTED_FIELDS alongside the other secret tokens. Storing it in plaintext means DB dumps expose the key.

None of these are blocked by missing VCR cassettes — they're wiring bugs that should be reproducible with existing stubs.

Generated by Claude Code

OpenStruct is moving out of Ruby's default load path (warning in 3.4+, removed in 3.5+). Tests work today because ActiveSupport transitively loads it, but that's incidental. Match the existing convention in test/controllers/settings/hostings_controller_test.rb which explicitly requires ostruct for the same reason.

Implements auto_categorize, auto_detect_merchants, and enhance_provider_merchants on Provider::Anthropic via forced tool calls, plus the cost-ledger plumbing they need. - Provider::Anthropic::AutoCategorizer, AutoMerchantDetector, ProviderMerchantEnhancer each define a single output tool whose input_schema mirrors the desired output, then force the model to call it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }. Anthropic guarantees the tool_use.input matches the schema, so there is no JSON parsing fragility, no <think> tag stripping, and no json_object/json_schema fallback ladders. - Concerns::UsageRecorder mirrors the OpenAI sibling but persists cache_creation_input_tokens / cache_read_input_tokens to dedicated columns instead of metadata. - Migration adds cache_creation_tokens, cache_read_tokens (nullable integers) to llm_usages. OpenAI rows leave them null. - LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6 $3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic" for claude-* via the existing exact/prefix lookup. - Provider::Anthropic#chat_response now persists cache columns directly rather than stashing them in metadata. - 25-transaction batch cap mirrors the OpenAI provider so the cost ledger sees the same shape regardless of which provider ran a batch. Tests cover the forced-tool-call path, null/None normalization, case-insensitive merchant matching, the missing-tool_use error path, and Anthropic-specific pricing + provider inference on LlmUsage. Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.

gariasf · 2026-05-27T08:11:04Z

@sure-design Thanks for the second pass. Walking through each — all three are already resolved on the current tip (17e9e0d9):

1. Streaming class namespace mismatch — this isn't actually a mismatch. The anthropic gem ships an explicit alias in lib/anthropic/streaming.rb:

module Anthropic
  Streaming = Anthropic::Helpers::Streaming
end

::Anthropic::Streaming::TextEvent and ::Anthropic::Helpers::Streaming::TextEvent resolve to the same class object — Ruby's constant resolution follows the assignment. The case/when branches in stream_chat_response match correctly via is_a? (which is what Class#=== delegates to).

Empirical confirmation: anthropic_test.rb:147 creates ad-hoc subclasses of ::Anthropic::Streaming::TextEvent and ::Anthropic::Streaming::MessageStopEvent, fires them through the stream proc, and asserts that two text deltas and one final-response chunk are emitted. The test passes — if the namespaces were distinct classes the case/when wouldn't dispatch and the assertions would fail.

2. Provider::Anthropic#chat_response missing messages: kwarg — fixed in 714cf0bb. Signature on the current tip (anthropic.rb:96):

def chat_response(
  prompt,
  model:,
  instructions: nil,
  functions: [],
  function_results: [],
  messages: nil,           # ← accepted (ignored, Anthropic uses conversation_history)
  conversation_history: [],
  ...
)

Regression test at anthropic_test.rb:100 exercises a Responder-shaped call with both messages: and conversation_history: kwargs.

3. anthropic_access_token stored unencrypted — fixed in 714cf0bb. :anthropic_access_token is in EncryptedSettingFields::ENCRYPTED_FIELDS in setting.rb:77 alongside :openai_access_token, :twelve_data_api_key, :external_assistant_token, etc. The MessageEncryptor wrapper at setting.rb:80-89 intercepts reads (decrypt) and writes (encrypt) for every listed field; existing plaintext values fall back via the InvalidSignature/InvalidMessage rescue and re-encrypt on next write.

Worth re-checking against 17e9e0d9 if the bot was running against an older snapshot.

gariasf · 2026-05-27T08:18:55Z

@coderabbitai review again plz

coderabbitai · 2026-05-27T08:19:12Z

@gariasf Sure! I'll kick off a fresh review of the PR against the latest commits.

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

app/models/assistant/responder.rb (1)

123-132: ⚡ Quick win

Avoid double-querying chat history for both payload builders.

Line 126 and Line 141 run nearly the same query; this adds avoidable overhead every request. Reuse one fetched record set and derive both shapes from it.

♻️ Proposed refactor

+    def complete_chat_messages
+      return [] unless chat&.messages
+      `@complete_chat_messages` ||= chat.messages
+                                      .where(type: [ "UserMessage", "AssistantMessage" ], status: "complete")
+                                      .includes(:tool_calls)
+                                      .ordered
+                                      .to_a
+    end
+
     def chat_message_records
-      return [] unless chat&.messages
-
-      chat.messages
-          .where(type: [ "UserMessage", "AssistantMessage" ], status: "complete")
-          .where.not(id: message.id)
-          .includes(:tool_calls)
-          .ordered
-          .to_a
+      complete_chat_messages.reject { |m| m.id == message.id }
     end

     def openai_messages_payload
       messages = []
-      return messages unless chat&.messages
-
-      chat.messages
-          .where(type: [ "UserMessage", "AssistantMessage" ], status: "complete")
-          .includes(:tool_calls)
-          .ordered
-          .each do |chat_message|
+      complete_chat_messages.each do |chat_message|
         if chat_message.tool_calls.any?
           messages << {

Also applies to: 137-145

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/models/assistant/responder.rb` around lines 123 - 132, The
chat_message_records method currently causes two nearly identical DB queries by
re-fetching chat.messages for both payload builders; instead, fetch the records
once into a local variable (using chat.messages.where(type:
["UserMessage","AssistantMessage"], status: "complete").where.not(id:
message.id).includes(:tool_calls).ordered.to_a) inside chat_message_records (or
the caller), then pass or derive the two required shapes from that single
in-memory array rather than re-running chat.messages for each payload builder;
update any callers that expect separate queries to accept the cached array and
compute their specific shapes from it.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/models/provider/anthropic.rb`:
- Around line 265-266: The current logs call Rails.logger.warn with
e.full_message which can expose sensitive request/output payloads; update the
logging in the Langfuse trace creation/error handlers (the Rails.logger.warn
calls that reference e.full_message and e.message) to only log the exception
class, sanitized message, and a trace identifier (if available) without the full
backtrace or payload, e.g., replace usage of e.full_message with a short
sanitized string (exception class + e.message + trace_id) and remove any raw
payload/backtrace printing; apply the same change for the other occurrences
around the Rails.logger.warn calls at the later error handlers mentioned (the
similar calls near lines 289-290 and 300-301).

In `@app/models/provider/anthropic/message_formatter.rb`:
- Around line 114-123: parse_arguments can return non-Hash values
(scalars/arrays) which makes Anthropic tool_use.input invalid; modify
parse_arguments so it always returns a Hash: compute the parsed result (handle
nil, Hash, String with JSON.parse/rescue, and other types), then if the result
is not a Hash wrap it as a Hash like { "input" => result } (or symbolized :input
if codebase prefers), and return that Hash; update parse_arguments to perform
this normalization so tool_use.input is always a JSON object.

---

Nitpick comments:
In `@app/models/assistant/responder.rb`:
- Around line 123-132: The chat_message_records method currently causes two
nearly identical DB queries by re-fetching chat.messages for both payload
builders; instead, fetch the records once into a local variable (using
chat.messages.where(type: ["UserMessage","AssistantMessage"], status:
"complete").where.not(id: message.id).includes(:tool_calls).ordered.to_a) inside
chat_message_records (or the caller), then pass or derive the two required
shapes from that single in-memory array rather than re-running chat.messages for
each payload builder; update any callers that expect separate queries to accept
the cached array and compute their specific shapes from it.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eac52cd6-9fb5-4fa9-bffe-84b46d04b0ae

📥 Commits

Reviewing files that changed from the base of the PR and between 0342958 and c120a05.

⛔ Files ignored due to path filters (1)

Gemfile.lock is excluded by !**/*.lock

📒 Files selected for processing (19)

Gemfile
app/models/assistant/builtin.rb
app/models/assistant/responder.rb
app/models/chat.rb
app/models/provider/anthropic.rb
app/models/provider/anthropic/chat_config.rb
app/models/provider/anthropic/chat_parser.rb
app/models/provider/anthropic/message_formatter.rb
app/models/provider/llm_concept.rb
app/models/provider/openai.rb
app/models/provider/registry.rb
app/models/setting.rb
app/models/user.rb
test/models/chat_test.rb
test/models/provider/anthropic/chat_config_test.rb
test/models/provider/anthropic/chat_parser_test.rb
test/models/provider/anthropic/message_formatter_test.rb
test/models/provider/anthropic_test.rb
test/models/provider/registry_test.rb

… history fetch Addresses three open CodeRabbit findings on PR #1983. - Provider::Anthropic Langfuse rescue branches no longer include `e.full_message` in `Rails.logger.warn`. `full_message` bundles the backtrace + cause chain and on some SDK error types includes the serialized request/response payload (prompt, model output). Logs now report `#{e.class}: #{e.message}` only. Three sites: create_langfuse_trace, log_langfuse_generation, upsert_langfuse_trace. Note: Provider::Openai has the same pattern (copy-pasted source) — harmonization deferred to a follow-up cleanup PR; this commit fixes only the Anthropic provider to keep PR scope tight. - MessageFormatter#parse_arguments now coerces any non-Hash parsed result to `{}`. Anthropic's Messages API requires `tool_use.input` to be a JSON object (map); a stored ToolCall::Function record whose arguments parse to a scalar, bool, or array (corrupt row, legacy data, cross-provider bleed) would otherwise produce a payload the API rejects. Normal flow stores Hash arguments end-to-end so the fix is defensive — adds 2 tests covering scalar/array JSON strings and non-String non-Hash inputs. - Assistant::Responder dedups the chat-history fetch. The previous layout fired two near-identical `chat.messages.where(...).includes( :tool_calls).ordered` queries per LLM turn (one for the OpenAI-shape payload, one for the raw-records kwarg). A new memoized `complete_chat_messages` fetches once; `chat_message_records` filters out the current message via `Array#reject`, `openai_messages_payload` iterates the cached array unchanged. One SQL query per turn instead of two. Memoization scope = single Responder instance (per LLM call), so cache invalidation is not a concern. All 4370 tests pass (1 pre-existing libvips env error unrelated). Rubocop + brakeman clean.

Implements auto_categorize, auto_detect_merchants, and enhance_provider_merchants on Provider::Anthropic via forced tool calls, plus the cost-ledger plumbing they need. - Provider::Anthropic::AutoCategorizer, AutoMerchantDetector, ProviderMerchantEnhancer each define a single output tool whose input_schema mirrors the desired output, then force the model to call it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }. Anthropic guarantees the tool_use.input matches the schema, so there is no JSON parsing fragility, no <think> tag stripping, and no json_object/json_schema fallback ladders. - Concerns::UsageRecorder mirrors the OpenAI sibling but persists cache_creation_input_tokens / cache_read_input_tokens to dedicated columns instead of metadata. - Migration adds cache_creation_tokens, cache_read_tokens (nullable integers) to llm_usages. OpenAI rows leave them null. - LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6 $3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic" for claude-* via the existing exact/prefix lookup. - Provider::Anthropic#chat_response now persists cache columns directly rather than stashing them in metadata. - 25-transaction batch cap mirrors the OpenAI provider so the cost ledger sees the same shape regardless of which provider ran a batch. Tests cover the forced-tool-call path, null/None normalization, case-insensitive merchant matching, the missing-tool_use error path, and Anthropic-specific pricing + provider inference on LlmUsage. Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.

gariasf · 2026-05-27T08:44:30Z

Latest review round addressed in 5ab9bb33:

🟠 Sanitized Langfuse warn logs (sensitive payload leak) — inline reply on the affected thread.
🟠 Normalized parse_arguments so tool_use.input is always a Hash — inline reply on the affected thread.
🟡 Deduped the chat-history fetch in Assistant::Responder. New memoized complete_chat_messages runs once per LLM turn; chat_message_records filters out the current message via Array#reject, openai_messages_payload iterates the cached array. One SQL query per turn instead of two. (This was in the review-body Nitpick section without a dedicated inline thread, hence the summary here.)

Stack rebased onto the latest c120a050 (main merge), all 4409 tests green, rubocop + brakeman clean.

github-actions Bot added the not-gittensor label May 25, 2026

gariasf mentioned this pull request May 25, 2026

feat(ai): Anthropic batch ops + LLM cost ledger (2/5) #1984

Draft

8 tasks

gariasf marked this pull request as ready for review May 25, 2026 15:44

superagent-security Bot added contributor:verified Contributor passed trust analysis. pr:verified PR passed security analysis. labels May 25, 2026

chatgpt-codex-connector Bot reviewed May 25, 2026

View reviewed changes

Comment thread app/models/provider/anthropic.rb

Comment thread app/models/setting.rb Outdated

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

Comment thread app/models/chat.rb Outdated

Comment thread app/models/provider/anthropic.rb

Comment thread app/models/setting.rb Outdated

Comment thread test/models/provider/anthropic_test.rb Outdated

coderabbitai Bot reviewed May 25, 2026

View reviewed changes

Comment thread test/models/chat_test.rb Outdated

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

Comment thread test/models/provider/anthropic/message_formatter_test.rb

jjmata mentioned this pull request May 26, 2026

feat: add Google Gemini as alternative LLM provider #1968

Closed

6 tasks

Merge branch 'main' into feature/anthropic-foundation

c120a05

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

Comment thread app/models/provider/anthropic.rb Outdated

Comment thread app/models/provider/anthropic/message_formatter.rb

jjmata removed the not-gittensor label May 27, 2026

Conversation

gariasf commented May 25, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why Anthropic

What's in this PR

What's not in this PR (intentional)

Configuration (new env vars)

Test plan

Compliance notes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mintlify Bot commented May 25, 2026

Uh oh!

jjmata commented May 26, 2026

Uh oh!

gariasf commented May 26, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sure-design commented May 26, 2026

Uh oh!

gariasf commented May 27, 2026

Uh oh!

gariasf commented May 27, 2026

Uh oh!

coderabbitai Bot commented May 27, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gariasf commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gariasf commented May 25, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 25, 2026 •

edited

Loading