Skip to content

feat(ai): Anthropic batch ops + LLM cost ledger (2/5)#1984

Draft
gariasf wants to merge 3 commits into
feature/anthropic-foundationfrom
feature/anthropic-batch-ops
Draft

feat(ai): Anthropic batch ops + LLM cost ledger (2/5)#1984
gariasf wants to merge 3 commits into
feature/anthropic-foundationfrom
feature/anthropic-batch-ops

Conversation

@gariasf
Copy link
Copy Markdown
Collaborator

@gariasf gariasf commented May 25, 2026

Summary

Implements auto_categorize, auto_detect_merchants, and enhance_provider_merchants on Provider::Anthropic via forced tool calls, plus the cost-ledger plumbing they need. This is PR 2 of 5, stacked on #1983.

Why forced tool calls

Anthropic has no first-class JSON-mode flag. The idiomatic replacement is to define a single output tool whose input_schema mirrors the desired output, then force the model to invoke it with tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }. The model returns exactly one tool_use block whose input is guaranteed to validate against the schema.

Net effect:

  • No `` tag stripping
  • No `json_schema` ↔ `json_object` ↔ `none` fallback ladder
  • No `parse_json_flexibly` cascade with markdown-code-block heuristics
  • One clear failure mode ("model did not invoke the tool"), raised as Provider::Anthropic::Error

The three Anthropic task classes end up ~30% smaller than their OpenAI siblings while covering the same surface.

What's in this PR

Task classes (each is small: forced-tool-call request + schema + normalize)

  • Provider::Anthropic::AutoCategorizer → tool: report_categorizations
  • Provider::Anthropic::AutoMerchantDetector → tool: report_merchants
  • Provider::Anthropic::ProviderMerchantEnhancer → tool: report_enhancements

Each:

  • Caps batch size at 25 (parity with OpenAI) and raises a clear error above it.
  • Tags Langfuse spans with the operation name so dashboards aggregate across providers.
  • Records usage via Concerns::UsageRecorder (mirror of the OpenAI sibling).
  • Normalizes "null" strings and case-insensitive matches against the user's category / merchant list.

Cost ledger

  • Migration 20260525120000_add_anthropic_cache_tokens_to_llm_usages.rb adds nullable cache_creation_tokens and cache_read_tokens integer columns. OpenAI rows leave them null; Anthropic rows populate them.
  • LlmUsage::PRICING gains claude-opus-4-7 ($15/$75), claude-opus-4-6 ($15/$75), claude-sonnet-4-6 ($3/$15), claude-sonnet-4-5 ($3/$15), claude-haiku-4-5 ($1/$5) per MTok (Anthropic public pricing).
  • LlmUsage.infer_provider returns "anthropic" for claude-* via the existing exact/prefix lookup — no code change needed beyond the PRICING rows.
  • Provider::Anthropic#chat_response (introduced in PR 1) now persists cache columns directly instead of stashing them in metadata.

Not changed (intentional)

  • Anthropic cache-creation tokens are billed at ~1.25x input rate and reads at 0.1x. LlmUsage.calculate_cost continues to bill them at the regular input rate for now — a deliberate slight over/under depending on cache lifetime, refined in a follow-up if real-world bills warrant it.
  • OpenAI batch ops are untouched.

Test plan

  • test/models/provider/anthropic/auto_categorizer_test.rb — 3 tests: forced-tool-call request shape, null/None normalization, missing-tool_use error path
  • test/models/provider/anthropic/auto_merchant_detector_test.rb — 3 tests: same shape + case-insensitive user_merchants matching
  • test/models/provider/anthropic/provider_merchant_enhancer_test.rb — 2 tests: forced-tool-call mapping + error path
  • test/models/llm_usage_test.rb (new file) — Claude pricing math, provider inference
  • All PR 1 tests still green
  • Full suite: 4371 runs, 18048 assertions, 0 failures, 26 pre-existing skips, 1 pre-existing libvips env error
  • bin/rubocop clean
  • bin/brakeman --no-pager clean

Migration

bin/rails db:migrate

Backwards compatible: new columns are nullable and existing OpenAI write paths are untouched.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 25, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eb7a0217-2197-422f-abe9-6227a16472d5

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/anthropic-batch-ops

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gariasf gariasf force-pushed the feature/anthropic-batch-ops branch from f7b0ff8 to 487b714 Compare May 25, 2026 17:50
gariasf added a commit that referenced this pull request May 25, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.

- Provider::Anthropic::PdfProcessor classifies the document, summarizes
  it, and extracts statement metadata via a forced report_document_analysis
  tool whose input_schema mirrors the existing Provider::Openai output
  (document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
  { transactions, period, account_holder, account_number, bank_name,
  opening_balance, closing_balance } shape via report_bank_statement so
  downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
  { type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
  — Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
  No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
  claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
  including cache_creation/cache_read tokens.

Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.

Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
@gariasf gariasf force-pushed the feature/anthropic-batch-ops branch from 487b714 to 1d35650 Compare May 25, 2026 17:58
gariasf added a commit that referenced this pull request May 25, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.

- Provider::Anthropic::PdfProcessor classifies the document, summarizes
  it, and extracts statement metadata via a forced report_document_analysis
  tool whose input_schema mirrors the existing Provider::Openai output
  (document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
  { transactions, period, account_holder, account_number, bank_name,
  opening_balance, closing_balance } shape via report_bank_statement so
  downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
  { type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
  — Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
  No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
  claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
  including cache_creation/cache_read tokens.

Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.

Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
@gariasf gariasf force-pushed the feature/anthropic-batch-ops branch from 1d35650 to a35b5ae Compare May 25, 2026 18:30
gariasf added a commit that referenced this pull request May 25, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.

- Provider::Anthropic::PdfProcessor classifies the document, summarizes
  it, and extracts statement metadata via a forced report_document_analysis
  tool whose input_schema mirrors the existing Provider::Openai output
  (document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
  { transactions, period, account_holder, account_number, bank_name,
  opening_balance, closing_balance } shape via report_bank_statement so
  downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
  { type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
  — Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
  No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
  claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
  including cache_creation/cache_read tokens.

Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.

Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
@gariasf gariasf force-pushed the feature/anthropic-batch-ops branch from a35b5ae to 7cd947e Compare May 26, 2026 08:39
gariasf added a commit that referenced this pull request May 26, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.

- Provider::Anthropic::PdfProcessor classifies the document, summarizes
  it, and extracts statement metadata via a forced report_document_analysis
  tool whose input_schema mirrors the existing Provider::Openai output
  (document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
  { transactions, period, account_holder, account_number, bank_name,
  opening_balance, closing_balance } shape via report_bank_statement so
  downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
  { type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
  — Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
  No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
  claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
  including cache_creation/cache_read tokens.

Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.

Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
@gariasf gariasf force-pushed the feature/anthropic-batch-ops branch from 7cd947e to 8913007 Compare May 27, 2026 08:09
gariasf added a commit that referenced this pull request May 27, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.

- Provider::Anthropic::PdfProcessor classifies the document, summarizes
  it, and extracts statement metadata via a forced report_document_analysis
  tool whose input_schema mirrors the existing Provider::Openai output
  (document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
  { transactions, period, account_holder, account_number, bank_name,
  opening_balance, closing_balance } shape via report_bank_statement so
  downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
  { type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
  — Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
  No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
  claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
  including cache_creation/cache_read tokens.

Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.

Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
gariasf added 3 commits May 27, 2026 10:42
Implements auto_categorize, auto_detect_merchants, and
enhance_provider_merchants on Provider::Anthropic via forced tool calls,
plus the cost-ledger plumbing they need.

- Provider::Anthropic::AutoCategorizer, AutoMerchantDetector,
  ProviderMerchantEnhancer each define a single output tool whose
  input_schema mirrors the desired output, then force the model to call
  it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }.
  Anthropic guarantees the tool_use.input matches the schema, so there
  is no JSON parsing fragility, no <think> tag stripping, and no
  json_object/json_schema fallback ladders.
- Concerns::UsageRecorder mirrors the OpenAI sibling but persists
  cache_creation_input_tokens / cache_read_input_tokens to dedicated
  columns instead of metadata.
- Migration adds cache_creation_tokens, cache_read_tokens (nullable
  integers) to llm_usages. OpenAI rows leave them null.
- LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6
  $3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic"
  for claude-* via the existing exact/prefix lookup.
- Provider::Anthropic#chat_response now persists cache columns directly
  rather than stashing them in metadata.
- 25-transaction batch cap mirrors the OpenAI provider so the cost
  ledger sees the same shape regardless of which provider ran a batch.

Tests cover the forced-tool-call path, null/None normalization,
case-insensitive merchant matching, the missing-tool_use error path,
and Anthropic-specific pricing + provider inference on LlmUsage.

Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.
- LlmUsage.infer_provider now returns "anthropic" for Bedrock /
  Vertex shaped IDs (anthropic.* and anthropic/*), so cost-ledger
  filtering by provider stays correct even when no per-MTok rate is
  stored. Previously these IDs fell through to the "openai" default.
- AutoCategorizer drops the redundant nil sentinel from the
  category_name enum — the union type [string, null] already permits
  null, and some JSON Schema validators reject nil literals inside
  enum arrays.
Same rationale as the PR1 ostruct fix — explicit require so the tests
don't depend on ActiveSupport's transitive load when Ruby 3.5+ removes
OpenStruct from the default load path.
@gariasf gariasf force-pushed the feature/anthropic-batch-ops branch from 8913007 to 764424e Compare May 27, 2026 08:42
gariasf added a commit that referenced this pull request May 27, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.

- Provider::Anthropic::PdfProcessor classifies the document, summarizes
  it, and extracts statement metadata via a forced report_document_analysis
  tool whose input_schema mirrors the existing Provider::Openai output
  (document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
  { transactions, period, account_holder, account_number, bank_name,
  opening_balance, closing_balance } shape via report_bank_statement so
  downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
  { type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
  — Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
  No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
  claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
  including cache_creation/cache_read tokens.

Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.

Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant