feat(ai): Anthropic batch ops + LLM cost ledger (2/5)#1984
Draft
gariasf wants to merge 3 commits into
Draft
Conversation
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
6 tasks
f7b0ff8 to
487b714
Compare
gariasf
added a commit
that referenced
this pull request
May 25, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.
- Provider::Anthropic::PdfProcessor classifies the document, summarizes
it, and extracts statement metadata via a forced report_document_analysis
tool whose input_schema mirrors the existing Provider::Openai output
(document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
{ transactions, period, account_holder, account_number, bank_name,
opening_balance, closing_balance } shape via report_bank_statement so
downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
{ type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
— Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
including cache_creation/cache_read tokens.
Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.
Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
487b714 to
1d35650
Compare
gariasf
added a commit
that referenced
this pull request
May 25, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.
- Provider::Anthropic::PdfProcessor classifies the document, summarizes
it, and extracts statement metadata via a forced report_document_analysis
tool whose input_schema mirrors the existing Provider::Openai output
(document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
{ transactions, period, account_holder, account_number, bank_name,
opening_balance, closing_balance } shape via report_bank_statement so
downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
{ type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
— Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
including cache_creation/cache_read tokens.
Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.
Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
1d35650 to
a35b5ae
Compare
gariasf
added a commit
that referenced
this pull request
May 25, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.
- Provider::Anthropic::PdfProcessor classifies the document, summarizes
it, and extracts statement metadata via a forced report_document_analysis
tool whose input_schema mirrors the existing Provider::Openai output
(document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
{ transactions, period, account_holder, account_number, bank_name,
opening_balance, closing_balance } shape via report_bank_statement so
downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
{ type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
— Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
including cache_creation/cache_read tokens.
Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.
Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
9 tasks
a35b5ae to
7cd947e
Compare
gariasf
added a commit
that referenced
this pull request
May 26, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.
- Provider::Anthropic::PdfProcessor classifies the document, summarizes
it, and extracts statement metadata via a forced report_document_analysis
tool whose input_schema mirrors the existing Provider::Openai output
(document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
{ transactions, period, account_holder, account_number, bank_name,
opening_balance, closing_balance } shape via report_bank_statement so
downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
{ type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
— Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
including cache_creation/cache_read tokens.
Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.
Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
7cd947e to
8913007
Compare
gariasf
added a commit
that referenced
this pull request
May 27, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.
- Provider::Anthropic::PdfProcessor classifies the document, summarizes
it, and extracts statement metadata via a forced report_document_analysis
tool whose input_schema mirrors the existing Provider::Openai output
(document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
{ transactions, period, account_holder, account_number, bank_name,
opening_balance, closing_balance } shape via report_bank_statement so
downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
{ type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
— Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
including cache_creation/cache_read tokens.
Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.
Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
Implements auto_categorize, auto_detect_merchants, and
enhance_provider_merchants on Provider::Anthropic via forced tool calls,
plus the cost-ledger plumbing they need.
- Provider::Anthropic::AutoCategorizer, AutoMerchantDetector,
ProviderMerchantEnhancer each define a single output tool whose
input_schema mirrors the desired output, then force the model to call
it via tool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }.
Anthropic guarantees the tool_use.input matches the schema, so there
is no JSON parsing fragility, no <think> tag stripping, and no
json_object/json_schema fallback ladders.
- Concerns::UsageRecorder mirrors the OpenAI sibling but persists
cache_creation_input_tokens / cache_read_input_tokens to dedicated
columns instead of metadata.
- Migration adds cache_creation_tokens, cache_read_tokens (nullable
integers) to llm_usages. OpenAI rows leave them null.
- LlmUsage::PRICING gains Claude 4.x rows (opus-4-7 $15/$75, sonnet-4-6
$3/$15, haiku-4-5 $1/$5 per MTok). infer_provider returns "anthropic"
for claude-* via the existing exact/prefix lookup.
- Provider::Anthropic#chat_response now persists cache columns directly
rather than stashing them in metadata.
- 25-transaction batch cap mirrors the OpenAI provider so the cost
ledger sees the same shape regardless of which provider ran a batch.
Tests cover the forced-tool-call path, null/None normalization,
case-insensitive merchant matching, the missing-tool_use error path,
and Anthropic-specific pricing + provider inference on LlmUsage.
Stacked on #1983 (PR 1/5). 3/5 PDF + vision next.
- LlmUsage.infer_provider now returns "anthropic" for Bedrock / Vertex shaped IDs (anthropic.* and anthropic/*), so cost-ledger filtering by provider stays correct even when no per-MTok rate is stored. Previously these IDs fell through to the "openai" default. - AutoCategorizer drops the redundant nil sentinel from the category_name enum — the union type [string, null] already permits null, and some JSON Schema validators reject nil literals inside enum arrays.
Same rationale as the PR1 ostruct fix — explicit require so the tests don't depend on ActiveSupport's transitive load when Ruby 3.5+ removes OpenStruct from the default load path.
8913007 to
764424e
Compare
gariasf
added a commit
that referenced
this pull request
May 27, 2026
Implements process_pdf and extract_bank_statement on Provider::Anthropic
using the native `document` content block — no rasterization, no text
pre-extraction.
- Provider::Anthropic::PdfProcessor classifies the document, summarizes
it, and extracts statement metadata via a forced report_document_analysis
tool whose input_schema mirrors the existing Provider::Openai output
(document_type from Import::DOCUMENT_TYPES, summary, extracted_data).
- Provider::Anthropic::BankStatementExtractor returns the same
{ transactions, period, account_holder, account_number, bank_name,
opening_balance, closing_balance } shape via report_bank_statement so
downstream pdf_import code is provider-agnostic.
- Both attach the PDF as
{ type: "document", source: { type: "base64", media_type: "application/pdf", data: <b64> } }
— Claude 3.5+ / 4.x accept this natively (up to 32MB / 100 pages).
No pdf-reader, no pdftoppm, no chunking for typical statements.
- supports_pdf_processing? (introduced in PR 1) already returns true for
claude-* models, gating process_pdf with a clear error otherwise.
- Cost ledger rows are persisted via the shared UsageRecorder concern,
including cache_creation/cache_read tokens.
Tests verify the document block shape, tool_choice forcing, normalized
document_type for unknown classifications, transaction normalization
(date / amount / reference → notes), and the missing-tool_use error
path. Blank pdf_content raises before any client call.
Stacked on #1984 (PR 2/5). 4/5 pgvector RAG next.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements
auto_categorize,auto_detect_merchants, andenhance_provider_merchantsonProvider::Anthropicvia forced tool calls, plus the cost-ledger plumbing they need. This is PR 2 of 5, stacked on #1983.Why forced tool calls
Anthropic has no first-class JSON-mode flag. The idiomatic replacement is to define a single output tool whose
input_schemamirrors the desired output, then force the model to invoke it withtool_choice: { type: "tool", name: ..., disable_parallel_tool_use: true }. The model returns exactly onetool_useblock whoseinputis guaranteed to validate against the schema.Net effect:
Provider::Anthropic::ErrorThe three Anthropic task classes end up ~30% smaller than their OpenAI siblings while covering the same surface.
What's in this PR
Task classes (each is small: forced-tool-call request + schema + normalize)
Provider::Anthropic::AutoCategorizer→ tool:report_categorizationsProvider::Anthropic::AutoMerchantDetector→ tool:report_merchantsProvider::Anthropic::ProviderMerchantEnhancer→ tool:report_enhancementsEach:
Concerns::UsageRecorder(mirror of the OpenAI sibling)."null"strings and case-insensitive matches against the user's category / merchant list.Cost ledger
20260525120000_add_anthropic_cache_tokens_to_llm_usages.rbadds nullablecache_creation_tokensandcache_read_tokensinteger columns. OpenAI rows leave them null; Anthropic rows populate them.LlmUsage::PRICINGgainsclaude-opus-4-7 ($15/$75),claude-opus-4-6 ($15/$75),claude-sonnet-4-6 ($3/$15),claude-sonnet-4-5 ($3/$15),claude-haiku-4-5 ($1/$5)per MTok (Anthropic public pricing).LlmUsage.infer_providerreturns"anthropic"forclaude-*via the existing exact/prefix lookup — no code change needed beyond the PRICING rows.Provider::Anthropic#chat_response(introduced in PR 1) now persists cache columns directly instead of stashing them inmetadata.Not changed (intentional)
LlmUsage.calculate_costcontinues to bill them at the regular input rate for now — a deliberate slight over/under depending on cache lifetime, refined in a follow-up if real-world bills warrant it.Test plan
test/models/provider/anthropic/auto_categorizer_test.rb— 3 tests: forced-tool-call request shape, null/None normalization, missing-tool_use error pathtest/models/provider/anthropic/auto_merchant_detector_test.rb— 3 tests: same shape + case-insensitive user_merchants matchingtest/models/provider/anthropic/provider_merchant_enhancer_test.rb— 2 tests: forced-tool-call mapping + error pathtest/models/llm_usage_test.rb(new file) — Claude pricing math, provider inferencebin/rubocopcleanbin/brakeman --no-pagercleanMigration
Backwards compatible: new columns are nullable and existing OpenAI write paths are untouched.