Skip to content

Bug: Pending→posted reconciliation can silently destroy a distinct same-amount transaction #2013

@greatjourney589

Description

@greatjourney589

Summary

For providers without a stable linking ID across the pending→posted transition (SimpleFIN, Lunchflow, Enable Banking), reconciliation falls back to matching a pending transaction purely by source + amount + currency within an 8-day window — no name, no merchant, no provider id. When such a match is found, the incoming posted transaction claims the pending entry (overwrites its external_id, name, date, amount) instead of being inserted as a new row.

Same-amount transactions are extremely common (two $20 ATM withdrawals, a $4.50 coffee twice in a week, two $50 payments to the same biller). When an unrelated posted transaction B has the same amount as a still-pending transaction A, B overwrites A — two real transactions collapse into one. The auto_claimed_pending_ids machinery then actively prevents the lost transaction from being recreated on later syncs, making the loss permanent and invisible.

Affected code

Match (PRIORITY 2 fallback) — app/models/account/provider_import_adapter.rb:102-109

# PRIORITY 2: Fallback to EXACT amount match (for SimpleFIN and providers without linking IDs)
if pending_match.nil?
  pending_match = find_pending_transaction(date: date, amount: amount, currency: currency, source: source)
  if pending_match
    Rails.logger.info("Reconciling pending→posted via exact amount match: claiming entry #{pending_match.id} ...")
  end
end

Claim (overwrite + permanent suppression) — app/models/account/provider_import_adapter.rb:111-134

if pending_match
  old_pending_external_id = pending_match.external_id
  pending_entry_date      = pending_match.date
  entry = pending_match                                   # ← reuse the OTHER transaction's row
  entry.assign_attributes(external_id: external_id)       # ← overwrite its identity
  # ...
  if old_pending_external_id.present?
    existing_claims = Array.wrap(ex["auto_claimed_pending_ids"])
    ex["auto_claimed_pending_ids"] = (existing_claims + [ old_pending_external_id ]).uniq  # ← suppress re-import forever
  end
end

The amount-only query — app/models/account/provider_import_adapter.rb:717-743

def find_pending_transaction(date:, amount:, currency:, source:, date_window: 8)
  candidates = account.entries
    .joins("INNER JOIN transactions ...")
    .where(source: source)
    .where(amount: amount)        # "exact match - this is the strongest signal" (no name, no merchant)
    .where(currency: currency)
    .where(date: (date - date_window.days)..date)
    .where(/* extra -> provider ->> 'pending' = true */)
    .order(date: :desc)
  candidates.first               # ← picks one with no corroborating signal, no single-candidate guard
end

Note: the sibling find_pending_transaction_fuzzy (:760) does accept merchant_id/name and guards on "exactly one candidate" — but the exact-amount path here uses neither corroborating signal nor a single-candidate guard.

Second, divergent path doing the same amount-only guess — app/models/entry.rb:159-195 (reconcile_pending_duplicates):

# PRIORITY 1: Look for posted transaction with EXACT amount match
exact_candidates = acct.entries
  .joins("INNER JOIN transactions ...")
  .where.not(id: pending_entry.id)
  .where(currency: pending_entry.currency)
  .where(amount: pending_entry.amount)
  .where(date: pending_entry.date..(pending_entry.date + date_window.days)) # different window/direction
  .where(not_pending_sql)
  .limit(2)
# auto-excludes the pending entry when exactly one candidate

So the same amount-only heuristic is implemented twice, inconsistently: an import-time claim/overwrite (no single-candidate guard) and a post-import auto-exclude sweep (different window, opposite date direction, has a single-candidate guard).

Impact (silent data loss + balance corruption)

  • Two legitimately distinct transactions with the same amount collapse into one row → understated spend and a corrupted account balance.
  • The overwrite mutates external_id, name, date, and amount of the claimed entry, so the original pending transaction's data is destroyed in place.
  • auto_claimed_pending_ids records the old external_id so the sync engine excludes it from re-import — the lost transaction can never be recreated, so the loss is permanent and invisible (no error, no duplicate to notice).
  • The "exactly one candidate" guard does not protect this path — the single-pending/single-posted same-amount case is exactly what triggers the false merge, and find_pending_transaction has no such guard anyway.

Steps to reproduce

  1. Use a provider without a pending→posted linking id (SimpleFIN / Lunchflow / Enable Banking).
  2. Sync a pending transaction A: amount -20.00, name "ATM Withdrawal", date day 1.
  3. Before A posts, sync an unrelated posted transaction B: amount -20.00, different merchant/name, date day 3 (within the 8-day window).
  4. Observed: B claims A's entry — only one -20.00 row remains (with B's id/name, A's preserved pending date), so 2 × -$20 becomes 1 × -$20; balance is off by $20. On the next sync, A is not recreated because its old external_id is in auto_claimed_pending_ids.
  5. Expected: A and B remain two separate transactions; total spend -$40; balance correct.

Why the fix is structural

Correct reconciliation needs a provider-supplied stable identity across the pending→posted transition — but only Plaid supplies one (pending_transaction_id, PRIORITY 1 at :93-100). Fixing this properly means:

  1. Require corroborating signals for ID-less providers. Never claim/merge on amount + currency + window alone. Require matching merchant/name (normalized) and/or other provider metadata, and treat a bare amount match as insufficient rather than "the strongest signal".
  2. Unify the two divergent code paths — the import-time claim in Account::ProviderImportAdapter and the post-import sweep in Entry.reconcile_pending_duplicates — into one reconciliation model with a single, consistent window, date-direction, and tolerance. Today they disagree on all three.
  3. Prefer non-destructive reconciliation. Rather than overwriting the pending row in place, link/supersede it (or only auto-exclude with a high-confidence, single-candidate + corroborating-signal guard) so an incorrect merge is recoverable and auditable.
  4. Make auto_claimed_pending_ids reversible / auditable so a wrong claim does not permanently suppress a real transaction's re-import.

Acceptance criteria

  • An ID-less provider never merges/claims a pending entry on amount+currency+window alone (requires corroborating merchant/name or provider id).
  • Two distinct same-amount transactions (one pending, one unrelated posted) within the window remain two rows; balance and total spend are correct.
  • The import-time claim and the post-import sweep share one reconciliation implementation with consistent window/direction/tolerance.
  • A wrongly-suppressed transaction can be re-imported (auto-claim is reversible/auditable).
  • Regression tests cover: single pending + single unrelated posted of equal amount (must NOT merge), and a genuine pending→posted of the same provider transaction (must reconcile).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions