//Comment API gateway and server for LiquiFact, the global invoice liquidity network on Stellar. This repo provides the Express-based REST API for invoice uploads, escrow state, and future Stellar integration.
Part of the LiquiFact stack: frontend (Next.js) | backend (this repo) | contracts (Soroban).
- Node.js 20+ (LTS recommended)
- npm 9+
- Docker & Docker Compose (for local PostgreSQL)
-
Clone the repo
git clone <this-repo-url> cd liquifact-backend
-
Install dependencies
npm install
-
Configure environment
cp .env.example .env # Edit .env with your database configuration[!IMPORTANT] Startup Validation Gate: The application validates all required environment variables at boot time before binding to a port. If the configuration is invalid (e.g.
JWT_SECRETis shorter than 32 characters or KYC keys are half-configured), the server will print a redacted error summary showing the failed keys and exit immediately. Secret values are never exposed in validation output. -
Start database services
docker-compose -f docker-compose.dev.yml up -d
-
Run database migrations
npm run db:migrate
For a complete, tested mapping of every environment variable to its type, default, consumer, and secret status, see docs/configuration.md.
The backend includes a TTL-based response-cache middleware backed by an in-memory store. Caching is applied to expensive read endpoints to reduce latency and database load.
Every cached response includes an X-Cache header:
| Value | Meaning |
|---|---|
HIT |
Response was served from cache |
MISS |
Response was generated by the handler |
Clients can bypass the cache by sending a Cache-Control: no-cache request header.
| Endpoint | Cache key format | TTL |
|---|---|---|
GET /api/marketplace |
marketplace:<tenantId>:<originalUrl> |
15s |
GET /api/investor/locks |
investor:locks:<tenantId>:<originalUrl> |
15s |
GET /api/investor/locks/:invoiceId |
investor:lock:<tenantId>:<invoiceId>:<funderAddress> |
15s |
Cache keys always include the tenant ID, ensuring that data from one tenant can never be served to another, even if they share the same query parameters.
Cache entries are automatically invalidated when the underlying data changes:
| Trigger | Prefix invalidated |
|---|---|
| Invoice state transition | marketplace: |
| Investor commitment persisted | investor: |
| Investor commitment updated | investor: |
The default store is MemoryCacheStore (src/services/cacheStore.js). It supports:
- TTL-based expiry with lazy eviction
- Prefix-based invalidation for targeted cache flushing
- Singleton access via
getSharedStore()for use by both middleware and write-side services
Cache store errors are caught and logged; they never block the request.
Optional Sentry error tracking is supported through the SENTRY_DSN environment variable. When enabled, the server scrubs sensitive values before sending events, including:
- Invoice payload bodies and invoice-related fields
- Authorization headers and bearer tokens
- JWT claims (issuer, audience) and algorithms
- API keys and secret values
- Stellar XDR / Stellar-specific payloads
The /metrics endpoint exposes Prometheus-formatted metrics via a dedicated route handler at GET /metrics. It is never served to unauthenticated or non-loopback clients.
- Bearer token configured — If
METRICS_BEARER_TOKENis set, every request must carry anAuthorization: Bearer <token>header. The comparison uses a constant-time algorithm (safeEqual) to prevent timing side-channel attacks. - No token (private-network mode) — If
METRICS_BEARER_TOKENis unset, only requests originating from loopback addresses (127.0.0.1,::1,::ffff:127.0.0.1) are allowed. This is suitable for Prometheus scrapers running on the same host. - All other requests receive a uniform
401 Unauthorizedresponse with no indication of whether the failure was a missing token, wrong token, or non-loopback origin.
Loopback detection reads the direct TCP connection address from req.socket.remoteAddress. The X-Forwarded-For header is never consulted, so a remote attacker cannot spoof a loopback origin by setting X-Forwarded-For: 127.0.0.1.
There is no app.set('trust proxy', ...) call in this application. If one is added in the future (which would cause req.ip to resolve from X-Forwarded-For values), the middleware already ignores req.ip for loopback checks and reads the socket directly, making it resilient to such configuration changes.
| Variable | Required | Default | Description |
|---|---|---|---|
METRICS_BEARER_TOKEN |
No | unset | Static bearer token for /metrics. Generate with openssl rand -hex 32. When unset, only loopback is allowed. |
- ✅ Valid bearer token → 200
- ✅ Missing
Authorizationheader → 401 (uniform body) - ✅ Wrong token → 401 (uniform body, indistinguishable from missing)
- ✅ Basic auth → 401 (non-Bearer scheme)
- ✅ Loopback with no token → 200
- ✅ Non-loopback with no token → 401
- ✅
X-Forwarded-For: 127.0.0.1from non-loopback socket → blocked (spoof rejected) - ✅
::1and::ffff:127.0.0.1loopback variants → 200 - ✅ Authorization header casing (
authorization,Authorization,AUTHORIZATION) → all accepted
| Endpoint | Type | Dependencies checked | Response |
|---|---|---|---|
GET /health |
Liveness | None (process alive) | 200 { status: "ok" } |
GET /healthz |
Liveness | None (Kubernetes convention alias) | 200 { status: "ok" } |
GET /ready |
Readiness | Soroban RPC, KYC provider, indexer staleness | 200/503 with per-check detail |
GET /readyz |
Readiness | Critical: DB (via knex SELECT 1), Soroban RPC |
200/503 with per-check detail |
The /readyz probe is designed for orchestrated deployments (Kubernetes, Nomad, etc.)
to distinguish "process is alive" from "process is ready to serve traffic".
- Liveness probes (
/health,/healthz) never touch external dependencies. - Readiness probe (
/readyz) only checks critical upstream dependencies (DB, Soroban RPC). - Credentials and internal hostnames are stripped from error responses.
Health state is also surfaced as a Prometheus gauge (readiness_gauge).
Environment variables:
SENTRY_DSN— Optional Sentry DSN. Example:https://<PUBLIC_KEY>@o<ORG_ID>.ingest.sentry.io/<PROJECT_ID>SENTRY_RELEASE— Optional release tag. Defaults to package version when available.SENTRY_ENVIRONMENT— Optional environment tag. Defaults toNODE_ENV.
Do not store secrets in source control. Use .env locally and deployment secrets in production.
The application exposes Prometheus metrics on GET /metrics (subject to the same auth rules). Additional gauges added for background job observability:
liquifact_job_queue_depth: Number of pending jobs currently waiting in background queues (includes main queue across registered job queues).liquifact_job_retry_queue_size: Number of jobs currently waiting in retry queues across registered job queues.liquifact_worker_inflight_count: Number of jobs currently being processed by registered background workers.
These gauges are updated by sampling registered JobQueue and BackgroundWorker instances and are intentionally bounded to avoid high-cardinality labels.
The body_size_limit_rejections_total counter tracks every request rejected with HTTP 413 Payload Too Large, labelled by the body parser type that rejected it:
Label type |
Trigger |
|---|---|
json |
Rejected by the global JSON body parser (default 100 KB) |
urlencoded |
Rejected by the URL-encoded body parser (default 50 KB) |
invoice |
Rejected by the stricter invoice upload parser (default 512 KB) |
unknown |
Rejected by the generic error handler when content-type cannot be determined |
This counter is designed for DoS detection: a sudden spike in any type label indicates a potential attack attempting to overwhelm the API with oversized payloads.
The following PromQL alerts detect rapid increases in body-size rejections. A sustained rate of 10+ rejections per minute is a strong signal of a volumetric DoS attempt.
# Alert when JSON body-size rejections exceed 10 per minute (potential DoS)
rate(body_size_limit_rejections_total{type="json"}[5m]) > 0.167
# Alert when urlencoded body-size rejections exceed 10 per minute
rate(body_size_limit_rejections_total{type="urlencoded"}[5m]) > 0.167
# Aggregate alert across ALL body-size limit types
sum(rate(body_size_limit_rejections_total[5m])) > 0.167
Tuning guidance:
| Environment | Suggested rate threshold | Rationale |
|---|---|---|
| Development / CI | > 0.5 (30/min) |
Higher baseline from automated test traffic |
| Production (normal) | > 0.167 (10/min) |
Expected occasional oversized payloads from legitimate clients |
| Production (locked down) | > 0.017 (1/min) |
Very low tolerance — almost all oversized payloads are malicious |
For production deployments, include the full YAML alert rule from docs/prometheus-rules.yml.
Import the pre-built Grafana dashboard to visualize body-size limit rejection metrics over time:
- Open Grafana → + → Import.
- Upload or paste
docs/grafana-dashboard.json. - Select your Prometheus data source.
- Click Import.
The dashboard includes the following panels:
| Panel | Type | Description |
|---|---|---|
| Rejection Rate by Type | Time series | rate(body_size_limit_rejections_total[5m]) per type label (json, urlencoded, invoice, unknown) + aggregate |
| Current Rejection Rate | Stat | Live rate with green/yellow/red background thresholds matching alert severity |
| Rejections by Type (Current) | Bar gauge | Instant per-type rates for quick scanning |
| Cumulative Rejections (Last Hour) | Bar gauge | increase()[1h] per type — sustained values > 600 suggest probing |
| Cumulative Rejections Over Time | Time series | Hourly increase per type over the selected time window |
| Alert Threshold Reference | Bar gauge | Combined rate with visual threshold markers |
| Historical Rejection Heatmap | Time series (step) | All-types aggregate with color-coded severity bands |
This backend supports an optional external KYC provider adapter. When KYC_PROVIDER_URL and KYC_PROVIDER_API_KEY are both configured, the service submits verification requests to the provider and maps provider statuses onto internal KYC_STATUSES:
pendingverifiedrejectedexempted
Incoming provider status updates are ingested through a signed webhook endpoint:
POST /api/kyc/webhook
The webhook verifies the X-Signature HMAC signature using KYC_PROVIDER_SECRET before persisting the SME record.
When no provider keys are present, the service gracefully falls back to the in-memory mock provider behavior used for local development and tests.
The API enforces a strict matching between STELLAR_NETWORK and SOROBAN_RPC_URL at boot time. This prevents misconfiguration where a passphrase (network identity) is paired with an incompatible RPC endpoint, which would cause on-chain validation failures.
| Network | Passphrase | RPC URL |
|---|---|---|
| TESTNET | Test SDF Network ; September 2015 |
https://soroban-testnet.stellar.org |
| MAINNET | Public Global Stellar Network ; September 2014 |
https://soroban.stellar.org |
| FUTURENET | Test SDF Future Network ; October 2022 |
https://rpc-futurenet.stellar.org |
Set both variables in your .env:
STELLAR_NETWORK=TESTNET
SOROBAN_RPC_URL=https://soroban-testnet.stellar.orgDo NOT use custom RPC URLs. The validation will reject any deviation from the expected RPC for the selected network.
On startup, src/index.js calls validateStellarConfig() from src/config/stellar.js. If the network/RPC combination is invalid, the server fails to start with a clear error message:
Error: Mismatch: STELLAR_NETWORK=TESTNET requires SOROBAN_RPC_URL="https://soroban-testnet.stellar.org", but got "https://custom-rpc.example.com". This combination would cause on-chain validation failures.
- The validation is a hard fail - no partial or degraded operation is permitted.
- This ensures the backend never signs transactions with a mismatched network, which could result in fund loss.
- The passphrase is derived from the network constant and is not user-configurable.
Service-to-service callers authenticate with the X-API-Key request header. Keys are loaded from the API_KEYS environment variable — no database connection is opened per request.
Set API_KEYS as a semicolon-separated list of JSON objects:
API_KEYS={"key":"lf_svc_key_001","clientId":"billing-service","scopes":["invoices:read","invoices:write"]};{"key":"lf_old_key_001","clientId":"legacy-service","scopes":["invoices:read"],"revoked":true}
| Field | Type | Required | Description |
|---|---|---|---|
key |
string | yes | Raw API key — must start with lf_, minimum 10 chars |
clientId |
string | yes | Unique identifier for the service client |
scopes |
string[] | yes | Permissions granted; valid values: invoices:read, invoices:write, escrow:read |
revoked |
boolean | no | Set true to reject the key without removing it from the list |
- Add the new key entry to
API_KEYSand deploy — old and new key both work. - Update callers to use the new key.
- Set
"revoked": trueon the old entry and redeploy — old key is rejected immediately.
- No database per request — the former SQLite-backed path (
src/middleware/apiKey.js) has been retired; the registry is parsed from environment variables at startup. - Timing-safe comparison —
authenticateApiKeyiterates every registry entry on every request usingcrypto.timingSafeEqual; it never short-circuits on the first match, preventing timing-based key enumeration. - No key material in logs — the raw key is never written to any log line; failed lookups record only a 401 response.
- Revocation without redeploy — setting
revoked: trueand redeploying is sufficient; no DB update needed.
const { authenticateApiKey } = require('./middleware/apiKeyAuth');
// No scope requirement — any valid, non-revoked key is accepted
router.use(authenticateApiKey());
// Require a specific scope
router.use(authenticateApiKey({ requiredScope: 'invoices:write' }));On success, req.apiClient is set to { clientId, scopes }.
The production image is built with two hardening measures that address container-security best practices (CIS Docker Benchmark):
| Property | Value |
|---|---|
| Base image | node:20-slim |
| Build strategy | Multi-stage (deps → runtime) |
| Runtime user | appuser (UID 1001, non-root) |
| Dependency install | npm ci --omit=dev against committed package-lock.json |
| Health probe | GET /readyz (HTTP 200 = healthy) |
| Exposed port | 3001 |
The final image creates a dedicated appuser/appgroup (UID/GID 1001) and
switches to that identity before CMD. The process therefore runs without
root privileges, limiting the blast radius of any application-layer exploit.
RUN groupadd --gid 1001 appgroup \
&& useradd --uid 1001 --gid appgroup --no-create-home --shell /bin/false appuser
...
USER appusernpm ci requires package-lock.json to be present and in sync with
package.json. If the lockfile is absent or diverged the build fails
immediately — no silent version drift into production.
Note:
package-lock.jsonmust be committed to the repository. The.gitignorewas updated to allow this file; runnpm installlocally and commit the generated lockfile before building the Docker image.
The deps stage installs dependencies; the runtime stage copies only the
resolved node_modules and application source. Build tooling, npm itself,
and any intermediate files never reach the final layer.
# Build the hardened image
docker build -t liquifact-backend:latest .
# Run with required environment variables
docker run --rm \
-p 3001:3001 \
-e NODE_ENV=production \
-e DATABASE_URL=postgresql://user:pass@host:5432/db \
liquifact-backend:latest
# Verify the container runs as a non-root user
docker run --rm --entrypoint id liquifact-backend:latest
# Expected output: uid=1001(appuser) gid=1001(appgroup) groups=1001(appgroup)| Scenario | Behaviour |
|---|---|
package-lock.json missing |
npm ci fails; docker build exits non-zero |
package-lock.json out of sync |
npm ci fails; docker build exits non-zero |
| Health check during startup | --start-period=5s absorbs boot time; probe retried up to 3× |
| Non-root file permissions | chown -R appuser:appgroup /app grants app full access to its own tree |
| Command | Description |
|---|---|
npm run dev |
Start API with watch mode |
npm run dev:ts |
Start API with TS runtime (optional) |
npm run start |
Start API |
npm run typecheck |
Run TypeScript type checking (no emit) |
npm run build |
Compile src/ to dist/ |
npm run start:dist |
Start compiled output from dist/ |
npm run lint |
Run ESLint on src/ |
npm test |
Run load helper tests and structured error tests |
npm run db:migrate |
Run database migrations |
npm run db:rollback |
Rollback last migration |
npm run db:seed |
Run database seeds |
npm run db:migrate:down |
Rollback last migration |
npm run db:migrate:create <name> |
Create new migration file |
npm run db:migrate:reset |
Reset database (drop & re-run) |
npm run test:coverage |
Run helper/API tests with coverage |
npm run load:baseline |
Run the core endpoint load baseline suite |
Default port: 3001.
The application uses an in-memory cache store (MemoryCacheStore) by default for token metadata and other transient data to avoid unbounded memory growth.
- Eviction Policy: Configurable
maxEntriesbound (defaults to5000) with Least Recently Used (LRU) eviction. Expired entries are also lazily evicted onget(). - Metrics: Emits hit, miss, and eviction counts to Prometheus via standard counters:
soroban_footprint_cache_hits_totalsoroban_footprint_cache_misses_totalsoroban_footprint_cache_evictions_total
- Configuration: Optional and disabled by default. Set
REDIS_ESCROW_CACHE_ENABLED=truewithREDIS_URLto enable it. - Tuning:
REDIS_ESCROW_CACHE_TTL_SECONDSis strictly clamped to5..300, andREDIS_ESCROW_LEDGER_GAP_THRESHOLDcontrols ledger-gap invalidation.
Incremental TypeScript setup and migration guidance lives in docs/typescript-plan.md.
This project uses node-pg-migrate for database schema management with PostgreSQL. The migration system provides:
- SQL-first migration control with rollback support
- Multi-tenant architecture with Row Level Security (RLS)
- Production-safe transaction handling
- Comprehensive audit logging
# Start PostgreSQL and Redis
docker-compose -f docker-compose.dev.yml up -d
# Run migrations
npm run db:migrate- Multi-tenant isolation with tenant-scoped data (see
docs/multi-tenancy.md) - Soft deletes for data recovery
- Audit trail for compliance
- UUID primary keys for distributed systems
- JSONB metadata for schema flexibility
📖 Full documentation: See DB_MIGRATIONS.md for comprehensive migration guide, troubleshooting, and deployment procedures.
The API is documented using OpenAPI 3.0 specification.
- OpenAPI JSON:
GET /openapi.json- Machine-readable API specification - Interactive Docs:
GET /docs- Swagger UI for exploring and testing the API - Correlation Strategy: See
docs/invoice-correlation.mdfor details on howinvoiceIdcorrelates with on-chain Stellar and Soroban data. - Signing Modes: See
docs/ops-signing.mdfor details on the escrow transaction signing modes (delegated, custodial, stubbed). - Multi-Tenancy Model: See
docs/multi-tenancy.mdfor details on the multi-tenant architecture and data isolation constraints.
The documentation covers all public endpoints including health checks, invoice management, escrow operations, and investment opportunities.
-
Marketplace:
GET /api/marketplace- Search and sort invoices by yield, maturity, and funded ratio. Supports advanced filtering (yieldBpsMin,maturityDateTo,fundedRatioMin, etc.) and both cursor-based and offset pagination.Cursor pagination (recommended) — stable under inserts/deletes; use the
nextCursorvalue from one response as thecursorparam in the next request. Cursors are opaque and HMAC-signed; any modification returns 400.Offset pagination (legacy) — use
page+limitas before.nextCursorandhasMoreare also returned so clients can migrate incrementally.Param Mode Description cursorCursor Opaque cursor from previous nextCursor; invalidatespagelimitBoth Page size (1–100, default 10) pageOffset 1-based page number (ignored when cursorpresent)sortByBoth yield_bps|maturity_date|funded_ratio|amount|created_atorderBoth asc|desc(must be consistent across pages in cursor mode)
Example — first page (cursor mode):
curl -H "Authorization: Bearer <token>" \
"http://localhost:3001/api/marketplace?sortBy=yield_bps&order=desc&limit=10"
# Response meta: { total, limit, hasMore, nextCursor }Example — next page:
curl -H "Authorization: Bearer <token>" \
"http://localhost:3001/api/marketplace?sortBy=yield_bps&order=desc&limit=10&cursor=<nextCursor>"Example — with filters (offset mode):
curl -H "Authorization: Bearer <token>" \
"http://localhost:3001/api/marketplace?yieldBpsMin=500&sortBy=yield_bps&order=desc&page=2&limit=10"The src/middleware/smeAuth.js middleware binds Stellar wallet authorization strictly to the authenticated principal.
authorizeSmeWalletresolves the wallet address only fromreq.user.walletAddress.- The
x-stellar-addressheader is not accepted as a wallet source. Any such header is silently ignored. - Requests with no verified wallet bound to the account are rejected with an RFC 7807
403 Forbidden.
All wallet addresses are validated against ^G[A-Z2-7]{55}$ (Stellar Ed25519 public key format). Invalid formats yield a 400 before any capital-movement logic runs.
| Condition | Status | type URI |
|---|---|---|
No req.user (unauthenticated) |
401 | .../probs/unauthorized |
| No wallet bound to account | 403 | .../probs/forbidden |
| Invalid address format | 400 | .../probs/validation-error |
- Header spoofing is eliminated: a caller cannot assert a wallet they do not control by supplying
x-stellar-address. - Wallet identity is derived from the JWT principal (
req.user), which is set by the auth middleware beforeauthorizeSmeWalletruns. req.usershape is unchanged; downstream KYC/tenant resolution is unaffected.verifyInvoiceOwneraccepts ownership by eitherreq.user.idmatchinginvoice.ownerIdorreq.walletAddressmatchinginvoice.smeWallet, preventing privilege escalation via wallet substitution.
Contract tests validate route responses against the generated OpenAPI schemas.
Coverage includes:
- Success response envelopes
- RFC 7807 problem responses
- Missing required fields
- Undocumented response fields
The generated OpenAPI specification remains the single source of truth.
LiquiFact uses tenant-scoped object storage and strict validation controls for invoice uploads.
Invoice files are stored using tenant and invoice scoped object keys:
tenants/{tenantId}/invoices/{invoiceId}/{uuid}-{filename}
Example:
tenants/tenant-123/invoices/inv-456/550e8400-e29b-41d4-a716-446655440000-invoice.pdf
Security benefits:
- Tenant isolation
- Invoice isolation
- UUID-based object naming
- Protection against object enumeration
- Prevention of cross-tenant object access
Uploaded filenames are sanitized before storage.
The storage layer:
- Rejects path traversal attempts (
../) - Rejects invalid filenames
- Removes null bytes
- Sanitizes special characters
- Truncates excessively long filenames
Examples:
../../etc/passwd -> rejected
..\..\windows\system32 -> rejected
invoice.pdf -> accepted
Tenant IDs are required for all SME upload operations, provided either via:
X-Tenant-Idheader (service-to-service), ortenantIdJWT claim (authenticated users).
Tenant IDs and invoice IDs are validated before key generation.
Allowed characters:
a-z
A-Z
0-9
_
-
Rejected examples:
../../admin
tenant/admin
inv/123
The POST /api/sme/invoice/presigned-url endpoint requires an Idempotency-Key header to prevent duplicate invoiceId creation and ensure retries don't generate new presigned URLs.
- Valid key format: 8–128 URL-safe characters (a-z, A-Z, 0-9, ., _, :, -)
- Same key + same body → returns cached response (same invoiceId and presigned URL)
- Same key + different body → returns 409 Conflict
Supported invoice file types:
- application/pdf
- image/jpeg
- image/png
- image/tiff
Validation occurs during:
- Direct uploads
- Presigned URL generation
Unsupported MIME types are rejected before any storage operation occurs.
Invoice uploads are limited by:
BODY_LIMIT_INVOICEDefault:
512 KB
Files exceeding the configured limit are rejected.
Upload URLs:
15 minutes
Download URLs:
Default: 1 hour
Maximum: 24 hours
Requests outside the allowed expiry range are rejected.
The invoice upload subsystem includes:
- Path traversal protection
- MIME type allow-listing
- File size enforcement
- Tenant isolation
- Invoice isolation
- UUID object naming
- Presigned URL expiry limits
- AWS credential non-disclosure
- Server-side validation before S3 operations
- Prototype pollution prevention —
sanitizeValueinsrc/utils/sanitization.jsrecursively strips__proto__,constructor, andprototypekeys from every object and array in request body, query, and params before any downstream handler or Knex query sees the data. Depth and string-length caps bound processing cost for adversarially deep payloads.
Core routes currently covered:
- Health:
GET /health - API Info:
GET /api - Invoices:
GET /api/invoices(with optional status filter),GET /api/invoices/:id,POST /api/invoices - Escrow:
GET /api/escrow/:invoiceId,POST /api/escrow - Investment:
GET /api/invest/opportunities - SME Metrics:
GET /api/sme/metrics
liquifact-backend/
├── src/
│ └── index.js
├── tests/
│ └── load/
│ ├── config.js
│ ├── reporter.js
│ ├── run-baselines.js
│ └── *.test.js
├── .env.example
├── eslint.config.js
└── package.json
For the full end-to-end model (indexer → projection → GET /api/escrow, funding via escrowSubmit, reconciliation, signing modes, and env contracts), see docs/escrow-integration-overview.md.
Nightly reconciliation compares the DB-side fundedTotal against the on-chain funded_amount for every invoice in linked_escrow, funded, or partially_funded state. Each run is persisted to the reconciliation_runs table and metrics are emitted to Prometheus.
| Component | Location | Notes |
|---|---|---|
| Job | src/jobs/reconcileEscrow.js |
Injectable dbClient and escrowAdapter for testability |
| Metrics | src/metrics.js |
Four new Prometheus instruments (see below) |
| History API | GET /api/admin/reconciliation/runs |
Admin-only, tenant-scoped, paginated |
| Migration | migrations/20260429000000_create_reconciliation_runs.js |
One row per run; reconciled_at indexed |
| Ops guide | docs/ops-reconcile.md |
Full architecture, alerting rules, troubleshooting |
| Metric | Type | Description |
|---|---|---|
escrow_reconciliation_mismatches_total |
Counter | Cumulative per-invoice mismatch count (use with increase() in alerts) |
escrow_reconciliation_mismatched_invoices |
Gauge | Mismatch count in the most recent run |
escrow_reconciliation_drift_magnitude |
Gauge | Sum of ` |
escrow_reconciliation_drift_alerts_total |
Counter | Runs that breached RECONCILIATION_DRIFT_THRESHOLD |
Suggested Prometheus alert rules:
# Alert on any drift detected in the last 26 hours:
increase(escrow_reconciliation_mismatches_total[26h]) > 0
# Alert when a run explicitly exceeded the configured threshold:
increase(escrow_reconciliation_drift_alerts_total[26h]) > 0
Set RECONCILIATION_DRIFT_THRESHOLD (integer, default 1) to control when a run is treated as a threshold breach. When the number of mismatches in a single run meets or exceeds this value:
escrow_reconciliation_drift_alerts_totalis incremented.- An error-level structured log is emitted with
mismatches,threshold,totalDrift, andreconciledAtfields.
| Variable | Default | Description |
|---|---|---|
RECONCILIATION_DRIFT_THRESHOLD |
1 |
Mismatch count that triggers a drift alert log and counter increment |
GET /api/admin/reconciliation/runs
Authorization: Bearer <admin-token> (or X-API-Key)
X-Tenant-Id: <tenant-id>
Returns a paginated list of recent reconciliation run summaries, newest first. Per-invoice results are excluded from list rows — raw on-chain values are never leaked here.
Query parameters
| Param | Default | Range | Description |
|---|---|---|---|
limit |
20 | 1–100 | Rows per page |
page |
1 | ≥ 1 | 1-based page number |
Example response
{
"data": [
{
"id": "b1c2d3e4-...",
"total": 150,
"matches": 148,
"mismatches": 2,
"errors": 0,
"reconciled_at": "2026-06-25T02:00:00.000Z",
"created_at": "2026-06-25T02:00:01.000Z"
}
],
"meta": {
"total": 42,
"page": 1,
"limit": 20,
"totalPages": 3,
"hasMore": true,
"timestamp": "...",
"version": "0.1.0"
},
"message": "Reconciliation runs retrieved successfully."
}Security
- Admin-only: requires a valid JWT bearer token or API key.
- Tenant-scoped:
x-tenant-idheader or JWTtenantIdclaim required. - No raw on-chain values (contract addresses, XDR, ledger keys) are surfaced in any response or error.
node -e "require('./src/jobs/reconcileEscrow').performReconciliation().then(s => console.log(s))"For full ops guidance (cron scheduling, Soroban RPC config, error handling, troubleshooting), see docs/ops-reconcile.md.
The API supports invoice-to-escrow contract address resolution using environment-based configuration for early phases. This allows mapping invoice IDs to their corresponding Stellar escrow contract addresses without requiring on-chain registry lookups.
Configure escrow mappings using the ESCROW_ADDR_BY_INVOICE environment variable:
ESCROW_ADDR_BY_INVOICE='{"mappings":[{"invoiceId":"inv_demo_001","escrowAddress":"GABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890ABCDEFGHIJKLM","environment":"development","isActive":true}],"defaultEnvironment":"development","allowlistEnabled":true,"cacheEnabled":true,"cacheTtlSeconds":300}'- Allowlist Validation: Only mapped invoices can be resolved
- Environment Separation: Different mappings for development, staging, production
- Address Validation: Ensures Stellar addresses are properly formatted
- Caching: In-memory caching with configurable TTL
- Input Validation: Strict validation of invoice IDs and addresses
The mapping system is automatically used by escrow endpoints. When resolving /api/escrow/:invoiceId, the system:
- Validates the invoice ID format
- Checks if the invoice is in the allowlist for the current environment
- Returns the corresponding Stellar escrow contract address
- Caches the result for subsequent requests
For production deployments:
- Environment Separation: Use different mappings per environment
- Key Rotation: Update mappings by modifying the environment variable
- Monitoring: Use health checks to validate mapping configuration
- Security: Only map invoices you own or have explicit permission to map
{
"mappings": [
{
"invoiceId": "inv_123",
"escrowAddress": "GABC...123",
"environment": "development",
"isActive": true
}
],
"defaultEnvironment": "development",
"allowlistEnabled": true,
"cacheEnabled": true,
"cacheTtlSeconds": 300
}The repo includes a focused load baseline suite for representative core endpoint reads:
GET /health— health check (critical path)GET /api/invoices— invoice listGET /api/escrow/:invoiceId— escrow state readGET /api/marketplace— marketplace search (hot endpoint)GET /api/invest/opportunities— investment opportunities list (hot endpoint)
The suite uses autocannon and captures:
- total requests
- throughput in requests per second
- average latency
- p50 latency
- p95 latency
- p99 latency
- error count
- non-2xx count
- timeout count
These are the canonical health, invoices, escrow, marketplace, and invest endpoints currently exposed by the backend. Marketplace and invest/opportunities are prioritized as hot read endpoints with strict latency and error-rate assertions.
The load suite is intentionally safe by default:
- it targets
http://127.0.0.1:3001 - it blocks remote targets unless
ALLOW_REMOTE_LOAD_BASELINES=true - it does not hardcode tokens or credentials
- it uses a placeholder escrow invoice id unless a fixture id is provided
Do not run the suite against production without explicit approval.
The suite includes strict load assertions for hot endpoints (marketplace, invest-opportunities):
| Endpoint | p99 Latency Ceiling | Max Error Rate |
|---|---|---|
marketplace |
1000 ms | 1% |
invest-opportunities |
1000 ms | 1% |
invoices-list |
500 ms | 1% |
escrow-read |
500 ms | 1% |
health |
50 ms | 0% |
Assertions are gated behind ENABLE_LOAD_BASELINES=true to prevent automatic execution during the default test suite. Run them explicitly when validating performance targets.
| Variable | Default | Purpose |
|---|---|---|
LOAD_BASE_URL |
http://127.0.0.1:3001 |
Base URL for the load target |
ALLOW_REMOTE_LOAD_BASELINES |
false |
Explicit opt-in for non-local targets |
LOAD_DURATION_SECONDS |
15 |
Duration per endpoint scenario |
LOAD_CONNECTIONS |
10 |
Concurrent connections per scenario |
LOAD_TIMEOUT_SECONDS |
10 |
Request timeout |
LOAD_AUTH_TOKEN |
unset | Optional bearer token for protected endpoints |
LOAD_ESCROW_INVOICE_ID |
placeholder-invoice |
Escrow fixture id |
LOAD_REPORT_DIR |
tests/load/reports |
Directory for generated reports |
ENABLE_LOAD_BASELINES |
false |
Gate for baseline assertion tests (marketplace, invest endpoints) |
-
Start the API locally:
npm run dev
-
In another terminal, run the baseline suite:
npm run load:baseline
-
Optional example with custom settings:
LOAD_DURATION_SECONDS=20 LOAD_CONNECTIONS=25 LOAD_ESCROW_INVOICE_ID=invoice-123 npm run load:baseline
-
To run baseline assertion tests for hot endpoints (marketplace, invest):
ENABLE_LOAD_BASELINES=true npm test -- baselines.test.js
- Remote load targets are blocked by default.
- Secrets and tokens must come from environment variables.
- The suite never prints auth tokens.
- The selected baseline endpoints are low-risk reads.
The repository includes a reproducible one-command E2E smoke test script that uses Docker Compose to spin up a fully isolated environment including the API, a test Postgres database, and a mocked Soroban RPC server.
- Service health:
/health(verifies API, DB reachability, and Soroban mock integration). - Versioned API:
GET /v1/escrow/:invoiceId(verifies token authentication and Soroban mock state). - Backward compatibility:
GET /api/escrow/:invoiceId(verifies deprecation warning headers).
Ensure you have Docker and Docker Compose installed.
npm run e2e:apiThe script will:
- Build and start the
api,db, andmock-sorobanservices. - Wait for the API to report a healthy status.
- Run the Jest smoke test suite against the live containers.
- Clean up (shutdown and remove) the containers and volumes.
- Isolated Environment: Uses a dedicated
docker-compose.e2e.ymland a private network. - Mocked Dependencies: Points
SOROBAN_RPC_URLto a local mock server to ensure tests are fast, deterministic, and don't require external network access. - Fail-Fast Healthchecks: The API and DB services use Docker healthchecks to ensure dependent services only start when their dependencies are ready.
Each run generates:
- a JSON artifact
- a Markdown artifact
- a console summary
By default, reports are written to:
tests/load/reports/
│ ├── config/ │ │ └── cors.js # CORS allowlist parsing and policy │ ├── middleware/ │ │ ├── auth.js # JWT authentication middleware │ │ ├── audit.js # Immutable audit logging for mutations │ │ ├── deprecation.js # API deprecation notices │ │ ├── errorHandler.js # Centralized error handling │ │ ├── rateLimit.js # Rate limiting enforcement │ │ └── stacks.js # Composed middleware stacks (authenticatedTenantStack, adminStack) │ ├── services/ │ │ ├── invoiceService.js # Business logic and pagination │ │ └── soroban.js # Contract interaction wrappers │ ├── utils/ │ │ ├── asyncHandler.js # Express async error wrapper │ │ └── retry.js # Exponential backoff utility │ ├── app.js # Express app, middleware, routes │ └── index.js # Runtime bootstrap ├── tests/ │ ├── setup.js # Test configuration │ ├── helpers/ │ │ └── createTestApp.js # Test app factory │ ├── unit/ │ │ ├── asyncHandler.test.js │ │ └── errorHandler.test.js │ └── app.test.js ├── .env.example # Env template ├── eslint.config.js └── package.json
---
## Resiliency & Retries
### Circuit Breaker (`src/utils/circuitBreaker.js`)
The project uses a circuit breaker pattern to protect against cascading failures from unstable external dependencies
(Soroban RPC, Redis, KYC provider, etc.).
#### State lifecycle
CLOSED → (failureThreshold reached) → OPEN → (recoveryTimeout elapsed) → HALF_OPEN → (success) → CLOSED → (failure) → OPEN
- **CLOSED** — Normal operation; requests pass through to the dependency.
- **OPEN** — Requests fail fast (or return a fallback) without calling the dependency.
- **HALF_OPEN** — A single probe request is allowed; success recovers the breaker, failure re-opens it.
#### Options
| Option | Default | Description |
|--------|---------|-------------|
| `name` | `'default'` | Unique label attached to Prometheus metrics so each dependency is distinguishable |
| `failureThreshold` | `5` | Consecutive failures before the breaker trips to OPEN |
| `recoveryTimeout` | `10000` | Milliseconds before OPEN → HALF_OPEN transition |
| `fallbackLogic` | `null` | Function returning alternative data when the circuit is OPEN |
| `onStateChange` | `null` | Callback `(oldState, newState)` fired on every state transition |
#### `reset()` method
Forces the breaker back to the **CLOSED** state and clears the failure count. Operators can call `reset()`
after deploying a fix to a dependency (e.g., restarting Soroban RPC, redeploying Redis) without waiting
for the recovery timeout.
```js
breaker.reset();
console.log(breaker.state); // 'CLOSED'
console.log(breaker.failureCount); // 0
Security:
reset()is an instance method — it cannot be triggered by untrusted HTTP input. No external caller can force-reset a breaker that it does not hold a reference to.
Every state transition emits a Prometheus counter:
soroban_circuit_breaker_state_transitions_total{breaker_name="soroban",state="OPEN"}
Labels:
breaker_name— Distinguishes breakers per dependency (soroban,redis,kyc, …)state— The target state (CLOSED,OPEN,HALF_OPEN)
Cardinality is bounded: (#breaker names) × (3 states). The counter is defined in src/metrics.js and
shims gracefully when prom-client is not installed (no throws, no-ops).
Creating a named breaker:
const { CircuitBreaker } = require('./utils/circuitBreaker');
const redisBreaker = new CircuitBreaker({
name: 'redis',
failureThreshold: 3,
recoveryTimeout: 5000,
fallbackLogic: () => null,
});Soroban RPC reads are routed through the shared breaker (src/services/soroban.js, name soroban).
Configuration is read from environment variables:
| Variable | Default | Description |
|---|---|---|
SOROBAN_CB_FAILURE_THRESHOLD |
5 |
Consecutive failures before tripping |
SOROBAN_CB_RECOVERY_TIMEOUT |
10000 |
Milliseconds before half-open probe |
- Remote load targets are blocked by default.
- Secrets and tokens must come from environment variables.
- The suite never prints auth tokens.
- If protected endpoints are added later, use least-privilege non-production credentials.
- The selected baseline endpoints are low-risk reads to avoid destructive behavior.
- missing base URL falls back to a safe local default
- remote targets require explicit opt-in
- invalid concurrency, duration, or timeout values are rejected
- missing auth token is handled gracefully
- missing escrow fixture id falls back to a placeholder
- partial endpoint failures are still captured in the report
- This suite establishes baselines, not maximum capacity.
- Results depend on local machine resources and runtime conditions.
- The invoices and escrow endpoints are currently placeholders, so these baselines should be treated as early reference points rather than production sizing data.
All API failures now return a consistent structured error payload:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Malformed JSON request body.",
"correlation_id": "req_f7d1b9f6c0f1459d8b3b7b6a",
"retryable": false,
"retry_hint": "Fix the JSON payload and try again."
}
}code: stable machine-readable error codemessage: safe human-readable messagecorrelation_id: per-request identifier for debugging and supportretryable: whether the caller may safely retryretry_hint: safe retry guidance
VALIDATION_ERRORAUTHENTICATION_REQUIREDFORBIDDENNOT_FOUNDUPSTREAM_ERRORINTERNAL_SERVER_ERROR
- Every request receives a correlation ID.
- The API returns it in both the response body and the
X-Correlation-Idheader. - If a client sends
X-Correlation-Idand it matches the accepted pattern, the value is echoed back. - Invalid client-supplied IDs are ignored and replaced with a generated ID.
- Each request also gets a per-request child logger attached to
req.log, which carries the request ID and correlation ID in structured log fields without binding secrets or request payloads.
The centralized mapper covers:
- malformed JSON
- validation failures
- authorization and authentication failures
- not found responses
- upstream connection failures
- unexpected thrown errors
- non-
Errorthrown values
- Internal stack traces and raw exception details are never returned to clients.
- Correlation IDs are sanitized and do not expose internal state.
- Retry hints are generic and do not leak infrastructure details.
- Server-side logs include correlation context without returning sensitive internals in responses.
Validation error:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Invoice payload must be a JSON object.",
"correlation_id": "req_d3b92b4d2d554f33b8d8b089",
"retryable": false,
"retry_hint": "Send a valid JSON object in the request body and try again."
}
}Unexpected error:
{
"error": {
"code": "INTERNAL_SERVER_ERROR",
"message": "An internal server error occurred.",
"correlation_id": "req_3d5d8c9e4ff34dd9aa73b946",
"retryable": false,
"retry_hint": "Do not retry until the issue is resolved or support is contacted."
}
}The backend supports durable idempotency keys for funding operations to safely retry requests without risking double-funding.
- Send an
Idempotency-Keyheader with each distinct funding request. The key must be an 8-128 character URL-safe string. - First use: The backend processes the request and persists the key along with a SHA-256 hash of the payload and the resulting response.
- Identical retries: Resending the same key with the same payload will short-circuit and instantly replay the cached response.
- Conflicting payload: Resending the same key with a different payload body results in a
409 Conflictcontaining an RFC 7807application/problem+jsonerror envelope.
- Keys expire after a configurable TTL (default is 24 hours, overridable via
IDEMPOTENCY_KEY_TTL_HOURS). - Expired keys are automatically purged to save database space, governed by the
expires_atindex.
src/services/investorCommitment.js persists funding intents from the POST /api/invest/fund-invoice flow and exposes an in-memory lock store for the GET /api/investor/locks routes.
amountStroops is the on-chain principal unit. The service enforces strict format rules before any database write:
| Rule | Detail |
|---|---|
| Type | Must be a string — numeric types are rejected, never coerced |
| Format | Digits only — no decimal point, no sign, no scientific notation |
| Leading zeros | Rejected (e.g. "007") |
| Range | Must be > 0 and ≤ 10^18 stroops (≈ 10 billion XLM) |
Any violation throws a CommitmentValidationError with a typed .code:
| Code | Trigger |
|---|---|
INVALID_AMOUNT_TYPE |
Value is not a string |
INVALID_AMOUNT_FORMAT |
Contains non-digit characters or leading zeros |
INVALID_AMOUNT_RANGE |
Value is zero |
INVALID_AMOUNT_OVERFLOW |
Value exceeds the upper bound |
INVALID_INVESTOR_ADDRESS |
Investor address fails Stellar address validation |
AMOUNT_IMMUTABLE |
Caller attempted to update amount_stroops via updateCommitment |
validateAddress(address) checks that the investor address is a valid Stellar public key (G… or C… prefix, 56 base-32 characters). It returns { valid, reason } and is also called from the GET /api/investor/locks routes to validate the funderAddress query parameter.
persistCommitment accepts an optional idempotencyKey. When a row with that key already exists the function returns it immediately — no second insert is made. updateCommitment refuses to modify amount_stroops to prevent silent corruption of commitment records.
The service maintains a Map-backed lock cache (claimNotBefore, investorEffectiveYieldBps) mirrored from the DB. All cached entries carry stale: true because they are not read live from the chain.
| Function | Purpose |
|---|---|
seedInvestorLocks() |
Populate representative data (used in tests) |
clearInvestorLocks() |
Wipe the cache (used between test suites) |
setInvestorLock(params) |
Upsert a lock record |
getInvestorLock(invoiceId, funderAddress) |
Look up a single lock |
getInvestorLocksByAddress(funderAddress, opts) |
Filter by funder address |
getAllInvestorLocks(opts) |
List all locks |
| Method | Path | Description |
|---|---|---|
GET |
/api/investor/locks |
List locks, optional funderAddress / invoiceId filters |
GET |
/api/investor/locks/:invoiceId |
Single lock for a specific invoice and funder |
Both routes require a valid JWT (Authorization: Bearer <token>). An invalid funderAddress returns 400 with { error: "invalid Stellar address: …" }.
The backend now supports a database-backed append-only audit log for:
- admin actions (for example, KYC state transitions or key-rotation operations)
- webhook dispatch outcomes (success/failure with redacted payload fields)
- retention policy and legal-hold mutations (create/update/release with before/after snapshots)
Run SQL migrations in order:
migrations/202604260001_create_audit_log_events.sqlmigrations/202604260002_enforce_audit_log_append_only.sql
audit_log_events is enforced as append-only at the database layer via triggers that reject UPDATE and DELETE.
src/middleware/auditLog.jsattachesreq.audithelpers:req.audit.logAdminAction(...)req.audit.logWebhookDelivery(...)req.audit.logRetentionMutation(...)/emitRetentionAuditSafely(req, ...)for retention routes
- successful
POST|PUT|PATCH|DELETErequests under/api/admin/*are auto-logged - retention routes (
POST/PUT /api/retention/policies,POST /api/retention/legal-holds,POST .../release) emitretention_mutationevents with tenant-scoped metadata - sensitive fields are redacted before persistence (
password,token,secret,apiKey,privateKey, etc.)
GET /api/admin/audit/invoices/:invoiceId/export accepts a format query parameter:
format |
Behaviour |
|---|---|
json (default) |
Returns a paginated JSON array. The limit query param (default 50, max 500) controls the page size. |
csv |
Streaming: rows are emitted directly from the database cursor and piped to the HTTP response. The full result set is never buffered in memory, making this safe for arbitrarily large audit trails. |
PostgreSQL cursor (Knex .stream())
→ createCsvTransform() ← object-mode Transform, writes header on first row
→ res (HTTP response)
Both ends of the pipeline attach error listeners. If the database stream or the transform errors after headers have been flushed, the socket is destroyed cleanly to avoid a hanging client connection.
Every CSV field is processed by escapeCsvField() in src/services/auditLogStore.js:
- Leading-whitespace normalisation — the field is checked after stripping leading whitespace (
trimStart()), so values like=HYPERLINK(...)or\t=cmdare caught even when the dangerous character is not in position 0. - Leading-character neutralisation — cells whose first non-whitespace character is
=,+,-,@,|, TAB, or CR are prefixed with a single quote ('). This covers the full OWASP CSV Injection list and prevents spreadsheet software (Excel, LibreOffice Calc, Google Sheets) from interpreting the cell as a formula or DDE command. - RFC 4180 quoting — fields containing commas, double-quotes, or newlines are wrapped in double-quotes; embedded double-quotes are doubled (
"→"").
Tenant scoping is enforced at the database level using a whereRaw filter on the JSONB metadata column:
WHERE metadata->>'tenantId' = ?No cross-tenant row is ever loaded into application memory.
Content-Type: text/csv
Content-Disposition: attachment; filename="audit-<invoiceId>.csv"
id, timestamp, actor, action, resourceType, resourceId, statusCode, ipAddress, userAgent
Admin action logging:
curl -X POST http://localhost:3001/api/admin/kyc/cus_42/approve \
-H "Authorization: Bearer <admin-jwt>" \
-H "x-admin-action: kyc.approve" \
-H "x-audit-target-type: kyc_profile" \
-H "x-audit-target-id: cus_42" \
-H "Content-Type: application/json" \
-d '{"reason":"manual review","privateKey":"redacted-at-write-time"}'Streaming CSV export:
curl -H "Authorization: Bearer <admin-jwt>" \
-H "x-tenant-id: tenant-alpha" \
"http://localhost:3001/api/admin/audit/invoices/inv-001/export?format=csv" \
-o audit-inv-001.csvJSON export (paginated):
curl -H "Authorization: Bearer <admin-jwt>" \
-H "x-tenant-id: tenant-alpha" \
"http://localhost:3001/api/admin/audit/invoices/inv-001/export?format=json&limit=100"All API failures return a structured error payload:
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Malformed JSON request body.",
"correlation_id": "req_f7d1b9f6c0f1459d8b3b7b6a",
"retryable": false,
"retry_hint": "Fix the JSON payload and try again."
}
}VALIDATION_ERRORAUTHENTICATION_REQUIREDINVALID_TOKENFORBIDDENNOT_FOUNDRATE_LIMITEDUPSTREAM_ERRORINTERNAL_SERVER_ERROR
- Internal stack traces and raw exception details are never returned to clients.
- Correlation IDs are sanitized.
- Retry hints are generic and do not leak infrastructure details.
The repo includes a focused negative security test suite for middleware hardening.
- unauthorized requests with no
Authorizationheader - malformed
Authorizationheader formats - invalid or tampered Bearer tokens
- rate-limited abuse against a representative protected endpoint
- non-leakage checks for error bodies and headers
- public-route behavior when malformed auth headers are present
GitHub Actions runs on push and pull requests to main:
- Lint:
npm run lint - Build check:
node --check src/index.js
- Fork the repo and clone your fork.
- Create a branch from
main. - Run
npm install. - Make focused changes and keep style consistent.
- Run
npm run lint,npm test, and any relevant local checks. - Push your branch and open a pull request.
We welcome docs improvements, bug fixes, and new API endpoints aligned with LiquiFact product goals.
See CONTRIBUTING.md for branch naming, local checks, testing expectations, CI behavior, and pull request guidance.
The backend sends maturity reminders to relevant parties before invoices reach their settlement date. Email delivery includes built-in resiliency:
- Exponential backoff: Transient SMTP failures (4xx, network errors) are automatically retried with configurable backoff (default: 3 attempts, ~1s base delay, doubling each attempt)
- Error classification: Permanent SMTP failures (5xx, invalid recipient) fail immediately without retry to avoid wasting resources
- Dead-lettering: Emails that fail after all retries are recorded as sanitized rows in
maturity_reminder_dead_lettersfor durable inspection - Observability: Prometheus counters track delivery attempts, successes, and dead-lettered messages with fine-grained failure reasons
| Variable | Default | Description |
|---|---|---|
SMTP_HOST |
- | SMTP server hostname |
SMTP_PORT |
587 | SMTP server port |
SMTP_USER |
- | SMTP authenticated username |
SMTP_PASS |
- | SMTP authenticated password |
SMTP_FROM |
noreply@liquifact.com |
Sender email address |
SMTP_MAX_RETRIES |
3 | Maximum retry attempts for transient failures |
When SMTP_HOST is unset, the system runs in dry-run mode (logs to console instead of sending real emails), which is ideal for local development and CI testing.
Three Prometheus counters track reminder delivery:
maturity_reminder_delivery_attempts_total{job_type="maturity_reminder"} # Each attempt (including retries)
maturity_reminder_delivery_success_total{job_type="maturity_reminder"} # Successful deliveries
maturity_reminder_dead_letter_total{job_type,reason} # Dead-lettered reminders
├─ reason="permanent_error" # Permanent SMTP failures (5xx)
└─ reason="max_retries_exceeded" # Exhausted all transient retries
See docs/email-ops.md for full technical details on retry logic, error classification, and dead-letter queue management.
LiquiFact delivers signed webhook callbacks to tenant-configured endpoints whenever an invoice transitions between states (e.g. pending → approved, approved → linked_escrow).
- State transition —
invoiceStateMachine.executeTransitioncompletes successfully. - Job enqueue —
enqueueWebhookDeliverylooks up the tenant'swebhook_url/webhook_secretfrom the database and enqueues awebhook_deliveryjob via the sharedBackgroundWorker. - Signed delivery — the
webhookDeliveryjob handler constructs a deterministically-sorted JSON payload, signs it with HMAC-SHA256 (v1scheme), and POSTs it with anX-Signatureheader. - Retry — transient failures (network errors, HTTP 5xx) are retried with bounded exponential backoff. Non-retriable failures (HTTP 4xx) are not retried.
- Dead-letter — after exhausting all retry attempts the delivery is written to
webhook_dead_lettersand a Prometheus counter is incremented.
Every webhook request carries an X-Signature header in the format:
t=<unix_timestamp>,v1=<hmac_sha256_hex>
To verify on the receiving end:
- Extract
t(timestamp, seconds since epoch) andv1(hex signature) from the header. - Reject if
|now_ms − t × 1000| > 300000(5-minute tolerance window). - Compute the expected signature:
HMAC-SHA256(secret, "<t>.<raw_body>") - Compare using a constant-time function (e.g.
crypto.timingSafeEqual) to prevent timing attacks. - Reject if the signatures do not match.
Example (Node.js receiver):
const crypto = require('crypto');
function verifyWebhook(secret, rawBody, signatureHeader) {
const parts = Object.fromEntries(
signatureHeader.split(',').map((p) => p.split('='))
);
const ts = parseInt(parts.t, 10);
if (Math.abs(Date.now() - ts * 1000) > 5 * 60 * 1000) {
return false; // replay / clock-skew rejected
}
const expected = crypto
.createHmac('sha256', secret)
.update(`${ts}.${rawBody}`)
.digest('hex');
return crypto.timingSafeEqual(
Buffer.from(parts.v1, 'hex'),
Buffer.from(expected, 'hex')
);
}Keys are always sorted alphabetically (deterministic) to simplify signature verification on any platform.
| Variable | Default | Description |
|---|---|---|
WEBHOOK_MAX_RETRIES |
3 |
Max retry attempts after the first failure |
WEBHOOK_BASE_DELAY |
500 |
Base exponential-backoff delay (ms) |
WEBHOOK_MAX_DELAY |
10000 |
Maximum backoff delay cap (ms) |
WEBHOOK_TIMEOUT_MS |
5000 |
Per-request HTTP timeout (ms) |
Configure per-tenant webhook delivery by storing webhook_url and webhook_secret in the tenants.settings JSONB column:
UPDATE tenants
SET settings = settings || '{"webhook_url":"https://your.endpoint/cb","webhook_secret":"<strong-random-secret>"}'
WHERE id = 'your-tenant-id';Security: Generate
webhook_secretwith at least 32 bytes of cryptographic randomness (e.g.openssl rand -hex 32). Rotate secrets by updating the column — in-flight jobs will fail safe and dead-letter, then delivery resumes automatically on the next enqueue.
- Secrets and full target URLs are never logged at
infolevel. - Signature comparison uses
crypto.timingSafeEqual— no timing side-channels. - The 5-minute timestamp tolerance prevents replay attacks.
- Webhook delivery failures never affect the outcome of a state transition.
- Dead-lettered deliveries are stored in
webhook_dead_lettersfor ops inspection.
The background job queue (src/workers/jobQueue.js) is in-memory by default. An optional
durable backing persists every job transition to the background_jobs table so that queued
and in-flight jobs survive a process crash.
# .env
JOB_QUEUE_PERSISTENCE_ENABLED=true
DATABASE_URL=postgresql://user:pass@localhost:5432/liquifactRun the migration before starting the server:
npm run db:migrate # applies migrations/20260625000000_create_background_jobs.sql| Event | In-memory | With persistence |
|---|---|---|
enqueue |
adds to queue | + INSERT into background_jobs |
dequeue |
marks PROCESSING | + UPDATE status → processing |
ack |
marks COMPLETED | + UPDATE status + stamps acked_at |
retry |
marks RETRYING / FAILED | + UPDATE status |
| restart | queue is empty | unacked rows are requeued automatically |
On startup, worker.start() calls JobQueue.restoreFromPersistence() which SELECTs rows
where status IN ('pending','processing','retrying') AND acked_at IS NULL, then requeues
them into the in-memory structures before the poll loop begins.
- At-least-once delivery — jobs that were dequeued but never acked (in-flight at crash time) are requeued on the next startup.
- No double-run of acked jobs —
acked_atis stamped atomically with status=completed; recovery unconditionally skips any row withacked_at IS NOT NULL. - Bounded recovery — at most
JOB_QUEUE_MAX_RECOVERY_ROWS(default 1 000) rows are fetched per startup, preventing an unbounded DB scan from blocking the process. - Hard retry cap preserved —
maxRetries(default 3, hard max 10) is enforced identically whether persistence is on or off. - Payload validation on restore — each recovered payload is round-tripped through
JSON.parse(JSON.stringify())before re-enqueueing; rows with corrupt payloads are skipped and logged. - DB failures are non-fatal — all persistence calls are fire-and-forget with internal error logging. A DB outage degrades to the in-memory path; it never crashes the worker.
JobQueue (options.persistence = adapter)
│
├── enqueue() ──► persistJob(job)
├── dequeue() ──► updateJobStatus(job) status → processing
├── ack() ──► ackJob(jobId) status → completed, acked_at = now
├── retry() ──► updateJobStatus(job) status → retrying | failed
└── restoreFromPersistence()
└──► recoverUnackedJobs() SELECT … WHERE acked_at IS NULL
BackgroundWorker.start()
└── if queue._persistence → restoreFromPersistence() → then _poll()
The persistence adapter is created by createJobPersistence(db, options) from
src/workers/jobPersistence.js and injected into JobQueue via options.persistence.
The feature flag wiring lives in whichever bootstrap file initialises the queue
(e.g. src/jobs/webhookDelivery.js, src/jobs/retentionPurge.js).
MIT (see root LiquiFact project for full license).
{ "event": "invoice.pending_to_approved", "invoiceId": "inv_abc123", "tenantId": "tenant_xyz", "timestamp": "2025-01-15T12:00:00.000Z", "transition": { "actor": "usr_admin", "from": "pending", "reason": null, "to": "approved", "transitionedAt": "2025-01-15T12:00:00.000Z" } }