fix: replace string-based SQL injection in inject_block_filter with AST manipulation#75
Open
fix: replace string-based SQL injection in inject_block_filter with AST manipulation#75
Conversation
…ST manipulation The previous implementation used string position matching (finding WHERE, ORDER BY, LIMIT keywords) and format! interpolation to splice block_num filters into user-provided SQL. This was vulnerable to structural SQL injection where crafted queries could exploit the naive keyword matching (e.g. WHERE inside string literals, UNION bypasses). Replace with sqlparser AST parsing and manipulation: - Parse user SQL into AST, requiring a single simple SELECT statement - Determine filter column from the FROM table (num for blocks, block_num for others) - Safely AND the block filter into the existing WHERE clause (or add one) - Serialize modified AST back to SQL Also: - Reject UNION/INTERSECT/set operations in live mode (ambiguous filtering) - Return Result<String, ApiError> instead of String for proper error handling - Add Display impl for ApiError - Add tests for UNION rejection, non-SELECT rejection, and WHERE keyword in string literals Amp-Thread-ID: https://ampcode.com/threads/T-019c3272-f632-763c-8078-504a90852a67 Co-authored-by: Amp <amp@ampcode.com>
Replace the blocklist-only approach with a table allowlist so API users can only query: blocks, txs, logs, receipts, token_holders, token_balances, and CTE-defined tables. Previously, users could query sync_state (internal), pg_tables (schema enumeration), or any other table accessible to the tidx DB user. Changes: - Allowlist in validator: only permitted tables + CTE-defined names pass - Block dblink function family (cross-database access) - Add db/api_role.sql migration creating a tidx_api read-only role with SELECT-only grants on indexed tables (defense-in-depth) - Thread CTE names through all validate_* functions - 6 new tests: sync_state rejected, pg_tables rejected, unknown table rejected, CTE tables allowed, dblink blocked, analytics tables allowed Amp-Thread-ID: https://ampcode.com/threads/T-019c3272-f632-763c-8078-504a90852a67 Co-authored-by: Amp <amp@ampcode.com>
Block three categories of attacks:
1. DoS via resource exhaustion:
- Reject WITH RECURSIVE (endless loop CTEs)
- Block generate_series() (billion-row generation)
- Block SELECT INTO (object creation)
2. Privilege escalation / validator bypass:
- Validate expressions inside VALUES rows (previously VALUES(pg_sleep(10))
bypassed the entire function blocklist)
- Reject TABLE statement (TABLE pg_shadow bypassed table allowlist)
- Validate GROUP BY, HAVING, JOIN ON expressions (could hide function calls)
- Walk IsNull/IsNotNull/IsTrue/IsFalse/Like expressions recursively
3. File read hardening:
- Block lo_get/lo_open/lo_close/loread/lo_creat/lo_create/lo_unlink/lo_put
- Block pg_file_read/pg_file_write/pg_file_rename/pg_file_unlink/pg_logdir_ls
- VALUES bypass closure prevents pg_read_file via VALUES(...)
11 new tests covering all vectors.
Amp-Thread-ID: https://ampcode.com/threads/T-019c3272-f632-763c-8078-504a90852a67
Co-authored-by: Amp <amp@ampcode.com>
Validator hardening: - Add 16KB query size limit to prevent parser DoS - Validate ORDER BY, LIMIT, OFFSET expressions (previously unvalidated, could hide dangerous function calls) - Block FOR UPDATE/FOR SHARE locking clauses - Fail-closed expression validation: replace catch-all Ok(()) with explicit allowlist of safe leaf expression types, rejecting unknown Expr variants - Block ALL table functions in FROM clause (generate_series, etc.) API role hardening (golden-axe model): - REVOKE ALL ON ALL TABLES/FUNCTIONS before granting (broad revoke-first) - Add resource limits: statement_timeout=30s, work_mem=256MB, temp_file_limit=512MB - Add CONNECTION LIMIT 64 to prevent connection flooding 7 new tests covering all hardening paths. 149 total lib tests pass. Amp-Thread-ID: https://ampcode.com/threads/T-019c3272-f632-763c-8078-504a90852a67 Co-authored-by: Amp <amp@ampcode.com>
These are ClickHouse-only analytics tables that go through a separate code path (clickhouse.query) which doesn't use the postgres validator. They were never real postgres tables. Amp-Thread-ID: https://ampcode.com/threads/T-019c3272-f632-763c-8078-504a90852a67 Co-authored-by: Amp <amp@ampcode.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive security hardening for the tidx SQL query API, fixing SQL injection, DoS, privilege escalation, and file read vulnerabilities identified during a security audit. Inspired by golden-axe's security model.
Thread: https://tempoxyz.slack.com/archives/C0A87C21805/p1770372917363549
Changes
1. SQL injection in
inject_block_filter(commit 1)Replaced string-based SQL splicing with
sqlparserAST manipulation. The old code usedformat!()with keyword position matching to inject block filters into user SQL — vulnerable to structural injection via UNION, string literal confusion, and subquery exfiltration.2. Table allowlist + API role (commit 2)
blocks,txs,logs,receipts,token_holders,token_balances+ CTE-defined names. Previouslysync_state,pg_tables, and any other table were queryable.tidx_apirole: Read-only PostgreSQL role with SELECT-only grants.dblinkfamily blocked: Prevents cross-database access.3. DoS, privilege escalation, file read bypass fixes (commit 3)
WITH RECURSIVEblocked: Prevents endless loop CTEsVALUES()expressions now validated: PreviouslyVALUES(pg_read_file('/etc/passwd'))bypassed the entire function blocklistTABLEstatement blocked:TABLE pg_shadowbypassed the table allowlistSELECT INTOblocked: Prevents object creationgenerate_series()blocked: Prevents billion-row generationlo_get,lo_open, etc.pg_file_read,pg_file_write, etc.4. Final hardening pass — golden-axe model (commit 4)
_ => Ok(())catch-all with explicit allowlist of safeExprvariants — unknown expression types are rejectedgenerate_series()in FROM, etc.REVOKE EXECUTE ON ALL FUNCTIONS,statement_timeout=30s,work_mem=256MB,temp_file_limit=512MB,CONNECTION LIMIT 64Testing
inject_block_filter)