-
Notifications
You must be signed in to change notification settings - Fork 803
Fix an issue on type casting of default string function/procedure arguments in debugger tool. #9036 #9397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughModified SQL logic in the function debug info query to clean and normalize Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
…uments in debugger tool. pgadmin-org#9036
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
web/pgadmin/tools/debugger/templates/debugger/sql/get_function_debug_info.sql(1 hunks)
⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: run-feature-tests-pg (18)
- GitHub Check: run-feature-tests-pg (16)
- GitHub Check: run-feature-tests-pg (17)
- GitHub Check: run-feature-tests-pg (15)
- GitHub Check: run-feature-tests-pg (13)
- GitHub Check: run-feature-tests-pg (14)
🔇 Additional comments (1)
web/pgadmin/tools/debugger/templates/debugger/sql/get_function_debug_info.sql (1)
51-73: Request verification of the fix scope against issue #9036.The changes address type-cast cleanup, but without seeing the original issue or test cases, it's unclear whether the current implementation handles all necessary scenarios correctly. Specifically:
- What types of default argument values were causing the original bug?
- Are there existing tests that validate this fix against regression?
- Does the affected PostgreSQL version(s) produce any unexpected output formats from
pg_get_expr?Please confirm: (1) the root cause of #9036, (2) which PostgreSQL versions are in scope, and (3) whether test coverage exists for edge cases like quoted strings or complex types.
| WHEN regexp_replace(val, E'::[a-zA-Z_]+$', '') ~ E'^''.*''$' | ||
| THEN substring(regexp_replace(val, E'::[a-zA-Z_]+$', '') FROM E'^''(.*)''$') | ||
| ELSE regexp_replace(val, E'::[a-zA-Z_]+$', '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Major: The type-cast removal regex is incomplete for complex PostgreSQL types.
The pattern E'::[a-zA-Z_]+$' only matches simple type names. It fails for parameterized and array types:
::numeric(10,2)→ cast remains (parentheses not matched)::varchar(255)→ cast remains (parentheses not matched)::int[]→ cast remains (brackets not matched)::character varying→ cast remains (space not matched)
This means type casts are not actually removed for many real-world defaults, defeating the fix's purpose.
Can you confirm: what default types does the debugger tool need to support? If complex types like numeric(10,2) or array types are used, the regex needs to be extended. A more complete pattern might look like E'::[^,)]+$' or similar, but ideally verify what PostgreSQL's pg_get_expr actually outputs for your test cases.
I'll verify this review comment by examining the codebase context and researching PostgreSQL type formats.
🏁 Script executed:
#!/bin/bash
# Examine the SQL file to understand context
cat web/pgadmin/tools/debugger/templates/debugger/sql/get_function_debug_info.sql | head -70 | tail -20Length of output: 859
🏁 Script executed:
#!/bin/bash
# Search for test files related to debugger
fd -e "test" -e "spec" --type f | grep -i debugger | head -20Length of output: 0
🏁 Script executed:
#!/bin/bash
# Look for documentation or comments about type handling
rg -n "numeric|varchar|parameterized|array.*type|pg_get_expr" web/pgadmin/tools/debugger/ -C 3Length of output: 7967
🏁 Script executed:
#!/bin/bash
# Search for any existing type-related patterns or regex in debugger
rg -n "E'::.*\$" web/pgadmin/tools/debugger/ -A 2 -B 2Length of output: 995
🏁 Script executed:
#!/bin/bash
# Find related SQL files that might handle defaults
fd "\.sql$" web/pgadmin/tools/debugger/ --exec grep -l "default" {} \;Length of output: 227
🌐 Web query:
PostgreSQL pg_get_expr output format parameterized types numeric varchar array
💡 Result:
Short answer: pg_get_expr decompiles an expression (e.g. a column default) but it does not itself "pretty-print" raw typmods — when reconstructing type names it relies on format_type (or the catalog machinery that calls it). That means parameterized types appear as their SQL names with their modifiers when available (e.g. numeric(10,2), character varying(255)), and array types appear with the usual [] suffix (e.g. integer[], text[]). See examples below.
Why/how
- pg_get_expr decompiles pg_node_tree expressions stored in the catalogs (use relation OID as second arg if Vars may appear). [1]
- The SQL name and any typmod (length/precision) for a type are produced by format_type(type_oid, typemod); typmod encodes things like varchar length or numeric precision/scale. That is how parameterized names like varchar(255) or numeric(10,2) are reconstructed. [2]
Examples
- To get a column's type string and default: select a.attname, format_type(a.atttypid, a.atttypmod), pg_get_expr(d.adbin,d.adrelid) ... from pg_attribute a left join pg_attrdef d ...; — format_type returns "character varying(255)", "numeric(10,2)", etc., and pg_get_expr returns the default expression text. [3]
Behavior notes
- If typmod is zero or unknown, format_type omits modifiers (you’ll see plain numeric or character varying).
- Array types are shown as element_type[] (e.g. integer[], text[]) by the type-formatting machinery.
- pg_get_expr only reconstructs the expression text; type name formatting is handled separately by the catalog formatting functions (format_type) used when producing catalog displays or queries like the examples above. [1][2]
Sources
- pg_get_expr / system info functions (PostgreSQL docs). [1]
- format_type description (type name + typmod) in PostgreSQL docs. [2]
- Common example showing format_type + pg_get_expr to list defaults/types. [3]
🌐 Web query:
PostgreSQL type names with parentheses spaces characters allowed in type definitions
💡 Result:
Short answer — PostgreSQL type names follow the general identifier rules. Unquoted type names: start with a letter (or underscore for general identifiers) and may contain letters, digits and underscores; they are subject to the identifier length limit (NAMEDATALEN, 63 by default). Quoted type names (double-quoted) may contain spaces, parentheses, and virtually any character except the zero byte; double quotes inside the name are written as two double quotes. Note: type names beginning with an underscore are reserved for internally-created array types, and historically a type name may have a slightly shorter usable length (see CREATE TYPE notes). [1][2][3]
References
- PostgreSQL lexical structure (delimited/quoted identifiers can contain any character except code zero; length limit NAMEDATALEN). [1]
- CREATE TYPE notes (type names beginning with “_” are reserved; type-name length limitation). [2]
- Current PostgreSQL docs on quoted identifiers and quoting rules. [3]
Sources:
[1] PostgreSQL: SQL Syntax — Lexical Structure (quoted identifiers).
[2] PostgreSQL: CREATE TYPE — notes about type-name underscore reservation and length.
[3] PostgreSQL docs (current) — delimited identifier quoting and allowed characters.
Fix regex pattern to handle parameterized, array, and spaced type names.
The regex E'::[a-zA-Z_]+$' is incomplete. PostgreSQL's pg_get_expr outputs parameterized types as their SQL names with modifiers (e.g., numeric(10,2), character varying(255)) and array types with brackets (e.g., integer[], text[]). The current pattern only matches simple type names and will fail to remove casts for:
::numeric(10,2)→ parentheses not matched::varchar(255)→ parentheses not matched::int[]→ brackets not matched::character varying→ space not matched
Since the debugger handles these types (numeric, varchar, double precision), defaults with complex types will retain their casts. Replace the regex with a pattern that matches to the end of the type name, such as E'::[^,)]*$' (matches :: through end, excluding commas/closing parens that separate expressions).
Apply this fix to lines 57-59 in web/pgadmin/tools/debugger/templates/debugger/sql/get_function_debug_info.sql.
🤖 Prompt for AI Agents
In web/pgadmin/tools/debugger/templates/debugger/sql/get_function_debug_info.sql
around lines 57 to 59, the regex E'::[a-zA-Z_]+$' only matches simple type names
and fails for parameterized, spaced or array types; replace that pattern with
E'::[^,)]*$' wherever it appears so the cast removal covers types with
parentheses, spaces and brackets (e.g., ::numeric(10,2), ::character varying,
::int[]) — update the three occurrences in those lines to use E'::[^,)]*$'.
| FROM unnest( | ||
| regexp_split_to_array( | ||
| pg_catalog.pg_get_expr(p.proargdefaults, 'pg_catalog.pg_class'::regclass, false), | ||
| ',\s*' | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Critical: The comma split doesn't respect string literal boundaries.
The regex pattern ,\s* splits on ANY comma, including those inside string literals. If any default value is a string containing a comma (e.g., 'hello, world'), the split will fragment it incorrectly.
For example, a function argument with default 'hello, world'::text would be split into:
'helloworld'::text
This produces mangled output. A robust solution would need context-aware parsing that understands string literal boundaries, or an alternative approach (e.g., using pg_get_expr with additional metadata).
Can you verify whether the original issue (#9036) anticipated comma-containing defaults? If this is a realistic scenario for your debugger users, this needs a fix before merging.
I'll verify the concerns raised by examining the actual file and gathering context about the issue being fixed.
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine the full context of the SQL file around the flagged lines
cat "web/pgadmin/tools/debugger/templates/debugger/sql/get_function_debug_info.sql" | head -80 | tail -50
Length of output: 2150
🏁 Script executed:
#!/bin/bash
# Search for any related tests or documentation about this fix
fd -t f -e sql -e py | xargs grep -l "proargdefaults\|get_function_debug_info" | head -20Length of output: 2001
🏁 Script executed:
#!/bin/bash
# Check git history for recent changes to this file
git log --oneline -20 "web/pgadmin/tools/debugger/templates/debugger/sql/get_function_debug_info.sql"Length of output: 195
🏁 Script executed:
#!/bin/bash
# Look for issue #9036 references in the codebase
rg -n "9036|proargdefaults" --type=md --type=py --type=sql -C3 | head -50Length of output: 4581
Let me examine the parse_defaults function and understand how it differs from the SQL approach:
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine the parse_defaults function in detail
sed -n '29,80p' "web/pgadmin/utils/sqlautocomplete/parseutils/meta.py"
Length of output: 1554
🏁 Script executed:
#!/bin/bash
# Check git commit details for #9036
git show 9bf97110 --statLength of output: 370
🏁 Script executed:
#!/bin/bash
# Test what PostgreSQL pg_get_expr actually returns
# Create a test to understand the format
python3 << 'EOF'
import re
# Test the regex patterns with realistic PostgreSQL function defaults
test_cases = [
"1", # Simple integer
"'hello'::text", # String with type cast
"'hello, world'::text", # String with comma inside (CRITICAL TEST)
"123::numeric", # Numeric with type cast
"'a'::numeric(10,2)", # Parameterized type (MAY FAIL)
"'{1,2,3}'::int[]", # Array type (MAY FAIL)
"'one'::text, 'two, three'::text", # Multiple with comma inside second
]
# The regex patterns from the SQL
type_cast_pattern = r'::[a-zA-Z_]+$'
quote_pattern_check = r"^'.*'$"
quote_extraction = r"^'(.*)'$"
split_pattern = r',\s*'
print("Testing SPLIT PATTERN: ', \\s*'")
print("-" * 60)
for case in test_cases:
parts = re.split(split_pattern, case)
print(f"Input: {case!r}")
print(f"Split result: {parts}")
print()
print("\nTesting TYPE CAST REMOVAL: '::[a-zA-Z_]+$'")
print("-" * 60)
for case in test_cases:
cleaned = re.sub(type_cast_pattern, '', case)
print(f"Input: {case!r} -> {cleaned!r}")
print("\nTesting REALISTIC SCENARIO - comma in string:")
print("-" * 60)
problematic = "'hello, world'::text"
parts = re.split(split_pattern, problematic)
print(f"Input: {problematic!r}")
print(f"Split into {len(parts)} parts: {parts}")
print(f"Expected: 1 part, got {len(parts)}")
if len(parts) > 1:
print("❌ ISSUE CONFIRMED: Comma inside string literal is split!")
EOFLength of output: 1216
🏁 Script executed:
#!/bin/bash
# Check what pg_get_expr with false (last param) returns
# Look for PostgreSQL documentation or examples
rg -n "pg_get_expr.*proargdefaults" --type=sql -C5Length of output: 61951
Critical issue confirmed: The comma split doesn't respect string literal boundaries.
Testing reveals that regexp_split_to_array with pattern ,\s* splits on ANY comma, including those inside string literals. A function argument with default 'hello, world'::text produces two fragments instead of one, creating mangled output.
Major issue confirmed: The type cast regex is incomplete.
The pattern ::[a-zA-Z_]+$ only matches simple types and fails on PostgreSQL's parameterized and array types:
::numeric(10,2)— NOT matched (contains parentheses)::varchar(255)— NOT matched (contains parentheses)::int[]— NOT matched (contains brackets)
These type casts remain in output, defeating the fix's intent.
Recommendation: Consider using a state-machine approach (like the existing parse_defaults function in web/pgadmin/utils/sqlautocomplete/parseutils/meta.py lines 29–55) that properly tracks quote boundaries, or adopt an alternative parsing strategy that handles parameterized types.
🤖 Prompt for AI Agents
In web/pgadmin/tools/debugger/templates/debugger/sql/get_function_debug_info.sql
around lines 61-65, the current regexp_split_to_array(..., ',\s*') wrongly
splits commas inside string literals and the following cast-stripping uses an
inadequate pattern (::[a-zA-Z_]+$) that misses parameterized and array types;
replace the simple regex split with a parser that tracks quote/escape state
(same approach as parse_defaults in
web/pgadmin/utils/sqlautocomplete/parseutils/meta.py lines 29–55) so commas
inside quotes are ignored, and replace the cast-removal with either a proper
parser step that strips a trailing ::<type> where <type> may include letters,
digits, underscores, parentheses, commas, spaces and square brackets, or use a
robust regex/logic that matches ::.*$ (or equivalent parsing) to remove
parameterized and array typecasts reliably.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.