Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ This extension provides a set of utility functions to work with JSON data, focus

- **`json_flatten(json[, separator])`**: Recursively flattens nested JSON objects and arrays into a single-level object with path keys (default separator: `.`).
- **`json_add_prefix(json, text)`**: Adds a string prefix to every top-level key in a JSON object.
- **`json_extract_columns(json, columns[, separator])`**: Pulls selected root keys into a struct of `VARCHAR` fields using regex patterns.
- **`json_group_merge(json [ORDER BY ...])`**: Streams JSON patches with RFC 7396 merge semantics without materializing intermediate lists.

## Quick Start
Expand Down Expand Up @@ -124,6 +125,33 @@ SELECT json_add_prefix('{"user": {"name": "Alice"}, "count": 5}', 'data_');

**Note:** This function requires the input to be a JSON object. It will raise an error if given a JSON array or primitive value.

### `json_extract_columns(json, columns[, separator]) -> struct`

Extracts selected root-level fields into a struct of `VARCHAR` columns. The first argument must be a JSON object value (not an array or primitive). `columns` must be a constant JSON object mapping output column names to RE2 regex patterns evaluated against each top-level key (partial matches by default; add anchors to tighten). Patterns are case-sensitive unless you supply inline flags such as `(?i)`. Output columns follow the mapping order.

`separator` defaults to `''` and is inserted between multiple matches for the same column in the order keys appear in the input object. It can be empty but cannot be `NULL` (even when the JSON input is `NULL`). Columns with no matches return `NULL`.

Values are stringified: strings pass through unquoted; arrays, objects, numbers, booleans, and `null` become their JSON text.

**Examples:**
```sql
SELECT (json_extract_columns('{"id": 5, "name": "duck"}',
'{"id":"^id$","name":"^name$"}', ',')).id AS id;
-- Result: 5

SELECT (json_extract_columns('{"a":1,"a2":2,"b":3}',
'{"a":"^a","b":"^b$"}', '|')).a AS a_values;
-- Result: 1|2

SELECT (json_extract_columns('{"Key": "Value"}',
'{"k":"(?i)^key$"}', ',')).k AS case_insensitive;
-- Result: Value

SELECT (json_extract_columns('{"x":"a","xx":"b"}',
'{"col":"x"}')).col AS default_separator;
-- Result: ab
```

### `json_group_merge(json_expr [, treat_null_values] [ORDER BY ...]) -> json`

Applies a sequence of JSON patches using [RFC 7396](https://datatracker.ietf.org/doc/html/rfc7396) merge semantics. Inputs can be `JSON` values or `VARCHAR` text that parses as JSON. SQL `NULL` rows are skipped, and the aggregate returns `'{}'::json` when no non-null inputs are provided.
Expand Down Expand Up @@ -175,6 +203,7 @@ FROM (VALUES ('{"keep":1}'::json, 1), ('{"keep":null}'::json, 2)) AS t(patch, ts

- `json_flatten()` returns an error for malformed JSON
- `json_add_prefix()` requires a JSON object (not array or primitive value)
- `json_extract_columns()` requires a JSON object input and a constant JSON object of string regex patterns; it raises on invalid regexes, NULL separators, non-string object keys, or mismatched input shapes
- `json_group_merge()` surfaces DuckDB JSON parse errors for invalid text and raises on merge buffers that exceed DuckDB limits
- Maximum nesting depth: 1000 levels
- Empty objects (`{}`) and arrays (`[]`) are omitted from flattened output
Expand Down
Loading
Loading