EQL supports encrypting, decrypting, and searching JSON and JSONB objects.
- Configuring the index
- Querying JSONB data with EQL
- EQL functions for JSONB and
ste_vec
- EJSON paths
- Native PostgreSQL JSON(B) compared to EQL
json ->> text
→text
andjson -> text
→jsonb
/json
json ->> int
→text
andjson -> int
→jsonb
/json
json #>> text[]
→text
andjson #> text[]
→jsonb
/json
@>
and<@
json_array_elements
,jsonb_array_elements
,json_array_elements_text
, andjsonb_array_elements_text
json_array_length
andjsonb_array_length
Similar to how you configure indexes for text data, you can configure indexes for JSON and JSONB data.
The only difference is that you need to specify the cast_as
parameter as json
or jsonb
.
SELECT cs_add_index_v1(
'users',
'encrypted_json',
'ste_vec',
'jsonb',
'{"prefix": "users/encrypted_json"}' -- The prefix is in the form of "table/column"
);
You can read more about the index configuration options here.
When inserting JSON data, this works the same as inserting text data. You need to wrap the JSON data in the appropriate EQL payload. CipherStash Proxy will encrypt the data automatically.
Example:
Assuming you want to store the following JSON data:
{
"name": "John Doe",
"metadata": {
"age": 42,
}
}
The EQL payload would be:
INSERT INTO users (encrypted_json) VALUES (
'{"v":1,"k":"pt","p":"{\"name\":\"John Doe\",\"metadata\":{\"age\":42}}","i":{"t":"users","c":"encrypted_json"}}'
);
Data is stored in the database as:
{
"i": {
"c": "encrypted_json",
"t": "users"
},
"k": "sv",
"v": 1,
"sv": [
["ciphertext"]
]
}
When querying data, select the encrypted column. CipherStash Proxy will decrypt the data automatically.
Example:
SELECT encrypted_json FROM users;
Data is returned as:
{
"k": "pt",
"p": "{\"metadata\":{\"age\":42},\"name\":\"John Doe\"}",
"i": {
"t": "users",
"c": "encrypted_json"
},
"v": 1,
"q": null
}
EQL provides specialized functions to interact with encrypted JSONB data, supporting operations like containment queries, field extraction, and comparisons.
Retrieve the Structured Encryption Vector for JSONB containment queries.
Example: Containment query
Suppose we have the following encrypted JSONB data:
{
"top": {
"nested": ["a", "b", "c"]
}
}
We can query records that contain a specific structure.
SQL query:
SELECT * FROM examples
WHERE cs_ste_vec_v1(encrypted_json) @> cs_ste_vec_v1(
'{
"v":1,
"k":"pt",
"p":{"top":{"nested":["a"]}},
"i":{"t":"examples","c":"encrypted_json"},
"q":"ste_vec"
}'
);
Equivalent plaintext query:
SELECT * FROM examples
WHERE jsonb_column @> '{"top":{"nested":["a"]}}';
Note: The @>
operator checks if the left JSONB value contains the right JSONB value.
Negative example:
If we query for a value that does not exist in the data:
SQL query:
SELECT * FROM examples
WHERE cs_ste_vec_v1(encrypted_json) @> cs_ste_vec_v1(
'{
"v":1,
"k":"pt",
"p":{"top":{"nested":["d"]}},
"i":{"t":"examples","c":"encrypted_json"},
"q":"ste_vec"
}'
);
This query would return no results, as the value "d"
is not present in the "nested"
array.
Extract a field from an encrypted JSONB object.
Example:
Suppose we have the following encrypted JSONB data:
{
"top": {
"nested": ["a", "b", "c"]
}
}
We can extract the value of the "top"
key.
SQL query:
SELECT cs_ste_vec_value_v1(encrypted_json,
'{
"v":1,
"k":"pt",
"p":"$.top",
"i":{"t":"examples","c":"encrypted_json"},
"q":"ejson_path"
}'
) AS value
FROM examples;
Equivalent plaintext query:
SELECT jsonb_column->'top' AS value
FROM examples;
Result:
{
"nested": ["a", "b", "c"]
}
Select rows based on a field value in an encrypted JSONB object.
Example:
Suppose we have encrypted JSONB data with a numeric field:
{
"num": 3
}
We can query records where the "num"
field is greater than 2
.
SQL query:
SELECT * FROM examples
WHERE cs_ste_vec_term_v1(encrypted_json,
'{
"v":1,
"k":"pt",
"p":"$.num",
"i":{"t":"examples","c":"encrypted_json"},
"q":"ejson_path"
}'
) > cs_ste_vec_term_v1(
'{
"v":1,
"k":"pt",
"p":"2",
"i":{"t":"examples","c":"encrypted_json"},
"q":"ste_vec"
}'
);
Equivalent plaintext query:
SELECT * FROM examples
WHERE (jsonb_column->>'num')::int > 2;
Use cs_ste_vec_term_v1
along with cs_grouped_value_v1
to group by a field in an encrypted JSONB column.
Example:
Suppose we have records with a "color"
field:
{"color": "blue"}
{"color": "blue"}
{"color": "green"}
{"color": "blue"}
{"color": "red"}
{"color": "green"}
We can group the data by the "color"
field and count occurrences.
SQL query:
SELECT cs_grouped_value_v1(cs_ste_vec_value_v1(encrypted_json,
'{
"v":1,
"k":"pt",
"p":"$.color",
"i":{"t":"examples","c":"encrypted_json"},
"q":"ejson_path"
}'
)) AS color, COUNT(*)
FROM examples
GROUP BY cs_ste_vec_term_v1(encrypted_json,
'{
"v":1,
"k":"pt",
"p":"$.color",
"i":{"t":"examples","c":"encrypted_json"},
"q":"ejson_path"
}'
);
Equivalent plaintext query:
SELECT jsonb_column->>'color' AS color, COUNT(*)
FROM examples
GROUP BY jsonb_column->>'color';
Result:
color | count |
---|---|
blue | 3 |
green | 2 |
red | 1 |
-
Index management
cs_add_index_v1(table_name text, column_name text, 'ste_vec', 'jsonb', opts jsonb)
: Adds anste_vec
index configuration.opts
must include the"context"
key.
-
Query functions
cs_ste_vec_v1(val jsonb)
: Retrieves the STE vector for JSONB containment queries.cs_ste_vec_term_v1(val jsonb, epath jsonb)
: Retrieves the encrypted term associated with an encrypted JSON path.cs_ste_vec_value_v1(val jsonb, epath jsonb)
: Retrieves the decrypted value associated with an encrypted JSON path.cs_ste_vec_terms_v1(val jsonb, epath jsonb)
: Retrieves an array of encrypted terms for elements in an array at the given JSON path (used for comparisons).cs_grouped_value_v1(val jsonb)
: Used withste_vec
indexes for grouping.
EQL uses an extended JSONPath syntax called EJSONPath for specifying paths in JSONB data.
- Root selector:
$
- Dot notation for keys:
$.key
- Bracket notation for keys with special characters:
$['key.with.special*chars']
- Wildcards are supported:
$.some_array_field[*]
- Array indexing is not supported:
$.some_array_field[0]
Example paths:
$.top.nested
selects the"nested"
key within the"top"
object.$.array[*]
selects all elements in the"array"
array.
EQL supports a subset of functionality supported by the native PostgreSQL JSON(B) functions and operators. The following examples compare native PostgreSQL JSON(B) functions and operators to the related functionality in EQL.
Native PostgreSQL JSON(B) example
-- `->` (returns JSON(B))
SELECT plaintext_jsonb->'field_a' FROM examples;
-- `->>` (returns text)
SELECT plaintext_jsonb->>'field_a' FROM examples;
EQL example
EQL JSONB functions accept an eJSONPath as an argument (instead of using ->
/->>
) for lookups.
cs_ste_vec_value_v1
returns the plaintext EQL payload to the client.
SELECT cs_ste_vec_value_v1(encrypted_json, $1) FROM examples;
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": 100
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.field_a`:
{
"k": "pt",
"p": "$.field_a",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}
cs_ste_vec_term_v1
returns an ORE term for comparison.
SELECT * FROM examples
WHERE cs_ste_vec_term_v1(examples.encrypted_json, $1) > cs_ste_vec_term_v1($2)
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": 100
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.field_a`:
{
"k": "pt",
"p": "$.field_a",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}
// `$2` is the EQL plaintext payload for the ORE term to compare against (in this case, the ORE term for the integer `123`):
{
"k": "pt",
"p": "123",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ste_vec"
}
Native PostgreSQL JSON(B) example
-- `->` (returns JSON(B))
SELECT plaintext_jsonb->0 FROM examples;
-- `->>` (returns text)
SELECT plaintext_jsonb->>0 FROM examples;
EQL example
EQL JSONB functions accept an eJSONPath as an argument (instead of using ->
/->>
) for lookups.
EQL currently doesn't support returning a specific array element for decryption, but cs_ste_vec_value_v1
can be used to return an array to the client to process.
The query:
SELECT cs_ste_vec_value_v1(encrypted_json, $1) AS val FROM examples;
With the params:
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": [1, 2, 3]
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.field_a`:
{
"k": "pt",
"p": "$.field_a",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}
Would return the EQL plaintext payload with an array ([1, 2, 3]
for example):
// Example result for a single row
{
"k": "pt",
"p": "[1, 2, 3]",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": null
}
cs_ste_vec_terms_v1
can be used with the native PostgreSQL array access operator to get a term for comparison by array index.
The eJSONPath used with cs_ste_vec_terms_v1
needs to end with [*]
($.some_array_field[*]
for example).
Important
Array access with cs_ste_vec_terms_v1
only works when the given eJSONPath only matches a single array.
Accessing array elements from cs_ste_vec_terms_v1
when the eJSONPath matches multiple arrays (for example, when there are nested arrays or multiple arrays at the same depth) can return unexpected results.
The following query compares the first item in the array at the eJSONPath in $1
to the value in $2
.
SELECT * FROM examples
WHERE (cs_ste_vec_terms_v1(examples.encrypted_json, $1))[1] > cs_ste_vec_term_v1($2)
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": [4, 5, 6]
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.field_a[*]`:
{
"k": "pt",
"p": "$.field_a[*]",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}
// `$2` is the EQL plaintext payload for the ORE term to compare against (in this case, the ORE term for the integer `3`):
{
"k": "pt",
"p": "3",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ste_vec"
}
Native PostgreSQL JSON(B) example
-- `#>` (returns JSON(B))
SELECT plaintext_jsonb#>'{field_a,field_b}' FROM examples;
-- `#>>` (returns text)
SELECT plaintext_jsonb#>>'{field_a,field_b}' FROM examples;
EQL example
EQL JSONB functions accept an eJSONPath as an argument (instead of using #>
/#>>
) for lookups.
Note that these are similar to the examples for ->
/->>
.
The difference in these examples is that the path does a lookup multiple levels deep.
cs_ste_vec_value_v1
returns the plaintext EQL payload to the client.
SELECT cs_ste_vec_value_v1(encrypted_json, $1) FROM examples;
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": {
"field_b": 100
}
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.field_a.field_b`:
{
"k": "pt",
"p": "$.field_a.field_b",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}
cs_ste_vec_term_v1
returns an ORE term for comparison.
SELECT * FROM examples
WHERE cs_ste_vec_term_v1(examples.encrypted_json, $1) > cs_ste_vec_term_v1($2)
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": {
"field_b": 100
}
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.field_a.field_b`:
{
"k": "pt",
"p": "$.field_a.field_b",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}
// `$2` is the EQL plaintext payload for the ORE term to compare against (in this case, the ORE term for the integer `123`):
{
"k": "pt",
"p": "123",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ste_vec"
}
Native PostgreSQL JSON(B) example
-- Checks if the left arg contains the right arg (returns `true` in this example).
SELECT '{"a":1, "b":2}'::jsonb @> '{"b":2}'::jsonb;
-- Checks if the right arg contains the left arg (returns `true` in this example).
SELECT '{"b":2}'::jsonb <@ '{"a":1, "b":2}'::jsonb;
EQL example
EQL uses the same operators for containment (@>
and <@
) queries, but the args need to be wrapped in cs_ste_vec_v1
.
Example query:
-- Checks if the left arg (the `examples.encrypted_json` column) contains the right arg ($1).
-- Would return `true` for the example data and param below.
SELECT * WHERE cs_ste_vec_v1(encrypted_json) @> cs_ste_vec_v1($1) FROM examples;
-- Checks if the the right arg ($1) contains left arg (the `examples.encrypted_json` column).
-- Would return `false` for the example data and param below.
SELECT * WHERE cs_ste_vec_v1(encrypted_json) <@ cs_ste_vec_v1($1) FROM examples;
Example params:
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": {
"field_b": [1, 2, 3]
}
}
// `$1` is the EQL plaintext payload for the JSON object `{"field_b": [1, 2, 3]}`:
{
"k": "pt",
"p": "{\"field_b\": [1, 2, 3]}",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ste_vec"
}
Native PostgreSQL JSON(B) example
-- Each returns the results...
--
-- Value
-- _____
-- a
-- b
--
-- The only difference is that the input is either json or jsonb (depending
-- on the prefix of the function name) and the output is either json,
-- jsonb, or text (depending on both the prefix and the suffix).
SELECT * from json_array_elements('["a", "b"]');
SELECT * from jsonb_array_elements('["a", "b"]');
SELECT * from json_array_elements_text('["a", "b"]');
SELECT * from jsonb_array_elements_text('["a", "b"]');
EQL example
EQL currently doesn't support returning a SETOF
values for decryption (for returning a row per item in an array), but cs_ste_vec_value_v1
can be used to return an array to the client to process.
The query:
SELECT cs_ste_vec_value_v1(encrypted_json, $1) AS val FROM examples;
With the params:
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": [1, 2, 3]
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.field_a`:
{
"k": "pt",
"p": "$.field_a",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}
Would return the EQL plaintext payload with an array ([1, 2, 3]
for example):
// Example result for a single row
{
"k": "pt",
"p": "[1, 2, 3]",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": null
}
cs_ste_vec_terms_v1
(note that terms is plural) can be used to return an array of ORE terms for comparison.
The array can be unnest
ed to work with a SETOF
ORE terms for comparison.
The eJSONPath used with cs_ste_vec_terms_v1
needs to end with [*]
($.some_array_field[*]
for example).
Example query:
SELECT id FROM examples e
WHERE EXISTS (
SELECT 1
FROM unnest(cs_ste_vec_terms_v1(e.encrypted_json, $1)) AS term
WHERE term > cs_ste_vec_term_v1($2)
);
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"field_a": [1, 2, 3]
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.field_a[*]`:
{
"k": "pt",
"p": "$.field_a[*]",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}
// `$2` is the EQL plaintext payload for the ORE term to compare against (in this case, the ORE term for the integer `2`):
{
"k": "pt",
"p": "2",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ste_vec"
}
Native PostgreSQL JSON(B) example
-- Both of these examples return the int `3`.
-- The only difference is the input type.
SELECT json_array_length('[1, 2, 3]');
SELECT jsonb_array_length('[1, 2, 3]');
EQL example
The PostgreSQL array_length
function can be used with cs_ste_vec_terms_v1
to find the length of an array.
The eJSONPath used with cs_ste_vec_terms_v1
needs to end with [*]
($.some_array_field[*]
for example).
Important
Determining array length with cs_ste_vec_terms_v1
only works when the given eJSONPath only matches a single array.
Attempting to determine array length using cs_ste_vec_terms_v1
when the eJSONPath matches multiple arrays (for example, when there are nested arrays or multiple arrays at the same depth) can return unexpected results.
Example query:
SELECT COALESCE( -- We `COALESCE` because cs_ste_vec_terms_v1 will return `NULL` for empty arrays.
array_length( -- `cs_ste_vec_terms_v1` returns an array type (not JSON(B)), so we use `array_length`.
cs_ste_vec_terms_v1(encrypted_json, $1), -- Pluck out the array of terms at the path in $1.
1 -- The array dimension to find the length of (term array are flat, so this should always be 1).
),
0 -- Assume a length of `0` when `cs_ste_vec_terms_v1` returns `NULL`.
) AS len FROM examples;
Example data and params:
// Assume that examples.encrypted_json has JSON objects with the shape:
{
"val": [1, 2, 3]
}
// `$1` is the EQL plaintext payload for the eJSONPath `$.val[*]`:
{
"k": "pt",
"p": "$.val[*]",
"i": {
"t": "examples",
"c": "encrypted_json"
},
"v": 1,
"q": "ejson_path"
}