Add media type for explain #2368

steve-chavez · 2022-07-14T14:03:22Z

TODO

steve-chavez · 2022-07-20T04:31:17Z

Some observations.

The query identifier is included with the VERBOSE option

curl 'localhost:3000/projects?id=lt.5' -H "Accept: application/vnd.pgrst.plan; options=verbose"

[
  {
    "Plan": {
      "Node Type": "Aggregate",
      "Strategy": "Plain",
      "Partial Mode": "Simple",
      "Parallel Aware": false,
      "Async Capable": false,
      "Startup Cost": 24.25,
      "Total Cost": 24.28,
      "Plan Rows": 1,
      "Plan Width": 144,
      "Output": ["NULL::bigint", "count(ROW(projects.id, projects.name, projects.client_id))", "'{}'::text[]", "(COALESCE(json_agg(ROW(projects.id, projects.name, projects.client_id)), '[]'::j
son))::character varying", "NULLIF(current_setting('response.headers'::text, true), ''::text)", "NULLIF(current_setting('response.status'::text, true), ''::text)"],
      "Plans": [
        {
          "Node Type": "Bitmap Heap Scan",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Async Capable": false,
          "Relation Name": "projects",
          "Schema": "test",
          "Alias": "projects",
          "Startup Cost": 7.25,
          "Total Cost": 22.25,
          "Plan Rows": 400,
          "Plan Width": 40,
          "Output": ["projects.id", "projects.name", "projects.client_id"],
          "Recheck Cond": "(projects.id < 5)",
          "Plans": [
            {
              "Node Type": "Bitmap Index Scan",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Async Capable": false,
              "Index Name": "projects_pkey",
              "Startup Cost": 0.00,
              "Total Cost": 7.15,
              "Plan Rows": 400,
              "Plan Width": 0,
              "Index Cond": "(projects.id < 5)"
            }
          ]
        }
      ]
    },
    "Query Identifier": 8960408738112779943
  }
]

Somehow the query identifier doesn't change unless the filter is changed. So doing an lt.3 or lt.4 give the same identifier, eq.1 or eq.5 give a different identifier. This is more or less explained in the pg_stat_statements docs:

When a constant's value has been ignored for purposes of matching the query to other queries, the constant is replaced by a parameter symbol, such as $1

However, it is important to understand that there are only limited guarantees around the stability of the queryid hash value. Since the identifier is derived from the post-parse-analysis tree, its value is a function of, among other things, the internal object identifiers appearing in this representation.

So looks more like a plan identifier than a query identifier. Not really reliable to trace a request back to a query.

The search_path is included with the SETTINGS option

curl 'localhost:3000/projects?id=lt.5' -H "Accept: application/vnd.pgrst.plan; options=settings"

[
  {
    "Plan": {
      "Node Type": "Aggregate",
      "Strategy": "Plain",
      "Partial Mode": "Simple",
      "Parallel Aware": false,
      "Async Capable": false,
      "Startup Cost": 24.25,
      "Total Cost": 24.28,
      "Plan Rows": 1,
      "Plan Width": 144,
      "Plans": [
        {
          "Node Type": "Bitmap Heap Scan",
          "Parent Relationship": "Outer",
          "Parallel Aware": false,
          "Async Capable": false,
          "Relation Name": "projects",
          "Alias": "projects",
          "Startup Cost": 7.25,
          "Total Cost": 22.25,
          "Plan Rows": 400,
          "Plan Width": 40,
          "Recheck Cond": "(id < 5)",
          "Plans": [
            {
              "Node Type": "Bitmap Index Scan",
              "Parent Relationship": "Outer",
              "Parallel Aware": false,
              "Async Capable": false,
              "Index Name": "projects_pkey",
              "Startup Cost": 0.00,
              "Total Cost": 7.15,
              "Plan Rows": 400,
              "Plan Width": 0,
              "Index Cond": "(id < 5)"
            }
          ]
        }
      ]
    },
    "Settings": {
      "search_path": "\"test\", \"public\""
    }
  }
]

SETTINGS
Include information on configuration parameters. Specifically, include options affecting query planning with value different from the built-in default value. This parameter defaults to FALSE.

This seems really useful for debugging as we have had questions about search_path in the past. However, I'm worried that this might leak our in-db config settings, specifically the jwt-secret. For now I haven't been able to expose settings other than search_path but this seems like it could break in the future. If an anonymous user is able to specify the plan media type, that would be disastrous.

I guess this concern only applies if a production instance gets our EXPLAIN config enabled, which is likely. Perhaps we can recommend an additional measure: a db-pre-request that only enables the plan media type from certain IPs, otherwise rejects it.

It might also be worth considering tying the two together, i.e. enabling the plan media type like db-plan-enabled = "172.168.1.x, 172.168.1.y".

Specifying TIMING or WAL without ANALYZE gets a nice error message

curl 'localhost:3000/projects?id=lt.5' -H "Accept: application/vnd.pgrst.plan; options=wal|timing"

HTTP/1.1 400 Bad Request
{"code":"22023","details":null,"hint":null,"message":"EXPLAIN option TIMING requires ANALYZE"}

Overall all the options seem pretty useful. I don't see why we should not expose all of them if we can ensure the callers are restricted.

Working without options

Also refactor Statements module types

steve-chavez · 2022-07-23T01:27:07Z

Exposing COSTS is unnecessary because is already TRUE by default for all.

COSTS
Include information on the estimated startup and total cost of each plan node, as well as the estimated number of rows and the estimated width of each row. This parameter defaults to TRUE.

TIMING as well because requires ANALYZE and it's TRUE by default here.

TIMING
Include actual startup time and time spent in each node in the output. The overhead of repeatedly reading the system clock can slow down the query significantly on some systems, so it may be useful to set this parameter to FALSE when only actual row counts, and not exact times, are needed. Run time of the entire statement is always measured, even when node-level timing is turned off with this option. This parameter may only be used when ANALYZE is also enabled. It defaults to TRUE.

SUMMARY ditto. It doesn't add any info on non-ANALYZE and it's already included for ANALYZE.

SUMMARY
Include summary information (e.g., totaled timing information) after the query plan. Summary information is included by default when ANALYZE is used but otherwise is not included by default, but can be enabled using this option. Planning time in EXPLAIN EXECUTE includes the time required to fetch the plan from the cache and the time required for re-planning, if necessary.

steve-chavez · 2022-07-23T05:43:49Z

Just noted that with this feature we can remove the QueryCost test and instead include the cost tests on the PlanSpec; which is good because QueryCost was always brittle(added on #1385).

steve-chavez · 2022-07-27T21:21:16Z

Added a plain text output to be compatible with https://explain.depesz.com/. The text output is really cool, same one as psql:

curl 'localhost:3000/projects?id=in.(1,2,3)' -H "Accept: application/vnd.pgrst.plan+text;options=analyze"

Aggregate  (cost=15.61..15.63 rows=1 width=112) (actual time=0.024..0.025 rows=1 loops=1)
  ->  Bitmap Heap Scan on projects  (cost=8.48..15.59 rows=3 width=40) (actual time=0.008..0.009 rows=3 loops=1)
        Recheck Cond: (id = ANY ('{1,2,3}'::integer[]))
        Heap Blocks: exact=1
        ->  Bitmap Index Scan on projects_pkey  (cost=0.00..8.48 rows=3 width=0) (actual time=0.005..0.005 rows=3 loops=1)
              Index Cond: (id = ANY ('{1,2,3}'::integer[]))
Planning Time: 0.093 ms
Execution Time: 0.077 ms

src/PostgREST/MediaType.hs

wolfgangwalther · 2022-07-28T05:59:24Z

I just skimmed the tests quickly, but couldn't find it: Do we already have the for= parameter implemented as mentioned, here? Or did you conclude that it's not actually necessary?

steve-chavez · 2022-07-28T14:00:19Z

@wolfgangwalther Ah, no. I just left a note in the CHANGELOG(docs would be the same) that for is a TODO(next version).

Limited to generating the plan of a json representation(application/json) but can be extended later to allow other representations.

steve-chavez · 2022-08-01T19:11:02Z

Adding for in #2400

steve-chavez force-pushed the explain branch 3 times, most recently from 5bcb9c6 to ea3de6a Compare July 20, 2022 03:24

steve-chavez added 4 commits July 21, 2022 14:26

refactor: proc call reuses ResultsWithCount type

6b71a70

add media type for explain

10141b5

Working without options

add pg_stat_statements temporarily

b51cfea

add EXPLAIN options

9cad94e

steve-chavez force-pushed the explain branch from ea3de6a to c46473d Compare July 22, 2022 06:08

add explain for rpc and writes

92355d0

Also refactor Statements module types

steve-chavez force-pushed the explain branch from c46473d to 92355d0 Compare July 22, 2022 06:15

steve-chavez added 2 commits July 22, 2022 20:28

add option tests

b89b20f

remove unnecessary options

bd2c830

steve-chavez force-pushed the explain branch from f0e3f43 to bd2c830 Compare July 23, 2022 01:31

steve-chavez added 2 commits July 22, 2022 20:44

add writes tests

cf116d7

add rpc test

5f3c6ab

steve-chavez force-pushed the explain branch from aa46b3f to 5f3c6ab Compare July 23, 2022 01:56

steve-chavez added 2 commits July 23, 2022 00:26

dry liftIO on tests

9a56772

add test for single upsert

ebf5059

rename spec to just PlanSpec

57313d8

steve-chavez force-pushed the explain branch from 5f16e9a to 57313d8 Compare July 23, 2022 05:44

steve-chavez added 2 commits July 26, 2022 23:16

add db-plan-enabled config

58ca15a

add CHANGELOG

3fcc641

steve-chavez force-pushed the explain branch from 401758b to 3fcc641 Compare July 27, 2022 04:24

steve-chavez added 2 commits July 26, 2022 23:44

delete unused

66eab22

add explain text output

7f3c638

steve-chavez force-pushed the explain branch from 0d41d30 to 7f3c638 Compare July 27, 2022 21:22

steve-chavez marked this pull request as ready for review July 27, 2022 21:22

steve-chavez added 2 commits July 27, 2022 16:25

remove pg_stat_statements

39f1ac6

add missing CLI entry

e6bc5bb

steve-chavez force-pushed the explain branch from 68c7cfc to e6bc5bb Compare July 27, 2022 21:35

add missing test for PATCH

6920be6

steve-chavez commented Jul 27, 2022

View reviewed changes

src/PostgREST/MediaType.hs Outdated Show resolved Hide resolved

add MTPlanAttrs type

22700d9

laurenceisla approved these changes Jul 28, 2022

View reviewed changes

steve-chavez merged commit 8911afd into PostgREST:main Jul 28, 2022

steve-chavez mentioned this pull request Aug 1, 2022

Add "for" param for the vnd.pgrst.plan media type #2400

Merged

steve-chavez mentioned this pull request Aug 3, 2022

Adds explain() and geojson() transforms supabase/postgrest-js#293

Merged

laurenceisla mentioned this pull request Aug 15, 2022

Add missing changes/fixes/features for the new version PostgREST/postgrest-docs#563

Merged

16 tasks

steve-chavez mentioned this pull request Jan 29, 2023

Add db-plan-enabled recommendation PostgREST/postgrest-docs#588

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add media type for explain #2368

Add media type for explain #2368

steve-chavez commented Jul 14, 2022 •

edited

Loading

steve-chavez commented Jul 20, 2022 •

edited

Loading

steve-chavez commented Jul 23, 2022

steve-chavez commented Jul 23, 2022

steve-chavez commented Jul 27, 2022

wolfgangwalther commented Jul 28, 2022

steve-chavez commented Jul 28, 2022

steve-chavez commented Aug 1, 2022

Add media type for explain #2368

Add media type for explain #2368

Conversation

steve-chavez commented Jul 14, 2022 • edited Loading

TODO

steve-chavez commented Jul 20, 2022 • edited Loading

The query identifier is included with the VERBOSE option

The search_path is included with the SETTINGS option

Specifying TIMING or WAL without ANALYZE gets a nice error message

steve-chavez commented Jul 23, 2022

steve-chavez commented Jul 23, 2022

steve-chavez commented Jul 27, 2022

wolfgangwalther commented Jul 28, 2022

steve-chavez commented Jul 28, 2022

steve-chavez commented Aug 1, 2022

steve-chavez commented Jul 14, 2022 •

edited

Loading

steve-chavez commented Jul 20, 2022 •

edited

Loading