|
| 1 | +Originally from: [tweet](https://twitter.com/samokhvalov/status/1711982079838507225), [LinkedIn post](...). |
| 2 | + |
| 3 | +--- |
| 4 | + |
| 5 | +# How to monitor CREATE INDEX / REINDEX progress in Postgres 12+ |
| 6 | + |
| 7 | +> I post a new PostgreSQL "howto" article every day. Join me in this |
| 8 | +> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share! |
| 9 | +
|
| 10 | +To monitor the progress of long-running index building or rebuilding, you can use this query: |
| 11 | + |
| 12 | +```sql |
| 13 | +select |
| 14 | + now(), |
| 15 | + query_start as started_at, |
| 16 | + now() - query_start as query_duration, |
| 17 | + format('[%s] %s', a.pid, a.query) as pid_and_query, |
| 18 | + index_relid::regclass as index_name, |
| 19 | + relid::regclass as table_name, |
| 20 | + (pg_size_pretty(pg_relation_size(relid))) as table_size, |
| 21 | + nullif(wait_event_type, '') || ': ' || wait_event as wait_type_and_event, |
| 22 | + phase, |
| 23 | + format( |
| 24 | + '%s (%s of %s)', |
| 25 | + coalesce((round(100 * blocks_done::numeric / nullif(blocks_total, 0), 2))::text || '%', 'N/A'), |
| 26 | + coalesce(blocks_done::text, '?'), |
| 27 | + coalesce(blocks_total::text, '?') |
| 28 | + ) as blocks_progress, |
| 29 | + format( |
| 30 | + '%s (%s of %s)', |
| 31 | + coalesce((round(100 * tuples_done::numeric / nullif(tuples_total, 0), 2))::text || '%', 'N/A'), |
| 32 | + coalesce(tuples_done::text, '?'), |
| 33 | + coalesce(tuples_total::text, '?') |
| 34 | + ) as tuples_progress, |
| 35 | + current_locker_pid, |
| 36 | + (select nullif(left(query, 150), '') || '...' from pg_stat_activity a where a.pid = current_locker_pid) as current_locker_query, |
| 37 | + format( |
| 38 | + '%s (%s of %s)', |
| 39 | + coalesce((round(100 * lockers_done::numeric / nullif(lockers_total, 0), 2))::text || '%', 'N/A'), |
| 40 | + coalesce(lockers_done::text, '?'), |
| 41 | + coalesce(lockers_total::text, '?') |
| 42 | + ) as lockers_progress, |
| 43 | + format( |
| 44 | + '%s (%s of %s)', |
| 45 | + coalesce((round(100 * partitions_done::numeric / nullif(partitions_total, 0), 2))::text || '%', 'N/A'), |
| 46 | + coalesce(partitions_done::text, '?'), |
| 47 | + coalesce(partitions_total::text, '?') |
| 48 | + ) as partitions_progress, |
| 49 | + ( |
| 50 | + select |
| 51 | + format( |
| 52 | + '%s (%s of %s)', |
| 53 | + coalesce((round(100 * n_dead_tup::numeric / nullif(reltuples::numeric, 0), 2))::text || '%', 'N/A'), |
| 54 | + coalesce(n_dead_tup::text, '?'), |
| 55 | + coalesce(reltuples::int8::text, '?') |
| 56 | + ) |
| 57 | + from pg_stat_all_tables t, pg_class tc |
| 58 | + where t.relid = p.relid and tc.oid = p.relid |
| 59 | + ) as table_dead_tuples |
| 60 | +from pg_stat_progress_create_index p |
| 61 | + left join pg_stat_activity a on a.pid = p.pid |
| 62 | +order by p.index_relid |
| 63 | +; -- in psql, use "\watch 5" instead of semicolon |
| 64 | +``` |
| 65 | + |
| 66 | +The same query, in [a better formatted form](https://gitlab.com/-/snippets/2138417). |
| 67 | + |
| 68 | +How this query works: |
| 69 | + |
| 70 | +1. The basis of it |
| 71 | + is `pg_stat_progress_create_index` [added in Postgres 12](https://postgresql.org/docs/current/progress-reporting.html#CREATE-INDEX-PROGRESS-REPORTING). |
| 72 | + |
| 73 | +2. The documentation also has a list of `CREATE INDEX` phases. As you can see from the table provided in the docs, |
| 74 | + advanced variant, `CREATE INDEX CONCURRENTLY` / `REINDEX CONCURRENTLY` (a.k.a. CIC and RC), which takes longer but |
| 75 | + acts in a non-blocking fashion suitable for loaded production systems, has more phases. The current phase is |
| 76 | + presented in the column "phase" of the output. |
| 77 | + |
| 78 | +3. Index name (a temporary one in case of CIC/RC), table name are presented (using the useful trick to convert OIDs to |
| 79 | + names – note, e.g., `index_relid::regclass as index_name`). Additionally, the table size which is essential to form |
| 80 | + expectations of overall duration – the bigger the table is, the longer the index creation is going to take. |
| 81 | + |
| 82 | +4. `pg_stat_activity` (`pgsa`) provides a lot of additional useful information: |
| 83 | + - PID of Postgres backend |
| 84 | + - actual SQL query used |
| 85 | + - the moment when the work has started (`query_start`), allowing us to understand the elapsed |
| 86 | + time (`query_duration`) |
| 87 | + - `wait_event_type` & `wait_event` to understand what the process is currently waiting on |
| 88 | + - it also used (in a separate sub-query) to get the information of the session that blocks our process, when such an |
| 89 | + event occurs (`current_locker_pid`, `current_locker_query`) |
| 90 | + |
| 91 | +5. Function `format(...)` is very useful to consolidate data in convenient form without having to worry about `NULL`s, |
| 92 | + that |
| 93 | + would be a problem if we used a regular concatenation without `coalesce(...)`. |
| 94 | + |
| 95 | +6. However, in certain cases we use `coalesce(...)` just to put special symbols if the value is missing (`IS NULL`) – |
| 96 | + e.g., "?" or "N/A". |
| 97 | + |
| 98 | +7. Another interesting trick is combination of `coalesce(...)` and `nullif(...)`. The latter allows us to avoid division |
| 99 | + by |
| 100 | + zero errors (replacing `0` with `NULL`, making the division result also `NULL`), while the former, again, is used to |
| 101 | + substitute resulting `NULL` with some non-empty value (in this case, 'N/A'). |
| 102 | + |
| 103 | +When executed in `psql`, it can be convenient to use `\watch [seconds]` to run this report in loop and observe progress |
| 104 | +live: |
| 105 | + |
| 106 | + |
0 commit comments