-
Notifications
You must be signed in to change notification settings - Fork 87
Open
Labels
Description
Столкнулся с такой проблемой:
заполнился полностью раздел для бэкапов (NFS), очистил часть места, но процесс архивации WAL продолжал висеть в прежнем состоянии (на одном и том же WAL)
$ pg_probackup -V
pg_probackup 2.4.15 (Postgres Pro 9.6.18.1 enterprise)
archive_command = '/usr/bin/pg_probackup archive-push -B /pgdump/backup --instance=dummy --compress-level=1 -j 20 --batch-size=200 --wal-file-path %p --wal-file-name %f --overwrite'
В списке процессов было
postgres 123710 0.0 0.0 816616 4160 ? Sl Apr28 0:15 /usr/bin/pg_probackup archive-push -B /pgdump/backup --instance=dummy --compress-level=1 -j 10 --batch-size=100 --wal-file-path pg_xlog/000000A1000287D300000087 --wal-file-name 000000A1000287D300000087 --overwrite
[postgres@somehost ~]$ ps auxwww | grep archiver
postgres 38529 0.7 0.0 112848 4588 ? Ss Apr18 116:05 postgres: archiver process archiving 000000A1000287D300000087
В логе pg_probackup новых записей не было, лог остановился на моменте заполнения раздела.
Прибил процесс зависшей команды из archive_command
[postgres@somehost ~]$ kill -9 123710
Запустилась новая команда pid=101930
[postgres@somehost ~]$ ps auxwww | grep pg_probackup
postgres 101930 237 0.0 1567064 6044 ? Sl 02:40 2:03 /usr/bin/pg_probackup archive-push -B /pgdump/backup --instance=dummy --compress-level=1 -j 20 --batch-size=200 --wal-file-path pg_xlog/000000A1000287D300000087 --wal-file-name 000000A1000287D300000087 --overwrite
postgres 103193 0.0 0.0 8244 796 pts/3 S+ 02:40 0:00 grep --color=auto pg_probackup
В лог pg_probackup начали добавляться новые записи
2021-04-29 02:40:00 MSK [101930]: INFO: pg_probackup archive-push WAL file: 000000A1000287D300000087, threads: 20/20, batch: 200/200, compression: zlib
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000E0"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000DF"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000DE"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D300000087"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000E1"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000E5"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000E7"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000E8"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000DF.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E0.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000DE.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000E9"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000EA"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000EB"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000EC"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000EE"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000ED"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E1.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E5.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E7.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E8.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E9.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000EA.gz.part"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000F0"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000EF"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000F1"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000F2"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000F3"
2021-04-29 02:40:00 MSK [101930]: LOG: pushing file "000000A1000287D3000000F4"
2021-04-29 02:40:01 MSK [101930]: LOG: WAL file already exists in archive with the same checksum, skip pushing: "/pgdata/9.6/data/pg_xlog/000000A1000287D300000087"
2021-04-29 02:40:01 MSK [101930]: LOG: pushing file "000000A1000287D3000000F5"
2021-04-29 02:40:02 MSK [101930]: LOG: pushing file "000000A1000287D3000000F6"
Но процесс архивации снова висел на том же WAL, что и до kill -9 старого процесса.
Cнова прибиваю
[postgres@somehost ~]$ kill -9 101930
В логе новые записи, но по сути - тоже самое
2021-04-29 02:40:20 MSK [101930]: LOG: pushing file "000000A1000287D4000000A5"
2021-04-29 02:40:20 MSK [101930]: LOG: pushing file "000000A1000287D4000000A6"
2021-04-29 02:40:20 MSK [101930]: LOG: pushing file "000000A1000287D4000000A7"
2021-04-29 02:40:20 MSK [101930]: LOG: pushing file "000000A1000287D4000000A8"
2021-04-29 02:49:05 MSK [113985]: INFO: pg_probackup archive-push WAL file: 000000A1000287D300000087, threads: 20/20, batch: 200/200, compression: zlib
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000DE"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D300000087"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000DF"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000E1"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000E0"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000E5"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000E7"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000E8"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000DE.gz.part"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000DF.gz.part"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E1.gz.part"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E0.gz.part"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E5.gz.part"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E7.gz.part"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E8.gz.part"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000EA"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D3000000E9"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000A9"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000AA"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000AB"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000AC"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000AD"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000AE"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000AF"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000B0"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000B1"
2021-04-29 02:49:05 MSK [113985]: LOG: pushing file "000000A1000287D4000000B2"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000E9.gz.part"
2021-04-29 02:49:05 MSK [113985]: LOG: Temp WAL file already exists, waiting on it 24000 seconds: "/pgdump/backup/wal/dummy/000000A1000287D3000000EA.gz.part"
2021-04-29 02:49:06 MSK [113985]: LOG: pushing file "000000A1000287D4000000B3"
2021-04-29 02:49:06 MSK [113985]: LOG: pushing file "000000A1000287D4000000B4"
[postgres@somehost ~]$ ls -lh /pgdump/backup/wal/dummy/000000A1000287D3*.gz.part
-rw------- 1 postgres postgres 3.1M Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000DE.gz.part
-rw------- 1 postgres postgres 3.1M Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000DF.gz.part
-rw------- 1 postgres postgres 3.2M Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000E0.gz.part
-rw------- 1 postgres postgres 1.2M Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000E1.gz.part
-rw------- 1 postgres postgres 0 Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000E5.gz.part
-rw------- 1 postgres postgres 0 Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000E7.gz.part
-rw------- 1 postgres postgres 0 Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000E8.gz.part
-rw------- 1 postgres postgres 0 Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000E9.gz.part
-rw------- 1 postgres postgres 0 Apr 28 21:51 /pgdump/backup/wal/dummy/000000A1000287D3000000EA.gz.part
Удалил эти файлы .gz.part
[postgres@somehost ~]$ rm -f /pgdump/backup/wal/dummy/000000A1000287D3*.gz.part
После этого архивация заработала.