PostgreSQL primary-standby failback tools : pg

将要加入PostgreSQL 9.5了

http://hlinnaka.iki.fi/2015/03/23/pg_rewind-in-postgresql-9-5/

https://github.com/vmware/pg_rewind

在数据库变得比较大时, 例如上TB, 如果部署了PostgreSQL primary-standby 流复制或者log shipping HA.

当发生了failover, old primary节点可能因为某些原因需要重新同步数据.

在广域网上, 由于网络异常也可能造成standby节点落后主节点, 导致需要重新同步数据.

小数据库重新同步数据很方便, 全量或者使用rsync增量同步都可以.

但是数据库很大的情况下, rsync也会变得非常慢, 而且大量消耗主机IO资源.

PostgreSQL 社区有提议在核心中加入通过wal文件解析, 达到增量同步到目的. 目前还在开发阶段.

http://www.postgresql.org/message-id/flat/CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=gGyu1KmT+s2xcQ-bw@mail.gmail.com#CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=gGyu1KmT+s2xcQ-bw@mail.gmail.com

http://www.postgresql.org/message-id/flat/CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtbJgWrFu513s+Q@mail.gmail.com#CAF8Q-Gxg3PQTf71NVECe-6OzRaew5pWhk7yQtbJgWrFu513s+Q@mail.gmail.com

目前已经有一个工具名为pg_rewind, 也是一个增量同步工具, 具体的做法是通过解析wal, 同步变更过的数据块. 仅仅支持9.3及以上版本. 因为需要data page checksum的支持.

原理如下 :

Theory of operation
-------------------

The basic idea is to copy everything from the new cluster to old, except
for the blocks that we know to be the same.

1. Scan the WAL log of the old cluster, starting from the point where
the new cluster's timeline history forked off from the old cluster. For
each WAL record, make a note of the data blocks that are touched. This
yields a list of all the data blocks that were changed in the old
cluster, after the new cluster forked off.

2. Copy all those changed blocks from the new master to the old master.

3. Copy all other files like clog, conf files etc. from the new cluster
to old. Everything except the relation files.

4. Apply the WAL from the new master, starting from the checkpoint
created at failover. (pg_rewind doesn't actually apply the WAL, it just
creates a backup label file indicating that when PostgreSQL is started,
it will start replay from that checkpoint and apply all the required WAL)

详细介绍参考此文 :

http://www.postgresql.org/message-id/flat/519DF910.4020609@vmware.com#519DF910.4020609@vmware.com

https://github.com/vmware/pg_rewind

下面测试一下pg_rewind的强大功能. 测试中包含了standby节点promote后, 主节点和备节点都发生了变更的情况, 当然发生变更产生的wal必须存在. 如果old primary发生的变更已经在归档目录, 需要先手工将这些变更拷贝到pg_xlog目录. 备节点发生的变更无所谓, 因为可以通过recovery.conf来解决.

下载postgresql 9.3 :

http://git.postgresql.org/gitweb/?p=postgresql.git;a=shortlog;h=refs/heads/REL9_3_STABLE
wget http://git.postgresql.org/gitweb/?p=postgresql.git;a=snapshot;h=b5a20ab3e0310103ff11337faeed3c521f5eb917;sf=tgz

安装PostgreSQL 9.3

tar -zxvf postgresql-b5a20ab.tar.gz
cd postgresql-b5a20ab
./configure --prefix=/opt/pgsql9.3beta2 --with-pgport=1999 --with-perl --with-tcl --with-python --with-openssl --with-pam --without-ldap --with-libxml --with-libxslt --enable-thread-safety --with-wal-blocksize=16 && gmake && gmake install
cd contrib/
gmake && gmake install

下载pg_rewind :

https://github.com/vmware/pg_rewind/archive/master.zip

安装pg_rewind

unzip pg_rewind-master.zip
mv pg_rewind-master postgresql-b5a20ab/contrib/
cd postgresql-b5a20ab/contrib/pg_rewind-master
[root@db-172-16-3-33 pg_rewind-master]# export PATH=/opt/pgsql9.3beta2/bin:$PATH
[root@db-172-16-3-33 pg_rewind-master]# which pg_config
/opt/pgsql9.3beta2/bin/pg_config
[root@db-172-16-3-33 pg_rewind-master]# gmake clean
[root@db-172-16-3-33 pg_rewind-master]# gmake
[root@db-172-16-3-33 pg_rewind-master]# gmake install
[root@db-172-16-3-33 pg_rewind-master]# which pg_rewind
/opt/pgsql9.3beta2/bin/pg_rewind
[root@db-172-16-3-33 pg_rewind-master]# pg_rewind --help
pg_rewind resynchronizes a cluster with another copy of the cluster.

Usage:
  pg_rewind [OPTION]...

Options:
  -D, --target-pgdata=DIRECTORY
                 existing data directory to modify
  --source-pgdata=DIRECTORY
                 source data directory to sync with
  --source-server=CONNSTR
                 source server to sync with
  -v             write a lot of progress messages
  -n, --dry-run  stop before modifying anything
  -V, --version  output version information, then exit
  -?, --help     show this help, then exit

Report bugs to <xxx>.

初始化数据库, 使用data page checksums

pg93@db-172-16-3-33-> initdb -D $PGDATA -E UTF8 --locale=C -W -U postgres -k

配置主节点

vi pg_hba.conf
host replication postgres 172.16.3.0/24 md5
host all all 0.0.0.0/0 md5

vi postgresql.conf
listen_addresses = '0.0.0.0'            # what IP address(es) to listen on;
port = 1999                             # (change requires restart)
max_connections = 100                   # (change requires restart)
superuser_reserved_connections = 3      # (change requires restart)
unix_socket_directories = '.'   # comma-separated list of directories
unix_socket_permissions = 0700          # begin with 0 to use octal notation
shared_buffers = 1024MB                 # min 128kB
maintenance_work_mem = 512MB            # min 1MB
shared_preload_libraries = 'pg_stat_statements'         # (change requires restart)
vacuum_cost_delay = 10                  # 0-100 milliseconds
vacuum_cost_limit = 10000               # 1-10000 credits
bgwriter_delay = 10ms                   # 10-10000ms between rounds
wal_level = hot_standby                 # minimal, archive, or hot_standby
synchronous_commit = off                # synchronization level;
wal_sync_method = fdatasync             # the default is the first option
wal_buffers = 16384kB                   # min 32kB, -1 sets based on shared_buffers
wal_writer_delay = 10ms         # 1-10000 milliseconds
checkpoint_segments = 32                # in logfile segments, min 1, 16MB each
archive_mode = on               # allows archiving to be done
archive_command = 'test ! -f /pgdata/digoal/1921/data03/pg93/pg_arch/%f && cp %p /pgdata/digoal/1921/data03/pg93/pg_arch/%f'# command to use to archive a logfile segment
max_wal_senders = 32            # max number of walsender processes
wal_keep_segments = 128         # in logfile segments, 16MB each; 0 disables
hot_standby = on                        # "on" allows queries during recovery
max_standby_archive_delay = 300s        # max delay before canceling queries
max_standby_streaming_delay = 300s      # max delay before canceling queries
wal_receiver_status_interval = 1s       # send replies at least this often
hot_standby_feedback = on               # send info from standby to prevent
random_page_cost = 1.5                  # same scale as above
effective_cache_size = 128000MB
log_destination = 'csvlog'              # Valid values are combinations of
logging_collector = on          # Enable capturing of stderr and csvlog
log_directory = 'pg_log'                # directory where log files are written,
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
log_file_mode = 0600                    # creation mode for log files,
log_truncate_on_rotation = on           # If on, an existing log file with the
log_rotation_age = 1d                   # Automatic rotation of logfiles will
log_rotation_size = 10MB                # Automatic rotation of logfiles will
log_checkpoints = on
log_connections = on
log_disconnections = on
log_error_verbosity = verbose           # terse, default, or verbose messages
log_lock_waits = on                     # log lock waits >= deadlock_timeout
log_statement = 'ddl'                   # none, ddl, mod, all
log_timezone = 'PRC'
autovacuum = on                 # Enable autovacuum subprocess?  'on'
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C'                       # locale for system error message
lc_monetary = 'C'                       # locale for monetary formatting
lc_numeric = 'C'                        # locale for number formatting
lc_time = 'C'                           # locale for time formatting
default_text_search_config = 'pg_catalog.english'
pg_stat_statements.max = 1000
pg_stat_statements.track = all

pg93@db-172-16-3-33-> cp $PGHOME/share/recovery.conf.sample $PGDATA/
pg93@db-172-16-3-33-> mv $PGDATA/recovery.conf.sample $PGDATA/recovery.done
vi recovery.done
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.16.3.39 port=1999 user=postgres keepalives_idle=60'

启动主节点数据库

pg93@db-172-16-3-33-> pg_ctl start
server starting
pg93@db-172-16-3-33-> LOG:  00000: loaded library "pg_stat_statements"
LOCATION:  load_libraries, miscinit.c:1296

创建standby(172.16.3.39)

安装postgresql

略, 同主节点

安装pg_rewind

略, 同主节点

创建$PGDATA目录

mkdir -p $PGDATA
chown pg93:pg93 $PGDATA
chmod 700 $PGDATA
su - pg93

复制主节点数据库

root@db-172-16-3-39-> su - pg93
pg93@db-172-16-3-39-> vi ~/.pgpass
172.16.3.33:1999:*:postgres:postgres
pg93@db-172-16-3-39-> chmod 400 ~/.pgpass
pg93@db-172-16-3-39-> pg_basebackup -D $PGDATA -F p -x -P -v -h 172.16.3.33 -p 1999 -U postgres
WARNING:  skipping special file "./.s.PGSQL.1999"
transaction log start point: 0/2000028 on timeline 1
WARNING:  skipping special file "./.s.PGSQL.1999"g_root/pg_subtrans/0000)
36575/36575 kB (100%), 1/1 tablespace
transaction log end point: 0/20000F0
pg_basebackup: base backup completed
pg93@db-172-16-3-39-> cd $PGDATA
pg93@db-172-16-3-39-> mv recovery.done recovery.conf
pg93@db-172-16-3-39-> vi recovery.conf
recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.16.3.33 port=1999 user=postgres keepalives_idle=60'                # e.g. 'host=localhost port=5432'

启动standby数据库

pg93@db-172-16-3-39-> pg_ctl start
server starting
pg93@db-172-16-3-39-> LOG:  00000: loaded library "pg_stat_statements"
LOCATION:  load_libraries, miscinit.c:1296

主节点控制文件信息 :

pg93@db-172-16-3-33-> pg_controldata
pg_control version number:            937
Catalog version number:               201306121
Database system identifier:           5908450106616519131
Database cluster state:               in production
pg_control last modified:             Mon 05 Aug 2013 10:31:56 AM CST
Latest checkpoint location:           0/30007E8
Prior checkpoint location:            0/3000710
Latest checkpoint's REDO location:    0/30007A8
Latest checkpoint's REDO WAL file:    000000010000000000000003
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0/1685
Latest checkpoint's NextOID:          24576
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        1674
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  1684
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Time of latest checkpoint:            Mon 05 Aug 2013 10:31:56 AM CST
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
Current wal_level setting:            hot_standby
Current max_connections setting:      100
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   64
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       16384
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1

备节点控制文件信息 :

pg93@db-172-16-3-39-> pg_controldata
pg_control version number:            937
Catalog version number:               201306121
Database system identifier:           5908450106616519131
Database cluster state:               in archive recovery
pg_control last modified:             Mon 05 Aug 2013 10:31:39 AM CST
Latest checkpoint location:           0/2000060
Prior checkpoint location:            0/2000060
Latest checkpoint's REDO location:    0/2000028
Latest checkpoint's REDO WAL file:    000000010000000000000002
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0/1684
Latest checkpoint's NextOID:          12815
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        1674
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  1684
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Time of latest checkpoint:            Mon 05 Aug 2013 10:28:59 AM CST
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/20000F0
Min recovery ending loc's timeline:   1
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
Current wal_level setting:            hot_standby
Current max_connections setting:      100
Current max_prepared_xacts setting:   0
Current max_locks_per_xact setting:   64
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       16384
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1

测试 :

1. 主节点

pg93@db-172-16-3-33-> psql
psql (9.3beta2)
Type "help" for help.

digoal=# create table test (id int primary key, info text, crt_time timestamp);
CREATE TABLE
digoal=# create or replace function func() returns void as $$
digoal$# declare
digoal$#   v_id int;
digoal$# begin
digoal$#   v_id := round(5000000*random());
digoal$#   update test set info=md5(random()::text),crt_time=clock_timestamp() where id=v_id;
digoal$#   if found then
digoal$#     return;
digoal$#   else
digoal$#     insert into test values (v_id,md5(random()::text),clock_timestamp());
digoal$#   end if;
digoal$#   return;
digoal$# end;
digoal$# $$ language plpgsql strict;
CREATE FUNCTION
digoal=# select func();
 func
------

(1 row)

digoal=# select * from test ;
   id    |               info               |          crt_time
---------+----------------------------------+----------------------------
 3554644 | c5aabfa68774a7bd9a623819537475c6 | 2013-08-05 10:39:49.304063
(1 row)

digoal=# select func();
 func
------

(1 row)

digoal=# select * from test ;
   id    |               info               |          crt_time
---------+----------------------------------+----------------------------
 3554644 | c5aabfa68774a7bd9a623819537475c6 | 2013-08-05 10:39:49.304063
 2856072 | ec17bc98163a1ac0cbcdeadd0b151607 | 2013-08-05 10:39:54.324455
(2 rows)

数据变更测试

vi test.sql
select func();
pg93@db-172-16-3-33-> pgbench -M prepared -f ./test.sql -r -n -h $PGDATA -p 1999 -U postgres -c 16 -j 4 -T 10 digoal
Client 13 aborted in state 0: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (id)=(3717357) already exists.
CONTEXT:  SQL statement "insert into test values (v_id,md5(random()::text),clock_timestamp())"
PL/pgSQL function func() line 10 at SQL statement
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 16
number of threads: 4
duration: 10 s
number of transactions actually processed: 419517
tps = 41926.489602 (including connections establishing)
tps = 42031.118850 (excluding connections establishing)
statement latencies in milliseconds:
        0.376348        select func();

数据一致性验证

主节点

pg93@db-172-16-3-33-> psql
psql (9.3beta2)
Type "help" for help.

digoal=# select sum(hashtext(test.*::text)) from test;
      sum
---------------
 -215513112678
(1 row)

digoal=# select count(*) from test ;
 count
--------
 402434
(1 row)

备节点

pg93@db-172-16-3-39-> psql
psql (9.3beta2)
Type "help" for help.

digoal=# select sum(hashtext(test.*::text)) from test;
      sum
---------------
 -215513112678
(1 row)

digoal=# select count(*) from test ;
 count
--------
 402434
(1 row)

2. 备节点

promote
pg93@db-172-16-3-39-> pg_ctl promote
server promoting

数据变更测试

vi test.sql
select func();
pg93@db-172-16-3-39-> pgbench -M prepared -f ./test.sql -r -n -h $PGDATA -p 1999 -U postgres -c 16 -j 4 -T 10 digoal
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 16
number of threads: 4
duration: 10 s
number of transactions actually processed: 378395
tps = 37814.175846 (including connections establishing)
tps = 37866.507340 (excluding connections establishing)
statement latencies in milliseconds:
        0.419977        select func();

备节点数据 :

pg93@db-172-16-3-39-> psql
psql (9.3beta2)
Type "help" for help.
digoal=# select sum(hashtext(test.*::text)) from test;
     sum
--------------
 380706298298
(1 row)
digoal=# select count(*) from test ;
 count
--------
 737925
(1 row)

3. 主节点

主节点继续变更数据

pg93@db-172-16-3-33-> pgbench -M prepared -f ./test.sql -r -n -h $PGDATA -p 1999 -U postgres -c 16 -j 4 -T 10 digoal
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 16
number of threads: 4
duration: 10 s
number of transactions actually processed: 399093
tps = 39862.553184 (including connections establishing)
tps = 39960.089273 (excluding connections establishing)
statement latencies in milliseconds:
        0.398488        select func();

主节点数据 :

pg93@db-172-16-3-33-> psql
psql (9.3beta2)
Type "help" for help.

digoal=# select sum(hashtext(test.*::text)) from test;
     sum
--------------
 127807805610
(1 row)

digoal=# select count(*) from test ;
 count
--------
 755238
(1 row)

4. 主节点变更为备节点, 一般可以通过rsync从新的主节点(172.16.3.39)来同步$PGDATA, 或者全新的pg_basebackup一次.

本文使用pg_rewind来实现.

停原主库(172.16.3.33)

pg93@db-172-16-3-33-> pg_ctl stop -m fast
waiting for server to shut down..... done
server stopped

配置.pgpass

su - pg93
pg93@db-172-16-3-33-> vi .pgpass
172.16.3.39:1999:*:postgres:postgres
pg93@db-172-16-3-33-> chmod 400 .pgpass

重新同步

pg93@db-172-16-3-33-> pg_rewind -D $PGDATA --source-server='host=172.16.3.39 port=1999 dbname=digoal' -v
connected to remote server
unexpected control file size 0, expected 8192

多次切换时间线后, 出现如下错误

pg93@db-172-16-3-33-> pg_rewind -D $PGDATA --source-server='host=172.16.3.39 port=1999 dbname=digoal' -v
connected to remote server
fetched file "global/pg_control", length 8192
fetched file "pg_xlog/00000006.history", length 213
could not find common ancestor of the source and target cluster's timelines

在pg_rewind.c中加了一个打印tli信息的代码

       /*
         * Trace the history backwards, until we hit the target timeline.
         *
         * TODO: This assumes that there are no timeline switches on the target
         * cluster after the fork.
         */
        for (i = nentries - 1; i >= 0; i--)
        {
                TimeLineHistoryEntry *entry = &sourceHistory[i];
                fprintf(stdout, "nen:%d,srctli:%d,targettli:%d\n", i, entry->tli, targettli);

重新编译pg_rewind, timeline.c代码有问题, 历史文件解析不正确. 如下 :

pg93@db-172-16-3-39-> pg_rewind -D $PGDATA --source-server='host=172.16.3.33 port=1999 dbname=digoal' -v
connected to remote server
fetched file "global/pg_control", length 8192
fetched file "pg_xlog/0000000B.history", length 419
nen:1,srctli:11,targettli:10
nen:0,srctli:1,targettli:10
could not find common ancestor of the source and target cluster's timelines

使用如下方法修复以上问题.

修改新的主库history文件, 把最后一行放上来即可. 但是紧接着又是前面的错误.unexpected control file size 0, expected 8192, 如下

pg93@db-172-16-3-33-> cd $PGDATA/pg_xlog
pg93@db-172-16-3-33-> vi 0000000B.history
10      1/6000000       no recovery target specified
1       0/92DCDD8       no recovery target specified
2       0/1CB86338      no recovery target specified
3       0/36E68A20      no recovery target specified
4       0/569ADB88      no recovery target specified
5       0/762CF5D8      no recovery target specified
6       0/9F67C920      no recovery target specified
7       0/A0000090      no recovery target specified
8       1/3F535A0       no recovery target specified
9       1/4000090       no recovery target specified

重新执行pg_rewind

pg93@db-172-16-3-39-> pg_rewind -D $PGDATA --source-server='host=172.16.3.33 port=1999 dbname=digoal' -v
connected to remote server
fetched file "global/pg_control", length 8192
fetched file "pg_xlog/0000000B.history", length 419
nen:1,srctli:11,targettli:10
nen:0,srctli:10,targettli:10
Last common WAL position: 1/6000000 on timeline 10
Last common checkpoint at 1/6000000 on timeline 10
error reading xlog record: record with zero length at 1/6000090
.s.PGSQL.1999 (COPY)
backup_label.old (COPY)
recovery.done (COPY)
pg_ident.conf (COPY)
postmaster.opts (COPY)
postgresql.conf (COPY)
pg_hba.conf (COPY)
.s.PGSQL.1999.lock (COPY)
pg_log/postgresql-2013-08-05_112157.csv (COPY)
pg_log/postgresql-2013-08-05_123414.csv (COPY)
pg_log/postgresql-2013-08-05_101818.log (COPY)
pg_log/postgresql-2013-08-05_134452.log (COPY)
pg_log/postgresql-2013-08-05_112001.csv (COPY)
pg_log/postgresql-2013-08-05_134452.csv (COPY)
pg_log/postgresql-2013-08-05_111642.csv (COPY)
pg_log/postgresql-2013-08-05_110518.csv (COPY)
pg_log/postgresql-2013-08-05_134655.csv (COPY)
pg_log/postgresql-2013-08-05_131517.csv (COPY)
pg_log/postgresql-2013-08-05_103139.csv (COPY)
pg_log/postgresql-2013-08-05_110518.log (COPY)
pg_log/postgresql-2013-08-05_112902.csv (COPY)
pg_log/postgresql-2013-08-05_112001.log (COPY)
pg_log/postgresql-2013-08-05_134523.csv (COPY)
pg_log/postgresql-2013-08-05_134523.log (COPY)
pg_log/postgresql-2013-08-05_104358.csv (COPY)
pg_log/postgresql-2013-08-05_112902.log (COPY)
pg_log/postgresql-2013-08-05_131517.log (COPY)
pg_log/postgresql-2013-08-05_130021.csv (COPY)
pg_log/postgresql-2013-08-05_104358.log (COPY)
pg_log/postgresql-2013-08-05_103139.log (COPY)
pg_log/postgresql-2013-08-05_101818.csv (COPY)
pg_log/postgresql-2013-08-05_113036.csv (COPY)
pg_log/postgresql-2013-08-05_123414.log (COPY)
pg_log/postgresql-2013-08-05_123855.csv (COPY)
pg_log/postgresql-2013-08-05_112157.log (COPY)
pg_log/postgresql-2013-08-05_134655.log (COPY)
pg_log/postgresql-2013-08-05_130021.log (COPY)
pg_log/postgresql-2013-08-05_113036.log (COPY)
pg_log/postgresql-2013-08-05_131316.csv (COPY)
pg_log/postgresql-2013-08-05_111642.log (COPY)
pg_log/postgresql-2013-08-05_131316.log (COPY)
pg_log/postgresql-2013-08-05_123855.log (COPY)
pg_log/postgresql-2013-08-05_134444.csv (COPY)
pg_log/postgresql-2013-08-05_134444.log (COPY)
pg_subtrans/0047 (COPY)
pg_notify/0000 (COPY)
global/12696_vm (COPY)
global/12700_vm (COPY)
global/12707_fsm (COPY)
global/12557_fsm (COPY)
global/pg_internal.init (COPY)
global/12700_fsm (COPY)
global/pg_control (COPY)
global/12711_fsm (COPY)
global/12696_fsm (COPY)
global/12557_vm (COPY)
global/pg_filenode.map (COPY)
global/12707_vm (COPY)
global/12711_vm (COPY)
pg_xlog/00000008.history (COPY)
pg_xlog/0000000B0000000100000006 (COPY)
pg_xlog/00000009.history (COPY)
pg_xlog/0000000A.history (COPY)
pg_xlog/0000000B0000000100000005 (COPY)
pg_xlog/0000000A0000000100000005 (COPY)
pg_xlog/0000000B.history (COPY)
pg_clog/0001 (COPY)
pg_clog/0000 (COPY)
pg_clog/0002 (COPY)
pg_clog/0003 (COPY)
pg_clog/0004 (COPY)
pg_stat_tmp/db_16384.stat (COPY)
pg_stat_tmp/global.stat (COPY)
pg_stat_tmp/db_0.stat (COPY)
pg_multixact/members/0000 (COPY)
pg_multixact/offsets/0000 (COPY)
base/12814/12641_vm (COPY)
base/12814/12639_fsm (COPY)
base/12814/12547_fsm (COPY)
base/12814/12620_fsm (COPY)
..........省略
base/12809/12625_fsm (COPY)
base/12809/12569_fsm (COPY)
base/12809/12639_vm (COPY)
base/12809/12779_vm (COPY)
base/12809/12717_fsm (COPY)
base/12809/12799_vm (COPY)
base/12809/12600_vm (COPY)
base/12809/12612_fsm (COPY)
base/12809/12616_fsm (COPY)
base/12809/12553_fsm (COPY)
base/12809/12608_vm (COPY)
base/12809/12784_fsm (COPY)
base/12809/12784_vm (COPY)
base/12809/12768_vm (COPY)
base/12809/12549_vm (COPY)
base/12809/12673_fsm (COPY)
base/12809/12732_fsm (COPY)
base/12809/12794_fsm (COPY)
base/12809/12547_vm (COPY)
base/12809/12682_vm (COPY)
base/12809/12673_vm (COPY)
base/12809/12774_vm (COPY)
base/12809/12721_fsm (COPY)
base/12809/12587_fsm (COPY)
base/12809/12608_fsm (COPY)
base/12809/12717_vm (COPY)
base/12809/12636_fsm (COPY)
base/12809/12553_vm (COPY)
base/12809/12604_vm (COPY)
base/12809/12728_vm (COPY)
base/12809/12629_vm (COPY)
base/12809/12587_vm (COPY)
base/12809/12569_vm (COPY)
base/12809/12724_fsm (COPY)
base/12809/12768_fsm (COPY)
base/12809/12612_vm (COPY)
base/12809/12616_vm (COPY)
base/12809/12799_fsm (COPY)
base/12809/12629_fsm (COPY)
base/12809/12732_vm (COPY)
base/12809/12641_fsm (COPY)
base/12809/12764_fsm (COPY)
base/12809/12736_vm (COPY)
base/12809/12664_fsm (COPY)
base/12809/12658_fsm (COPY)
base/12809/12625_vm (COPY)
base/12809/12620_vm (COPY)
base/12809/12794_vm (COPY)
base/12809/pg_filenode.map (COPY)
base/12809/12604_fsm (COPY)
base/12809/12600_fsm (COPY)
base/12809/12774_fsm (COPY)
base/12809/12779_fsm (COPY)
base/12809/12789_fsm (COPY)
base/12809/12576_fsm (COPY)
base/12809/12789_vm (COPY)
base/12809/12736_fsm (COPY)
base/12809/12664_vm (COPY)
pg_xlog/archive_status/0000000A0000000100000005.done (COPY)
pg_xlog/archive_status/0000000B.history.done (COPY)
pg_stat/db_16384.stat (REMOVE)
pg_stat/global.stat (REMOVE)
pg_stat/db_0.stat (REMOVE)
global/pg_stat_statements.stat (REMOVE)
pg_xlog/0000000800000000000000F6 (REMOVE)
pg_xlog/000000080000000100000003 (REMOVE)
pg_xlog/0000000800000000000000F7 (REMOVE)
pg_xlog/0000000800000000000000F8 (REMOVE)
pg_xlog/0000000A0000000100000006 (REMOVE)
pg_xlog/0000000800000000000000F4 (REMOVE)
pg_xlog/0000000800000000000000FA (REMOVE)
pg_xlog/0000000800000000000000F2 (REMOVE)
pg_xlog/000000080000000100000002 (REMOVE)
pg_xlog/000000090000000100000003 (REMOVE)
pg_xlog/0000000800000000000000EF (REMOVE)
pg_xlog/0000000A0000000100000005.00000028.backup (REMOVE)
pg_xlog/0000000800000000000000F9 (REMOVE)
pg_xlog/0000000800000000000000FD (REMOVE)
pg_xlog/0000000A0000000100000004 (REMOVE)
pg_xlog/0000000800000000000000FC (REMOVE)
pg_xlog/0000000800000000000000EE (REMOVE)
pg_xlog/000000080000000100000000 (REMOVE)
pg_xlog/0000000800000000000000FB (REMOVE)
pg_xlog/0000000800000000000000F0 (REMOVE)
pg_xlog/000000090000000100000004 (REMOVE)
pg_xlog/000000080000000100000001 (REMOVE)
pg_xlog/0000000800000000000000F3 (REMOVE)
pg_xlog/archive_status/0000000800000000000000EE.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000FF.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000F4.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000FA.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000FD.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000F1.done (REMOVE)
pg_xlog/archive_status/00000009.history.ready (REMOVE)
pg_xlog/archive_status/000000090000000100000004.ready (REMOVE)
pg_xlog/archive_status/0000000800000000000000F2.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000EF.done (REMOVE)
pg_xlog/archive_status/0000000A0000000100000005.ready (REMOVE)
pg_xlog/archive_status/0000000A0000000100000004.ready (REMOVE)
pg_xlog/archive_status/0000000800000000000000FB.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000F3.done (REMOVE)
pg_xlog/archive_status/0000000A.history.ready (REMOVE)
pg_xlog/archive_status/0000000800000000000000FE.done (REMOVE)
pg_xlog/archive_status/000000080000000100000002.done (REMOVE)
pg_xlog/archive_status/000000090000000100000003.ready (REMOVE)
pg_xlog/archive_status/0000000A0000000100000005.00000028.backup.ready (REMOVE)
pg_xlog/archive_status/0000000800000000000000F6.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000F9.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000F5.done (REMOVE)
pg_xlog/archive_status/000000080000000100000000.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000FC.done (REMOVE)
pg_xlog/archive_status/000000080000000100000003.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000F8.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000F7.done (REMOVE)
pg_xlog/archive_status/000000080000000100000001.done (REMOVE)
pg_xlog/archive_status/0000000800000000000000F0.done (REMOVE)
pg_xlog/0000000800000000000000F1 (REMOVE)
pg_xlog/0000000800000000000000FE (REMOVE)
pg_xlog/0000000800000000000000FF (REMOVE)
pg_xlog/0000000800000000000000F5 (REMOVE)
getting chunks: -- fetch all the blocks listed in the temp table.
select path, begin,
  pg_read_binary_file(path, begin, len) as chunk
from fetchchunks

sent query
received chunk for file "backup_label.old", off 0, len 206
received chunk for file "recovery.done", off 0, len 4759
received chunk for file "pg_ident.conf", off 0, len 1636
received chunk for file "postmaster.opts", off 0, len 32
received chunk for file "postgresql.conf", off 0, len 20431
received chunk for file "pg_hba.conf", off 0, len 4547
received chunk for file ".s.PGSQL.1999.lock", off 0, len 64
received chunk for file "pg_log/postgresql-2013-08-05_112157.csv", off 0, len 48110
received chunk for file "pg_log/postgresql-2013-08-05_123414.csv", off 0, len 10613
received chunk for file "pg_log/postgresql-2013-08-05_112001.csv", off 0, len 8450
received chunk for file "pg_log/postgresql-2013-08-05_134452.csv", off 0, len 968
received chunk for file "pg_log/postgresql-2013-08-05_111642.csv", off 0, len 22888
received chunk for file "pg_log/postgresql-2013-08-05_110518.csv", off 0, len 34844
received chunk for file "pg_log/postgresql-2013-08-05_134655.csv", off 0, len 4932
received chunk for file "pg_log/postgresql-2013-08-05_131517.csv", off 0, len 70200
received chunk for file "pg_log/postgresql-2013-08-05_103139.csv", off 0, len 52611
received chunk for file "pg_log/postgresql-2013-08-05_112902.csv", off 0, len 2009
received chunk for file "pg_log/postgresql-2013-08-05_134523.csv", off 0, len 12060
received chunk for file "pg_log/postgresql-2013-08-05_104358.csv", off 0, len 61220
received chunk for file "pg_log/postgresql-2013-08-05_130021.csv", off 0, len 13541
received chunk for file "pg_log/postgresql-2013-08-05_104358.log", off 0, len 7125
received chunk for file "pg_log/postgresql-2013-08-05_101818.csv", off 0, len 2719
received chunk for file "pg_log/postgresql-2013-08-05_113036.csv", off 0, len 15990
received chunk for file "pg_log/postgresql-2013-08-05_123855.csv", off 0, len 36541
received chunk for file "pg_log/postgresql-2013-08-05_131316.csv", off 0, len 3686
received chunk for file "pg_log/postgresql-2013-08-05_134444.csv", off 0, len 968
received chunk for file "pg_subtrans/0047", off 0, len 114688
received chunk for file "pg_notify/0000", off 0, len 8192
received chunk for file "global/12696_vm", off 0, len 8192
received chunk for file "global/12700_vm", off 0, len 8192
received chunk for file "global/12707_fsm", off 0, len 24576
received chunk for file "global/12557_fsm", off 0, len 24576
received chunk for file "global/pg_internal.init", off 0, len 12784
received chunk for file "global/12700_fsm", off 0, len 24576
received chunk for file "global/pg_control", off 0, len 8192
received chunk for file "global/12711_fsm", off 0, len 24576
received chunk for file "global/12696_fsm", off 0, len 24576
received chunk for file "global/12557_vm", off 0, len 8192
received chunk for file "global/pg_filenode.map", off 0, len 512
received chunk for file "global/12707_vm", off 0, len 8192
received chunk for file "global/12711_vm", off 0, len 8192
received chunk for file "pg_xlog/00000008.history", off 0, len 299
received chunk for file "pg_xlog/0000000B0000000100000006", off 0, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 1000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 2000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 3000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 4000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 5000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 6000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 7000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 8000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 9000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 10000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 11000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 12000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 13000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 14000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 15000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000006", off 16000000, len 777216
received chunk for file "pg_xlog/00000009.history", off 0, len 334
received chunk for file "pg_xlog/0000000A.history", off 0, len 376
received chunk for file "pg_xlog/0000000B0000000100000005", off 0, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 1000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 2000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 3000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 4000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 5000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 6000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 7000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 8000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 9000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 10000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 11000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 12000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 13000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 14000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 15000000, len 1000000
received chunk for file "pg_xlog/0000000B0000000100000005", off 16000000, len 777216
received chunk for file "pg_xlog/0000000A0000000100000005", off 0, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 1000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 2000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 3000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 4000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 5000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 6000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 7000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 8000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 9000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 10000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 11000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 12000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 13000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 14000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 15000000, len 1000000
received chunk for file "pg_xlog/0000000A0000000100000005", off 16000000, len 777216
received chunk for file "pg_xlog/0000000B.history", off 0, len 419
received chunk for file "pg_clog/0001", off 0, len 262144
received chunk for file "pg_clog/0000", off 0, len 262144
received chunk for file "pg_clog/0002", off 0, len 262144
received chunk for file "pg_clog/0003", off 0, len 262144
received chunk for file "pg_clog/0004", off 0, len 122880
received chunk for file "pg_stat_tmp/db_16384.stat", off 0, len 3216
received chunk for file "pg_stat_tmp/global.stat", off 0, len 471
received chunk for file "pg_stat_tmp/db_0.stat", off 0, len 1188
received chunk for file "pg_multixact/members/0000", off 0, len 8192
received chunk for file "pg_multixact/offsets/0000", off 0, len 8192
received chunk for file "base/12814/12641_vm", off 0, len 8192
received chunk for file "base/12814/12639_fsm", off 0, len 24576
received chunk for file "base/12814/12547_fsm", off 0, len 24576
received chunk for file "base/12814/12620_fsm", off 0, len 24576
received chunk for file "base/12814/12549_fsm", off 0, len 24576
................省略
received chunk for file "base/12809/12576_fsm", off 0, len 24576
received chunk for file "base/12809/12789_vm", off 0, len 8192
received chunk for file "base/12809/12736_fsm", off 0, len 24576
received chunk for file "base/12809/12664_vm", off 0, len 8192
Done!

启动数据库, 无法启动 :

2013-08-05 13:52:04.424 CST,,,29312,,51ff3d84.7280,1,,2013-08-05 13:52:04 CST,,0,LOG,00000,"database system was interrupted; last known up at 2013-08-05 13:47:00 CST",,,,,,,,"StartupXLOG, xlog.c:4915",""
2013-08-05 13:52:04.424 CST,,,29312,,51ff3d84.7280,2,,2013-08-05 13:52:04 CST,,0,LOG,00000,"entering standby mode",,,,,,,,"StartupXLOG, xlog.c:4968",""
2013-08-05 13:52:04.424 CST,,,29312,,51ff3d84.7280,3,,2013-08-05 13:52:04 CST,,0,LOG,00000,"invalid checkpoint link in backup_label file",,,,,,,,"ReadCheckpointRecord, xlog.c:6364",""
2013-08-05 13:52:04.424 CST,,,29312,,51ff3d84.7280,4,,2013-08-05 13:52:04 CST,,0,FATAL,XX000,"could not locate required checkpoint record",,"If you are not restoring from a backup, try removing the file ""/pgdata/digoal/1921/data03/pg93/pg_root/backup_label"".",,,,,,"StartupXLOG, xlog.c:5047",""
2013-08-05 13:52:04.425 CST,,,29310,,51ff3d84.727e,1,,2013-08-05 13:52:04 CST,,0,LOG,00000,"startup process (PID 29312) exited with exit code 1",,,,,,,,"LogChildExit, postmaster.c:3211",""
2013-08-05 13:52:04.425 CST,,,29310,,51ff3d84.727e,2,,2013-08-05 13:52:04 CST,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,"reaper, postmaster.c:2536",""

这个错误, 需要删除新的standby(老的primary)的backup_label文件.

rm -f $PGDATA/backup_label*

启动数据库, 无法启动 :

pg93@db-172-16-3-39-> cat postgresql-2013-08-05_135236.csv
2013-08-05 13:52:36.846 CST,,,29333,,51ff3da4.7295,1,,2013-08-05 13:52:36 CST,,0,LOG,00000,"database system was interrupted; last known up at 2013-08-05 13:47:00 CST",,,,,,,,"StartupXLOG, xlog.c:4915",""
2013-08-05 13:52:36.846 CST,,,29333,,51ff3da4.7295,2,,2013-08-05 13:52:36 CST,,0,LOG,00000,"entering standby mode",,,,,,,,"StartupXLOG, xlog.c:4968",""
2013-08-05 13:52:36.846 CST,,,29333,,51ff3da4.7295,3,,2013-08-05 13:52:36 CST,,0,FATAL,XX000,"invalid data in history file: 1   0/92DCDD8   no recovery target specified
",,"Timeline IDs must be in increasing sequence.",,,,,,"readTimeLineHistory, timeline.c:158",""
2013-08-05 13:52:36.846 CST,,,29331,,51ff3da4.7293,1,,2013-08-05 13:52:36 CST,,0,LOG,00000,"startup process (PID 29333) exited with exit code 1",,,,,,,,"LogChildExit, postmaster.c:3211",""
2013-08-05 13:52:36.846 CST,,,29331,,51ff3da4.7293,2,,2013-08-05 13:52:36 CST,,0,LOG,00000,"aborting startup due to startup process failure",,,,,,,,"reaper, postmaster.c:2536",""

因为前面修改了history文件, 改回来即可. (两台主机都需要修改, 以免后面再出问题)

pg93@db-172-16-3-33-> vi 0000000B.history
pg93@db-172-16-3-39-> vi 0000000B.history
1       0/92DCDD8       no recovery target specified
2       0/1CB86338      no recovery target specified
3       0/36E68A20      no recovery target specified
4       0/569ADB88      no recovery target specified
5       0/762CF5D8      no recovery target specified
6       0/9F67C920      no recovery target specified
7       0/A0000090      no recovery target specified
8       1/3F535A0       no recovery target specified
9       1/4000090       no recovery target specified
10      1/6000000       no recovery target specified

启动数据库, 复制正常.

[其他问题]

1. 还有可能遇到值溢出的问题.

pg_xlog/0000000B000000010000001E (REMOVE)
unexpected result while sending file list: ERROR:  value "2148254528" is out of range for type integer
CONTEXT:  COPY fetchchunks, line 28557, column begin: "2148254528"

来自以下函数 :

pg_rewind_master/libpq_fetch.c

/*
 * Fetch all changed blocks from remote source data directory.
 */
void
libpq_executeFileMap(filemap_t *map)
{
        file_entry_t *entry;
        const char *sql;
        PGresult   *res;

        /*
         * First create a temporary table, and load it with the blocks that
         * we need to fetch.
         */
        sql = "create temporary table fetchchunks(path text, begin int4, len int4);";
        res = PQexec(conn, sql);

        if (PQresultStatus(res) != PGRES_COMMAND_OK)
        {
                fprintf(stderr, "error creating temporary table: %s\n",
                                PQresultErrorMessage(res));
                exit(1);
        }

        sql = "copy fetchchunks from stdin";
        res = PQexec(conn, sql);

        if (PQresultStatus(res) != PGRES_COPY_IN)
        {
                fprintf(stderr, "unexpected result while sending file list: %s\n",
                                PQresultErrorMessage(res));
                exit(1);
        }

[参考]
1. http://www.postgresql.org/message-id/flat/519DF910.4020609@vmware.com#519DF910.4020609@vmware.com

2. https://github.com/vmware/pg_rewind

时间： 2024-10-31 16:06:08

PostgreSQL primary-standby failback tools : pg_rewind(have bug)

PostgreSQL primary-standby failback tools : pg_rewind(have bug)的相关文章

PostgreSQL pending patch : fail-back without fresh backup (have bug?)

PostgreSQL HOT STANDBY using log shipping

PostgreSQL HOT STANDBY using Stream replication

PostgreSQL 9.3 beta2 stream replication primary standby switchover bug?

PostgreSQL Greenplum crash 后临时表引发的BUG - 暨年龄监控的重要性

PostgreSQL 恢复进程startup hang住的原因分析一例

PostgreSQL 最佳实践 - pg_rman 以standby为源的备份浅析

PostgreSQL as RDS on AWS

PostgreSQL 性能优化方法 - 2