背景
PostgreSQL 逻辑备份, 指在线备份数据库数据, DDL以SQL语句形式输出, 数据则可以以SQL语句或者固定分隔符(row格式)的形式输出.
备份时不影响其他用户对备份对象的DML操作.
本文主要介绍一下PostgreSQL提供的逻辑备份工具pg_dump, pg_dumpall, 以及数据库的COPY命令的备份方法.
pg_dump
使用pg_dump进行备份时, 其他用户可以同时进行DML(SELECT, UPDATE, DELETE, INSERT)操作, 相互之间没有干扰.
一、pg_dump备份程序的逻辑,源码分析.
1. pg_dump的一次完整的备份是在一个事务中完成的, 事务隔离级别为serializable 或者 repeatable read. 代码如下 :
ExecuteSqlStatement(fout, "BEGIN");
if (fout->remoteVersion >= 90100)
{
if (serializable_deferrable)
ExecuteSqlStatement(fout,
"SET TRANSACTION ISOLATION LEVEL "
"SERIALIZABLE, READ ONLY, DEFERRABLE");
else
ExecuteSqlStatement(fout,
"SET TRANSACTION ISOLATION LEVEL "
"REPEATABLE READ");
}
else
ExecuteSqlStatement(fout,
"SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");
2. pg_dump在备份数据开始前, 需要对进行备份的对象加ACCESS SHARE锁, 代码如下 :
if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
{
resetPQExpBuffer(query);
appendPQExpBuffer(query,
"LOCK TABLE %s IN ACCESS SHARE MODE",
fmtQualifiedId(fout,
tblinfo[i].dobj.namespace->dobj.name,
tblinfo[i].dobj.name));
ExecuteSqlStatement(fout, query->data);
}
pg_dump加的锁与DDL冲突, 例如TRUNCATE, DROP, ALTER, VACUUM FULL, 以及以及unqualified LOCK TABLE冲突, 所以备份开始后是不能进行这些操作的. 可以防止备份过程中数据结构改变, 或者数据被物理的删除掉了.
正因为pg_dump在备份数据前要对备份对象加锁, 所以为了防止pg_dump无休止的的锁等待, pg_dump支持锁超时.
if (lockWaitTimeout && fout->remoteVersion >= 70300)
{
/*
* Arrange to fail instead of waiting forever for a table lock.
*
* NB: this coding assumes that the only queries issued within the
* following loop are LOCK TABLEs; else the timeout may be undesirably
* applied to other things too.
*/
resetPQExpBuffer(query);
appendPQExpBuffer(query, "SET statement_timeout = ");
appendStringLiteralConn(query, lockWaitTimeout, GetConnection(fout));
ExecuteSqlStatement(fout, query->data);
}
例如 :
SESSION A :
pg93@db-172-16-3-33-> psql
psql (9.3devel)
Type "help" for help.
digoal=# begin;
BEGIN
digoal=# truncate table test;
TRUNCATE TABLE
-- 不结束A事务.
SESSION B :
pg93@db-172-16-3-33-> pg_dump -f ./test.dmp
会一直等SESSION A是否test的锁.
从第三个会话可以看出这个等待.
SESSION C :
digoal=# select query,waiting from pg_stat_activity;
query | waiting
---------------------------------------------+---------
LOCK TABLE public.test IN ACCESS SHARE MODE | t -- 这条就是pg_dump发起的.
truncate table test; | f
select query,waiting from pg_stat_activity; | f
(3 rows)
如果不想让pg_dump一直等待下去, 那么可以使用--lock-wait-timeout参数.
例如以下命令, 等待5秒未成功获得锁则退出pg_dump.
pg93@db-172-16-3-33-> pg_dump -f ./test.dmp --lock-wait-timeout=5s
pg_dump: [archiver (db)] query failed: ERROR: canceling statement due to statement timeout
pg_dump: [archiver (db)] query was: LOCK TABLE public.test IN ACCESS SHARE MODE
3. 一切准备就绪后, pg_dump将开始备份数据.
二、备份的内容格式 :
以PostgreSQL 9.3 为例, pg_dump 支持4种格式 :
p, plain
默认格式, 备份输出为可读的text文本. 还原时在数据库中直接执行备份文本的SQL即可.
c, custom
可自定义的归档格式, 同时默认开启了数据压缩, 还原时可以调整备份对象的还原顺序, 同时支持选择还原的对象.
备份写入到一个文件中. 需要注意文件系统支持的单个文件大小.
这个格式必须使用pg_restore命令进行还原.
d, directory
目录归档格式, 与custom格式类似, 需要使用pg_restore还原. 但是目录归档格式下会创建一个目录, 然后每个表或者每个大对象对应一个备份输出文件.
加上TOC文件名描述备份的详细信息, 这个格式默认支持压缩, 同时支持并行导出.
t, tar
tar归档格式, 不支持压缩, 同时限制每个表最大不能超过8GB, 同样需要使用pg_restore还原.
三、全库一致性备份举例 :
注意全库一致性不是指集群一致性, 一个PostgreSQL 集群中可以创建多个数据库.
pg_dump的全库一致性备份指的是集群中的单个数据库的一致性备份,因为备份不同的数据库需要切换连接,无法在不同的数据库之间共享snapshot,因此只能单库一致.
排他选项 :
使用多次 --exclude-table-data=TABLE 排除不需要备份的表.
使用多次 --exclude-schema=SCHEMA 排除不需要备份的schema.
备份前查看一下备份数据的hash值. 方便还原后对照 :
pg93@db-172-16-3-33-> psql
psql (9.3devel)
Type "help" for help.
digoal=# \dt
List of relations
Schema | Name | Type | Owner
--------+----------------+-------+----------
public | pwd_dictionary | table | postgres
public | tbl_user | table | postgres
public | test | table | postgres
(3 rows)
digoal=# select sum(hashtext(t.*::text)) from pwd_dictionary t;
sum
------------
-719496483
(1 row)
digoal=# select sum(hashtext(t.*::text)) from tbl_user t;
sum
---------------
-131178135551
(1 row)
digoal=# select sum(hashtext(t.*::text)) from test t;
sum
-----
(1 row)
备份命令 :
备份digoal库, DDL中不包含表空间. 所以恢复时不需要提前创建对应的表空间.
pg93@db-172-16-3-33-> pg_dump -f ./digoal.dmp -F p -C -E UTF8 --no-tablespaces -h 127.0.0.1 -p 1999 -U postgres digoal
删除digoal库.
digoal=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# drop database digoal;
DROP DATABASE
还原, 直接执行备份SQL即可 :
pg93@db-172-16-3-33-> psql postgres postgres -f ./digoal.dmp
SET
SET
SET
SET
SET
SET
CREATE DATABASE
ALTER DATABASE
You are now connected to database "digoal" as user "postgres".
SET
SET
SET
SET
SET
SET
CREATE SCHEMA
ALTER SCHEMA
CREATE EXTENSION
COMMENT
SET
CREATE FUNCTION
ALTER FUNCTION
CREATE FUNCTION
ALTER FUNCTION
SET
CREATE TABLE
ALTER TABLE
CREATE TABLE
ALTER TABLE
CREATE SEQUENCE
ALTER TABLE
ALTER SEQUENCE
CREATE TABLE
ALTER TABLE
SET
CREATE TABLE
ALTER TABLE
SET
ALTER TABLE
setval
--------
1
(1 row)
SET
SET
ALTER TABLE
ALTER TABLE
REVOKE
REVOKE
GRANT
GRANT
REVOKE
REVOKE
GRANT
GRANT
检查还原后的hash值, 与备份前一致.
pg93@db-172-16-3-33-> psql
psql (9.3devel)
Type "help" for help.
digoal=# select sum(hashtext(t.*::text)) from pwd_dictionary t;
sum
------------
-719496483
(1 row)
digoal=# select sum(hashtext(t.*::text)) from tbl_user t;
sum
---------------
-131178135551
(1 row)
digoal=# select sum(hashtext(t.*::text)) from test t;
sum
-----
(1 row)
四、非全库一致性备份举例 :
知道pg_dump备份流程后, 可以想象如果数据库庞大, pg_dump备份的时间就会越长, 持锁就会越久. 这对有DDL需求的数据库来说, 可能是无法忍受的.
因此您可能需要将pg_dump的粒度弄小一点, 不要一次备份整库, 例如同时备份有一致性需求或者依赖关系的数据表.
举个例子 :
#!/bin/bash
# 环境变量
PATH=$PATH:$HOME/bin
export PATH
export LANG=en_US.utf8
export PGHOME=/opt/pgsql
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib
export DATE=`date +"%Y%m%d%H%M"`
export PATH=$PGHOME/bin:$PATH:.
# 程序变量
TODAY=`date +%Y%m%d`
EMAIL="digoal@126.com"
BAKBASEDIR="/database/pgbak"
RESERVE_DAY=4
HOST="10.10.10.10"
PORT="1921"
ROLE="postgres"
# 不一致备份, 按单表进行.
for DB in `psql -A -q -t -h $HOST -p $PORT -U $ROLE postgres -c "select datname from pg_database where datname not in ('postgres','template0','template1')"`
do
echo -e "------`date +%F\ %T`----Start Backup----IP:$HOST PORT:$PORT DBNAME:$DB TYPE:$BAKTYPE TO:$BAKBASEDIR------"
for TABLE in `psql -A -q -t -h $HOST -p $PORT -U $ROLE $DB -c "select schemaname||'.'||tablename from pg_tables where schemaname !~ '^pg_' and schemaname <>'information_schema'"`
do
pg_dump -f ${BAKBASEDIR}/${DB}-${TABLE}-${TODAY}.dmp.ing -F c -t $TABLE --lock-wait-timeout=6000 -E UTF8 -h ${HOST} -p ${PORT} -U ${ROLE} -w ${DB}
if [ $? -ne 0 ]; then
echo -e "backup $HOST $PORT $DB $BAKBASEDIR error \n `date +%F%T` \n"|mutt -s "ERROR : PostgreSQL_backup " ${EMAIL}
echo -e "------`date +%F\ %T`----Error Backup----IP:$HOST PORT:$PORT DBNAME:$DB TABLE:$TABLE TO:$BAKBASEDIR------"
rm -f ${BAKBASEDIR}/${DB}-${TABLE}-${TODAY}.dmp.ing
break
fi
mv ${BAKBASEDIR}/${DB}-${TABLE}-${TODAY}.dmp.ing ${BAKBASEDIR}/${DB}-${TABLE}-${TODAY}.dmp
echo -e "------`date +%F\ %T`----Success Backup----IP:$HOST PORT:$PORT DBNAME:$DB TABLE:$TABLE TO:$BAKBASEDIR------"
done
done
echo -e "find ${BAKBASEDIR}/${DB}_${TABLE}_${TODAY}.dmp* -daystart -mtime +${RESERVE_DAY} -delete"
find ${BAKBASEDIR}/${DB}_${TABLE}_${TODAY}.dmp* -daystart -mtime +${RESERVE_DAY} -delete
# 这只是个简单的例子, 每次只备份1张表. 多次调用pg_dump.
# 实际使用时需要考虑业务逻辑, 确保业务数据一致性. 例如某一些有关联的表确保放在同一个pg_dump中导出. 多个-t talbename参数即可.
# 同时需要注意在设计时需要考虑到单表数据量不要太大, 应该考虑分区表. 否则该单表的DDL操作也会加大和pg_dump发生冲突的概率.
# 对于较大的数据库还是建议使用PITR物理增量备份方式. 这个后面会讲到.
五、并行备份举例 :
并行备份首先需要9.3的pg_dump, 服务端需要9.2以及以上支持pg_export_snapshot的版本. 因此并行备份也是一致性备份.
使用并行备份需要n+1个连接, n指-j n指定的并行度, 1 是主连接也就是到处事务状态的连接, 其他连接导入这个事务状态, 进行到处.
对于9.2以前的版本, 如果要并行备份来提高备份速度, 同时又要数据库一致性, 那么请在备份期间不要对备份对象执行dml操作.
首先创建1000个测试表, 插入测试数据 :
digoal=# do language plpgsql $$
digoal$# declare
digoal$# v_sql text;
digoal$# begin
digoal$# for i in 1..1000 loop
digoal$# v_sql := 'create table test_'||i||'(id int, info text)';
digoal$# execute v_sql;
digoal$# v_sql := 'insert into test_'||i||'(id,info) select generate_series(1,1000),''test''';
digoal$# execute v_sql;
digoal$# end loop;
digoal$# end;
digoal$# $$;
备份, 并行度为10, 备份到./paralleldmp目录, 这个目录会自动创建 :
pg93@db-172-16-3-33-> pg_dump -f ./paralleldmp -F d -C -E UTF8 --no-tablespaces -j 10 -h 127.0.0.1 -p 1999 -U postgres digoal
输出每个表一个文件, 加上一个toc文件.
还原测试 :
首先删除digoal库.
pg93@db-172-16-3-33-> psql
psql (9.3devel)
Type "help" for help.
digoal=# \c postgres postgres
You are now connected to database "postgres" as user "postgres".
postgres=# drop database digoal;
DROP DATABASE
还原 :
pg93@db-172-16-3-33-> pg_restore -C -h 127.0.0.1 -p 1999 -U postgres -d postgres -j 10 ~/paralleldmp
检查是否还原 :
pg93@db-172-16-3-33-> psql
psql (9.3devel)
Type "help" for help.
digoal=# \dt
List of relations
Schema | Name | Type | Owner
--------+----------------+-------+----------
public | pwd_dictionary | table | postgres
public | tbl_user | table | postgres
public | test | table | postgres
public | test_1 | table | postgres
public | test_10 | table | postgres
public | test_100 | table | postgres
public | test_1000 | table | postgres
public | test_101 | table | postgres
public | test_102 | table | postgres
public | test_103 | table | postgres
public | test_104 | table | postgres
public | test_105 | table | postgres
public | test_106 | table | postgres
public | test_107 | table | postgres
public | test_108 | table | postgres
public | test_109 | table | postgres
public | test_11 | table | postgres
public | test_110 | table | postgres
public | test_111 | table | postgres
public | test_112 | table | postgres
public | test_113 | table | postgres
public | test_114 | table | postgres
public | test_115 | table | postgres
public | test_116 | table | postgres
public | test_117 | table | postgres
public | test_118 | table | postgres
public | test_119 | table | postgres
public | test_12 | table | postgres
public | test_120 | table | postgres
public | test_121 | table | postgres
public | test_122 | table | postgres
public | test_123 | table | postgres
public | test_124 | table | postgres
public | test_125 | table | postgres
... 略
六、TOC文件定制举例 :
定制TOC文件可以达到调整还原顺序, 开关还原对象的目的.
首先要创建list. 使用pg_restore的 -l 参数. 从directory或dmp文件中创建list. 下面以目录归档为例, 创建list文件 :
pg93@db-172-16-3-33-> pg_restore ~/paralleldmp -l >./toc.list
以下为部分list文件截取 :
分号为注释;
pg93@db-172-16-3-33-> less toc.list
;
; Archive created at Mon May 27 08:58:40 2013
; dbname: digoal
; TOC Entries: 2026
; Compression: -1
; Dump Version: 1.12-0
; Format: UNKNOWN
; Integer: 4 bytes
; Offset: 8 bytes
; Dumped from database version: 9.3devel
; Dumped by pg_dump version: 9.3devel
;
;
; Selected TOC Entries:
;
8744; 1262 26431 DATABASE - digoal postgres
6; 2615 2200 SCHEMA - public postgres
8745; 0 0 COMMENT - SCHEMA public postgres
8746; 0 0 ACL - public postgres
7; 2615 26432 SCHEMA - test postgres
8747; 0 0 ACL - test postgres
1176; 3079 12536 EXTENSION - plpgsql
8748; 0 0 COMMENT - EXTENSION plpgsql
1189; 1255 26433 FUNCTION public alter_role_pwd(name, text) postgres
1190; 1255 26434 FUNCTION public create_role(name, text) postgres
171; 1259 26435 TABLE public pwd_dictionary postgres
172; 1259 26441 TABLE public tbl_user postgres
173; 1259 26444 SEQUENCE public tbl_user_id_seq postgres
8749; 0 0 SEQUENCE OWNED BY public tbl_user_id_seq postgres
174; 1259 26446 TABLE public test postgres
176; 1259 26457 TABLE public test_1 postgres
185; 1259 26511 TABLE public test_10 postgres
275; 1259 27051 TABLE public test_100 postgres
1175; 1259 32451 TABLE public test_1000 postgres
276; 1259 27057 TABLE public test_101 postgres
277; 1259 27063 TABLE public test_102 postgres
278; 1259 27069 TABLE public test_103 postgres
截取一行解释一下 :
8744; 1262 26431 DATABASE - digoal postgres
8744 对应 dumpId
1262 对应 catalogId.tableoid
26431 对应 catalogId.oid
DATABASE 对应 desc
- 对应 te->namespace ? te->namespace : "-"
digoal 对应 tag
postgres 对应 owner
以上TOC文件中entry的意思截取自代码如下 :
src/bin/pg_dump/pg_backup_archiver.c
void
PrintTOCSummary(Archive *AHX, RestoreOptions *ropt)
{
ArchiveHandle *AH = (ArchiveHandle *) AHX;
TocEntry *te;
OutputContext sav;
char *fmtName;
sav = SaveOutput(AH);
if (ropt->filename)
SetOutput(AH, ropt->filename, 0 /* no compression */ );
ahprintf(AH, ";\n; Archive created at %s", ctime(&AH->createDate));
ahprintf(AH, "; dbname: %s\n; TOC Entries: %d\n; Compression: %d\n",
AH->archdbname, AH->tocCount, AH->compression);
switch (AH->format)
{
case archFiles:
fmtName = "FILES";
break;
case archCustom:
fmtName = "CUSTOM";
break;
case archTar:
fmtName = "TAR";
break;
default:
fmtName = "UNKNOWN";
}
ahprintf(AH, "; Dump Version: %d.%d-%d\n", AH->vmaj, AH->vmin, AH->vrev);
ahprintf(AH, "; Format: %s\n", fmtName);
ahprintf(AH, "; Integer: %d bytes\n", (int) AH->intSize);
ahprintf(AH, "; Offset: %d bytes\n", (int) AH->offSize);
if (AH->archiveRemoteVersion)
ahprintf(AH, "; Dumped from database version: %s\n",
AH->archiveRemoteVersion);
if (AH->archiveDumpVersion)
ahprintf(AH, "; Dumped by pg_dump version: %s\n",
AH->archiveDumpVersion);
ahprintf(AH, ";\n;\n; Selected TOC Entries:\n;\n");
/* We should print DATABASE entries whether or not -C was specified */
ropt->createDB = 1;
for (te = AH->toc->next; te != AH->toc; te = te->next)
{
if (ropt->verbose || _tocEntryRequired(te, ropt, true) != 0)
ahprintf(AH, "%d; %u %u %s %s %s %s\n", te->dumpId,
te->catalogId.tableoid, te->catalogId.oid,
te->desc, te->namespace ? te->namespace : "-",
te->tag, te->owner);
if (ropt->verbose && te->nDeps > 0)
{
int i;
ahprintf(AH, ";\tdepends on:");
for (i = 0; i < te->nDeps; i++)
ahprintf(AH, " %d", te->dependencies[i]);
ahprintf(AH, "\n");
}
}
if (ropt->filename)
RestoreOutput(AH, sav);
}
下面调整一下list文件, 根据list文件还原 :
例如注释test_1的表创建和数据还原.
;176; 1259 26457 TABLE public test_1 postgres
;7740; 0 26457 TABLE DATA public test_1 postgres
然后调整顺序
275; 1259 27051 TABLE public test_100 postgres
1175; 1259 32451 TABLE public test_1000 postgres
调整为
1175; 1259 32451 TABLE public test_1000 postgres
275; 1259 27051 TABLE public test_100 postgres
保存toc.list
删除数据库digoal:
digoal=# \c postgres
You are now connected to database "postgres" as user "postgres".
postgres=# drop database digoal;
DROP DATABASE
还原 :
pg93@db-172-16-3-33-> pg_restore -h 127.0.0.1 -p 1999 -U postgres -C -d postgres -j 10 -L ./toc.list -v ~/paralleldmp >./restore.log 2>&1
查看restore.log日志, 注意到顺序调整生效 :
pg_restore: processing item 1175 TABLE test_1000
pg_restore: creating TABLE test_1000
pg_restore: processing item 275 TABLE test_100
pg_restore: creating TABLE test_100
同时进入数据库查看test_1表没有被还原.
digoal=# \d test_1
Did not find any relation named "test_1".
pg_dumpall :
pg_dumpall最主要的是用于备份全局数据, 例如表空间的DDL, 创建用户的DDL.
导出创建用户, 创建表空间的脚本 :
pg93@db-172-16-3-33-> pg_dumpall -g -h 127.0.0.1 -p 1999 -U postgres -f ./global.dmp
内容如下 :
pg93@db-172-16-3-33-> cat global.dmp
--
-- PostgreSQL database cluster dump
--
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
--
-- Roles
--
CREATE ROLE client1;
ALTER ROLE client1 WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB LOGIN NOREPLICATION PASSWORD 'md596bdd340a56d9ab240581edede7a13c6';
CREATE ROLE digoal;
ALTER ROLE digoal WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB LOGIN NOREPLICATION PASSWORD 'md5c08bdd942d14da5ede9d9cef2b17ef9c';
CREATE ROLE gp1;
ALTER ROLE gp1 WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOLOGIN NOREPLICATION;
CREATE ROLE new;
ALTER ROLE new WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOLOGIN NOREPLICATION PASSWORD 'md54a5ca2a5e9aaed4c781e7d72d7fe945f';
CREATE ROLE postgres;
ALTER ROLE postgres WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION PASSWORD 'md53175bce1d3201d16594cebf9d7eb3f9d';
CREATE ROLE sslcertgroup;
ALTER ROLE sslcertgroup WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOLOGIN NOREPLICATION;
CREATE ROLE u4;
ALTER ROLE u4 WITH NOSUPERUSER INHERIT NOCREATEROLE NOCREATEDB NOLOGIN NOREPLICATION PASSWORD 'md5bbc9d1a9f9e201c9c9d3c153f85771cc';
--
-- Role memberships
--
GRANT gp1 TO client1 GRANTED BY postgres;
GRANT sslcertgroup TO client1 GRANTED BY postgres;
GRANT sslcertgroup TO digoal GRANTED BY postgres;
GRANT sslcertgroup TO postgres GRANTED BY postgres;
--
-- Tablespaces
--
CREATE TABLESPACE tbs_digoal OWNER postgres LOCATION '/pgdata/digoal/1921/data03/pg93/1999/tbs_digoal';
--
-- Per-Database Role Settings
--
ALTER ROLE postgres IN DATABASE postgres SET work_mem TO '10240MB';
--
-- PostgreSQL database cluster dump complete
--
COPY :
除了使用以上命令导出数据库之外, 还可以使用SQL语句导出数据, 亦可结合管道使用.
例如 :
导出 :
digoal=# copy test_10 to '/home/pg93/test_10.dmp' with csv header;
COPY 1000
digoal=# select sum(hashtext(t.*::text)) from test_10 t;
sum
------------
-432745392
(1 row)
digoal=# truncate test_10;
TRUNCATE TABLE
导入 :
digoal=# copy test_10 from '/home/pg93/test_10.dmp' with csv header;
COPY 1000
digoal=# select sum(hashtext(t.*::text)) from test_10 t;
sum
------------
-432745392
(1 row)
小结
1.
如果数据库非常庞大, 在做pg_dump备份时, 与ACCESS SHARE锁冲突的SQL将会处于等待状态.
等待直到pg_dump结束(锁的释放需要等待顶级事务块结束).
这个是需要特别注意的.