Greenplum 扩展 segment个数

扩展Greenplum segment个数,总共分三步。
1. 将主机加入集群(如果在原有主机扩展,不需要这一步)
这一步主要做的是
环境配置,例如OS kernel 参数;
创建gp管理用户;
ssh key的交换(使用gpssh-exkeys -e exist_hosts -x new_hosts);
greenplum bin软件的拷贝;
规划segment 数据目录;
使用gpcheck检查 (gpcheck -f new_hosts );
使用gpcheckperf检查性能 (gpcheckperf -f new_hosts_file -d /data1 -d /data2 -v)

2. 初始化segment并加入集群
这一步主要做的是
产生配置文件(gpexpand -f new_hosts_file),也可以自己写配置文件;
在指定目录初始化segment数据库(gpexpand -i cnf -D dbname );
将新增的segment信息添加到master元表;
扩展失败了怎么处理?

3. 重分布表
规划表的重分布优先级顺序;
将表数据根据新的 segments 重新分布。
分析表;

例子包含两种情况,
一种是在集群中的原有机器中添加segment。
另一种是新增主机,添加segment。

测试环境
5台虚拟机,每台虚拟机16G内存,跑3个segment。
digoal=# select * from gp_segment_configuration ;
 dbid | content | role | preferred_role | mode | status | port  |     hostname      |      address      | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+-------------------+-------------------+------------------+------------
    1 |      -1 | p    | p              | s    | u      |  1921 | digoal193096.zmf | digoal193096.zmf |                  |
    2 |       0 | p    | p              | s    | u      | 40000 | digoal193096.zmf | digoal193096.zmf |                  |
    5 |       3 | p    | p              | s    | u      | 40000 | digoal199092.zmf | digoal199092.zmf |                  |
    8 |       6 | p    | p              | s    | u      | 40000 | digoal200164.zmf | digoal200164.zmf |                  |
   11 |       9 | p    | p              | s    | u      | 40000 | digoal204016.zmf | digoal204016.zmf |                  |
   14 |      12 | p    | p              | s    | u      | 40000 | digoal204063.zmf | digoal204063.zmf |                  |
    3 |       1 | p    | p              | s    | u      | 40001 | digoal193096.zmf | digoal193096.zmf |                  |
    6 |       4 | p    | p              | s    | u      | 40001 | digoal199092.zmf | digoal199092.zmf |                  |
    9 |       7 | p    | p              | s    | u      | 40001 | digoal200164.zmf | digoal200164.zmf |                  |
   12 |      10 | p    | p              | s    | u      | 40001 | digoal204016.zmf | digoal204016.zmf |                  |
   15 |      13 | p    | p              | s    | u      | 40001 | digoal204063.zmf | digoal204063.zmf |                  |
    4 |       2 | p    | p              | s    | u      | 40002 | digoal193096.zmf | digoal193096.zmf |                  |
    7 |       5 | p    | p              | s    | u      | 40002 | digoal199092.zmf | digoal199092.zmf |                  |
   10 |       8 | p    | p              | s    | u      | 40002 | digoal200164.zmf | digoal200164.zmf |                  |
   13 |      11 | p    | p              | s    | u      | 40002 | digoal204016.zmf | digoal204016.zmf |                  |
   16 |      14 | p    | p              | s    | u      | 40002 | digoal204063.zmf | digoal204063.zmf |                  |
(16 rows)

假设我需要原地扩展3个segment,只用其中的3台主机即可。
因为没有新增主机,所以直接进入第二步.
创建需要扩展segment的主机文件
$vi host
digoal193096.zmf
digoal199092.zmf
digoal200164.zmf

产生配置文件
$gpexpand -f ./host
Would you like to initiate a new System Expansion Yy|Nn (default=N):
> y

How many new primary segments per host do you want to add? (default=0):  每个主机加几个segment
> 1

Enter new primary data directory 1:  segment data目录
> /data01/gpdata

Input configuration files were written to 'gpexpand_inputfile_20151217_162323' and 'None'.
Please review the file and make sure that it is correct then re-run
with: gpexpand -i gpexpand_inputfile_20151217_162323 -D digoal

产生的配置文件内容如下
$cat gpexpand_inputfile_20151217_162323
digoal193096.zmf:digoal193096.zmf:40003:/data01/gpdata/gpseg15:17:15:p
digoal199092.zmf:digoal199092.zmf:40003:/data01/gpdata/gpseg16:18:16:p
digoal200164.zmf:digoal200164.zmf:40003:/data01/gpdata/gpseg17:19:17:p

内容包括几个字段
hostname     主机名
address        类似主机名
port              segment监听端口
fselocation   segment data目录,注意是全路径
dbid              gp集群的唯一ID,可以到gp_segment_configuration中获得,必须顺序累加
content                 可以到gp_segment_configuration中获得,必须顺序累加
prefered_role        角色(p或m)(primary , mirror)
replication_port     如果没有mirror则不需要(用于replication的端口)。

如果你觉得以上内容有问题,可以手工修改。

接下来需要修改greenplum bin目录权限,gpexpand需要在这个目录写一些东西。
chmod -R 700 /opt/gpdb

执行gpexpand进行扩展。
$gpexpand -i ./gpexpand_inputfile_20151217_162323 -D digoal -S -V -v -n 1 -B 1 -t /tmp
解释一下命令
-B <batch_size>
 Batch size of remote commands to send to a given host before
 making a one-second pause. Default is 16. Valid values are 1-128.
 The gpexpand utility issues a number of setup commands that may exceed
 the host's maximum threshold for authenticated connections as defined
 by MaxStartups in the SSH daemon configuration. The one-second pause
 allows authentications to be completed before gpexpand issues any
 more commands. The default value does not normally need to be changed.
 However, it may be necessary to reduce the maximum number of commands
 if gpexpand fails with connection errors such as
 'ssh_exchange_identification: Connection closed by remote host.'

-D <database_name>
 Specifies the database in which to create the expansion schema
 and tables. If this option is not given, the setting for the
 environment variable PGDATABASE is used. The database templates
 template1 and template0 cannot be used.

-i | --input <input_file>
 Specifies the name of the expansion configuration file, which contains
 one line for each segment to be added in the format of:

  <hostname>:<address>:<port>:<fselocation>:<dbid>:<content>:<preferred_role>:<replication_port>

-n <parallel_processes>
 The number of tables to redistribute simultaneously. Valid values
 are 1 - 16. Each table redistribution process requires two database
 connections: one to alter the table, and another to update the table's
 status in the expansion schema. Before increasing -n, check the current
 value of the server configuration parameter max_connections and make
 sure the maximum connection limit is not exceeded.

-S | --simple_progress
 Show simple progress view.

-t | --tardir <directory>
 Specify the temporary directory on segment hosts to put tar file.

-v | --verbose
 Verbose debugging output. With this option, the utility will output
 all DDL and DML used to expand the database.

-V | --novacuum
 Do not vacuum catalog tables before creating schema copy.

运行成功的输出如下
....
20151217:16:50:16:012900 gpexpand:digoal193096:digoal-[DEBUG]:-Transitioning from PREPARE_EXPANSION_SCHEMA_STARTED to PREPARE_EXPANSION_SCHEMA_DONE
20151217:16:50:16:012900 gpexpand:digoal193096:digoal-[DEBUG]:-Transitioning from PREPARE_EXPANSION_SCHEMA_DONE to EXPANSION_PREPARE_DONE
20151217:16:50:16:012900 gpexpand:digoal193096:digoal-[DEBUG]:-Removing segment configuration backup file
20151217:16:50:16:012900 gpexpand:digoal193096:digoal-[INFO]:-Stopping Greenplum Database
20151217:16:50:33:012900 gpexpand:digoal193096:digoal-[INFO]:-Starting Greenplum Database
20151217:16:50:44:012900 gpexpand:digoal193096:digoal-[INFO]:-************************************************
20151217:16:50:44:012900 gpexpand:digoal193096:digoal-[INFO]:-Initialization of the system expansion complete.
20151217:16:50:44:012900 gpexpand:digoal193096:digoal-[INFO]:-To begin table expansion onto the new segments
20151217:16:50:44:012900 gpexpand:digoal193096:digoal-[INFO]:-rerun gpexpand
20151217:16:50:44:012900 gpexpand:digoal193096:digoal-[INFO]:-************************************************
20151217:16:50:44:012900 gpexpand:digoal193096:digoal-[INFO]:-Exiting...

假如上一步失败了,怎么办?
启动限制模式,回滚。
$ gpstart -R
$ gpexpand --rollback -D digoal
$ gpstart -a
然后找问题继续上一步,直到成功。

成功后可以看到节点数变多了
digoal=# select * from gp_segment_configuration ;
 dbid | content | role | preferred_role | mode | status | port  |     hostname      |      address      | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+-------------------+-------------------+------------------+------------
    1 |      -1 | p    | p              | s    | u      |  1921 | digoal193096.zmf | digoal193096.zmf |                  |
    2 |       0 | p    | p              | s    | u      | 40000 | digoal193096.zmf | digoal193096.zmf |                  |
    5 |       3 | p    | p              | s    | u      | 40000 | digoal199092.zmf | digoal199092.zmf |                  |
    8 |       6 | p    | p              | s    | u      | 40000 | digoal200164.zmf | digoal200164.zmf |                  |
   11 |       9 | p    | p              | s    | u      | 40000 | digoal204016.zmf | digoal204016.zmf |                  |
   14 |      12 | p    | p              | s    | u      | 40000 | digoal204063.zmf | digoal204063.zmf |                  |
    3 |       1 | p    | p              | s    | u      | 40001 | digoal193096.zmf | digoal193096.zmf |                  |
    6 |       4 | p    | p              | s    | u      | 40001 | digoal199092.zmf | digoal199092.zmf |                  |
    9 |       7 | p    | p              | s    | u      | 40001 | digoal200164.zmf | digoal200164.zmf |                  |
   12 |      10 | p    | p              | s    | u      | 40001 | digoal204016.zmf | digoal204016.zmf |                  |
   15 |      13 | p    | p              | s    | u      | 40001 | digoal204063.zmf | digoal204063.zmf |                  |
    4 |       2 | p    | p              | s    | u      | 40002 | digoal193096.zmf | digoal193096.zmf |                  |
    7 |       5 | p    | p              | s    | u      | 40002 | digoal199092.zmf | digoal199092.zmf |                  |
   10 |       8 | p    | p              | s    | u      | 40002 | digoal200164.zmf | digoal200164.zmf |                  |
   13 |      11 | p    | p              | s    | u      | 40002 | digoal204016.zmf | digoal204016.zmf |                  |
   16 |      14 | p    | p              | s    | u      | 40002 | digoal204063.zmf | digoal204063.zmf |                  |
   17 |      15 | p    | p              | s    | u      | 40003 | digoal193096.zmf | digoal193096.zmf |                  |
   18 |      16 | p    | p              | s    | u      | 40003 | digoal199092.zmf | digoal199092.zmf |                  |
   19 |      17 | p    | p              | s    | u      | 40003 | digoal200164.zmf | digoal200164.zmf |                  |
(19 rows)
在数据重分布前,新增的节点对老数据不起作用。

接下来可以计划重分布任务中,表的调度顺序了
digoal=# select * from gpexpand.
gpexpand.status              gpexpand.status_detail       gpexpand.expansion_progress
digoal=# select * from gpexpand.status;
   status   |          updated
------------+----------------------------
 SETUP      | 2015-12-17 16:50:07.15973
 SETUP DONE | 2015-12-17 16:50:16.427367
(2 rows)
查看接下来的任务,如果要调整任务的先后顺序,改rank即可。
digoal=# select * from gpexpand.status_detail ;
 dbname |   fq_name    | schema_oid | table_oid | distribution_policy | distribution_policy_names | distribution_policy_coloids | storage_options | rank |   status    | expansion_started | expansion_finished | source_bytes
--------+--------------+------------+-----------+---------------------+---------------------------+-----------------------------+-----------------+------+-------------+-------------------+--------------------+--------------
 digoal | public.test  |       2200 |     17156 | {1}                 | id                        | 17156                       |                 |    2 | NOT STARTED |                   |                    |            0
 digoal | public.test1 |       2200 |     17182 | {1}                 | id                        | 17182                       |                 |    2 | NOT STARTED |                   |                    |            0
(2 rows)
例如:
=> UPDATE gpexpand.status_detail SET rank=10;
=> UPDATE gpexpand.status_detail SET rank=1 WHERE fq_name = 'public.lineitem';
=> UPDATE gpexpand.status_detail SET rank=2 WHERE fq_name = 'public.orders';
These commands lower the priority of all tables to 10 and then assign a rank of 1 to lineitem and a rank of 2 to orders.
When table redistribution begins, lineitem is redistributed first, followed by orders and all other tables in gpexpand.status_detail.
To exclude a table from redistribution, remove the table from gpexpand.status_detail.

还有多少个表未完成重分布
digoal=# select * from gpexpand.expansion_progress ;
    name     | value
-------------+-------
 Tables Left | 2
(1 row)

执行重分布命令。需要指定计划在多久内完成,或者计划在哪天完成重分布,脚本会自动调度重分布。
$ gpexpand -a -d 1:00:00 -D digoal -S -t /tmp -v -n 1

命令解释
To begin the redistribution phase, you must run gpexpand with either
the -d (duration) or -e (end time) options. Until the specified end
time or duration is reached, the utility will redistribute tables in
the expansion schema. Each table is reorganized using ALTER TABLE
commands to rebalance the tables across new segments, and to set
tables to their original distribution policy. If gpexpand completes
the reorganization of all tables before the specified duration,
it displays a success message and ends. 

NOTE: Data redistribution should be performed during low-use hours.
Redistribution can divided into batches over an extended period. 

-a | --analyze
 Run ANALYZE to update the table statistics after expansion.
 The default is to not run ANALYZE.

-d | --duration <hh:mm:ss>
 Duration of the expansion session from beginning to end.

-D <database_name>
 Specifies the database in which to create the expansion schema
 and tables. If this option is not given, the setting for the
 environment variable PGDATABASE is used. The database templates
 template1 and template0 cannot be used.

-e | --end '<YYYY-MM-DD hh:mm:ss>'
 Ending date and time for the expansion session.

-S | --simple_progress
 Show simple progress view.

-t | --tardir <directory>
 Specify the temporary directory on segment hosts to put tar file.

-v | --verbose
 Verbose debugging output. With this option, the utility will output
 all DDL and DML used to expand the database.

-n <parallel_processes>
 The number of tables to redistribute simultaneously. Valid values
 are 1 - 16. Each table redistribution process requires two database
 connections: one to alter the table, and another to update the table's
 status in the expansion schema. Before increasing -n, check the current
 value of the server configuration parameter max_connections and make
 sure the maximum connection limit is not exceeded.

重分布过程中,可以看到进度。
digoal=# select * from gpexpand.expansion_progress ;
      name       | value
-----------------+-------
 Tables Expanded | 1
 Tables Left     | 1
(2 rows)

digoal=# select * from gpexpand.status_detail ;
 dbname |   fq_name    | schema_oid | table_oid | distribution_policy | distribution_policy_names | distribution_policy_coloids | storage_options | rank |   status    |     expansion_started     |     expansion_finished     | source_byte
s
--------+--------------+------------+-----------+---------------------+---------------------------+-----------------------------+-----------------+------+-------------+---------------------------+----------------------------+------------
--
 digoal | public.test  |       2200 |     17156 | {1}                 | id                        | 17156                       |                 |    2 | NOT STARTED |                           |                            |
0
 digoal | public.test1 |       2200 |     17182 | {1}                 | id                        | 17182                       |                 |    2 | COMPLETED   | 2015-12-17 17:12:12.43088 | 2015-12-17 17:13:27.335207 |
0
(2 rows)

或者在命令行看进度
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-['digoal', 'public.test1', 2200L, 17182L, '{1}', 'id', '17182', None, 2, 'NOT STARTED', None, None, Decimal('0')]
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-Adding cmd to work_queue: None
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-['digoal', 'public.test', 2200L, 17156L, '{1}', 'id', '17156', None, 2, 'NOT STARTED', None, None, Decimal('0')]
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-Adding cmd to work_queue: None
20151217:17:12:11:020043 gpexpand:digoal193096:digoal-[DEBUG]:-woke up.  queue: 2 finished 0
。。。。
20151217:17:14:36:020043 gpexpand:digoal193096:digoal-[DEBUG]:-woke up.  queue: 2 finished 1
20151217:17:14:40:020043 gpexpand:digoal193096:digoal-[INFO]:-Analyzing public.test
20151217:17:14:41:020043 gpexpand:digoal193096:digoal-[DEBUG]:-woke up.  queue: 2 finished 1
20151217:17:14:43:020043 gpexpand:digoal193096:digoal-[INFO]:-Finished expanding digoal.public.test
20151217:17:14:43:020043 gpexpand:digoal193096:digoal-[DEBUG]:-UPDATE gpexpand.status_detail
                  SET status = 'COMPLETED', expansion_started='2015-12-17 17:13:29.258085', expansion_finished='2015-12-17 17:14:43.552232'
                  WHERE dbname = 'digoal' AND schema_oid = 2200
                  AND table_oid = 17156
20151217:17:14:44:020043 gpexpand:digoal193096:digoal-[DEBUG]:-[worker0] finished cmd: name cmdStr='None'
20151217:17:14:46:020043 gpexpand:digoal193096:digoal-[DEBUG]:-WorkerPool haltWork()
20151217:17:14:46:020043 gpexpand:digoal193096:digoal-[DEBUG]:-[worker0] haltWork
。。。
20151217:17:14:54:020043 gpexpand:digoal193096:digoal-[INFO]:-EXPANSION COMPLETED SUCCESSFULLY
20151217:17:14:54:020043 gpexpand:digoal193096:digoal-[INFO]:-Exiting...

最后一步操作,清除重分布产生的schema gpexpand。
$ gpexpand -c -D digoal

问你是否需要在清除gpexpand schema前将状态信息导出。
Do you want to dump the gpexpand.status_detail table to file? Yy|Nn (default=Y):
> y

例子2,再扩展,6个segment,并且新加一台主机。达到每个主机分布4个SEGMENT的目的。
和例子1的差别就在于新加了主机,所以需要额外的过程。

过程概要
环境配置,例如OS kernel 参数;
创建gp管理用户;
ssh key的交换(使用gpssh-exkeys -e exist_hosts -x new_hosts);
greenplum bin软件的拷贝;
规划segment 数据目录;
使用gpcheck检查 (gpcheck -f new_hosts );
使用gpcheckperf检查性能 (gpcheckperf -f new_hosts_file -d /data1 -d /data2 -v)

详细过程:
新增主机的操作如下
# yum -y install rsync coreutils glib2 lrzsz sysstat e4fsprogs xfsprogs ntp readline-devel zlib zlib-devel openssl openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl perl-devel perl-ExtUtils* OpenIPMI-tools openldap openldap-devel logrotate

# vi /etc/sysctl.conf
kernel.shmmax = 68719476736
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 50100 64128000 50100 1280
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.ip_forward = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 1025 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.overcommit_memory = 2
fs.file-max = 7672460
net.ipv4.netfilter.ip_conntrack_max = 655360
fs.aio-max-nr = 1048576
net.ipv4.tcp_keepalive_time = 72
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_intvl = 7

# sysctl -p

# vi /etc/security/limits.conf
* soft nofile 131072
* hard nofile 131072
* soft nproc 131072
* hard nproc 131072
* soft    memlock unlimited
* hard    memlock unlimited

# rm -f /etc/security/limits.d/90-nproc.conf

把所有主机的IP和主机名列到/etc/hosts中。使用真实的主机名。(需要在所有节点执行)
# vi /etc/hosts
127.0.0.1  localhost
10.99.193.96  digoal193096.zmf
10.99.199.92  digoal199092.zmf
10.99.200.164  digoal200164.zmf
10.99.204.16  digoal204016.zmf
10.99.204.63  digoal204063.zmf
10.99.209.198  digoal209198.zmf

# 文件系统
ext4 mount option
noatime,nodiratime,nobarrier,discard,nodelalloc,data=writeback

设置块设备预读
# /sbin/blockdev --setra 16384 /dev/xvda1

创建一个管理greenplum 的用户,这里使用digoal

创建一个目录,放gp软件, 给greenplum管理用户写权限,也可以直接使用用户的HOME目录,例如/home/digoal/greenplum-db-4.3.6.1

创建一个目录,放数据库, 给greenplum管理用户写权限
# mkdir -p /data01/gpdata
# chown -R digoal /data01/gpdata
# chmod -R 700 /data01/gpdata

以下在主节点执行
创建主机文件,包括所有节点以及主节点本身
$ vi host_exist
digoal193096.zmf
digoal199092.zmf
digoal200164.zmf
digoal204016.zmf
digoal204063.zmf

$ vi host_new
digoal209198.zmf

交换KEY,master使用gp管理用户(digoal)访问所有的segment不需要输入密码,master pub拷贝到所有的segment authorized_keys
$ gpssh-exkeys -e host_exist -x host_new

安装软件到segment hosts
$gpseginstall -f ./host_new -u digoal

使用gpcheck检查
$ gpcheck -f host_new

使用gpcheckperf检查性能
$ gpcheckperf -f host_new -d /data01/gpdata -v

接下来的操作和前面就差不多了,如下:
$vi host
digoal204016.zmf
digoal204063.zmf
digoal209198.zmf

产生配置文件
$gpexpand -f ./host -c

产生的配置文件内容如下
$cat gpexpand_inputfile_20151217_173855
digoal209198.zmf:digoal209198.zmf:40000:/data01/gpdata/gpseg20:22:20:p
digoal209198.zmf:digoal209198.zmf:40001:/data01/gpdata/gpseg21:23:21:p
digoal209198.zmf:digoal209198.zmf:40002:/data01/gpdata/gpseg22:24:22:p
digoal209198.zmf:digoal209198.zmf:40003:/data01/gpdata/gpseg23:25:23:p
digoal193096.zmf:digoal193096.zmf:40004:/data01/gpdata/gpseg24:26:24:p
digoal199092.zmf:digoal199092.zmf:40004:/data01/gpdata/gpseg25:27:25:p
digoal200164.zmf:digoal200164.zmf:40004:/data01/gpdata/gpseg26:28:26:p
digoal204016.zmf:digoal204016.zmf:40004:/data01/gpdata/gpseg27:29:27:p
digoal204063.zmf:digoal204063.zmf:40004:/data01/gpdata/gpseg28:30:28:p
digoal209198.zmf:digoal209198.zmf:40004:/data01/gpdata/gpseg29:31:29:p

需要人为调整一下: (dbid, contendid都务必连续, 通过查看gp_segment_configuration) (同一主机,端口不能冲突)
digoal204016.zmf:digoal204016.zmf:40004:/data01/gpdata/gpseg20:22:20:p
digoal204063.zmf:digoal204063.zmf:40004:/data01/gpdata/gpseg21:23:21:p
digoal209198.zmf:digoal209198.zmf:40000:/data01/gpdata/gpseg22:24:22:p
digoal209198.zmf:digoal209198.zmf:40001:/data01/gpdata/gpseg23:25:23:p
digoal209198.zmf:digoal209198.zmf:40002:/data01/gpdata/gpseg24:26:24:p
digoal209198.zmf:digoal209198.zmf:40003:/data01/gpdata/gpseg25:27:25:p

接下来需要修改greenplum bin目录权限,gpexpand需要在这个目录写一些东西。
chmod -R 700 /opt/gpdb

执行gpexpand进行扩展。
$ gpexpand -i ./gpexpand_inputfile_20151217_173855 -D digoal -S -V -v -n 1 -B 1 -t /tmp

执行重分布命令。需要指定计划在多久内完成,或者计划在哪天完成重分布,脚本会自动调度重分布。
$ gpexpand -a -d 1:00:00 -D digoal -S -t /tmp -v -n 1

时间: 2024-10-29 16:34:26

Greenplum 扩展 segment个数的相关文章

Greenplum 添加segment mirror节点

添加mirror也分两种情况,在现有的GP集群节点中添加,或者新加的主机中添加. 如果在新加的主机中添加mirror,那么首先需要做的是对应主机的初始化(包括OS的配置,SSH KEY的交换,GP BIN的安装等等) 对应新加主机的部分,可以参考 Greenplum 扩展 segment个数 中的方法来添加新的主机.这里不多说. http://yq.aliyun.com/articles/177 接下来的就是可以复用的内容. 在所有需要添加mirror的主机,创建存放mirror的数据目录 $

Greenplum failed segment的恢复方法

当为segment节点配置了mirror时,如果segment节点挂了,FTS进程会监测到,并且自动将mirror节点切换为primary角色. 在数据库中可以查看到状态.下面表示优先角色为primary的节点已经down了. postgres=# select * from gp_segment_configuration order by role,dbid; dbid | content | role | preferred_role | mode | status | port | ho

Greenplum insert的性能(单步\批量\copy) - 暨推荐使用gpfdist、阿里云oss外部表并行导入

标签 PostgreSQL , Greenplum , HybridDB for PostgreSQL , insert , copy , 外部表 , oss , gpfdist 背景 Greenplum是一款MPP数据库产品,优势是优良的OLAP性能,支持多节点并行计算,实现PB级数据量的实时分析. 除了分析能力,数据写入吞吐也是Greenplum的重要指标,Greenplum支持从master节点写入,也支持从segment节点并行写入. 从segment并行加载的话,性能是线性提升的,因为

从 oss 装载数据到 GreenPlum

gpossext 在阿里云上,支持通过 gpossext 并行装载数据到 Greenplum 中 gpossext 语法 CREATE [READABLE] EXTERNAL TABLE tablename ( columnname datatype [, ...] | LIKE othertable ) LOCATION ('ossprotocol') FORMAT '...' [ ENCODING 'encoding' ] [ LOG ERRORS ...] ossprotocol: oss

GPDB · 特性分析· GreenPlum FTS 机制

前言 FTS(Fault Tolerance Serve)是GreenPlum中的故障检测服务,是保证GP高可用的核心功能.GreenPlum的Segment的健康检测及HA是由GP Master实现的,GP Master上面有个专门的进程–FTS进程,它可以快速检测到Primary或者Mirror是否挂掉,并及时作出Primary/Mirror 故障切换.如果FTS挂掉了,Master将会重新fork出来一个FTS进程. FTS实现原理 GP Master上面的FTS进程每隔60s(时间可以配

如何打造千万级Feed流系统?阿里数据库技术解读

2017年的双十一又一次刷新了记录,交易创建峰值32.5万笔/秒.支付峰值25.6万笔/秒.而这样的交易和支付等记录,都会形成实时订单Feed数据流,汇入数据运营平台的主动服务系统中去.数据运营平台的主动服务,根据这些合并后的数据,实时的进行分析,进行实时的舆情展示,实时的找出需要主动服务的对象等,实现一个智能化的服务运营平台. 通过RDS PostgreSQL和HybridDB for PGSQL实时分析方案: 承受住了每秒几十万笔的写入吞吐并做数据清洗,是交易的数倍 实现分钟级延迟的实时分析

PgSQL · 最佳实践 · 双十一数据运营平台订单Feed数据洪流实时分析方案

摘要 2017年的双十一又一次刷新了记录,交易创建峰值32.5万笔/秒.支付峰值25.6万笔/秒.而这样的交易和支付等记录,都会形成实时订单Feed数据流,汇入数据运营平台的主动服务系统中去. 数据运营平台的主动服务,根据这些合并后的数据,实时的进行分析,进行实时的舆情展示,实时的找出需要主动服务的对象等,实现一个智能化的服务运营平台. 通过阿里云RDS PostgreSQL和HybridDB for PGSQL实时分析方案: - 承受住了几十万笔/s的写入吞吐并做数据清洗,是交易的数倍 - 实

INITIAL参数设置导致TRUNCATE TABLE不能降低高水位线案例

在一个数据库使用下面SQL找出了一批需要降低高水位线的表,其中有几个表没有数据,于是我打算用TRUNCATE来降低高水位线HWM SELECT a.owner,        a.segment_name,        a.segment_type,        a.tablespace_name,        a.blocks              "real block",        a.bytes / 1024 / 1024 "realSizeMB&quo

分布式DB锁问题排查方法 - 阿里云HybridDB for PostgreSQL最佳实践

标签 PostgreSQL , Greenplum , 锁 , SEGMENT不一致 , gp_session_role=utility , gp_dist_random 背景 Greenplum(GPDB)是一个分布式数据库,分布式数据库的锁管理比单机更加复杂.例如在加锁时,需要对所有节点加锁(包括MASTER和所有的SEGMENT节点),在释放锁时,则需要释放所有节点的锁. 如果在释放过程中,MASTER的锁释放了,而SEGMENT锁没有释放,会造成什么问题呢? 不用说,会有很诡异的问题出现