本文仅简略介绍Deepgreen数据库在TPC-H测试中需要注意的几个步骤,为朋友们以后做测试防雷,详细步骤参见德哥帖子。
1.上传文件并解压缩:
ChrisdeMacBook-Pro:tpch chris$ scp master.zip tpc-h-tool.zip dgadmin@192.168.100.107:/home/dgadmin
dgadmin@192.168.100.107's password:
master.zip 100% 21KB 3.6MB/s 00:00
tpc-h-tool.zip
unzip xxx
2.设置Makefile参数:
[dgadmin@linux1 ~]$ cd tpch_2_17_0/
[dgadmin@linux1 tpch_2_17_0]$ ls
dbgen dev-tools ref_data
[dgadmin@linux1 tpch_2_17_0]$ cd dbgen/
[dgadmin@linux1 dbgen]$ cp makefile.suite Makefile
[dgadmin@linux1 dbgen]$ vim Makefile
主要修改CC、Database、Machine、Workload四行:
################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC = gcc
# Current values for DATABASE are: INFORMIX, DB2, TDAT (Teradata)
# SQLSERVER, SYBASE, ORACLE, VECTORWISE
# Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS,
# SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are: TPCH
DATABASE= ORACLE
MACHINE = LINUX
WORKLOAD = TPCH
3.编译:make
4.生成测试数据:
[dgadmin@linux1 dbgen]$ ./dbgen -s 1 -f
TPC-H Population Generator (Version 2.17.0)
Copyright Transaction Processing Performance Council 1994 - 2010
5.将测试数据转换为postgresql识别的格式,删除末尾的分隔符|
[dgadmin@linux1 dbgen]$ for i in `ls *.tbl`; do sed 's/|$//' $i > ${i/tbl/csv}; done
6.把包含csv文件的目录,软链接到/tmp/dss-data。tpch-pg脚本中一会要用到这个目录
[dgadmin@linux1 dbgen]$ pwd
/home/dgadmin/tpch_2_17_0/dbgen
[dgadmin@linux1 dbgen]$ ln -s /home/dgadmin/tpch_2_17_0/dbgen /tmp/dss-data
7.切换到脚本文件master目录下:
[dgadmin@linux1 ~]$ cd pg_tpch-master/
[dgadmin@linux1 pg_tpch-master]$ ls
drop-caches.sh dss LICENSE process.php README.md tpch.sh
[dgadmin@linux1 pg_tpch-master]$ cd dss/
[dgadmin@linux1 dss]$ ls
templates tpch-alter.sql tpch-create.sql tpch-index.sql tpch-load.sql tpch-pkeys.sql
8.修改tpch-load.sql文件,适应dg语法:
##########COPY命令格式有问题,为了获得更好的效果,使用列存储,修改如下举例##########
) with (APPENDONLY=true,BLOCKSIZE=2097152,ORIENTATION=COLUMN,COMPRESSTYPE=lz4,CHECKSUM=true,OIDS=false);
备注:greenplum的压缩类型为:QuickLZ
单独用一个小文件测试数据是否可用:
psql -d tech
COPY region FROM '/tmp/dss-data/region.csv' WITH csv DELIMITER '|’;
9.把master下的所有文件复制到tpch的dbgen目录下:
[dgadmin@linux1 ~]$ cp -r pg_tpch-master/* tpch_2_17_0/dbgen/
10.在dbgen文件夹下的dss目录下创建文件夹:
[dgadmin@linux1 dbgen]$ mkdir dss/queries
10.生成测试sql:
[dgadmin@linux1 dbgen]$
for q in `seq 1 22`
do
DSS_QUERY=dss/templates ./qgen $q >> dss/queries/$q.sql
sed 's/^select/explain select/' dss/queries/$q.sql > dss/queries/$q.explain.sql
cat dss/queries/$q.sql >> dss/queries/$q.explain.sql;
done
11.连接并创建测试数据库tpch
[dgadmin@linux1 dbgen]$ psql -d postgres -p 15432
psql (8.2.15)
Type "help" for help.
postgres=# create database tpch owner=dgadmin;
CREATE DATABASE
12.设置几个数据库参数:
##########设置几个参数##########
gpconfig -c enable_nestloop -v off
gpconfig -c work_mem -v 256MB
gpstop -u
13.执行测试:
nohup ./tpch.sh ./results tpch dgadmin &
备注:如果报错误或者bench.log报timeout,调整tpch.sh文件中的时间为半小时或者更长即可。
时间: 2024-11-02 08:06:41