GreenPlum测试环境: 16台主机, 64个segment.
主要对比的是前几天MonetDB vs PostgreSQL vs PostgreSQL+cstore_fdw的场景.
MonetDB的测试环境和测试数据, 测试结果参考
http://blog.163.com/digoal@126/blog/static/163877040201471691955155/
GreenPlum采用列存储, 随机分布, 开启压缩, 压缩级别9.
以下是GreenPlum的测试结果 :
copy方式导入时间:Time: 366566.754 ms , -- 比MonetDB略慢.
(insert into t2 select * from t1)方式导入时间:20922 ms , -- 比monetdb快10秒.
MonetDB :
sql>delete from tt;
50000000 affected row (5.290ms)
sql>insert into tt select * from t1;
50000000 affected row (30.9s)
采用copy方式导入表存储消耗(列存储, 压缩级别9):10265MB --, 比MonetDB小 1.xG
测试SQL:
1. select count(distinct c1) from t1;
Time: 18442.063 ms , 比MonetDB慢10倍
2. select count(distinct c1) from t2;
Time: 18165.555 ms, 比MonetDB慢10倍
3. select count(*) from (select c1,c11,c21,c31,c41,c51,c60 from t1 group by c1,c11,c21,c31,c41,c51,c60) as t;
Time: 4302.028 ms , 比MonetDB慢2秒
4. select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3) as t;
Time: 3559.810 ms , 比MonetDB快24.5秒
5. select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3 join t3 on t1.c3=t3.c5) as t;
Time: 46933.672 ms , 比MonetDB快27秒
6. select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3 join t3 on t1.c3=t3.c5 join t4 on t3.c5=t4.c6) as t;
Time: 78507.539 ms , 比MonetDB快25秒
7. select count(*) from (select t1.c1 from t1 join t2 on t1.c2=t2.c3 join t3 on t2.c3=t3.c4 join t4 on t3.c4=t4.c5 join t5 on t4.c5=t5.c6) as t;
Time: 125883.171 ms , 速度一致
8. select count(*) from ( select t1.c1 from t1 join t2 on t1.c2=t2.c3 join t3 on t2.c3=t3.c4 join t4 on t3.c4=t4.c5 join t5 on t4.c5=t5.c6 where t1.c60 < 100000 ) as t;
Time: 5640.577 ms , 比MonetDB快7秒
9. select count(*) from (
select t1.c1 from t1
join t2 on t1.c2=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c5=t6.c6
join t7 on t6.c5=t7.c6
join t8 on t7.c5=t8.c6
join t9 on t8.c5=t9.c6
join t10 on t9.c5=t10.c6
join t11 on t10.c5=t11.c6
join t12 on t11.c5=t12.c6
join t13 on t12.c5=t13.c6
join t14 on t13.c5=t14.c6
join t15 on t14.c5=t15.c6
join t16 on t15.c5=t16.c6
join t17 on t16.c5=t17.c6
) as t;
Time: 658719.790 ms , 比MonetDB慢348秒
10. select count(*) from (
select t1.c1 from t1
join t2 on t1.c2=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c5=t6.c6
join t7 on t6.c5=t7.c6
join t8 on t7.c5=t8.c6
join t9 on t8.c5=t9.c6
join t10 on t9.c5=t10.c6
join t11 on t10.c5=t11.c6
join t12 on t11.c5=t12.c6
join t13 on t12.c5=t13.c6
join t14 on t13.c5=t14.c6
join t15 on t14.c5=t15.c6
join t16 on t15.c5=t16.c6
join t17 on t16.c5=t17.c6
where t1.c60 < 100000
) as t;
Time: 25521.288 ms , 比MonetDB快11秒
11. select count(*) from (
select t1.c1 from t1
join t2 on t1.c1=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c5=t6.c6
join t7 on t6.c5=t7.c6
join t8 on t7.c5=t8.c6
join t9 on t8.c5=t9.c6
join t10 on t9.c5=t10.c6
join t11 on t10.c5=t11.c6
join t12 on t11.c5=t12.c6
join t13 on t12.c5=t13.c6
join t14 on t13.c5=t14.c6
join t15 on t14.c5=t15.c6
join t16 on t15.c5=t16.c6
join t17 on t16.c5=t17.c6
) as t;
Time: 692865.584 ms , 比MonetDB慢430秒
12. select count(*) from (
select t1.c1 from t1
join t2 on t1.c2=t2.c3
join t3 on t2.c3=t3.c4
join t4 on t3.c4=t4.c5
join t5 on t4.c5=t5.c6
join t6 on t5.c6=t6.c7
join t7 on t6.c7=t7.c8
join t8 on t7.c8=t8.c9
join t9 on t8.c9=t9.c10
join t10 on t9.c10=t10.c11
join t11 on t10.c11=t11.c12
join t12 on t11.c12=t12.c13
join t13 on t12.c13=t13.c14
join t14 on t13.c14=t14.c15
join t15 on t14.c15=t15.c16
join t16 on t15.c16=t16.c17
join t17 on t16.c17=t17.c18
where t1.c1<10
) as t;
1小时未跑出。
GP做多表关联, 关联结果集比较大时, 性能不如MonetDB.
少量的表关联, 因为机器数量占优, 比MonetDB性能要好.
体积较小的单表查询效率不如MonetDB.
体积较大的单表查询效率可能超过MonetDB, 同样依赖于机器的数量.
什么情况下MonetDB性能会下降明显, 当查询的数据扫描量超过内存时, 例如多列的or条件查询, 当多列的存储空间超过内存时.
或者单列的存储空间超过内存大小时. MonetDB性能会有所下降. 例如这个表364亿条记录, 单列达到了135G, 超过内存(96G).
sql>select count(*) from bt5;
+-------------+
| L1 |
+=============+
| 36400000000 |
+-------------+
1 tuple (41.206ms)
sql>select count(*) from bt5 where id=1;
+------+
| L1 |
+======+
| 728 |
+------+
1 tuple (5m 36s)
对于单列小于内存的场景, MonetDB优势很明显. 以96G的内存为例, 存INT列的话, 可以存256亿条记录.
[参考]
1. http://blog.163.com/digoal@126/blog/static/163877040201471691955155/