PostgreSQL 空间、多维序列生成方法

背景

数据库的一维序列是很好理解的东西，就是在一个维度上自增。

那么二维、多维序列怎么理解呢？显然就是在多个维度上齐头并进的自增咯。

二维序列

以二维序列为例，应该是这样增长的：

0,0
0,1
1,0
1,1
1,2
2,1
2,2
...

那么如何生成以上二维序列呢？实际上可以利用数据库的多个一维序列来生成。

create sequence seq1;
create sequence seq2;  

create or replace function seq_2d() returns point[] as $$
declare
  res point[];
begin
  select array_cat(res, array[point(nextval('seq1'), nextval('seq2'))]) into res;
  select array_cat(res, array[point(currval('seq1')+1, currval('seq2'))]) into res;
  select array_cat(res, array[point(currval('seq1'), currval('seq2')+1)]) into res;
  return res;
end;
$$ language plpgsql strict;

测试

test=# select seq_2d();
          seq_2d
---------------------------
 {"(1,1)","(2,1)","(1,2)"}
(1 row)  

test=# select seq_2d();
          seq_2d
---------------------------
 {"(2,2)","(3,2)","(2,3)"}
(1 row)  

test=# select seq_2d();
          seq_2d
---------------------------
 {"(3,3)","(4,3)","(3,4)"}
(1 row)

三维序列

三维序列的生成方法类似：

0,0,0
1,0,0
0,1,0
0,0,1
1,1,0
0,1,1
1,0,1  

1,1,1
2,1,1
1,2,1
1,1,2
2,2,1
1,2,2
2,1,2  

2,2,2
......

create sequence seq1;
create sequence seq2;
create sequence seq3;
create extension cube;  

create or replace function seq_3d() returns cube[] as $$
declare
  res cube[];
begin
  select array_cat(res, array[cube(array[nextval('seq1'), nextval('seq2'), nextval('seq3')])]) into res;
  select array_cat(res, array[cube(array[currval('seq1')+1, currval('seq2'), currval('seq3')])]) into res;
  select array_cat(res, array[cube(array[currval('seq1'), currval('seq2')+1, currval('seq3')])]) into res;
  select array_cat(res, array[cube(array[currval('seq1'), currval('seq2'), currval('seq3')+1])]) into res;
  select array_cat(res, array[cube(array[currval('seq1')+1, currval('seq2')+1, currval('seq3')])]) into res;
  select array_cat(res, array[cube(array[currval('seq1'), currval('seq2')+1, currval('seq3')+1])]) into res;
  select array_cat(res, array[cube(array[currval('seq1')+1, currval('seq2'), currval('seq3')+1])]) into res;
  return res;
end;
$$ language plpgsql strict;

例子

test=# select seq_3d();
                                        seq_3d
---------------------------------------------------------------------------------------
 {"(1, 1, 1)","(2, 1, 1)","(1, 2, 1)","(1, 1, 2)","(2, 2, 1)","(1, 2, 2)","(2, 1, 2)"}
(1 row)  

test=# select seq_3d();
                                        seq_3d
---------------------------------------------------------------------------------------
 {"(2, 2, 2)","(3, 2, 2)","(2, 3, 2)","(2, 2, 3)","(3, 3, 2)","(2, 3, 3)","(3, 2, 3)"}
(1 row)

多维序列

以此类推，可以得到多维序列。

多维数据的空间存放和BRIN块级索引

《PostgreSQL 黑科技 - 空间聚集存储》

前面讲到了空间聚集存储，如果数据按空间顺序存放，使用BRIN块级索引，可以在任意维度上得到最好的查询效率，真正做到一个块级索引支持任意列的高效过滤。

例子

create sequence seq1;
create sequence seq2;
create sequence seq3;  

create table tbl(c1 int, c2 int, c3 int);  

create or replace function cluster_insert() returns void as $$
declare
begin
  insert into tbl values (nextval('seq1'), nextval('seq2'), nextval('seq3'));
  insert into tbl values (currval('seq1')+1, currval('seq2'), currval('seq3'));
  insert into tbl values (currval('seq1'), currval('seq2')+1, currval('seq3'));
  insert into tbl values (currval('seq1'), currval('seq2'), currval('seq3')+1);
  insert into tbl values (currval('seq1')+1, currval('seq2')+1, currval('seq3'));
  insert into tbl values (currval('seq1'), currval('seq2')+1, currval('seq3')+1);
  insert into tbl values (currval('seq1')+1, currval('seq2'), currval('seq3')+1);
end;
$$ language plpgsql strict;

压测，写入大量数据

vi test.sql
select count(*) from (select cluster_insert() from generate_series(1,100)) t;  

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 32 -j 32 -T 1200

检查多维聚集性

test=# select * from tbl limit 10;
   c1    |   c2    |   c3
---------+---------+---------
 1992652 | 1992653 | 1992652
 1992573 | 1992574 | 1992578
 1992574 | 1992574 | 1992578
 1992573 | 1992575 | 1992578
 1992573 | 1992574 | 1992579
 1992574 | 1992575 | 1992578
 1992573 | 1992575 | 1992579
 1992574 | 1992574 | 1992579
 1992658 | 1992658 | 1992658
 1992659 | 1992658 | 1992658
(10 rows)

创建BRIN块级索引。

create index idx on tbl using brin (c1,c2,c3);

test=# \dt+ tbl
                             List of relations
 Schema |        Name         | Type  |  Owner   |    Size    | Description
--------+---------------------+-------+----------+------------+-------------
 public | tbl                 | table | postgres | 97 GB      |
(1 row)  

test=# \di+ idx
                        List of relations
 Schema | Name | Type  |  Owner   | Table |  Size  | Description
--------+------+-------+----------+-------+--------+-------------
 public | idx  | index | postgres | tbl   | 456 kB |
(1 row)

看看456KB的索引，在97 GB的数据层面，查询效率如何。

任意列、组合查询过滤性。

explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000;
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c2 between 1 and 1000;
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c3 between 1 and 1000;
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c2 between 100 and 2000;
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c3 between 100 and 2000;
explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c2 between 100 and 2000 and c3 between 1 and 2000;

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000;

------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31623.80 rows=1 width=12) (actual time=27.302..50.284 rows=6997 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000))
   Rows Removed by Index Recheck: 229803
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=26.881..26.881 rows=12800 loops=1)
         Index Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000))
         Buffers: shared hit=662
 Planning time: 0.095 ms
 Execution time: 50.636 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c2 between 1 and 1000;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31623.80 rows=1 width=12) (actual time=27.886..49.011 rows=6997 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c2 >= 1) AND (tbl.c2 <= 1000))
   Rows Removed by Index Recheck: 229803
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=27.512..27.512 rows=12800 loops=1)
         Index Cond: ((tbl.c2 >= 1) AND (tbl.c2 <= 1000))
         Buffers: shared hit=662
 Planning time: 0.040 ms
 Execution time: 49.348 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c3 between 1 and 1000;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31623.80 rows=1 width=12) (actual time=25.238..46.292 rows=6997 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c3 >= 1) AND (tbl.c3 <= 1000))
   Rows Removed by Index Recheck: 229803
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=24.875..24.875 rows=12800 loops=1)
         Index Cond: ((tbl.c3 >= 1) AND (tbl.c3 <= 1000))
         Buffers: shared hit=662
 Planning time: 0.044 ms
 Execution time: 46.631 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c2 between 100 and 2000;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31742.85 rows=1 width=12) (actual time=30.018..48.522 rows=6307 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c2 >= 100) AND (tbl.c2 <= 2000))
   Rows Removed by Index Recheck: 230493
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=27.273..27.273 rows=12800 loops=1)
         Index Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c2 >= 100) AND (tbl.c2 <= 2000))
         Buffers: shared hit=662
 Planning time: 0.049 ms
 Execution time: 48.829 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c3 between 100 and 2000;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31742.85 rows=1 width=12) (actual time=27.565..46.347 rows=6307 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c3 >= 100) AND (tbl.c3 <= 2000))
   Rows Removed by Index Recheck: 230493
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=24.799..24.799 rows=12800 loops=1)
         Index Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c3 >= 100) AND (tbl.c3 <= 2000))
         Buffers: shared hit=662
 Planning time: 0.055 ms
 Execution time: 46.656 ms
(11 rows)

test=# explain (analyze,verbose,timing,costs,buffers) select * from tbl where c1 between 1 and 1000 and c2 between 100 and 2000 and c3 between 1 and 2000;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.tbl  (cost=650.23..31861.90 rows=1 width=12) (actual time=28.703..49.599 rows=6307 loops=1)
   Output: c1, c2, c3
   Recheck Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c2 >= 100) AND (tbl.c2 <= 2000) AND (tbl.c3 >= 1) AND (tbl.c3 <= 2000))
   Rows Removed by Index Recheck: 230493
   Heap Blocks: lossy=1280
   Buffers: shared hit=1942
   ->  Bitmap Index Scan on idx  (cost=0.00..650.23 rows=23810 width=0) (actual time=25.590..25.590 rows=12800 loops=1)
         Index Cond: ((tbl.c1 >= 1) AND (tbl.c1 <= 1000) AND (tbl.c2 >= 100) AND (tbl.c2 <= 2000) AND (tbl.c3 >= 1) AND (tbl.c3 <= 2000))
         Buffers: shared hit=662
 Planning time: 0.114 ms
 Execution time: 49.919 ms
(11 rows)

小结

本文介绍了如何创建、生成多维序列。

本文验证了数据如果按照多维序列聚集存放，可以达到块级索引最强过滤性，任意字段都能实现高效率过滤。

如果数据的多列本身不存在相关性，可以参考这篇文档，对数据进行空间重分布存储。得到最强过滤性。

《PostgreSQL 黑科技 - 空间聚集存储》

时间： 2024-12-22 02:39:14

PostgreSQL 空间、多维序列生成方法的相关文章

SQL SERVER2005的序列生成方法

此外,真正要解决这个问题,还源于我做的试剂耗材的管理系统.以往生成条码为了保证唯一性,使用日期时间+序列的方式,为了保证它不重复,时间就取的很细,到了毫秒级别.虽然理论上来讲这仍然有重复的可能性. 其实重复毕竟是小概率,但是条码却太长了,比如一个试剂的条码是20140801162430234001,这么一大串,一来不容易被条码枪识别,二来即便识别也容易造成识别前半部分.如果实在没有识别要手输的话,就是要费很大的工夫了. 条码本来就是类似数据库主键的意味,只要保证不重复,就该越短越好. 既然sql

两种PHP生成二维码的方法

PHP生成二维码,个人认为最常用的有两种,1.使用google的api生成,2.使用PHP QR Code生成,两种方法生成的二维码都是很清淅的,效果不错.下面来分别说明这两种方法如何实现. 一.PHP使用google api在线生成二维码: 关于这个API的介绍就不说了吧,你可以去Google的官方网站查询,相信大家关注的是具体代码,如下: 1 2$urlToEncode="http://www.csdn.net";//要生成二维码的网址 3generateQRfromGoogl

Symfony生成二维码的方法_php实例

本文实例讲述了Symfony生成二维码的方法.分享给大家供大家参考,具体如下: 现在网上能搜到很多关于使用PHP生成二维码的例子,主要是两种方法: 第一种:google开放api,如下: $urlToEncode="http://blog.it985.com"; generateQRfromGoogle($urlToEncode); function generateQRfromGoogle($chl, $widhtHeight = '150', $EC_level = 'L', $m

ASP.NET生成二维码的方法总结_实用技巧

本文实例总结了ASP.NET生成二维码的方法.分享给大家供大家参考,具体如下: 分享一例c#生成二维码的代码,直接引用ThoughtWorks.QRCode.dll 类生成二维码,有需要的朋友参考下. 方法1.直接引用ThoughtWorks.QRCode.dll 类,生成二维码. 代码示例: ThoughtWorks.QRCode.Codec.QRCodeEncoder encoder = new QRCodeEncoder(); encoder.QRCodeEncodeMode = QRCo

PHP QRCODE生成彩色二维码的方法_php技巧

本文实例讲述了PHP QRCODE生成彩色二维码的方法.分享给大家供大家参考,具体如下: 这里重写了下PHPQRCODE,精简了部分代码,合并PNG GIF JPEG的输出. 参数说明: 调用方式: 复制代码代码如下: QRcode::IMGout($text, $outfile = false, $level = QR_ECLEVEL_L, $size = 3, $col=array(array(255,255,255),array(0,0,0)),$margin =2, $saveandp

使用PHP生成二维码的方法汇总_php技巧

随着科技的进步,二维码应用领域越来越广泛,本站之前已有文章介绍通过使用jQuery插件来生成二维码,今天我给大家分享下如何使用PHP生成二维码,以及如何生成中间带LOGO图像的二维码. 利用Google API生成二维码 Google提供了较为完善的二维码生成接口,调用API接口很简单,以下是调用代码: $urlToEncode="http://www.jb51.net"; generateQRfromGoogle($urlToEncode); /** * google api 二维码

php生成二维码图片方法汇总_php实例

第一种方法: 1,第一种方法:利用使用最广泛,最方便的Google api技术实现: 2 ,<?php //封装生成二维码图片的函数(方法) /* 利用google api生成二维码图片 $content:二维码内容参数 $size:生成二维码的尺寸,宽度和高度的值 $lev:可选参数,纠错等级 $margin:生成的二维码离边框的距离*/ function create_erweima($content, $size = '100', $lev = 'L', $margin= '0') { $

java实现二维码生成的几个方法（推荐）_java

java实现二维码生成的几个方法,具体如下: 1: 使用SwetakeQRCode在Java项目中生成二维码 http://swetake.com/qr/ 下载地址或着http://sourceforge.jp/projects/qrcode/downloads/28391/qrcode.zip 这个是日本人写的,生成的是我们常见的方形的二维码可以用中文如:5677777ghjjjjj 2: 使用BarCode4j生成条形码和二维码 BarCode4j网址:http://sourcefor

Android中二维码的生成方法(普通二维码、中心Logo 二维码、及扫描解析二维码)

首先声明我们通篇用的都是Google开源框架Zxing,要实现的功能有三个 ,生成普通二维码.生成带有中心图片Logo 的二维码,扫描解析二维码,直接上效果图吧首先我们需要一个这样的 Zxing 的包类似于这样接下来需要引入资源 1.drawable 中引入图片 navbar.png 2.layout中引入camera.xml.main.xml.qrcode_capture_page.xml 3.创建raw文件夹并添加beep.ogg 扫描声音 4.合并color.xml,copy ids.

PostgreSQL 空间、多维 序列 生成方法

标签

背景