采用 部分索引、表达式索引 提高搜索效率

标签

PostgreSQL , partial index , 部分索引 , 表达式索引 , 复合索引 , gist_btree混合索引 , 空间索引


背景

在现实场景中,经常有搜索的需求,例如搜索附近的店铺,搜索通常会有一些搜索的附带条件,例如搜索附近的美食类店铺,加油站等。

这里实际上涉及两类搜索需求,一类是距离,一类是属性。

如果将属性枚举掉,那么搜索时可以变成只按距离搜索。建立空间索引即可。

而如果属性无法枚举,那么需要同时搜索空间和属性,可以建立 “空间+属性” 的“复合索引”,或者建立“多索引”,PG内部会使用bitmap自动将多个索引的过滤结果进行合并。

以上的做法都挺好理解,但是在PG里面还有更丰富的玩法。

例如部分索引、表达式索引。下面举例说明具体的场景和玩法。

问题1,搜索离我最近的女性用户

使用部分索引,只对女性用户建立索引。

postgres=# create table user_pos(
id int primary key,  -- 主键
pos point,  -- 位置
sex char(1)  -- 性别
);
CREATE TABLE  

postgres=# insert into user_pos select generate_series(1,10000000), point(random()*10000,random()*10000), (random())::int::text;
INSERT 0 10000000
postgres=# select * from user_pos limit 10;
 id |                 pos                 | sex
----+-------------------------------------+-----
  1 | (2447.73048441857,5153.31742353737) | 0
  2 | (6969.8447175324,5497.46428150684)  | 1
  3 | (4143.54857057333,9740.06621632725) | 1
  4 | (6990.53473770618,5271.83207217604) | 0
  5 | (5196.75491377711,5041.81199707091) | 1
  6 | (1515.07906615734,1538.82524929941) | 0
  7 | (1805.89218158275,7099.36406929046) | 1
  8 | (383.995678275824,1186.2367298454)  | 1
  9 | (9107.82004706562,9367.1752139926)  | 0
 10 | (9713.49926199764,9380.74112869799) | 1
(10 rows)

女性=0

男性=1

建立部分索引的方法如下,

postgres=# create index idx_user_pos_0 on user_pos using gist(pos) where sex='0';  -- 只对女性建立索引
CREATE INDEX
postgres=# create index idx_user_pos_1 on user_pos using gist(pos) where sex='1';  -- 只对男性建立索引
CREATE INDEX

表和索引的大小如下

postgres=# \dt+
                      List of relations
 Schema |   Name   | Type  |  Owner   |  Size  | Description
--------+----------+-------+----------+--------+-------------
 public | user_pos | table | postgres | 575 MB |
(1 row)  

postgres=# \di+
                              List of relations
 Schema |      Name      | Type  |  Owner   |  Table   |  Size  | Description
--------+----------------+-------+----------+----------+--------+-------------
 public | idx_user_pos_0 | index | postgres | user_pos | 354 MB |
 public | idx_user_pos_1 | index | postgres | user_pos | 354 MB |
 public | user_pos_pkey  | index | postgres | user_pos | 214 MB |
(3 rows)

搜索里某个位置最近的女性用户,男性用户。

使用到了部分索引,0.3x毫秒

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex='0' order by pos <-> point(5000,5000) limit 100;
                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.41..5.99 rows=100 width=30) (actual time=0.067..0.290 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=106
   ->  Index Scan using idx_user_pos_0 on public.user_pos  (cost=0.41..277775.58 rows=4983333 width=30) (actual time=0.066..0.277 rows=100 loops=1)
         Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
         Order By: (user_pos.pos <-> '(5000,5000)'::point)
         Buffers: shared hit=106
 Planning time: 0.116 ms
 Execution time: 0.324 ms
(9 rows)  

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex='1' order by pos <-> point(5000,5000) limit 100;
                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.41..5.97 rows=100 width=30) (actual time=0.118..0.342 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=107
   ->  Index Scan using idx_user_pos_1 on public.user_pos  (cost=0.41..278530.69 rows=5016667 width=30) (actual time=0.117..0.330 rows=100 loops=1)
         Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
         Order By: (user_pos.pos <-> '(5000,5000)'::point)
         Buffers: shared hit=107
 Planning time: 0.122 ms
 Execution time: 0.377 ms
(9 rows)

当不带部分索引所示条件(性别条件)进行搜索时,无法使用部分索引

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos order by pos <-> point(5000,5000) limit 100;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=580722.81..580723.06 rows=100 width=30) (actual time=3995.670..3995.692 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=73530
   ->  Sort  (cost=580722.81..605722.81 rows=10000000 width=30) (actual time=3995.668..3995.679 rows=100 loops=1)
         Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
         Sort Key: ((user_pos.pos <-> '(5000,5000)'::point))
         Sort Method: top-N heapsort  Memory: 38kB
         Buffers: shared hit=73530
         ->  Seq Scan on public.user_pos  (cost=0.00..198530.00 rows=10000000 width=30) (actual time=0.013..1962.988 rows=10000000 loops=1)
               Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
               Buffers: shared hit=73530
 Planning time: 0.091 ms
 Execution time: 3995.733 ms
(13 rows)  

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex in ('0', '1') order by pos <-> point(5000,5000) limit 100;
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=605722.81..605723.06 rows=100 width=30) (actual time=5044.095..5044.119 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=73530
   ->  Sort  (cost=605722.81..630722.81 rows=10000000 width=30) (actual time=5044.093..5044.106 rows=100 loops=1)
         Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
         Sort Key: ((user_pos.pos <-> '(5000,5000)'::point))
         Sort Method: top-N heapsort  Memory: 38kB
         Buffers: shared hit=73530
         ->  Seq Scan on public.user_pos  (cost=0.00..223530.00 rows=10000000 width=30) (actual time=0.013..2978.314 rows=10000000 loops=1)
               Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
               Filter: (user_pos.sex = ANY ('{0,1}'::bpchar[]))
               Buffers: shared hit=73530
 Planning time: 0.170 ms
 Execution time: 5044.160 ms
(14 rows)

对于这种情况,只能使用UNION ALL变通

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from ((select * from user_pos where sex='0' order by pos <-> point(5000,5000) limit 100) union all (select * from user_pos where sex='1' order by pos <-> point(5000,5000) limit 100)) t order by pos <-> point(5000,5000) limit 100;
                                                                                         QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=24.10..24.35 rows=100 width=30) (actual time=0.784..0.800 rows=100 loops=1)
   Output: "*SELECT* 1".id, "*SELECT* 1".pos, "*SELECT* 1".sex, (("*SELECT* 1".pos <-> '(5000,5000)'::point))
   Buffers: shared hit=213
   ->  Sort  (cost=24.10..24.60 rows=200 width=30) (actual time=0.783..0.789 rows=100 loops=1)
         Output: "*SELECT* 1".id, "*SELECT* 1".pos, "*SELECT* 1".sex, (("*SELECT* 1".pos <-> '(5000,5000)'::point))
         Sort Key: (("*SELECT* 1".pos <-> '(5000,5000)'::point))
         Sort Method: quicksort  Memory: 40kB
         Buffers: shared hit=213
         ->  Result  (cost=0.41..16.46 rows=200 width=30) (actual time=0.073..0.693 rows=200 loops=1)
               Output: "*SELECT* 1".id, "*SELECT* 1".pos, "*SELECT* 1".sex, ("*SELECT* 1".pos <-> '(5000,5000)'::point)
               Buffers: shared hit=213
               ->  Append  (cost=0.41..13.96 rows=200 width=22) (actual time=0.072..0.650 rows=200 loops=1)
                     Buffers: shared hit=213
                     ->  Subquery Scan on "*SELECT* 1"  (cost=0.41..6.99 rows=100 width=22) (actual time=0.071..0.302 rows=100 loops=1)
                           Output: "*SELECT* 1".id, "*SELECT* 1".pos, "*SELECT* 1".sex
                           Buffers: shared hit=106
                           ->  Limit  (cost=0.41..5.99 rows=100 width=30) (actual time=0.071..0.289 rows=100 loops=1)
                                 Output: user_pos.id, user_pos.pos, user_pos.sex, ((user_pos.pos <-> '(5000,5000)'::point))
                                 Buffers: shared hit=106
                                 ->  Index Scan using idx_user_pos_0 on public.user_pos  (cost=0.41..277775.58 rows=4983333 width=30) (actual time=0.070..0.278 rows=100 loops=1)
                                       Output: user_pos.id, user_pos.pos, user_pos.sex, (user_pos.pos <-> '(5000,5000)'::point)
                                       Order By: (user_pos.pos <-> '(5000,5000)'::point)
                                       Buffers: shared hit=106
                     ->  Subquery Scan on "*SELECT* 2"  (cost=0.41..6.97 rows=100 width=22) (actual time=0.110..0.327 rows=100 loops=1)
                           Output: "*SELECT* 2".id, "*SELECT* 2".pos, "*SELECT* 2".sex
                           Buffers: shared hit=107
                           ->  Limit  (cost=0.41..5.97 rows=100 width=30) (actual time=0.110..0.314 rows=100 loops=1)
                                 Output: user_pos_1.id, user_pos_1.pos, user_pos_1.sex, ((user_pos_1.pos <-> '(5000,5000)'::point))
                                 Buffers: shared hit=107
                                 ->  Index Scan using idx_user_pos_1 on public.user_pos user_pos_1  (cost=0.41..278530.69 rows=5016667 width=30) (actual time=0.109..0.303 rows=100 loops=1)
                                       Output: user_pos_1.id, user_pos_1.pos, user_pos_1.sex, (user_pos_1.pos <-> '(5000,5000)'::point)
                                       Order By: (user_pos_1.pos <-> '(5000,5000)'::point)
                                       Buffers: shared hit=107
 Planning time: 0.235 ms
 Execution time: 0.861 ms
(35 rows)

将前面的部分索引,换成全索引

postgres=# drop index idx_user_pos_0;
DROP INDEX
postgres=# drop index idx_user_pos_1;
DROP INDEX
postgres=# create index idx_user_pos_0 on user_pos using gist(pos);
CREATE INDEX
postgres=# \di+
                              List of relations
 Schema |      Name      | Type  |  Owner   |  Table   |  Size  | Description
--------+----------------+-------+----------+----------+--------+-------------
 public | idx_user_pos_0 | index | postgres | user_pos | 711 MB |
 public | user_pos_pkey  | index | postgres | user_pos | 214 MB |
(2 rows)

使用同样的查询,相比部分索引,性能下降一半,因为索引数据包含了所有性别,需要通过FILTER来处理。

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex='0' order by pos <-> point(5000,5000) limit 100;
                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..9.63 rows=100 width=30) (actual time=0.163..0.688 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=205
   ->  Index Scan using idx_user_pos_0 on public.user_pos  (cost=0.42..459271.75 rows=4983333 width=30) (actual time=0.161..0.674 rows=100 loops=1)
         Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
         Order By: (user_pos.pos <-> '(5000,5000)'::point)
         Filter: (user_pos.sex = '0'::bpchar)
         Rows Removed by Filter: 94
         Buffers: shared hit=205
 Planning time: 0.142 ms
 Execution time: 0.736 ms
(11 rows)  

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex='1' order by pos <-> point(5000,5000) limit 100;
                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..9.57 rows=100 width=30) (actual time=0.110..0.578 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=218
   ->  Index Scan using idx_user_pos_0 on public.user_pos  (cost=0.42..459355.08 rows=5016667 width=30) (actual time=0.109..0.566 rows=100 loops=1)
         Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
         Order By: (user_pos.pos <-> '(5000,5000)'::point)
         Filter: (user_pos.sex = '1'::bpchar)
         Rows Removed by Filter: 107
         Buffers: shared hit=218
 Planning time: 0.105 ms
 Execution time: 0.614 ms
(11 rows)

当不输入性别条件时,依旧可以使用全索引,性能0.3x毫秒,意料之中

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos order by pos <-> point(5000,5000) limit 100;
                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..4.89 rows=100 width=30) (actual time=0.095..0.318 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=107
   ->  Index Scan using idx_user_pos_0 on public.user_pos  (cost=0.42..446813.42 rows=10000000 width=30) (actual time=0.093..0.307 rows=100 loops=1)
         Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
         Order By: (user_pos.pos <-> '(5000,5000)'::point)
         Buffers: shared hit=107
 Planning time: 0.088 ms
 Execution time: 0.353 ms
(9 rows)

将索引换成 空间+属性 复合全索引

postgres=# drop index idx_user_pos_0 ;
DROP INDEX
postgres=# create extension btree_gist;
CREATE EXTENSION
postgres=# create index idx_user_pos_0 on user_pos using gist(sex,pos);
CREATE INDEX
postgres=# \di+
                              List of relations
 Schema |      Name      | Type  |  Owner   |  Table   |  Size  | Description
--------+----------------+-------+----------+----------+--------+-------------
 public | idx_user_pos_0 | index | postgres | user_pos | 843 MB |
 public | user_pos_pkey  | index | postgres | user_pos | 214 MB |
(2 rows)

使用同样的查询,相比部分索引,性能下降一些。

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex='0' order by pos <-> point(5000,5000) limit 100;
                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..6.50 rows=100 width=30) (actual time=0.174..0.434 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=106
   ->  Index Scan using idx_user_pos_0 on public.user_pos  (cost=0.42..303079.91 rows=4983333 width=30) (actual time=0.173..0.422 rows=100 loops=1)
         Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
         Index Cond: (user_pos.sex = '0'::bpchar)
         Order By: (user_pos.pos <-> '(5000,5000)'::point)
         Buffers: shared hit=106
 Planning time: 0.194 ms
 Execution time: 0.471 ms
(10 rows)  

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex='1' order by pos <-> point(5000,5000) limit 100;
                                                                     QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..6.48 rows=100 width=30) (actual time=0.205..0.457 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=107
   ->  Index Scan using idx_user_pos_0 on public.user_pos  (cost=0.42..304368.42 rows=5016667 width=30) (actual time=0.203..0.445 rows=100 loops=1)
         Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
         Index Cond: (user_pos.sex = '1'::bpchar)
         Order By: (user_pos.pos <-> '(5000,5000)'::point)
         Buffers: shared hit=107
 Planning time: 0.105 ms
 Execution time: 0.494 ms
(10 rows)  

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos order by pos <-> point(5000,5000) limit 100;
                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..5.14 rows=100 width=30) (actual time=0.142..0.363 rows=100 loops=1)
   Output: id, pos, sex, ((pos <-> '(5000,5000)'::point))
   Buffers: shared hit=109
   ->  Index Scan using idx_user_pos_0 on public.user_pos  (cost=0.42..472151.42 rows=10000000 width=30) (actual time=0.141..0.350 rows=100 loops=1)
         Output: id, pos, sex, (pos <-> '(5000,5000)'::point)
         Order By: (user_pos.pos <-> '(5000,5000)'::point)
         Buffers: shared hit=109
 Planning time: 0.087 ms
 Execution time: 0.399 ms
(9 rows)

问题2,搜索激活的用户,附加其他搜索条件。

假设业务一定不会对未激活的用户进行检索。那么激活就是必选条件。

可以使用 激活 作为部分索引的条件,避免对不必要的数据进行索引。

create table test(id int, actived boolean, colx int);  

create index idx_test_1 on test(id) where actived;  

insert into test select generate_series(1,1000000), random()::int::boolean, 1 ;  

postgres=# explain (analyze,verbose,timing,costs,buffers) select id from test where actived limit 10;
                                                                 QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..0.67 rows=10 width=4) (actual time=0.033..0.036 rows=10 loops=1)
   Output: id
   Buffers: shared hit=4
   ->  Index Only Scan using idx_test_1 on public.test  (cost=0.42..12341.42 rows=503700 width=4) (actual time=0.032..0.034 rows=10 loops=1)
         Output: id
         Heap Fetches: 10
         Buffers: shared hit=4
 Planning time: 0.075 ms
 Execution time: 0.054 ms
(9 rows)  

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test where actived limit 10;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.42..0.67 rows=10 width=9) (actual time=0.027..0.033 rows=10 loops=1)
   Output: id, actived, colx
   Buffers: shared hit=4
   ->  Index Scan using idx_test_1 on public.test  (cost=0.42..12341.42 rows=503700 width=9) (actual time=0.025..0.029 rows=10 loops=1)
         Output: id, actived, colx
         Buffers: shared hit=4
 Planning time: 0.101 ms
 Execution time: 0.058 ms
(8 rows)

问题3,根据纠偏坐标搜索近距离用户

例如在某些时候,可能由于设备的问题,或者各国地理位置编码的问题使得经纬度需要纠偏,因此查询时都是通过纠偏后的数据进行查询。

这个需求,需要用到表达式索引,因为纠偏是需要计算的,我们需要将纠偏这个计算作为索引表达式。

纠偏函数例子

postgres=# create or replace function fix_point(point) returns point as $$
  select $1 + point(0.9, 12);
$$ language sql strict immutable;
CREATE FUNCTION  

postgres=# select fix_point(point(1,1));
 fix_point
-----------
 (1.9,13)
(1 row)

测试数据

create table test(id int, pos point, info text);
insert into test select generate_series(1,100000), point(random()*10000, random()*10000), 'test';

纠偏表达式索引

create index idx_test_1 on test using gist( fix_point(pos) );

纠偏查询

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from test order by fix_point(pos) <-> point (1,10000) limit 10;;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..0.75 rows=10 width=33) (actual time=0.062..0.078 rows=10 loops=1)
   Output: id, pos, info, (((pos + '(0.9,12)'::point) <-> '(1,10000)'::point))
   Buffers: shared hit=13
   ->  Index Scan using idx_test_1 on public.test  (cost=0.28..4717.78 rows=100000 width=33) (actual time=0.061..0.076 rows=10 loops=1)
         Output: id, pos, info, ((pos + '(0.9,12)'::point) <-> '(1,10000)'::point)
         Order By: ((test.pos + '(0.9,12)'::point) <-> '(1,10000)'::point)
         Buffers: shared hit=13
 Planning time: 0.262 ms
 Execution time: 0.113 ms
(9 rows)

问题4,条件表达式索引 - 全量部分索引

通常我们在建立索引时,需要制定在哪个列、哪些列、哪个表达式上面。

还有一种方法,实际上也是表达式索引,只是表达式看起来比较奇特,例如建立女性索引。

postgres=# create table user_pos(
id int primary key,  -- 主键
pos point,  -- 位置
sex char(1)  -- 性别
);
CREATE TABLE  

postgres=# insert into user_pos select generate_series(1,10000000), point(random()*10000,random()*10000), (random())::int::text;
INSERT 0 10000000

条件表达式索引

create index idx1 on user_pos ( (sex='0') );

这个索引的内容是: sex='0'的结果 + 行号

测试

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex='0';
                                                                 QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on public.user_pos  (cost=46663.43..182485.09 rows=4983333 width=22) (actual time=212.844..1420.697 rows=5000542 loops=1)
   Output: id, pos, sex
   Recheck Cond: (user_pos.sex = '0'::bpchar)
   Rows Removed by Index Recheck: 2244498
   Heap Blocks: exact=40505 lossy=33025
   Buffers: shared hit=87195
   ->  Bitmap Index Scan on idx_user_pos_0  (cost=0.00..45417.60 rows=4983333 width=0) (actual time=204.879..204.879 rows=5000542 loops=1)
         Buffers: shared hit=13665
 Planning time: 0.395 ms
 Execution time: 1688.563 ms
(10 rows)  

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from user_pos where sex='0' limit 10;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..0.90 rows=10 width=22) (actual time=0.058..0.063 rows=10 loops=1)
   Output: id, pos, sex
   Buffers: shared hit=1 read=3
   ->  Index Scan using idx1 on public.user_pos  (cost=0.43..230861.93 rows=4983333 width=22) (actual time=0.057..0.060 rows=10 loops=1)
         Output: id, pos, sex
         Index Cond: ((user_pos.sex = '0'::bpchar) = true)
         Filter: (user_pos.sex = '0'::bpchar)
         Buffers: shared hit=1 read=3
 Planning time: 0.146 ms
 Execution time: 0.085 ms
(10 rows)  
时间: 2024-11-10 07:02:27

采用 部分索引、表达式索引 提高搜索效率的相关文章

mysql全文搜索索引的字段提高搜索效率

一个SELECT查询中的LIKE语句来执行这种查询,尽管这种方法可行,但对于全文查找而言,这是一种效率极端低下的方法,尤其在处理大量数据的时候. 开发者只需要简单地标记出需要全文查找的字段,然后使用特殊的MySQL方法在那些字段运行搜索,这不仅仅提高了性能和效率(因为MySQL对这些字段做了索引来优化搜索),而且实现了更高质量的搜索,因为MySQL使用自然语言来智能地对结果评级,以去掉不相关的项目. 建表: ALTER TABLE 表名 ADD FULLTEXT INDEX (表字段); 使用S

提高Windows7搜索效率的方法

搜索是我们最常用的操作之一,虽然Windows7提供了强大的搜索功能,但是不熟悉的win7的同志们还是不能够更快更加准确的搜到你想要的文件.今天笔者和大家一起分享一下一些老鸟准确搜索的方法. 用内容视图提高搜索效率 大家都知道就是在Windows7开始菜单搜索条中打出你要查找的文件关键字,系统会把所有有关键字的文件全部呈现在你的面前.下面就需要一些技巧了,这时我们可以单机"文档",进入资源管理器窗口,在文件窗口右上角有个"更改您的视图"的按钮,点击按钮出现下拉菜单,

提高Windows 7系统搜索效率的方法

windows 7操作系统相对 Vista.windows xp 系统增加了不少个性化功能,比如增加了可以控制桌面窗口的显示.增加电脑资源搜索功能,本文主要围绕windows 7的搜索功能来展开介绍,搜索对我们已经并不陌生,象我们经常使用百度.google搜索网络资源,并且需要一定的技巧,电脑高手搜索到的东西总是比新手搜索到的快,并且准确,所以我们在使用windows 7 xitongtiandi.com,只要懂的了技巧也一样可以早日进军高手行列了. 用内容视图提高搜索效率 自从Windows7

Oracle查询指定索引提高查询效率

Oracle查询指定索引提高查询效率 一个1600万数据表--短信上行表TBL_SMS_MO结构:CREATE TABLE TBL_SMS_MO( SMS_ID NUMBER, MO_ID VARCHAR2(50), MOBILE VARCHAR2(11), SPNUMBER VARCHAR2(20), MESSAGE VARCHAR2(150), TRADE_CODE VARCHAR2(20), LINK_ID VARCHAR2(50), GATEWAY_ID NUMBER, GATEWAY_

PostgreSQL 索引虚拟列 - 表达式索引 - JOIN提速

标签 PostgreSQL , join , 表达式索引 , 虚拟列索引 , 静态数据 , immutable函数 背景 CASE: 使用虚拟索引,响应时间从2.3秒下降到0.3毫秒 业务系统在设计时,为了减少数据冗余,提升可读性,通常需要将不同的数据放到不同的表. 在查询时,通过多表JOIN来补齐需要查询或在过滤的内容. 比如这样的例子: 有两张表,分别有1千万和100万数据,当用户查询时,需要补齐那100万表中的某个字段进行过滤. create table a (id int, bid in

CentOS系统里如何提高文本搜索效率?

CentOS系统里如何提高文本搜索效率?   对于系统管理员或程序员来说,当需要在复杂配置的目录中或者在大型源码树中搜寻特定的文本或模式时,grep类型的工具大概是最受欢迎的. 如果grep是你最喜欢的工具之一,那么你可能会更喜欢ack.ack是一个基于Perl的类似于grep的命令行工具,但是搜索速度更快,能力比grep更强.尤其是当你是程序员时,我强烈推荐你使用ack来取代grep. ack的用法非常适用与代码搜索,因此程序员可以在源码树中进行复杂的查询,而只需要更少的按键. ack的特性

PPAS 兼容oracle部分函数表达式索引

oracle迁移过来的函数表达式索引出现下列问题,可以尝试用下列办法解决 迁移时需要保证创建的函数本身的属性,输入固定的值返回值不能变. demo: create table TABLE1(id int); CREATE  INDEX X_IDX ON TABLE1 ((TO_NUMBER(ID))); ERROR:  functions in index expression must be marked IMMUTABLE 解决办法: CREATE OR REPLACE FUNCTION m

SQL提高查询效率一些方法总结

  ◆尽量不要在where中包含子查询; 关于时间的查询,尽量不要写成:where to_char(dif_date,'yyyy-mm-dd')=to_char('2007-07-01′,'yyyy-mm-dd'); ◆在过滤条件中,可以过滤掉最大数量记录的条件必须放在where子句的末尾; FROM子句中写在最后的表(基础表,driving table)将被最先处理,在FROM子句中包含多个表的情况下,你必须选择记录条数最少的表作为基础表.如果有三个以上的连接查询,那就需要选择交叉表(inte

SEO优化需要制定合理的外链优化流程才能提高工作效率

网站优化中最让SEO头疼的就是网站外链建设这个环节,这种"慢舞天际"式的工作就算是"神"级别的人也会感到心里纠结.要说到什么程度才算是做好了外链工作?只能说是到了首页还得继续维持排名不往下掉,这种工作永远也不会有个尽头.但是没有到达首页的网站最起码站长心里面还有个盼头,那些已经到达首页的网站已经尝到了甜头或许已经弥补了那种苦逼滋味.网站外链建设和排名效果是一种因果循环的关系,只有广泛性的编织起来一张足够巨大的外链网状,网站排名关键词才能有质地性的飞跃.有很多SEO都