PostgreSQL 聚合函数讲解 - 1 常用聚合函数

PostgreSQL支持较多的聚合函数, 以PostgreSQL 9.4为例, 支持例如一般性的聚合, 统计学科的聚合, 排序集聚合, 假象集聚合等.
本文将对一般性聚合函数举例说明其功能和用法.
聚合函数有哪些，见 :
http://www.postgresql.org/docs/9.4/static/functions-aggregate.html

以上所有聚合函数, 当没有行输入时, 除了count返回0, 其他都返回null.
使用sum, array_agg时, 当没有行输入, 返回NULL可能有点别扭, 那么你可以使用coalesce来替代NULL, 如coalesce(sum(x), 0)
coalesce(array_agg(x), '{}'::int[])
例子 :
聚合后得到数组, null将计入数组元素
postgres=# select array_agg(id) from (values(null),(1),(2)) as t(id);
 array_agg
------------
 {NULL,1,2}
(1 row)

算平均值是不计算null
postgres=# select avg(id) from (values(null),(1),(2)) as t(id);
        avg
--------------------
 1.5000000000000000
(1 row)

算bit与|或 时也不计算NULL
postgres=# select bit_and(id) from (values(null),(1),(2)) as t(id);
 bit_and
---------
       0
(1 row)
postgres=# select bit_or(id) from (values(null),(1),(2)) as t(id);
 bit_or
--------
      3
(1 row)
算布尔逻辑时也不计算NULL
postgres=# select bool_and(id) from (values(null),(true),(false)) as t(id);
 bool_and
----------
 f
(1 row)
every是bool_and的别名, 实际上是SQL标准中定义的.
postgres=# select every(id) from (values(null),(true),(false)) as t(id);
 every
-------
 f
(1 row)
SQL标准中还定义了any和some为bool_or的别名, 但是因为any和some还可以被解释为子查询, 所以在PostgreSQL中any和some的布尔逻辑聚合不可用.
postgres=# select any(id) from (values(null),(true),(false)) as t(id);
ERROR:  syntax error at or near "any"
LINE 1: select any(id) from (values(null),(true),(false)) as t(id);
               ^
postgres=# select some(id) from (values(null),(true),(false)) as t(id);
ERROR:  syntax error at or near "some"
LINE 1: select some(id) from (values(null),(true),(false)) as t(id);
               ^
bool_or的例子
postgres=# select bool_or(id) from (values(null),(true),(false)) as t(id);
 bool_or
---------
 t
(1 row)

计算非空的表达式个数, count带表达式时, 不计算null
postgres=# select count(id) from (values(null),(1),(2)) as t(id);
 count
-------
     2
(1 row)

计算表达式(含空值)的个数, count(*)计算null, 注意count(*)是一个独立的聚合函数. 请和count(express)区分开来.
postgres=# select count(*) from (values(null),(1),(2)) as t(id);
 count
-------
     3
(1 row)
postgres=# select count(*) from (values(null),(null),(1),(2)) as t(id);
 count
-------
     4
(1 row)

聚合后得到json, 不带key的json聚合
postgres=# select json_agg(id) from (values(null),(true),(false)) as t(id);
      json_agg
---------------------
 [null, true, false]
(1 row)
聚合后得到json, 带key的json聚合, 注意key不能为null, 否则报错.
postgres=# select json_object_agg(c1,c2) from (values('a',null),('b',true),('c',false)) as t(c1,c2);
             json_object_agg
-----------------------------------------
 { "a" : null, "b" : true, "c" : false }
(1 row)
postgres=# select json_object_agg(c1,c2) from (values(null,null),('b',true),('c',false)) as t(c1,c2);
ERROR:  22023: field name must not be null
LOCATION:  json_object_agg_transfn, json.c:1959

计算最大最小值, max, min都不计算null
postgres=# select max(id) from (values(null),(1),(2)) as t(id);
 max
-----
   2
(1 row)
postgres=# select min(id) from (values(null),(1),(2)) as t(id);
 min
-----
   1
(1 row)

聚合后得到字符串, 字符串聚合
postgres=# select string_agg(c1,'***') from (values('a',null),('b',true),('c',false)) as t(c1,c2);
 string_agg
------------
 a***b***c
(1 row)
postgres=# select string_agg(id,'***') from (values(null),('digoal'),('zhou')) as t(id);
  string_agg
---------------
 digoal***zhou
(1 row)

计算总和, sum不计算null, 当所有行都是null时, 即没有任何行输入, 返回null.
postgres=# select sum(id) from (values(null),(1),(2)) as t(id);
 sum
-----
   3
(1 row)
postgres=# select sum(id::int) from (values(null),(null),(null)) as t(id);
 sum
-----

(1 row)

聚合后得到xml
postgres=# select xmlagg(id::xml) from (values(null),('<foo>digoal</foo>'),('<bar/>')) as t(id);
         xmlagg
-------------------------
 <foo>digoal</foo><bar/>
(1 row)

某些聚合函数得到的结果可能和行的输入顺序有关, 例如array_agg, json_agg, json_object_agg, string_agg, and xmlagg, 以及某些自定义聚合函数. 如何来实现呢?
支持聚合函数中使用order by的PostgreSQL版本可以用如下语法 :
postgres=# select string_agg(id,'***' order by id) from (values(null),('digoal'),('zhou')) as t(id);
  string_agg
---------------
 digoal***zhou
(1 row)
postgres=# select string_agg(id,'***' order by id desc) from (values(null),('digoal'),('zhou')) as t(id);
  string_agg
---------------
 zhou***digoal
(1 row)
不支持聚合函数中使用order by的PostgreSQL版本, 可以用如下语法 :
SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
postgres=# select string_agg(id,'***') from (select id from (values(null),('digoal'),('zhou')) as t(id) order by id desc) t;
  string_agg
---------------
 zhou***digoal
(1 row)

[参考]
1. http://www.postgresql.org/docs/9.4/static/functions-aggregate.html
2. http://www.postgresql.org/docs/9.4/static/functions-xml.html
3. src/backend/utils/adt
这些函数的代码在src/backend/utils/adt这里可以查询到, 对应各自的类型.

时间： 2024-10-24 17:42:07

PostgreSQL 聚合函数讲解 - 1 常用聚合函数的相关文章

PostgreSQL 聚合函数讲解 - 3 总体|样本方差, 标准方差

PostgreSQL自带了一些常用的统计学聚合函数, 非常好用. 本文介绍一下方差和标准差的一些聚合函数. 总体方差 : population covariance 总体标准差 : population standard deviation 样本方差 : sample covariance 样本标准差 : sample standard deviation 均值 : mean 样本均值和样本方差的介绍 : http://en.wikipedia.org/wiki/Sample_mean_and_

MySQL常用聚合函数详解_Mysql

一.AVG AVG(col) 返回指定列的平均值二.COUNT COUNT(col) 返回指定列中非NULL值的个数三.MIN/MAX MIN(col):返回指定列的最小值 MAX(col):返回指定列的最大值四.SUM SUM(col) 返回指定列的所有值之和五.GROUP_CONCAT GROUP_CONCAT([DISTINCT] expr [,expr ...] [ORDER BY {unsigned_integer | col_name | expr}

Spark SQL 用户自定义函数UDF、用户自定义聚合函数UDAF 教程（Java踩坑教学版）

在Spark中,也支持Hive中的自定义函数.自定义函数大致可以分为三种: UDF(User-Defined-Function),即最基本的自定义函数,类似to_char,to_date等 UDAF(User- Defined Aggregation Funcation),用户自定义聚合函数,类似在group by之后使用的sum,avg等 UDTF(User-Defined Table-Generating Functions),用户自定义生成函数,有点像stream里面的flatMap 本篇

HybridDB PostgreSQL "Sort、Group、distinct 聚合、JOIN" 不惧怕数据倾斜的黑科技和原理 - 多阶段聚合

标签 PostgreSQL , Greenplum , JOIN , group by , distinct , 聚合 , 非分布键 , 数据倾斜 , 多阶段聚合背景对于分布式系统,数据分布存储,例如随机.哈希分布. Greenplum数据库支持两种数据分布模式: 1.哈希(指定单个.或多个字段) 2.随机分布(无需指定任何字段) 数据分布存储后,面临一些挑战: JOIN,排序,group by,distinct. 1.JOIN涉及非分布键字段 2.排序,如何保证输出顺序全局有序 3.gro

excel成绩排名RANK函数讲解和常见的excel成绩排名案例

本文介绍excel成绩排名的常用RANK函数讲解,并整理了其余的excel成绩排名案例. 我们经常使用excel做成绩排名,如下面的成绩排名表格非常常见,求各位学员的成绩排名.本例使用到的是一个excel排名函数RANK函数. 我们只需要在E2单元格输入公式:=RANK(D2,$D$2:$D$11),然后下拉即可算出所有学生的成绩排名情况.

Linux时间时区、常用时间函数、整形时间计算思路详解

Linux时间时区详解与常用时间函数时间与时区整个地球分为二十四时区,每个时区都有自己的本地时间. Ø UTC时间与 GMT时间我们可以认为格林威治时间就是时间协调时间(GMT = UTC),格林威治时间和UTC时间都用秒数来计算的. Ø UTC时间与本地时间 UTC + 时区差 = 本地时间时区差东为正,西为负.在此,把东八区时区差记为 +0800 UTC + (+0800) = 本地(北京)时间 Ø UTC与Unix时间戳在计算机中看到的UTC时间都是从(1970年01月0

WordPress主题模板层次和常用模板函数

WordPress 是目前世界上使用最广泛的博客系统,是一款开源的PHP软件.因为使用者众多,所以WordPress社区非常活跃,有丰富的插件模板资源.使用WordPress可以快速搭建独立的博客网站.很多知名科技博客也是WordPress搭建的,例如:雷锋网.钛媒体等.下面PHP站为WordPress主题开发者整理了一下制作WordPress主题的必备知识.WordPress主题模板层次首页: home.php index.php 文章页: single-{post_type}.php – 如

Oracle 常用SQL函数

oracle|函数时间:2005-02-18 Oracle的SQL函数分为单行函数和多行函数.单行函数只对单条记录有效,多行函数对多条记录有效. 单行函数包括,字符.数字.日期.转换和普通函数. 字符函数举例:• 全小写 LOWER('SQL Course') sql course• 全大写 UPPER('SQL Course') SQL COURSE •首字母大写 INITCAP('SQL Course') Sql Course拼接 CONCAT('Good', 'Str

ASP常用的函数

函数 ASP常用的函数,希望能用的着. <% dim db db="dbms.mdb" '****************************************************************** '执行sql语句,不返回值,sql语句最好是如下: 'update 表名 set 字段名=value,字段名=value where 字段名=value 'delete from 表名 where 字段名=value 'insert into 表名 (字段名,