postgresql - 计算三列的不同组合 - 顺序无关紧要

postgresql - count distinct combination of three columns- order doesn't matter

我正在尝试计算三列的不同组合,列的顺序无关紧要

示例:

    a a a 
    a a b 
    a b a 
    b b a 
    b a b

我得到的结果:

a a a 1
a a b 1
a b a 1
b b a 1
b a b 1

想要的结果

aaa 1
aab 2
bba 2

您可以使用有序数组

select  v[1], v[2], v[3], count(*) n
from tbl t
cross join lateral (
  select array_agg(col order by col) v
  from (
    values (c1),(c2),(c3)
  ) t(col)
) s
group by v[1], v[2], v[3];

db<>fiddle

也许您可以使用 校验和 来获得所需的结果,例如,如果您正在处理的只是 'a' 和 'b' 的组合,您可以将字母转换为整数(通过调用 ASCII() 函数)并将它们相加以获得校验和。

TABLE

create table t (c1, c2, c3 ) as
select 'a', 'a', 'a' union all 
select 'a', 'a', 'b' union all 
select 'a', 'b', 'a' union all 
select 'b', 'b', 'a' union all 
select 'b', 'a', 'b' ;

校验和

select c1, c2, c3, ascii( c1 ) + ascii( c2 ) + ascii( c3 ) as checksum 
from t ;

-- output
c1  c2  c3  checksum
a   a   a   291
a   a   b   292
a   b   a   292
b   b   a   293
b   a   b   293

如果这对您有用,那么您可以使用 window 函数,例如

select c1, c2, c3, rc_ as rowcount
from (
  select c1, c2, c3
  , count(*) over ( partition by ascii( c1 ) + ascii( c2 ) + ascii( c3 ) order by 1 ) rc_
  , row_number() over ( partition by ascii( c1 ) + ascii( c2 ) + ascii( c3 ) order by 1 ) rn_  
  from t 
) sq
where rc_ = rn_ ;

-- output
c1  c2  c3  rowcount
a   a   a   1
a   b   a   2
b   a   b   2

参见 dbfiddle

如果你处理的是不容易转换为整数的字符串,你可以在字符串和整数之间创建一个映射,并将 map_ 实现为一个视图(以便在后续查询中易于使用)例如

地图

-- {1} find all distinct elements
-- {2} map each element to an integer
create view map_
as
select val_, rank() over ( order by val_ ) weight_
from (
  select distinct val_
  from (
    select distinct c1 val_ from t union all
    select distinct c2 from t union all
    select distinct c3 from t
  ) all_elements 
) unique_elements ;

有了这张地图后,您就可以使用它的值来创建校验和(也可以在视图中)...

校验和

create view t_checksums_
as
select c1, c2, c3, c1weight + c2weight + c3weight as checksum
from (
  select
    c1, ( select weight_ from map_ where c1 = map_.val_ ) c1weight
  , c2, ( select weight_ from map_ where c2 = map_.val_ ) c2weight
  , c3, ( select weight_ from map_ where c3 = map_.val_ ) c3weight
from t 
) valandweight ;

... 然后,您可以使用与之前相同的查询来获取最终结果 - 请参阅 dbfiddle