如何使用 PL/pgSQL 构造具有动态列的 table

How to use PL/pgSQL to construct a table with dynamic columns

我有一个名为 locations 的 Postgres tabled。它有几百万行数据,格式如下

 id |  location_a  |  location_b
----+--------------+--------------
 36 | Sydney       | London
 37 | Atlanta      | London
 38 | New York     | Tokyo
 39 | Tokyo        | Sydney
 40 | Tokyo        | Sydney
.....

我希望能够生成以下形式的枢轴 table / 计数 -

问题是列数是可变的,因此必须以编程方式/动态确定,而不是使用静态 SELECT 查询。

我理解 PL/pgSQL 的基本概念,因为它是一种脚本语言,可以让我做这样的动态事情。

但我在开始时遇到了很多麻烦。有没有简单的方法来计算以上内容?

您可以动态创建视图。描述了比您的案例更简单的想法和解决方案 请在继续之前阅读它。

我们将使用以下查询来创建视图:

with all_locations(location) as (
    select distinct location_a
    from locations
    union
    select distinct location_b
    from locations
)

select location_a as location, json_object_agg(location_b, count order by location_b) as data
from (
    select a.location as location_a, b.location as location_b, count(l.*)
    from all_locations a
    cross join all_locations b
    left join locations l on location_a = a.location and location_b = b.location
    group by 1, 2
    ) s
group by 1
order by 1;

结果:

 location |                                    data                                    
----------+----------------------------------------------------------------------------
 Atlanta  | { "Atlanta" : 0, "London" : 1, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
 London   | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
 New York | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 0, "Tokyo" : 1 }
 Sydney   | { "Atlanta" : 0, "London" : 1, "New York" : 0, "Sydney" : 0, "Tokyo" : 0 }
 Tokyo    | { "Atlanta" : 0, "London" : 0, "New York" : 0, "Sydney" : 2, "Tokyo" : 0 }
(5 rows)

城市列表将在函数中使用两次,因此存储在数组cities中。请注意,您可以将函数中的第一个查询替换为更简单的查询(它只是不同城市的有序列表)。

create or replace function create_locations_view()
returns void language plpgsql as $$
declare
    cities text[];
    list text;
begin
--  fill array with all cities in alphabetical order
    select array_agg(location_a order by location_a)
    from (
        select distinct location_a
        from locations
        union
        select distinct location_b
        from locations
        ) s
    into cities;

--  construct list of columns to use in select list
    select string_agg(format($s$data->>'%1$s' "%1$s"$s$, city), ', ')
    from unnest(cities) city
    into list;

--  create view from select based on the above list
    execute format($ex$
        drop view if exists locations_view;
        create view locations_view as 
        select location, %1$s
        from (
            select location_a as location, json_object_agg(location_b, count order by location_b) as data
            from (
                select a.location as location_a, b.location as location_b, count(l.*)
                from unnest(%2$L::text[]) a(location)
                cross join unnest(%2$L::text[]) b(location)
                left join locations l on location_a = a.location and location_b = b.location
                group by 1, 2
                ) s
            group by 1
        ) s
        order by 1
        $ex$, list, cities);
end $$;

使用创建的视图中的函数和select数据:

select create_locations_view();
select * from locations_view;

 location | Atlanta | London | New York | Sydney | Tokyo 
----------+---------+--------+----------+--------+-------
 Atlanta  | 0       | 1      | 0        | 0      | 0
 London   | 0       | 0      | 0        | 0      | 0
 New York | 0       | 0      | 0        | 0      | 1
 Sydney   | 0       | 1      | 0        | 0      | 0
 Tokyo    | 0       | 0      | 0        | 2      | 0
(5 rows)

这个方法我用过几次,但是我没有处理过大数据,所以我不能保证它是有效的。