Postgres 查询过滤了几种不同的方式

Question

我有一个很大的 postgres 查询，其中有很多连接，我想用几种不同的方式进行过滤。我有一个执行连接的中心函数：

create function my_big_function(p_limit int, p_offset, p_filter_ids int[])
returns setof my_type
language sql
immutable
returns null on null input 
as
$$
select my_column_list
from 
(
 select my_filter_id
    from  unnest(p_filter_ids)
    order by my_filter_id
    limit p_limit, offset p_offset
    ) f(my_filter_id)
    inner join... several other tables (using indexed columns)

然后我有一系列用于构建过滤器 ID 列表的简短函数，例如：

create or replace my_filter_id_function(p_some_id int)
returns int[]
language sql
immutable
returns null on null input
as
$$
select array_agg(my_filter_id) from my_table where some_id = p_some_id
$$;

这使我能够快速添加几个过滤函数并将结果数组作为参数提供给大查询，而不必在很多地方重复大查询。

select * from my_big_function(1000, 0, my_filter_function1(p_some_id));
select * from my_big_function(1000, 0, my_filter_function2(p_some_other_id));
select * from my_big_function(1000, 0, my_filter_function3(p_yet_another_id));

问题是，当从过滤器函数返回的值数组变大（约 1,000 行）时，我的查询速度很慢。我认为这是因为大查询必须排序，然后使用非索引结果加入？有没有更好的方法来创建一个通用查询，我可以将 ID 提供给它以不同的方式进行过滤？

Answer 1

我会避免大型数组，因为打包和解包它们的成本很高。

但我要说这里的主要问题是您将查询拆分为不同的函数，这会阻止优化器立即处理整个查询并得出有效的执行计划。

如果您想避免一遍又一遍地重复查询的某些部分，正确的工具不是函数，而是视图。执行查询时，视图将被其定义替换，因此优化器可以为整个查询找到一个好的计划。

不要陷入定义连接所有表格的“世界观”的陷阱。该视图应仅包含您在查询中实际需要的表。

Postgres 查询过滤了几种不同的方式

Postgres query filtered several different ways

arrays

postgresql

performance

function