如何在 PostgreSQL 9.6 及更低版本中对数组进行洗牌?

How to shuffle array in PostgreSQL 9.6 and also lower versions?

以下自定义存储函数-

CREATE OR REPLACE FUNCTION words_shuffle(in_array varchar[])
        RETURNS varchar[] AS
$func$
        SELECT array_agg(letters.x) FROM 
        (SELECT UNNEST(in_array) x ORDER BY RANDOM()) letters;
$func$ LANGUAGE sql STABLE;

在 PostgreSQL 9.5.3 中改组字符数组:

words=> select words_shuffle(ARRAY['a','b','c','d','e','f']);
 words_shuffle 
---------------
 {c,d,b,a,e,f}
(1 row)

但在我切换到 PostgreSQL 9.6.2 后,该功能停止工作:

words=> select words_shuffle(ARRAY['a','b','c','d','e','f']);
 words_shuffle 
---------------
 {a,b,c,d,e,f}
(1 row)

可能是因为 ORDER BY RANDOM() 停止工作:

words=> select unnest(ARRAY['a','b','c','d','e','f']) order by random();
 unnest 
--------
 a
 b
 c
 d
 e
 f
(6 rows)

我正在寻找一种更好的随机排列字符数组的方法,它可以在新的 PostgreSQL 9.6 和 9.5 中使用。

我在开发中 my word game 需要它,它使用 Pl/PgSQL 函数。

更新:

Tom Lane回复:

目标列表中 SRF 的扩展现在发生在 ORDER BY 之后。 所以 ORDER BY 正在对单个虚拟行进行排序,然后是 unnest 在那之后发生。参见

https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=9118d03a8

毫无疑问,这是一个更改,是由于优化器中的一些 "improvement"。鉴于文档有点说这是可行的,这令人沮丧。

但是,我建议你不要依赖子查询:

SELECT array_agg(letters.x ORDER BY random())
FROM UNNEST(in_array) l(x);

这也适用于 Postgres 的顺序版本。

documentation 说:

Alternatively, supplying the input values from a sorted subquery will usually work. For example:

SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;

But this syntax is not allowed in the SQL standard, and is not portable to other database systems.

(我承认 "will usually work" 不是保证。但是文档中的代码示例不符合标准确实具有误导性。为什么它不使用 ORDER BY 子句显示正确的示例在聚合函数中?)

https://www.postgresql.org/docs/9.5/static/functions-aggregate.html

一般来说,集合返回函数应该放在FROM子句中:

select array_agg(u order by random())
from unnest(array['a','b','c','d','e','f']) u

   array_agg   
---------------
 {d,f,b,e,c,a}
(1 row) 

对于the documentation(强调已添加):

Currently, functions returning sets can also be called in the select list of a query. For each row that the query generates by itself, the function returning set is invoked, and an output row is generated for each element of the function's result set. Note, however, that this capability is deprecated and might be removed in future releases.