GROUP BY 使用 ST_MakeLine 后元素顺序错误
Wrong order of elements after GROUP BY using ST_MakeLine
我有一个 table(嗯,它是 CTE)包含路径,作为节点 ID 的数组,以及 table 节点及其几何形状。我正在尝试 SELECT 具有开始和结束节点以及几何形状的路径,如下所示:
SELECT *
FROM (
SELECT t.path_id, t.segment_num, t.start_node, t.end_node, ST_MakeLine(n.geom) AS geom
FROM (SELECT path_id, segment_num, nodes[1] AS start_node, nodes[array_upper(nodes,1)] AS end_node, unnest(nodes) AS node_id
FROM paths
) t
JOIN nodes n ON n.id = t.node_id
GROUP BY path_id, segment_num, start_node, end_node
) rs
当我在单个路径样本上尝试时,这似乎工作得很好,但是当我 运行 在大型数据集上这样做时,少量的结果几何是错误的 - 显然 ST_MakeLine
收到点顺序错误。我怀疑并行聚合导致顺序错误,但也许我在这里遗漏了其他东西?
如何确保 ST_MakeLine
中的点顺序正确?
如果我对并行聚合的看法是正确的,postgres docs 是说 Scans of common table expressions (CTEs) are always parallel restricted
,但这是否意味着我必须使用非嵌套数组制作 CTE 并将其标记为 AS MATERIALIZED
所以它没有优化回查询?
为了 ST_MakeLine
以正确的顺序创建 LineString,您必须使用 ORDER BY
明确声明它。以下示例显示了点的顺序如何在输出中产生巨大差异:
没有订购
WITH j (id,geom) AS (
VALUES
(3,'SRID=4326;POINT(1 2)'::geometry),
(1,'SRID=4326;POINT(3 4)'::geometry),
(0,'SRID=4326;POINT(1 9)'::geometry),
(2,'SRID=4326;POINT(8 3)'::geometry)
)
SELECT ST_MakeLine(geom) FROM j;
按 id
排序:
WITH j (id,geom) AS (
VALUES
(3,'SRID=4326;POINT(1 2)'::geometry),
(1,'SRID=4326;POINT(3 4)'::geometry),
(0,'SRID=4326;POINT(1 9)'::geometry),
(2,'SRID=4326;POINT(8 3)'::geometry)
)
SELECT ST_MakeLine(geom ORDER BY id) FROM j;
演示:db<>fiddle
感谢提醒我ST_MakeLine(geom ORDER BY something)
的可能性,ST_MakeLine
毕竟是聚合函数。我没有任何可用的显式排序列(顺序位于 nodes
数组中,但一个节点可以出现多次)。幸运的是,unnest
可以在带有 WITH ORDINALITY
的 FROM 子句中使用,因此为我创建了一个排序列。工作解决方案:
SELECT *
FROM (SELECT t.path_id, t.segment_num, t.start_node, t.end_node, ST_MakeLine(n.geom ORDER BY node_order) AS geom
FROM (SELECT path_id, segment_num, nodes[1] AS start_node, nodes[array_upper(nodes,1)] AS end_node, a.elem AS node_id, a.nr AS node_order
FROM paths, unnest(nodes) WITH ORDINALITY a(elem, nr)
) t
JOIN nodes n ON n.id = t.node_id
GROUP BY path_id, segment_num, start_node, end_node
) rs
我有一个 table(嗯,它是 CTE)包含路径,作为节点 ID 的数组,以及 table 节点及其几何形状。我正在尝试 SELECT 具有开始和结束节点以及几何形状的路径,如下所示:
SELECT *
FROM (
SELECT t.path_id, t.segment_num, t.start_node, t.end_node, ST_MakeLine(n.geom) AS geom
FROM (SELECT path_id, segment_num, nodes[1] AS start_node, nodes[array_upper(nodes,1)] AS end_node, unnest(nodes) AS node_id
FROM paths
) t
JOIN nodes n ON n.id = t.node_id
GROUP BY path_id, segment_num, start_node, end_node
) rs
当我在单个路径样本上尝试时,这似乎工作得很好,但是当我 运行 在大型数据集上这样做时,少量的结果几何是错误的 - 显然 ST_MakeLine
收到点顺序错误。我怀疑并行聚合导致顺序错误,但也许我在这里遗漏了其他东西?
如何确保 ST_MakeLine
中的点顺序正确?
如果我对并行聚合的看法是正确的,postgres docs 是说 Scans of common table expressions (CTEs) are always parallel restricted
,但这是否意味着我必须使用非嵌套数组制作 CTE 并将其标记为 AS MATERIALIZED
所以它没有优化回查询?
为了 ST_MakeLine
以正确的顺序创建 LineString,您必须使用 ORDER BY
明确声明它。以下示例显示了点的顺序如何在输出中产生巨大差异:
没有订购
WITH j (id,geom) AS (
VALUES
(3,'SRID=4326;POINT(1 2)'::geometry),
(1,'SRID=4326;POINT(3 4)'::geometry),
(0,'SRID=4326;POINT(1 9)'::geometry),
(2,'SRID=4326;POINT(8 3)'::geometry)
)
SELECT ST_MakeLine(geom) FROM j;
按 id
排序:
WITH j (id,geom) AS (
VALUES
(3,'SRID=4326;POINT(1 2)'::geometry),
(1,'SRID=4326;POINT(3 4)'::geometry),
(0,'SRID=4326;POINT(1 9)'::geometry),
(2,'SRID=4326;POINT(8 3)'::geometry)
)
SELECT ST_MakeLine(geom ORDER BY id) FROM j;
演示:db<>fiddle
感谢提醒我ST_MakeLine(geom ORDER BY something)
的可能性,ST_MakeLine
毕竟是聚合函数。我没有任何可用的显式排序列(顺序位于 nodes
数组中,但一个节点可以出现多次)。幸运的是,unnest
可以在带有 WITH ORDINALITY
的 FROM 子句中使用,因此为我创建了一个排序列。工作解决方案:
SELECT *
FROM (SELECT t.path_id, t.segment_num, t.start_node, t.end_node, ST_MakeLine(n.geom ORDER BY node_order) AS geom
FROM (SELECT path_id, segment_num, nodes[1] AS start_node, nodes[array_upper(nodes,1)] AS end_node, a.elem AS node_id, a.nr AS node_order
FROM paths, unnest(nodes) WITH ORDINALITY a(elem, nr)
) t
JOIN nodes n ON n.id = t.node_id
GROUP BY path_id, segment_num, start_node, end_node
) rs