ST_GeogFromGeoJSON bigquery 失败而 postgres 成功

ST_GeogFromGeoJSON fails in bigquery while successful in postgres

我们有 geojson 多边形,我们想使用 ST_GeogFromGeoJSON 将其转换为 bigquery 中的地理对象。转换在 bigquery 中失败,而在 postgres 中使用等效命令 ST_GeomFromGeoJSON.

成功

我熟悉可以添加到 bigquery 调用的 SAFE 前缀,但我们希望使用该对象,而不是在转换失败时忽略它。我尝试使用 ST_CONVEXHULL 转换对象,但无法使其工作。

bigquery 中是否有一些解决方法?

示例:

运行 bigquery中的以下命令

select ST_GeogFromGeoJSON('{"type":"Polygon","coordinates":[[[-82.022982,26.69785],[-81.606813,26.710698],[-81.999574,26.109253],[-81.615053,26.105558],[-82.022982,26.69785]]]}')

returns

Query failed: ST_GeogFromGeoJSON failed: Invalid polygon loop: Edge 4 crosses edge 9

虽然在 postgres 中运行成功

select ST_GeomFromGeoJSON('{"type":"Polygon","coordinates":[[[-82.022982,26.69785],[-81.606813,26.710698],[-81.999574,26.109253],[-81.615053,26.105558],[-82.022982,26.69785]]]}')

以下适用于 BigQuery 标准 SQL

Query failed: ST_GeogFromGeoJSON failed: Invalid polygon loop: Edge 4 crosses edge 9
... Is there some work around in bigquery? ...

提议的解决方法显然是解决特定问题的幼稚和简单方法,同时可以轻松扩展到更一般的情况。这里的想法是提取坐标并重新排序以消除问题...

WITH test AS (
  SELECT '{"type":"Polygon","coordinates":[[[-82.022982,26.69785],[-81.606813,26.710698],[-81.999574,26.109253],[-81.615053,26.105558],[-82.022982,26.69785]]]}' AS geojson
)
SELECT ST_GEOGFROMGEOJSON('{"type":"Polygon","coordinates":' || fixed_coordinates || '}') AS geo
FROM (
  SELECT '[[[' || STRING_AGG(lat_lon, '],[') || '],[' || ANY_VALUE(ordered_coordinates[OFFSET(0)]) || ']]]' fixed_coordinates
  FROM (
    SELECT
      ARRAY( SELECT lon_lat
        FROM UNNEST(REGEXP_EXTRACT_ALL(JSON_EXTRACT(geojson, '$.coordinates'), r'\[+(.*?)\]+')) lon_lat
        ORDER BY CAST( SPLIT(lon_lat)[OFFSET(0)] AS FLOAT64), CAST(SPLIT(lon_lat)[OFFSET(1)] AS FLOAT64)
      ) ordered_coordinates
    FROM test
    ) t, t.ordered_coordinates lat_lon
)

这会产生正确的输出

POLYGON((-82.022982 26.69785, -81.999574 26.109253, -81.8073135 26.1074055, -81.615053 26.105558, -81.606813 26.710698, -81.8148975 26.704274, -82.022982 26.69785))    

相应的可视化是

以下适用于 BigQuery 标准 SQL

我之前的回答是基于 re-ordering 坐标的过于简单化的逻辑。显然它不会在像下面这样的更复杂的情况下工作

{‘type’:‘Polygon’,‘coordinates’:[[[-0.49044,51.4737],[-0.4907,51.4737],[-0.49075,51.46989],[-0.48664,51.46987],[-0.48664,51.47341],[-0.48923,51.47336],[-0.48921,51.4737],[-0.49072,51.47462],[-0.49114,51.47446],[-0.49044,51.4737]]]}

Is there some more advanced sorting logic that can be applied?

因此可以使用更复杂的逻辑来解决这个问题

#standardSQL
WITH test AS (
  SELECT '{"type":"Polygon","coordinates":[[[-0.49044,51.4737],[-0.4907,51.4737],[-0.49075,51.46989],[-0.48664,51.46987],[-0.48664,51.47341],[-0.48923,51.47336],[-0.48921,51.4737],[-0.49072,51.47462],[-0.49114,51.47446],[-0.49044,51.4737]]]}' geojson
), coordinates AS (
  SELECT CAST(SPLIT(lon_lat)[OFFSET(0)] AS FLOAT64) lon, CAST(SPLIT(lon_lat)[OFFSET(1)] AS FLOAT64) lat
  FROM test, UNNEST(REGEXP_EXTRACT_ALL(JSON_EXTRACT(geojson, '$.coordinates'), r'\[+(.*?)\]+')) lon_lat), stats AS (
  SELECT ST_CENTROID(ST_UNION_AGG(ST_GEOGPOINT(lon, lat))) centroid FROM coordinates
) 
SELECT ST_MAKEPOLYGON(ST_MAKELINE(ARRAY_AGG(point ORDER BY sequence))) AS polygon
FROM (
  SELECT point, 
    CASE 
      WHEN ST_X(point) > ST_X(centroid) AND ST_Y(point) > ST_Y(centroid) THEN 3.14 - angle
      WHEN ST_X(point) > ST_X(centroid) AND ST_Y(point) < ST_Y(centroid) THEN 3.14 + angle
      WHEN ST_X(point) < ST_X(centroid) AND ST_Y(point) < ST_Y(centroid) THEN 6.28 - angle
      ELSE angle
    END sequence
  FROM (
    SELECT point, centroid, 
      ACOS(ST_DISTANCE(centroid, anchor) / ST_DISTANCE(centroid, point)) angle
    FROM (
      SELECT centroid, 
        ST_GEOGPOINT(lon, lat) point, 
        ST_GEOGPOINT(lon, ST_Y(centroid)) anchor
      FROM coordinates, stats
    )
  )
) 

这种方法产生正确的输出

POLYGON((-0.49075 51.46989, -0.48664 51.46987, -0.48664 51.47341, -0.48923 51.47336, -0.48921 51.4737, -0.49072 51.47462, -0.49114 51.47446, -0.49044 51.4737, -0.4907 51.4737, -0.49075 51.46989))

如下图

2020 年 10 月更新 post

不再需要任何技巧 - ST_GEOGFROMGEOJSON 和 ST_GEOGFROMTEXT 地理函数现在支持新的 make_valid 参数。如果设置为 TRUE,该函数会尝试在导入地理数据时更正多边形问题。

所以,下面的简单语句现在可以完美运行了...

select ST_GeogFromGeoJSON(
  '{"type":"Polygon","coordinates":[[[-0.49044,51.4737],[-0.4907,51.4737],[-0.49075,51.46989],[-0.48664,51.46987],[-0.48664,51.47341],[-0.48923,51.47336],[-0.48921,51.4737],[-0.49072,51.47462],[-0.49114,51.47446],[-0.49044,51.4737]]]}' 
  , make_valid => true
) 

和returns预期输出