如何有效地将大型 PostGIS (PostgreSQL) table 导出到 GeoJSON 文件？

Question

我建立了一个比较大的 PostgreSQL 数据库，主要包含地理空间数据。我目前正在尝试将一些数据导出为 GeoJSON 格式，以便我可以对其进行平铺并将其与某些地图 (Mapbox) 一起使用。

对于我正在处理的第一个数据集，我编写了一个快速脚本，该脚本使用以下 ogr2ogr 命令将数据导出为 GeoJSON 格式。

ogr2ogr -f GeoJSON \
  -progress \
  nhd_.json \
  "PG:dbname=$PG_DB host=$PG_HOST port=$PG_PORT user=$PG_USERNAME password=$PG_PASSWORD" \
  -sql "select resolution, geom from nhd_hr_"

由于导出的 GeoJSON 文件较大，我随后使用 geojsplit 将大型 GeoJSON 文件分解为较小的子文件，然后我能够使用 Mapbox tiling tool 来创建我的图块，然后为我的地图创建图层。

但是，现在我已经转向更大的数据库，我一直运行遇到这样的问题：在仅下载 12-15GB 的数据后，我与数据库的连接就会超时。

我最初的想法是将我的查询拆分为子查询，但我不完全确定我该怎么做。有什么方法可以改变我导出数据的方法吗？或者有没有办法让我将这个查询分解成更易于管理的块？

Answer 1

使用 seq 生成数字，然后更改您的查询以查询一系列 id 值。

在此示例中，我已将 </code> 更改为 <code>$suffix，IRL 一旦您对输出感到满意，您将删除 echo 实际上运行命令（或者你可以 copy/paste 输出到另一个终端）。您需要将 2000 更改为安全地高于最大 ogc_fid 的值（select ogc_fid from whatever order by 1 desc limit 1 才能找到它），并且可能需要将 1000 更改为其他一些块大小，具体取决于每行有多大。不过，不要忘记更改 WHERE 子句中的 1000。

$ seq 0 500 1500
0
500
1000
1500
$ for suffix in foo bar; do
> for each in $(seq 0 1000 2000); do
> echo ogr2ogr -f GeoJSON -progress nhd_$suffix.json "PG:dbname=$PG_DB host=$PG_HOST port=$PG_PORT user=$PG_USERNAME password=$PG_PASSWORD" \
>  -sql "select resolution, geom from nhd_hr_$suffix where ogc_fid>=$each and ogc_fid < ($each + 1000)"
>  done
> done
ogr2ogr -f GeoJSON -progress nhd_foo.json PG:dbname= host= port= user= password= -sql select resolution, geom from nhd_hr_foo where ogc_fid>=0 and ogc_fid < (0 + 1000)
ogr2ogr -f GeoJSON -progress nhd_foo.json PG:dbname= host= port= user= password= -sql select resolution, geom from nhd_hr_foo where ogc_fid>=1000 and ogc_fid < (1000 + 1000)
ogr2ogr -f GeoJSON -progress nhd_foo.json PG:dbname= host= port= user= password= -sql select resolution, geom from nhd_hr_foo where ogc_fid>=2000 and ogc_fid < (2000 + 1000)
ogr2ogr -f GeoJSON -progress nhd_bar.json PG:dbname= host= port= user= password= -sql select resolution, geom from nhd_hr_bar where ogc_fid>=0 and ogc_fid < (0 + 1000)
ogr2ogr -f GeoJSON -progress nhd_bar.json PG:dbname= host= port= user= password= -sql select resolution, geom from nhd_hr_bar where ogc_fid>=1000 and ogc_fid < (1000 + 1000)
ogr2ogr -f GeoJSON -progress nhd_bar.json PG:dbname= host= port= user= password= -sql select resolution, geom from nhd_hr_bar where ogc_fid>=2000 and ogc_fid < (2000 + 1000)

为了更容易 copy/paste 到您的 shell:

，这里是一行命令

for suffix in foo bar; do for each in $(seq 0 1000 2000); do echo ogr2ogr -f GeoJSON -progress nhd_$suffix.json "PG:dbname=$PG_DB host=$PG_HOST port=$PG_PORT user=$PG_USERNAME password=$PG_PASSWORD"  -sql "select resolution, geom from nhd_hr_$suffix where ogc_fid>=$each and ogc_fid < ($each + 1000)";  done; done

如何有效地将大型 PostGIS (PostgreSQL) table 导出到 GeoJSON 文件？

How can I efficiently export a large PostGIS (PostgreSQL) table to a GeoJSON file?

postgresql

postgis

gdal

bigdata

mapbox