附加 GeoDataFrames 不是 return 预期的数据帧

Question

我在尝试附加包含几何类型的数据帧时遇到以下问题。我正在查看的 pandas 数据框如下所示：

name     x_zone     y_zone
0  A1  65.422080  48.147850
1  A1  46.635708  51.165745
2  A1  46.597984  47.657444
3  A1  68.477700  44.073700
4  A3  46.635708  54.108190
5  A3  46.635708  51.844770
6  A3  63.309560  48.826878
7  A3  62.215572  54.108190

如您所见，每个 name 有四行，因为它们代表多边形的角。我需要它采用 geopandas 中定义的多边形形式，即我需要 GeoDataFrame。为此，我将以下代码仅用于 name 之一（只是为了检查它是否有效）：

df  = df[df['name']=='A1']

x = df['x_zone'].to_list()
y = df['y_zone'].to_list()
polygon_geom = Polygon(zip(x, y))
crs = {'init': "EPSG:4326"}
polygon = gpd.GeoDataFrame(index=[name], crs=crs, geometry=[polygon_geom])
print(polygon)

哪个returns:

                                             geometry
A1  POLYGON ((65.42208 48.14785, 46.63571 51.16575...

polygon.info()

<class 'geopandas.geodataframe.GeoDataFrame'>
Index: 1 entries, A1 to A1
Data columns (total 1 columns):
 #   Column    Non-Null Count  Dtype   
---  ------    --------------  -----   
 0   geometry  1 non-null      geometry
dtypes: geometry(1)
memory usage: 16.0+ bytes

太好了，太好了。因此，对于更多 name，我认为以下方法可行：

unique_place = list(df['name'].unique())

GE = []
for name in unique_aisle:
    f = df[df['id']==name]
    x = f['x_zone'].to_list()
    y = f['y_zone'].to_list()
    polygon_geom = Polygon(zip(x, y))
    crs = {'init': "EPSG:4326"}
    polygon = gpd.GeoDataFrame(index=[name], crs=crs, geometry=[polygon_geom])
    print(polygon.info())
    GE.append(polygon)

但它 returns 是一个列表，而不是数据框。

[                                             geometry
 A1  POLYGON ((65.42208 48.14785, 46.63571 51.16575...,
                                              geometry
 A3  POLYGON ((46.63571 54.10819, 46.63571 51.84477...]

这很奇怪，因为如果要附加的是 pandas 数据帧，*.append(**) 工作得很好。

我错过了什么？此外，即使在第一种情况下，我只剩下几何列，但这不是问题，因为我可以将文件写入 shp 并再次读取它以获得第二列（名称）。

感谢任何能让我前进的解决方案！

Answer 1

我想您需要一个在您的数据上使用 groupby 的示例代码。如果不是这样请告诉我。

from io import StringIO
import geopandas as gpd
import pandas as pd
from shapely.geometry import Polygon
import numpy as np

dats_str = """index  id     x_zone     y_zone
0  A1  65.422080  48.147850
1  A1  46.635708  51.165745
2  A1  46.597984  47.657444
3  A1  68.477700  44.073700
4  A3  46.635708  54.108190
5  A3  46.635708  51.844770
6  A3  63.309560  48.826878
7  A3  62.215572  54.108190"""

# read the string, convert to dataframe
df1 = pd.read_csv(StringIO(dats_str), sep='\s+', index_col='index')

# Use groupBy as an iterator to:-
# - collect interested items
# - process some data: mean, creat Polygon, maybe others
# - all are collected/appended as lists
ids = []
counts = []
meanx = []
meany = []
list_x = []
list_y = []
polygon = []
for label, group in df1.groupby('id'):
    # label: 'A1', 'A3'; 
    # group: dataframe of 'A', of 'B'
    ids.append(label)   
    counts.append(len(group))         #number of rows
    meanx.append(group.x_zone.mean())
    meany.append(group.y_zone.mean())
    # process x,y data of this group -> for polygon
    xs = group.x_zone.values
    ys = group.y_zone.values
    list_x.append(xs)
    list_y.append(ys)
    polygon.append(Polygon(zip(xs, ys))) # make/collect polygon

# items above are used to create a dataframe here
df_from_groupby = pd.DataFrame({'id': ids, 'counts': counts, \
                                'meanx': meanx, "meany": meany, \
                                'list_x': list_x, 'list_y': list_y,
                                'polygon': polygon
                               })

如果你打印数据框df_from_groupby，你会得到：-

   id  counts      meanx      meany  \
0  A1       4  56.783368  47.761185   
1  A3       4  54.699137  52.222007   

                                        list_x  \
0    [65.42208, 46.635708, 46.597984, 68.4777]   
1  [46.635708, 46.635708, 63.30956, 62.215572]   

                                      list_y  \
0  [48.14785, 51.165745, 47.657444, 44.0737]   
1  [54.10819, 51.84477, 48.826878, 54.10819]   

                                             polygon  
0  POLYGON ((65.42207999999999 48.14785, 46.63570...  
1  POLYGON ((46.635708 54.10819, 46.635708 51.844...

附加 GeoDataFrames 不是 return 预期的数据帧

Appending GeoDataFrames does not return expected dataframe

pandas

geopandas