如何循环遍历嵌套列表以将值存储在数据框中?
How can I loop through a nested list to store the values in a data frame?
给定一个嵌套字典 neighborhood_data
并且第一项即 neighborhood_data[0]
显示
{'type': 'Feature',
'geometry': {'type': 'MultiPolygon',
'coordinates': [[[[28.073783, -26.343133],
[28.071239, -26.351536],
[28.068717, -26.350644],
[28.06663, -26.351362],
[28.065161, -26.352135],
[28.064671, -26.35399]]]],
'properties': {'cartodb_id': 1,
'subplace_c': 761001001,
'province': 'Gauteng',
'wardid': '74202012',
'district_m': 'Sedibeng',
'local_muni': 'Midvaal',
'main_place': 'Alberton',
'mp_class': 'Settlement',
'sp_name': 'Brenkondown',
'suburb_nam': 'Brenkondown',
'metro': 'Johannesburg',
'african': 330,
'white': 24,
'asian': 0,
'coloured': 2,
'other': 0,
'totalpop': 356}}}
然后我创建了一个空数据框neighborhoods
# define the dataframe columns
column_names = ['Province', 'District', 'Local_municipality','Main Place', 'Suburb','Metro','Latitude','Longitude']
# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
然而,当我循环 neighborhoods_data
将相关数据存储在 neighborhoods
数据框中时,出现以下错误
for data in neighborhood_data:
province = data['properties']['province']
district = data['properties']['district_m']
local_muni_name = suburb_name = data['properties']['local_muni']
suburb_name = data['properties']['suburb_nam']
metro = data['properties']['metro']
suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[[[[1]]]]
suburb_lon = suburb_latlon[[[[0]]]]
neighborhoods = neighborhoods.append({'Province': province,
'District': district,
'Local_municipality': local_muni_name,
'Main place': main_place,
'Suburb': suburb_name,
'Metro': metro,
'Latitude': suburb_lat,
'Longitude': suburb_lon}, ignore_index=True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-a5dc74ed4207> in <module>
7
8 suburb_latlon = data['geometry']['coordinates']
----> 9 subur_lat = suburb_latlon[[[[1]]]]
10 suburb_lon = suburb_latlon[[[[0]]]]
11
TypeError: list indices must be integers or slices, not list
那么如何在空数据框的 'Latitude' 和 'Longitude' 列中存储纬度和经度坐标?
您的字典格式错误,它在 coordinates
键中缺少右方括号,但我们假设这是正确的字典:
{'geometry': {'coordinates': [[[[28.073783, -26.343133],
[28.071239, -26.351536],
[28.068717, -26.350644],
[28.06663, -26.351362],
[28.065161, -26.352135],
[28.064671, -26.35399]]]],
'properties': {'african': 330,
'asian': 0,
'cartodb_id': 1,
'coloured': 2,
'district_m': 'Sedibeng',
'local_muni': 'Midvaal',
'main_place': 'Alberton',
'metro': 'Johannesburg',
'mp_class': 'Settlement',
'other': 0,
'province': 'Gauteng',
'sp_name': 'Brenkondown',
'subplace_c': 761001001,
'suburb_nam': 'Brenkondown',
'totalpop': 356,
'wardid': '74202012',
'white': 24},
'type': 'MultiPolygon'},
'type': 'Feature'}
然后,访问
suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[[[[1]]]] # <--- Indexing error here
suburb_lon = suburb_latlon[[[[0]]]] # <--- Indexing error here
我们想要执行以下操作(通过额外的列表级别解压缩直到我们得到我们的坐标):
suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[0][0][0][1] # <--- Not sure what your logic is here, and why you would pick the first one, but I'll assume that given this indexing procedure you can customize this.
suburb_lon = suburb_latlon[0][0][0][0] # <--- Same here
我认为您的坐标格式不正确。您目前有三个方括号打开,但 none 关闭:
'coordinates': [[[[28.073783, -26.343133]... [28.064671, -26.35399],
如果您想保留当前格式,您需要确保您的数据在末尾缺少三个方括号 ']]]' 或删除开头的两个方括号并按如下格式设置您的坐标:
'coordinates' : [[28.073783, -26.343133], [28.071239, -26.351536]...]
然后您可以使用以下方式访问:
suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[[0][1]]
suburb_lon = suburb_latlon[[0][0]]
访问第一个列表项 [28.073783, -26.343133],然后将 lat 分配给该列表中的第二个元素,并将 lon 分配给该列表中的第一个项目。
给定一个嵌套字典 neighborhood_data
并且第一项即 neighborhood_data[0]
显示
{'type': 'Feature',
'geometry': {'type': 'MultiPolygon',
'coordinates': [[[[28.073783, -26.343133],
[28.071239, -26.351536],
[28.068717, -26.350644],
[28.06663, -26.351362],
[28.065161, -26.352135],
[28.064671, -26.35399]]]],
'properties': {'cartodb_id': 1,
'subplace_c': 761001001,
'province': 'Gauteng',
'wardid': '74202012',
'district_m': 'Sedibeng',
'local_muni': 'Midvaal',
'main_place': 'Alberton',
'mp_class': 'Settlement',
'sp_name': 'Brenkondown',
'suburb_nam': 'Brenkondown',
'metro': 'Johannesburg',
'african': 330,
'white': 24,
'asian': 0,
'coloured': 2,
'other': 0,
'totalpop': 356}}}
然后我创建了一个空数据框neighborhoods
# define the dataframe columns
column_names = ['Province', 'District', 'Local_municipality','Main Place', 'Suburb','Metro','Latitude','Longitude']
# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
然而,当我循环 neighborhoods_data
将相关数据存储在 neighborhoods
数据框中时,出现以下错误
for data in neighborhood_data:
province = data['properties']['province']
district = data['properties']['district_m']
local_muni_name = suburb_name = data['properties']['local_muni']
suburb_name = data['properties']['suburb_nam']
metro = data['properties']['metro']
suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[[[[1]]]]
suburb_lon = suburb_latlon[[[[0]]]]
neighborhoods = neighborhoods.append({'Province': province,
'District': district,
'Local_municipality': local_muni_name,
'Main place': main_place,
'Suburb': suburb_name,
'Metro': metro,
'Latitude': suburb_lat,
'Longitude': suburb_lon}, ignore_index=True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-17-a5dc74ed4207> in <module>
7
8 suburb_latlon = data['geometry']['coordinates']
----> 9 subur_lat = suburb_latlon[[[[1]]]]
10 suburb_lon = suburb_latlon[[[[0]]]]
11
TypeError: list indices must be integers or slices, not list
那么如何在空数据框的 'Latitude' 和 'Longitude' 列中存储纬度和经度坐标?
您的字典格式错误,它在 coordinates
键中缺少右方括号,但我们假设这是正确的字典:
{'geometry': {'coordinates': [[[[28.073783, -26.343133],
[28.071239, -26.351536],
[28.068717, -26.350644],
[28.06663, -26.351362],
[28.065161, -26.352135],
[28.064671, -26.35399]]]],
'properties': {'african': 330,
'asian': 0,
'cartodb_id': 1,
'coloured': 2,
'district_m': 'Sedibeng',
'local_muni': 'Midvaal',
'main_place': 'Alberton',
'metro': 'Johannesburg',
'mp_class': 'Settlement',
'other': 0,
'province': 'Gauteng',
'sp_name': 'Brenkondown',
'subplace_c': 761001001,
'suburb_nam': 'Brenkondown',
'totalpop': 356,
'wardid': '74202012',
'white': 24},
'type': 'MultiPolygon'},
'type': 'Feature'}
然后,访问
suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[[[[1]]]] # <--- Indexing error here
suburb_lon = suburb_latlon[[[[0]]]] # <--- Indexing error here
我们想要执行以下操作(通过额外的列表级别解压缩直到我们得到我们的坐标):
suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[0][0][0][1] # <--- Not sure what your logic is here, and why you would pick the first one, but I'll assume that given this indexing procedure you can customize this.
suburb_lon = suburb_latlon[0][0][0][0] # <--- Same here
我认为您的坐标格式不正确。您目前有三个方括号打开,但 none 关闭:
'coordinates': [[[[28.073783, -26.343133]... [28.064671, -26.35399],
如果您想保留当前格式,您需要确保您的数据在末尾缺少三个方括号 ']]]' 或删除开头的两个方括号并按如下格式设置您的坐标:
'coordinates' : [[28.073783, -26.343133], [28.071239, -26.351536]...]
然后您可以使用以下方式访问:
suburb_latlon = data['geometry']['coordinates']
subur_lat = suburb_latlon[[0][1]]
suburb_lon = suburb_latlon[[0][0]]
访问第一个列表项 [28.073783, -26.343133],然后将 lat 分配给该列表中的第二个元素,并将 lon 分配给该列表中的第一个项目。