使用 Python 从 JSON 嵌套列表和字符串数组中提取值
Extract values from JSON nested list and string array with Python
我正在尝试从马萨诸塞州波士顿的多个街区从 JSON 数据集中提取坐标,但我一直在尝试为每个街区获取第一对坐标城市;下面是 Roslindale 坐标的缩小版。
"features": [{
"type": "Feature",
"properties": {
"Name": "Roslindale",
"Acres": 1605.5682375,
"SqMiles": 2.51,
},
"geometry": {
"type": "MultiPolygon",
"coordinates": [
[
[
[
-71.125927174853857,
42.272013107957406
],
[
-71.125927174853857,
42.272013107957406
]
]
],
[
[
[
-71.125830766767592,
42.272212845889705
],
[
-71.125830766767592,
42.272212845889705
]
]
],
[
[
[
-71.125767203228904,
42.272315958536389
],
[
-71.125767203228904,
42.272315958536389
]
]
]
]
}
},
现在我已经使用
提取了我想要的数据
for data in boston_neighborhoods:
neighborhood_name = data['properties']['Name']
neighborhood_id = data['properties']['Neighborhood_ID']
neighborhood_size = data['properties']['SqMiles']
neighborhood_latlon = data['geometry']['coordinates']
neighborhood_lat = neighborhood_latlon
neighborhood_lon = neighborhood_latlon
neighborhoods = neighborhoods.append({'Neighborhood': neighborhood_name,
'Neighborhood_ID': neighborhood_id,
'SqMiles': neighborhood_size,
'Latitude': neighborhood_lat,
'Longitude': neighborhood_lon}, ignore_index=True)
这个 returns 多个坐标对,但我只想要第一对,下面是我现在返回的示例输出:
Latitude | Longitude
--------------------------------------------------------
[[[[-71.12592717485386, | [[[[-71.12592717485386,
42.272013107957406], [... | 42.272013107957406], [...
可能有点矫枉过正,但是 JMESPath
使得查询嵌套的 JSON 结构变得非常容易。
向下遍历文档,您首先需要获取数组中的每个元素 ([*]
),然后对于每个元素,您将 select 项放入一个对象 (a Python字典)。您将 select properties
下的社区,然后 Name
(properties.Name
)。您对类似的嵌套属性执行相同的操作。
坐标位于 geometry.coordinates
下,它是坐标对数组的数组。
import jmespath
import pandas as pd
query = """
[*].{
Neighborhood: properties.Name,
Neighborhood_ID: properties.Neighborhood_ID,
SqMiles: properties.SqMiles,
Latitude: geometry.coordinates[0][0][0][0],
Longitude: geometry.coordinates[0][0][0][1]
}
"""
compiled = jmespath.compile(query)
result = compiled.search(boston_neighborhoods)
df = pd.DataFrame.from_records(result)
# Neighborhood Neighborhood_ID SqMiles Latitude Longitude
# 0 Roslindale None 2.51 -71.125927 42.272013
我正在尝试从马萨诸塞州波士顿的多个街区从 JSON 数据集中提取坐标,但我一直在尝试为每个街区获取第一对坐标城市;下面是 Roslindale 坐标的缩小版。
"features": [{
"type": "Feature",
"properties": {
"Name": "Roslindale",
"Acres": 1605.5682375,
"SqMiles": 2.51,
},
"geometry": {
"type": "MultiPolygon",
"coordinates": [
[
[
[
-71.125927174853857,
42.272013107957406
],
[
-71.125927174853857,
42.272013107957406
]
]
],
[
[
[
-71.125830766767592,
42.272212845889705
],
[
-71.125830766767592,
42.272212845889705
]
]
],
[
[
[
-71.125767203228904,
42.272315958536389
],
[
-71.125767203228904,
42.272315958536389
]
]
]
]
}
},
现在我已经使用
提取了我想要的数据for data in boston_neighborhoods:
neighborhood_name = data['properties']['Name']
neighborhood_id = data['properties']['Neighborhood_ID']
neighborhood_size = data['properties']['SqMiles']
neighborhood_latlon = data['geometry']['coordinates']
neighborhood_lat = neighborhood_latlon
neighborhood_lon = neighborhood_latlon
neighborhoods = neighborhoods.append({'Neighborhood': neighborhood_name,
'Neighborhood_ID': neighborhood_id,
'SqMiles': neighborhood_size,
'Latitude': neighborhood_lat,
'Longitude': neighborhood_lon}, ignore_index=True)
这个 returns 多个坐标对,但我只想要第一对,下面是我现在返回的示例输出:
Latitude | Longitude
--------------------------------------------------------
[[[[-71.12592717485386, | [[[[-71.12592717485386,
42.272013107957406], [... | 42.272013107957406], [...
可能有点矫枉过正,但是 JMESPath
使得查询嵌套的 JSON 结构变得非常容易。
向下遍历文档,您首先需要获取数组中的每个元素 ([*]
),然后对于每个元素,您将 select 项放入一个对象 (a Python字典)。您将 select properties
下的社区,然后 Name
(properties.Name
)。您对类似的嵌套属性执行相同的操作。
坐标位于 geometry.coordinates
下,它是坐标对数组的数组。
import jmespath
import pandas as pd
query = """
[*].{
Neighborhood: properties.Name,
Neighborhood_ID: properties.Neighborhood_ID,
SqMiles: properties.SqMiles,
Latitude: geometry.coordinates[0][0][0][0],
Longitude: geometry.coordinates[0][0][0][1]
}
"""
compiled = jmespath.compile(query)
result = compiled.search(boston_neighborhoods)
df = pd.DataFrame.from_records(result)
# Neighborhood Neighborhood_ID SqMiles Latitude Longitude
# 0 Roslindale None 2.51 -71.125927 42.272013