从 Geojson 数组中删除旧版本
Remove older versions from a Geojson array
我有一个大的 geojson 文件,其一般结构如下:
{
"features": [{
"geometry": {
"coordinates": [
[
[-12.345, 26.006],
[-78.56, 24.944],
[-76.44, 24.99],
[-76.456, 26.567],
[-78.345, 26.23456]
]
],
"type": "Polygon"
},
"id": "Some_ID_01",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
},
{
"geometry": {
"coordinates": [
[
[139.345, 39.2345],
[139.23456, 37.3465],
[141.678, 37.7896],
[141.2345, 39.6543],
[139.7856, 39.2345]
]
],
"type": "Polygon"
},
"id": "Some_OtherID_01",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
}, {
"geometry": {
"coordinates": [
[
[143.8796, -30.243],
[143.456, -32.764],
[145.3452, -32.76],
[145.134, -30.87],
[143.123, -30.765]
]
],
"type": "Polygon"
},
"id": "Some_ID_02",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
}
],
"type": "FeatureCollection"
}
我正在尝试删除重复的 ID,并保留最新版本(即 Some_ID_01 和 Some_ID_02 对我来说被认为是重复的,我想保留 Some_ID_02 ).这些 "duplicates" 的内容没有任何顺序(尽管如果我可以在此过程中对它们进行排序,可能会按字母顺序排列,那就太好了),这些重复项也不一定包含相同的坐标值(它们是较新的版本同一点)
到目前为止,我已经阅读了几个删除重复的 json 条目(尤其是尝试修改 this guide here 中的代码),但我对 JS 的了解不够,无法将其修改为我的特定内容需要。我正在阅读 underscore.js 以查看是否有帮助(基于其他线程中的建议),并且还将查看 python 或 excel(作为 CSV 文件)以查看是否有帮助这些简化。
是否可以在程序中输入 geojson 并在 return 中获取文件(即使它是文本文件),或者将其直接输入会更简单?
我选择了 python 因为我更擅长那种语言。我将 post 下面的代码供参考,但你也可以找到另一个 post 我制作的 更详细地说明了我在使用列表从字典中删除键时遇到的问题
import json
json_file = open('features.json')
json_str = json_file.read()
json_data = json.loads(json_str)
dictionaryOfJsonId = {}
removalCounter = 0
keyToRemove = []
valueToRemoveFromList = []
IDList = []
for values in json_data['features']: #This loop converts the values in the json parse into a dict of only ID
stringToSplit = values["id"] #the id values from the json file
IDList.append(stringToSplit) #list with all the ID
newKey = stringToSplit[:-2] #takes the initial substring up to the last 2 spaces (version)
newValue = stringToSplit[-2:] #grabs the last two characters of the string
if newKey in dictionaryOfJsonId:
dictionaryOfJsonId[newKey].append(newValue)
else:
dictionaryOfJsonId[newKey] = [newValue]
for key in dictionaryOfJsonId: #Remove entries that do not have duplicates
if len(dictionaryOfJsonId[key])<2:
valueToRemoveFromList.append(str(key + dictionaryOfJsonId[key][0]))
else:
valueToRemoveFromList.append(str(key +max(dictionaryOfJsonId[key])))
for string in valueToRemoveFromList: #Remove all values that don't have duplicates from the List of ID
IDList.remove(string)
removalCounter+=1
good_features = [i for i in json_data['features'] if i['id'] not in IDList] #Loops through the original and
#removes keys on list from original JSON
with open('features.geojson','w') as outfile: #create JSON file from list
json.dump(good_features,outfile)
print "Removed",len(json_data['features'])-removalCounter, "entries from JSON"
我有一个大的 geojson 文件,其一般结构如下:
{
"features": [{
"geometry": {
"coordinates": [
[
[-12.345, 26.006],
[-78.56, 24.944],
[-76.44, 24.99],
[-76.456, 26.567],
[-78.345, 26.23456]
]
],
"type": "Polygon"
},
"id": "Some_ID_01",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
},
{
"geometry": {
"coordinates": [
[
[139.345, 39.2345],
[139.23456, 37.3465],
[141.678, 37.7896],
[141.2345, 39.6543],
[139.7856, 39.2345]
]
],
"type": "Polygon"
},
"id": "Some_OtherID_01",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
}, {
"geometry": {
"coordinates": [
[
[143.8796, -30.243],
[143.456, -32.764],
[145.3452, -32.76],
[145.134, -30.87],
[143.123, -30.765]
]
],
"type": "Polygon"
},
"id": "Some_ID_02",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
}
],
"type": "FeatureCollection"
}
我正在尝试删除重复的 ID,并保留最新版本(即 Some_ID_01 和 Some_ID_02 对我来说被认为是重复的,我想保留 Some_ID_02 ).这些 "duplicates" 的内容没有任何顺序(尽管如果我可以在此过程中对它们进行排序,可能会按字母顺序排列,那就太好了),这些重复项也不一定包含相同的坐标值(它们是较新的版本同一点)
到目前为止,我已经阅读了几个删除重复的 json 条目(尤其是尝试修改 this guide here 中的代码),但我对 JS 的了解不够,无法将其修改为我的特定内容需要。我正在阅读 underscore.js 以查看是否有帮助(基于其他线程中的建议),并且还将查看 python 或 excel(作为 CSV 文件)以查看是否有帮助这些简化。
是否可以在程序中输入 geojson 并在 return 中获取文件(即使它是文本文件),或者将其直接输入会更简单?
我选择了 python 因为我更擅长那种语言。我将 post 下面的代码供参考,但你也可以找到另一个 post 我制作的
import json
json_file = open('features.json')
json_str = json_file.read()
json_data = json.loads(json_str)
dictionaryOfJsonId = {}
removalCounter = 0
keyToRemove = []
valueToRemoveFromList = []
IDList = []
for values in json_data['features']: #This loop converts the values in the json parse into a dict of only ID
stringToSplit = values["id"] #the id values from the json file
IDList.append(stringToSplit) #list with all the ID
newKey = stringToSplit[:-2] #takes the initial substring up to the last 2 spaces (version)
newValue = stringToSplit[-2:] #grabs the last two characters of the string
if newKey in dictionaryOfJsonId:
dictionaryOfJsonId[newKey].append(newValue)
else:
dictionaryOfJsonId[newKey] = [newValue]
for key in dictionaryOfJsonId: #Remove entries that do not have duplicates
if len(dictionaryOfJsonId[key])<2:
valueToRemoveFromList.append(str(key + dictionaryOfJsonId[key][0]))
else:
valueToRemoveFromList.append(str(key +max(dictionaryOfJsonId[key])))
for string in valueToRemoveFromList: #Remove all values that don't have duplicates from the List of ID
IDList.remove(string)
removalCounter+=1
good_features = [i for i in json_data['features'] if i['id'] not in IDList] #Loops through the original and
#removes keys on list from original JSON
with open('features.geojson','w') as outfile: #create JSON file from list
json.dump(good_features,outfile)
print "Removed",len(json_data['features'])-removalCounter, "entries from JSON"