PyMongo:如何查询系列并找到最接近的匹配项
PyMongo: how to query a series and find the closest match
这是我的数据如何存储在单个运动员 MongoDB 中的简化示例:
{ "_id" : ObjectId('5bd6eab25f74b70e5abb3326'),
"Result" : 12,
"Race" : [0.170, 4.234, 9.170]
"Painscore" : 68,
}
现在,当这位运动员参加了一场比赛时,我想搜索与当前比赛最相似的比赛,因此我想比较两个 painscores。
IOT 最好 'match' 我试过这个:
query = [0.165, 4.031, 9.234]
closestBelow = db[athlete].find({'Race' : {"$lte": query}}, {"_id": 1, "Race": 1}).sort("Race", -1).limit(2)
for i in closestBelow:
print(i)
closestAbove = db[athlete].find({'Race' : {"$gte": query}}, {"_id": 1, "Race": 1}).sort("Race", 1).limit(2)
for i in closestAbove:
print(i)
这似乎不起作用。
Question1: How can I give the mentioned query IOT find the race in Mongo that matches the best/closes?.. When taken in account that a race is almost never exactly the same.
Question2: How can i see a percentage of match per document so that an athlete knows how 'serious' he must interpreted the pain score?
谢谢。
感谢这个网站,我找到了一个解决方案:http://dataaspirant.com/2015/04/11/five-most-popular-similarity-measures-implementation-in-python/
第 1 步:找到您的查询;
第 2 步:根据查询进行初步选择并将结果附加到列表中(例如平均值);
第 3 步:使用 for 循环将列表中的每个项目与您的查询进行比较。为此使用欧氏距离;
第 4 步:处理匹配后,将最佳匹配定义到变量中。
from pymongo import MongoClient
client = MongoClient('mongodb://localhost:27017/')
Database = 'Firstclass'
def newSearch(Athlete):
# STEP 1
db = client[Database]
lastDoc = [i for i in db[Athlete].find({},{ '_id': 1, 'Race': 1, 'Avarage': 1}).sort('_id', -1).limit(1)]
query = { '$and': [ { 'Average' : {'$gte': lastDoc[0].get('Average')*0.9} }, { 'Average' : {'$lte': lastDoc[0].get('Average')*1.1} } ] }
funnel = [x for x in db[Athlete].find(query, {'_id': 1, 'Race': 1}).sort('_id', -1).limit(15)]
#STEP 2
compareListID = []
compareListRace = []
for x in funnel:
if lastDoc[0].get('_id') != x.get('_id'):
compareListID.append(x.get('_id'))
compareListRace.append(x.get('Race'))
#STEP 3
for y in compareListRace:
ED = euclidean_distance(lastDoc[0].get('Race'),y)
ESlist.append(ED)
#STEP 4
matchObjID = compareListID[numpy.argmax(ESlist)]
matchRace = compareListRace[numpy.argmax(ESlist)]
newSearch('Jim')
这是我的数据如何存储在单个运动员 MongoDB 中的简化示例:
{ "_id" : ObjectId('5bd6eab25f74b70e5abb3326'),
"Result" : 12,
"Race" : [0.170, 4.234, 9.170]
"Painscore" : 68,
}
现在,当这位运动员参加了一场比赛时,我想搜索与当前比赛最相似的比赛,因此我想比较两个 painscores。
IOT 最好 'match' 我试过这个:
query = [0.165, 4.031, 9.234]
closestBelow = db[athlete].find({'Race' : {"$lte": query}}, {"_id": 1, "Race": 1}).sort("Race", -1).limit(2)
for i in closestBelow:
print(i)
closestAbove = db[athlete].find({'Race' : {"$gte": query}}, {"_id": 1, "Race": 1}).sort("Race", 1).limit(2)
for i in closestAbove:
print(i)
这似乎不起作用。
Question1: How can I give the mentioned query IOT find the race in Mongo that matches the best/closes?.. When taken in account that a race is almost never exactly the same.
Question2: How can i see a percentage of match per document so that an athlete knows how 'serious' he must interpreted the pain score?
谢谢。
感谢这个网站,我找到了一个解决方案:http://dataaspirant.com/2015/04/11/five-most-popular-similarity-measures-implementation-in-python/
第 1 步:找到您的查询;
第 2 步:根据查询进行初步选择并将结果附加到列表中(例如平均值);
第 3 步:使用 for 循环将列表中的每个项目与您的查询进行比较。为此使用欧氏距离;
第 4 步:处理匹配后,将最佳匹配定义到变量中。
from pymongo import MongoClient
client = MongoClient('mongodb://localhost:27017/')
Database = 'Firstclass'
def newSearch(Athlete):
# STEP 1
db = client[Database]
lastDoc = [i for i in db[Athlete].find({},{ '_id': 1, 'Race': 1, 'Avarage': 1}).sort('_id', -1).limit(1)]
query = { '$and': [ { 'Average' : {'$gte': lastDoc[0].get('Average')*0.9} }, { 'Average' : {'$lte': lastDoc[0].get('Average')*1.1} } ] }
funnel = [x for x in db[Athlete].find(query, {'_id': 1, 'Race': 1}).sort('_id', -1).limit(15)]
#STEP 2
compareListID = []
compareListRace = []
for x in funnel:
if lastDoc[0].get('_id') != x.get('_id'):
compareListID.append(x.get('_id'))
compareListRace.append(x.get('Race'))
#STEP 3
for y in compareListRace:
ED = euclidean_distance(lastDoc[0].get('Race'),y)
ESlist.append(ED)
#STEP 4
matchObjID = compareListID[numpy.argmax(ESlist)]
matchRace = compareListRace[numpy.argmax(ESlist)]
newSearch('Jim')