使用 Pymongo 根据更新条件插入许多

Insert many based on upserting condition with Pymongo

我正在尝试找到一种有效的方法来将 Pandas DataFrame 上传到 MongoDB 集合,但有以下限制:

我试过:

from pymongo import UpdateOne

upserts=[ 
    UpdateOne(
        {"$and": [
            {'business_id': x['business_id']},
            {"document_key": x["document_key"]}
             ]
             }, 
        {'$setOnInsert': x}, 
        upsert=True
        ) 
    for x in dd.to_dict("records")
    ]

result = collection.bulk_write(upserts)

但它似乎没有更新文档,也不符合上述 overwriting/new 文档创建策略。

如何根据图示的 2 个要点进行插入?

我怀疑你想要 $set 而不是 $setOnInsert

If an update operation with upsert: true results in an insert of a document, then $setOnInsert assigns the specified values to the fields in the document. If the update operation does not result in an insert, $setOnInsert does nothing.

https://docs.mongodb.com/manual/reference/operator/update/setOnInsert/

使用 $set 的工作示例:

import pandas as pd
from pymongo import MongoClient, UpdateOne

db = MongoClient()['mydatabase']
collection = db['mycollection']

collection.insert_many([{'business_id': x, 'document_key': x, 'Existing': True} for x in range(10)])
df = pd.DataFrame([{'business_id': x, 'document_key': x, 'Updated': True} for x in range(3, 6)])

upserts = [
    UpdateOne(
        {'business_id': x['business_id'],
         "document_key": x["document_key"]},
        {'$set': x},
        upsert=True
    )
    for x in df.to_dict("records")
]

result = collection.bulk_write(upserts)

print(f'Matched: {result.matched_count}, Upserted: {result.upserted_count}, Modified: {result.modified_count}')

for document in collection.find({}, {'_id': 0}):
    print(document)

打印:

Matched: 3, Upserted: 0, Modified: 3
{'business_id': 0, 'document_key': 0, 'Existing': True}
{'business_id': 1, 'document_key': 1, 'Existing': True}
{'business_id': 2, 'document_key': 2, 'Existing': True}
{'business_id': 3, 'document_key': 3, 'Existing': True, 'Updated': True}
{'business_id': 4, 'document_key': 4, 'Existing': True, 'Updated': True}
{'business_id': 5, 'document_key': 5, 'Existing': True, 'Updated': True}
{'business_id': 6, 'document_key': 6, 'Existing': True}
{'business_id': 7, 'document_key': 7, 'Existing': True}
{'business_id': 8, 'document_key': 8, 'Existing': True}
{'business_id': 9, 'document_key': 9, 'Existing': True}