mongo-conector 未与 solr 连接 - 收集转储期间出现异常

mongo-conector not connecting with solr - Exception during collection dump

我正在用 solr 连接 MongoDB,

按照此文档进行集成: https://blog.toadworld.com/2017/02/03/indexing-mongodb-data-in-apache-solr

DB.Collection: solr.wlslog

D:\路径 solr\bin>

mongo-connector --unique-key=id -n solr.wlslog -m localhost:27017 -t http://localhost:8983/solr/wlslog -d solr_doc_manager

我收到以下响应和错误:

2020-06-15 12:15:45,744 [ALWAYS] mongo_connector.connector:50 - Starting mongo-connector version: 3.1.1
2020-06-15 12:15:45,744 [ALWAYS] mongo_connector.connector:50 - Python version: 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)]
2020-06-15 12:15:45,745 [ALWAYS] mongo_connector.connector:50 - Platform: Windows-10-10.0.18362-SP0
2020-06-15 12:15:45,745 [ALWAYS] mongo_connector.connector:50 - pymongo version: 3.10.1
2020-06-15 12:15:45,755 [ALWAYS] mongo_connector.connector:50 - Source MongoDB version: 4.2.2
2020-06-15 12:15:45,755 [ALWAYS] mongo_connector.connector:50 - Target DocManager: mongo_connector.doc_managers.solr_doc_manager version: 0.1.0
2020-06-15 12:15:45,787 [CRITICAL] mongo_connector.oplog_manager:713 - Exception during collection dump
Traceback (most recent call last):
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\doc_managers\solr_doc_manager.py", line 292, in
batch = list(next(cleaned) for i in range(self.chunk_size))
StopIteration

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\oplog_manager.py", line 668, in do_dump
upsert_all(dm)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\oplog_manager.py", line 651, in upsert_all
dm.bulk_upsert(docs_to_dump(from_coll), mapped_ns, long_ts)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\util.py", line 33, in wrapped
return f(*args, **kwargs)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\doc_managers\solr_doc_manager.py", line 292, in bulk_upsert
batch = list(next(cleaned) for i in range(self.chunk_size))
RuntimeError: generator raised StopIteration
2020-06-15 12:15:45,801 [ERROR] mongo_connector.oplog_manager:723 - OplogThread: Failed during dump collection cannot recover! Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset='rs0'), 'local'), 'oplog.rs')
2020-06-15 12:15:46,782 [ERROR] mongo_connector.connector:408 - MongoConnector: OplogThread <OplogThread(Thread-2, started 4936)> unexpectedly stopped! Shutting down

我搜索了 mongo-connector 的 GitHub 个问题,但没有得到任何解决方案:

Github-issue-870

Github-issue-898

问题终于解决了:)

我的系统 OS 是 windows 并且我已经在 C:\Program Files\MongoDB\(系统的驱动器)中安装了 mongodb,

在此 mongo 连接器连接之前,我已根据此 blog:

使用以下命令为 mongodb 启动了副本集
mongod --port 27017 --dbpath ../data/db --replSet rs0

问题:

问题在--dbpath ../data/db目录,这个目录位于C:\Program Files\MongoDB\Server.2\data\db这个目录有所有权限但是父目录C:\Program Files没有所有权限,因为它的系统目录和保护目录.

实际问题是: (收集转储期间出现异常)

2020-06-15 12:15:45,787 [CRITICAL] mongo_connector.oplog_manager:713 - Exception during collection dump

解决方案:

我刚刚将 --dbpath 更改为另一个路径,该目录位于系统受保护目录之外,如下所示:

mongod --port 27017 --dbpath C:/data/db --replSet rs0

之后我执行了以下连接命令,正如我在问题中发布的那样:

mongo-connector --unique-key=id -n solr.wlslog -m localhost:27017 -t http://localhost:8983/solr/wlslog -d solr_doc_manager

成功mongo 连接器日志结果:

2020-06-17 12:08:52,292 [ALWAYS] mongo_connector.connector:50 - Starting mongo-connector version: 3.1.1
2020-06-17 12:08:52,292 [ALWAYS] mongo_connector.connector:50 - Python version: 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)]
2020-06-17 12:08:52,293 [ALWAYS] mongo_connector.connector:50 - Platform: Windows-10-10.0.18362-SP0
2020-06-17 12:08:52,293 [ALWAYS] mongo_connector.connector:50 - pymongo version: 3.10.1
2020-06-17 12:08:52,310 [ALWAYS] mongo_connector.connector:50 - Source MongoDB version: 4.2.2
2020-06-17 12:08:52,311 [ALWAYS] mongo_connector.connector:50 - Target DocManager: mongo_connector.doc_managers.solr_doc_manager version: 0.1.0

希望这个回答对大家有帮助:)

就我而言,这并没有解决问题。 我正在使用 python 3.8,所以对我来说这实际上是由于 https://docs.python.org/3/whatsnew/3.7.html#changes-in-python-behavior

PEP 479 is enabled for all code in Python 3.7, meaning that StopIteration exceptions raised directly or indirectly in coroutines and generators are transformed into RuntimeError exceptions. (Contributed by Yury Selivanov in bpo-32670.)

阅读 How yield catches StopIteration exception? 让我开始认为它与 yield doc 语句有关,但实际上问题是在第 292 行和稍后的几行 solr_doc_manager.py 中有 2 行调用 next() :

batch = list(next(cleaned) for i in range(self.chunk_size))

更改为:

        batch = []
        for i in range(self.chunk_size):
            for x in cleaned:
                batch.append(x)