类似于cassandra的最终一致文档存储数据库
Eventually consistent document store database similar to cassandra
我正在寻找像 Cassandra 一样易于扩展的开源数据存储,但可以通过 MongoDB.
等文档查询数据
目前是否有任何数据库可以执行此操作?
在这个网站 http://nosql-database.org 中,您可以找到按数据存储类型排序的许多 NoSQL 数据库的列表,您应该检查那里的文档存储。
我没有命名任何特定的数据库以避免 biased/opinion-based 回答,但如果您对像 Cassandra 一样可扩展的数据存储感兴趣,您可能想要检查那些使用 master-master/multi-master/masterless(随便你说,想法是一样的)架构,写入和读取都可以在集群中的所有节点之间拆分。
我知道 Cassandra 针对写入而不是读取进行了优化,但如果没有问题中的更多详细信息,则无法使用更多信息完善答案。
更新:
Disclaimer: I haven't used CouchDB at all, and haven't tested it's performance either.
既然您发现了 CouchDB,我将在 distributed database and replication 部分添加我在官方文档中找到的内容。
CouchDB is a peer-based distributed database system. It allows users
and servers to access and update the same shared data while
disconnected. Those changes can then be replicated bi-directionally
later.
The CouchDB document storage, view and security models are designed to
work together to make true bi-directional replication efficient and
reliable. Both documents and designs can replicate, allowing full
database applications (including application design, logic and data)
to be replicated to laptops for offline use, or replicated to servers
in remote offices where slow or unreliable connections make sharing
data difficult.
The replication process is incremental. At the database level,
replication only examines documents updated since the last
replication. Then for each updated document, only fields and blobs
that have changed are replicated across the network. If replication
fails at any step, due to network problems or crash for example, the
next replication restarts at the same document where it left off.
Partial replicas can be created and maintained. Replication can be
filtered by a javascript function, so that only particular documents
or those meeting specific criteria are replicated. This can allow
users to take subsets of a large shared database application offline
for their own use, while maintaining normal interaction with the
application and that subset of data.
这对我来说看起来非常可扩展,因为您似乎可以向集群添加新节点,然后复制所有数据。
对于真正的大数据集,部分副本似乎是一个有趣的选择,我会非常仔细地配置这些,以防止对数据库的给定查询可能不会产生有效结果的情况,例如,在网络分区的情况并且只能访问部分集。
我正在寻找像 Cassandra 一样易于扩展的开源数据存储,但可以通过 MongoDB.
等文档查询数据目前是否有任何数据库可以执行此操作?
在这个网站 http://nosql-database.org 中,您可以找到按数据存储类型排序的许多 NoSQL 数据库的列表,您应该检查那里的文档存储。
我没有命名任何特定的数据库以避免 biased/opinion-based 回答,但如果您对像 Cassandra 一样可扩展的数据存储感兴趣,您可能想要检查那些使用 master-master/multi-master/masterless(随便你说,想法是一样的)架构,写入和读取都可以在集群中的所有节点之间拆分。
我知道 Cassandra 针对写入而不是读取进行了优化,但如果没有问题中的更多详细信息,则无法使用更多信息完善答案。
更新:
Disclaimer: I haven't used CouchDB at all, and haven't tested it's performance either.
既然您发现了 CouchDB,我将在 distributed database and replication 部分添加我在官方文档中找到的内容。
CouchDB is a peer-based distributed database system. It allows users and servers to access and update the same shared data while disconnected. Those changes can then be replicated bi-directionally later.
The CouchDB document storage, view and security models are designed to work together to make true bi-directional replication efficient and reliable. Both documents and designs can replicate, allowing full database applications (including application design, logic and data) to be replicated to laptops for offline use, or replicated to servers in remote offices where slow or unreliable connections make sharing data difficult.
The replication process is incremental. At the database level, replication only examines documents updated since the last replication. Then for each updated document, only fields and blobs that have changed are replicated across the network. If replication fails at any step, due to network problems or crash for example, the next replication restarts at the same document where it left off.
Partial replicas can be created and maintained. Replication can be filtered by a javascript function, so that only particular documents or those meeting specific criteria are replicated. This can allow users to take subsets of a large shared database application offline for their own use, while maintaining normal interaction with the application and that subset of data.
这对我来说看起来非常可扩展,因为您似乎可以向集群添加新节点,然后复制所有数据。
对于真正的大数据集,部分副本似乎是一个有趣的选择,我会非常仔细地配置这些,以防止对数据库的给定查询可能不会产生有效结果的情况,例如,在网络分区的情况并且只能访问部分集。