如何为海量数据设置弹性集群?
How to setup elastic cluster for huge amount of data?
我被要求为大约 100 TB 的文本数据设置弹性搜索集群!
我已经知道如何在弹性中进行搜索和聚合,但我真的不知道如何为如此大的数据设置具有多个节点的集群!
我的意思是有多少大师,佐伊守护者,CD,......?或者我需要为 activeMQ 配备一台专用服务器吗? ...
有文档解释吗?
ES 是一个分布式系统,创建具有 1 个节点或 1000 个节点的集群没有太大区别。
在你的情况下,你可以做的是有一些主节点和更多的数据节点来创建一个大集群。
The master node is responsible for lightweight cluster-wide actions
such as creating or deleting an index, tracking which nodes are part
of the cluster, and deciding which shards to allocate to which nodes.
Data nodes hold the shards that contain the documents you have
indexed. Data nodes handle data related operations like CRUD, search,
and aggregations. These operations are I/O-, memory-, and
CPU-intensive. It is important to monitor these resources and to add
more data nodes if they are overloaded.
您可以选择较小的主节点(如果它们不保存数据)和较大的数据节点。
以下是主节点的配置。
http.port: 9200
discovery.zen.ping.unicast.hosts: ["127.0.0.1"]
cluster.name: elasticsearch_hobbes ## note this cluster name must be same for all the es nodes in the same cluster
node.name: "elasticsearch_001_master"// give 002 for other master node
node.master: true
**node.data: false (This master node will not hold the data)**
path.data: /usr/local/var/elasticsearch/
path.logs: /usr/local/var/log/elasticsearch/
discovery.zen.ping.multicast.enabled: false
下面是数据节点的配置。
cluster.name: elasticsearch_hobbes
node.name: "node2"
node.master: false
node.data : true
http.port: 9201
discovery.zen.ping.multicast.enabled: false
script.engine.groovy.inline.aggs: on
discovery.zen.ping.unicast.hosts: ["127.0.0.1"]
然后您可以转到主节点的 KOPF plugin
,方法是单击 http://localhost:9200/_plugin/kopf/#!/cluster 并查看下面的屏幕,其中显示了集群中的所有三个节点。
注意:- 请按照 https://github.com/lmenezes/elasticsearch-kopf 安装 KOPF 插件。如果您在设置集群时遇到任何问题,请告诉我。
我被要求为大约 100 TB 的文本数据设置弹性搜索集群! 我已经知道如何在弹性中进行搜索和聚合,但我真的不知道如何为如此大的数据设置具有多个节点的集群! 我的意思是有多少大师,佐伊守护者,CD,......?或者我需要为 activeMQ 配备一台专用服务器吗? ...
有文档解释吗?
ES 是一个分布式系统,创建具有 1 个节点或 1000 个节点的集群没有太大区别。
在你的情况下,你可以做的是有一些主节点和更多的数据节点来创建一个大集群。
The master node is responsible for lightweight cluster-wide actions such as creating or deleting an index, tracking which nodes are part of the cluster, and deciding which shards to allocate to which nodes.
Data nodes hold the shards that contain the documents you have indexed. Data nodes handle data related operations like CRUD, search, and aggregations. These operations are I/O-, memory-, and CPU-intensive. It is important to monitor these resources and to add more data nodes if they are overloaded.
您可以选择较小的主节点(如果它们不保存数据)和较大的数据节点。
以下是主节点的配置。
http.port: 9200
discovery.zen.ping.unicast.hosts: ["127.0.0.1"]
cluster.name: elasticsearch_hobbes ## note this cluster name must be same for all the es nodes in the same cluster
node.name: "elasticsearch_001_master"// give 002 for other master node
node.master: true
**node.data: false (This master node will not hold the data)**
path.data: /usr/local/var/elasticsearch/
path.logs: /usr/local/var/log/elasticsearch/
discovery.zen.ping.multicast.enabled: false
下面是数据节点的配置。
cluster.name: elasticsearch_hobbes
node.name: "node2"
node.master: false
node.data : true
http.port: 9201
discovery.zen.ping.multicast.enabled: false
script.engine.groovy.inline.aggs: on
discovery.zen.ping.unicast.hosts: ["127.0.0.1"]
然后您可以转到主节点的 KOPF plugin
,方法是单击 http://localhost:9200/_plugin/kopf/#!/cluster 并查看下面的屏幕,其中显示了集群中的所有三个节点。
注意:- 请按照 https://github.com/lmenezes/elasticsearch-kopf 安装 KOPF 插件。如果您在设置集群时遇到任何问题,请告诉我。