Elasticsearch 为索引分配主节点和副本节点
Elasticsearch assigning primary and replica nodes for indices
我已经设置了一个 Elasticsearch 集群,其中包含 1 个主节点、1 个客户端节点和 2 个数据节点。客户端和 2 个数据节点在一台机器上,主节点在另一台机器上。
IP如下:
192.168.1.3 - master
192.168.1.2:9201 - client
192.168.1.2:9200 - data1
192.168.1.2:9202 - data2
我有属于两个索引(movie-ame
和 movie-eur
)的数据,并希望将数据保存在节点中,如下所示。我使用 logstash 将数据导入数据节点。
movie-ame
primary shards in data1
1 replica in data2
logstash.conf
input {
file {
path => "C:\Users\azinneera\Desktop\logstash-5.1.1\bin\data.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns => ["ID","MovieName","ReleaseYear","Country","Genres"]
}
}
output {
elasticsearch {
action => "index"
hosts => ["192.168.1.2:9200"]
index => "movie-ame"
}
stdout {codec => rubydebug}
}
欧洲电影
primary shards in data2
1 replica in data1
logstash.conf
input {
file {
path => "C:\Users\azinneera\Desktop\logstash-5.1.1\bin\movieeur.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns => ["ID","MovieName","ReleaseYear","Country","Genres"]
}
}
output {
elasticsearch {
action => "index"
hosts => ["192.168.1.2:9202"]
index => "movie-eur"
}
stdout {codec => rubydebug}
}
但似乎 data1 充当两个索引的主索引,并且两个索引的副本都在 data2 中。
这是集群状态显示的内容
没有错。 Elasticsearch 会将 primary/replica 放在不同的节点上,并在每个分片的基础上使它们保持同步。
当您从 Elasticsearch 中查询数据时,它将从主要副本或其中一个副本中查询(因为它们被视为相同的副本)。总的来说,Elasticsearch 会处理负载均衡,您不必担心。
如果您真的很担心,可以调整一些 settings。
我已经设置了一个 Elasticsearch 集群,其中包含 1 个主节点、1 个客户端节点和 2 个数据节点。客户端和 2 个数据节点在一台机器上,主节点在另一台机器上。 IP如下:
192.168.1.3 - master
192.168.1.2:9201 - client
192.168.1.2:9200 - data1
192.168.1.2:9202 - data2
我有属于两个索引(movie-ame
和 movie-eur
)的数据,并希望将数据保存在节点中,如下所示。我使用 logstash 将数据导入数据节点。
movie-ame
primary shards in data1
1 replica in data2
logstash.conf
input {
file {
path => "C:\Users\azinneera\Desktop\logstash-5.1.1\bin\data.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns => ["ID","MovieName","ReleaseYear","Country","Genres"]
}
}
output {
elasticsearch {
action => "index"
hosts => ["192.168.1.2:9200"]
index => "movie-ame"
}
stdout {codec => rubydebug}
}
欧洲电影
primary shards in data2
1 replica in data1
logstash.conf
input {
file {
path => "C:\Users\azinneera\Desktop\logstash-5.1.1\bin\movieeur.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns => ["ID","MovieName","ReleaseYear","Country","Genres"]
}
}
output {
elasticsearch {
action => "index"
hosts => ["192.168.1.2:9202"]
index => "movie-eur"
}
stdout {codec => rubydebug}
}
但似乎 data1 充当两个索引的主索引,并且两个索引的副本都在 data2 中。
这是集群状态显示的内容
没有错。 Elasticsearch 会将 primary/replica 放在不同的节点上,并在每个分片的基础上使它们保持同步。
当您从 Elasticsearch 中查询数据时,它将从主要副本或其中一个副本中查询(因为它们被视为相同的副本)。总的来说,Elasticsearch 会处理负载均衡,您不必担心。
如果您真的很担心,可以调整一些 settings。