为什么在 zookeeper 和 kafka 中交换不是一个好主意？

Question

我已阅读

上的说明

do not use swap

关于动物园管理员和卡夫卡。我知道 kafka 依赖于页面缓存来将部分顺序日志缓存在内存中，即使它们被写入磁盘。

但无法理解交换如何损害 zk 和 kafka。

Answer 1

交换可能会导致性能和稳定性问题；在您的示例中，您不希望 Linux 内核 "mistakenly/accidentally" 交换您的 Kafka 或 ZooKeeper 进程。

另外，swapping may be particularly bad for JVM processes比如Kafka和ZooKeeper，引用：

[The] JVM generally won't do a full GC cycle until it has run out of its allowed heap, so most of your heap is likely occupied by not-yet-collected garbage. Since these pages aren't being touched (because they are garbage and thus unreferenced), the OS happily swaps them out. When GC finally runs, you have a ridiculous swap storm, pulling in all these pages only to then discover that they are in fact filled with garbage and should be discarded; this can easily make your GC cycle take many minutes!

因此建议通过将 vm.swappiness 设置为 0 来禁用交换，尽管对于某些操作系统（如 RHEL 6.5）这实际上应该是 1（因为值 0 在这些 OS 上被更改了）。请注意，某些交换可能仍会发生。

以下链接可能会进一步阐明您的问题。他们分别解释了为什么要禁用 Hadoop 和 Elasticsearch 的交换，这与您应该禁用 Kafka 和 ZooKeeper 的交换的原因相同：

Hadoop：Two memory-related issues on the Apache Hadoop cluster (memory swapping and the OOM killer)，作者 Adam Kawa；在撰写本文时，他在 Spotify 的 Hadoop 基础架构团队工作。
弹性搜索：Why to disable swapping for machines running Elasticsearch。

为什么在 zookeeper 和 kafka 中交换不是一个好主意？

Why swapping is not a good idea in zookeeper and kafka?

apache-kafka

apache-zookeeper