使用 python eventhub sdk 从 eventhub 消费事件时，prefetch 和 max_batch_size 有什么区别

Question

我正在使用 Microsoft 提供的 azure-eventhub python 库连接到 eventhub 并异步捕获事件。

下面是我用来连接的代码

client = EventHubClientAsync(ADDRESS, debug=False, username=USER, password=KEY, http_proxy=self.proxy_settings)
receiver = client.add_async_receiver(CONSUMER_GROUP, str(i), OFFSET,                                               prefetch=self.azuremonitor_config.PREFETCH_SIZE)


batch = await receiver.receive(max_batch_size=azuremonitor.azuremonitor_config.MAX_BATCH_SIZE,timeout=azuremonitor.azuremonitor_config.TIMEOUT)

我不确定 add_async_receiver 函数中 prfetch 参数的用途是什么，它与接收函数中的 max_batch_size 参数有何不同。

Answer 1

根据定义：

**

max_batch_size
int

**

收到一批事件。批量大小将达到指定的最大值，但一旦服务 return 没有新事件，就会 return。如果结合超时且没有事件 **retrieve before the time, the result will be empty**。如果未提供批量大小，则 prefetch 大小将为最大值。

方法如下：

receive(max_batch_size=None, timeout=None)

您可以在此处阅读更多相关信息。

https://docs.microsoft.com/en-us/python/api/azure-eventhub/azure.eventhub.receiver.receiver?view=azure-python

希望对您有所帮助。

Answer 2

prefetch 和 max_batch_size 的用途不同。

prefetch是为了性能调优。客户端将尝试获取尽可能多的预取计数消息，并在单个请求中将它们缓存在本地。
max_batch_size 用于控制 receive 方法的最大消息量 return.

举个简单的例子：

假设预取为300，max_batch_size为10。
当调用 receive 时，客户端告诉服务在一次调用中向客户端发送多达 300 条消息。假设 Event Hub 中有 300 多条消息，那么客户端收到 300 条消息
receive 仅请求 10 条消息，该方法将 return 10 条消息，其余 290 条消息在本地缓存。
在接下来的receive调用中，SDK会先检查本地缓存，如果缓存中有足够的消息，则不会通过returning进行服务请求缓存的消息。

值得注意的是，azure-eventhub v5已于2020年1月正式发布，最新版本为v5.2.0

在 pypi 上可用：https://pypi.org/project/azure-eventhub/

请按照migration guide from v1 to v5迁移您的程序。

我们还提供 samples 供您开始使用。

使用 python eventhub sdk 从 eventhub 消费事件时，prefetch 和 max_batch_size 有什么区别

What is the difference between prefetch and max_batch_size when consuming events from eventhub using the python eventhub sdk

python

azure-eventhub