Azure 流分析 - CosmosDB 输出包含多行,每个分区键仅包含一行

Azure Stream Analytics - CosmosDB Output contains multiple rows and just one row per partition key

使用 Azure 流分析作业,将 IoTHub 作为输入,将文档数据库作为输出,经常收到以下警告 -

警告:CosmosDB 输出包含多行,每个分区键仅包含一行。如果输出延迟高于预期,请考虑选择每个分区键至少包含数百条记录的分区键。为获得最佳性能,请考虑为输入和输出选择相同的分区键列。

我正在使用分区键和 IoTHub 每秒为同一分区键接收的大量数据。

首先,这是警告,它不会导致您的流分析失败。

它建议您考虑更改您的 cosmosdb 设计。

分区键的正确设计,将提高您的 cosmosdb 使用时的性能。

由此article.

It is important to choose a partition key property that has a number of distinct values, and lets you distribute your workload evenly across these values. As a natural artifact of partitioning, requests involving the same partition key are limited by the maximum throughput of a single partition. Additionally, the storage size for documents belonging to the same partition key is limited to 10GB. An ideal partition key is one that appears frequently as a filter in your queries and has sufficient cardinality to ensure your solution is scalable.

A partition key is also the boundary for transactions in DocumentDB's stored procedures and triggers. You should choose the partition key so that documents that occur together in transactions share the same partition key value.

关于如何设计分区,可以参考这个article