在 Powerquery 中删除重复项忽略了我已经对数据进行了排序（因此删除了错误的重复项）

Question

我在名为 Table1 的 table 中的 Excel 中有这个（示例）数据：

| PC Item  | Priority |
|----------|----------|
| AN123169 | P3       |
| AN123169 | P1       |
| AN123169 | P1       |
| AN123169 | P1       |
| AN123169 | P3       |

按 优先级 排序并根据 PC 项目删除重复项后，我希望保留 P1 记录。相反，我得到 P3。

help file 状态：

Removes all rows from a Power Query table, in the Query Editor, where the values in the selected columns duplicate earlier values.

我的理解是保留第一条记录并删除后续记录？

我使用的 M 代码是：

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"PC Item", type text}, {"Priority", type text}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"PC Item", Order.Ascending}, {"Priority", Order.Ascending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"PC Item"})
in
    #"Removed Duplicates"

在 #"Sorted Rows" 之后我有这个 table，排序正确：

| PC Item  | Priority |
|----------|----------|
| AN123169 | P1       |
| AN123169 | P1       |
| AN123169 | P1       |
| AN123169 | P3       |
| AN123169 | P3       |

在#"Removed Duplicates"之后我有：

| PC Item  | Priority |
|----------|----------|
| AN123169 | P3       |

我真的应该有P1记录吗？

我的第一个想法是反转排序，但是使用这个原始数据我得到 P3 作为唯一值返回：

| PC Item | Priority |
|---------|----------|
| AN310C4 | P3       |
| AN310C4 | P1       |
| AN310C4 | P1       |

并且使用这个原始数据我得到 P1 返回：

| PC Item | Priority |
|---------|----------|
| AN310C4 | P1       |
| AN310C4 | P1       |
| AN310C4 | P3       |

问题：

所以我想我的问题是 - 排序如何工作，因为后续 M 代码似乎忽略了我已经对数据进行排序的事实？

Answer 1

使用Table.Buffer缓存中间查询结果，避免删除重复项时查询折叠。

示例数据：

已更新M:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"PC Item", type text}, {"Priority", type text}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"PC Item", Order.Ascending}, {"Priority", Order.Ascending}}),
    #"Buffer Table" = Table.Buffer(#"Sorted Rows"),
    #"Removed Duplicates" = Table.Distinct(#"Buffer Table", {"PC Item"})
in
    #"Removed Duplicates"

结果：

找到文档（很少）here. Informative video found here。

我找到了其他 "break" 查询折叠的方法，使用删除 table 中的错误行或添加索引列（并删除它）。但是，缓冲区对我来说似乎很整洁。

在 Powerquery 中删除重复项忽略了我已经对数据进行了排序（因此删除了错误的重复项）

Removing duplicates in Powerquery is ignoring that I've sorted the data (so removes wrong duplicate)

excel

m

powerquery

问题：