Power Query - 按每行特定的单个键过滤

Power Query - Filter by specific individual Key per row

他是专家! 我正在使用 Power Query 减少稍后加载到模型中的数据量。 我有一个很大的交易 table,每个订单包含多行:

objective 是根据此过滤器为每个订单获取最大值(最后)列:

例如,我们可以看到,类型 1 = A1 的最大值基于键 1 = 50 和键 2 = 20,与类型 2 无关。依此类推。

是否有机会在 Power Query 中完成此操作? 结果会是这样的:

由于通配符,这更复杂

如果您按照此处所示设置规则 table(并将查询命名为 Rules

那么这段代码应该可以工作

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Added Custom" = Table.AddColumn(Source, "Custom", each 
        let r = _ in Table.MatchesAnyRows(
            Rules,
            each List.Contains({r[Type 1], "*"}, [Type 1])
            and List.Contains({r[Type 2], "*"}, [Type 2])
            and List.Contains({r[Key 1], "*"}, [Key 1])
            and List.Contains({r[Key 2], "*"}, [Key 2])
    )),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = true)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"Order"}, {{"Value", each List.Max([Value]), type number}})
in #"Grouped Rows"

这是在组和最大值之前创建的中间步骤

假设离散类型 1 优先于“通配符”类型 1,您可以通过 .FuzzyNestedJoin 来处理,尽管根据您的实际数据,您可能需要调整threshold 来自默认的 0.8

此外,根据您的实际数据,您可能想要更改某些数据类型(整数与小数等)。

然后

  • 按顺序分组
  • 使用类型和键列的比较创建过滤器
  • Return 值列的最大值
let
    Source = Excel.CurrentWorkbook(){[Name="Filter"]}[Content],
    filter = Table.TransformColumnTypes(Source,{
        {"Type 1", type text}, {"Type 2", type text}, {"Key 1", Int64.Type}, {"Key 2", Int64.Type}}),
    
    Source2 = Excel.CurrentWorkbook(){[Name="Transactions"]}[Content],
    transactions = Table.TransformColumnTypes(Source2,{
        {"Order", Int64.Type},
        {"Type 1", type text},
        {"Type 2", type text},
        {"Key 1", Int64.Type},
        {"Key 2", Int64.Type},
        {"Value", Int64.Type}
    }),

    join = Table.FuzzyNestedJoin(transactions,"Type 1", filter,"Type 1","joined",JoinKind.LeftOuter),
    #"Expanded joined" = Table.ExpandTableColumn(join, "joined", 
        {"Type 1", "Type 2", "Key 1", "Key 2"}, {"joined.Type 1", "joined.Type 2", "joined.Key 1", "joined.Key 2"}),

    #"Grouped Rows" = Table.Group(#"Expanded joined", {"Order"}, {
        {"Value", (t)=> List.Max(Table.SelectRows(t, each (([joined.Type 2]=null) or ([Type 2]=[joined.Type 2])) and 
                [Key 1]=[joined.Key 1] and [Key 2]=[joined.Key 2])[Value]), Int64.Type}
        })
in
    #"Grouped Rows"