Power Query - 按每行特定的单个键过滤
Power Query - Filter by specific individual Key per row
他是专家!
我正在使用 Power Query 减少稍后加载到模型中的数据量。
我有一个很大的交易 table,每个订单包含多行:
objective 是根据此过滤器为每个订单获取最大值(最后)列:
例如,我们可以看到,类型 1 = A1 的最大值基于键 1 = 50 和键 2 = 20,与类型 2 无关。依此类推。
是否有机会在 Power Query 中完成此操作?
结果会是这样的:
由于通配符,这更复杂
如果您按照此处所示设置规则 table(并将查询命名为 Rules)
那么这段代码应该可以工作
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each
let r = _ in Table.MatchesAnyRows(
Rules,
each List.Contains({r[Type 1], "*"}, [Type 1])
and List.Contains({r[Type 2], "*"}, [Type 2])
and List.Contains({r[Key 1], "*"}, [Key 1])
and List.Contains({r[Key 2], "*"}, [Key 2])
)),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = true)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Order"}, {{"Value", each List.Max([Value]), type number}})
in #"Grouped Rows"
这是在组和最大值之前创建的中间步骤
假设离散类型 1 优先于“通配符”类型 1,您可以通过 .FuzzyNestedJoin
来处理,尽管根据您的实际数据,您可能需要调整threshold
来自默认的 0.8
此外,根据您的实际数据,您可能想要更改某些数据类型(整数与小数等)。
然后
- 按顺序分组
- 使用类型和键列的比较创建过滤器
- Return 值列的最大值
let
Source = Excel.CurrentWorkbook(){[Name="Filter"]}[Content],
filter = Table.TransformColumnTypes(Source,{
{"Type 1", type text}, {"Type 2", type text}, {"Key 1", Int64.Type}, {"Key 2", Int64.Type}}),
Source2 = Excel.CurrentWorkbook(){[Name="Transactions"]}[Content],
transactions = Table.TransformColumnTypes(Source2,{
{"Order", Int64.Type},
{"Type 1", type text},
{"Type 2", type text},
{"Key 1", Int64.Type},
{"Key 2", Int64.Type},
{"Value", Int64.Type}
}),
join = Table.FuzzyNestedJoin(transactions,"Type 1", filter,"Type 1","joined",JoinKind.LeftOuter),
#"Expanded joined" = Table.ExpandTableColumn(join, "joined",
{"Type 1", "Type 2", "Key 1", "Key 2"}, {"joined.Type 1", "joined.Type 2", "joined.Key 1", "joined.Key 2"}),
#"Grouped Rows" = Table.Group(#"Expanded joined", {"Order"}, {
{"Value", (t)=> List.Max(Table.SelectRows(t, each (([joined.Type 2]=null) or ([Type 2]=[joined.Type 2])) and
[Key 1]=[joined.Key 1] and [Key 2]=[joined.Key 2])[Value]), Int64.Type}
})
in
#"Grouped Rows"
他是专家!
我正在使用 Power Query 减少稍后加载到模型中的数据量。
我有一个很大的交易 table,每个订单包含多行:
objective 是根据此过滤器为每个订单获取最大值(最后)列:
例如,我们可以看到,类型 1 = A1 的最大值基于键 1 = 50 和键 2 = 20,与类型 2 无关。依此类推。
是否有机会在 Power Query 中完成此操作? 结果会是这样的:
由于通配符,这更复杂
如果您按照此处所示设置规则 table(并将查询命名为 Rules)
那么这段代码应该可以工作
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Custom" = Table.AddColumn(Source, "Custom", each
let r = _ in Table.MatchesAnyRows(
Rules,
each List.Contains({r[Type 1], "*"}, [Type 1])
and List.Contains({r[Type 2], "*"}, [Type 2])
and List.Contains({r[Key 1], "*"}, [Key 1])
and List.Contains({r[Key 2], "*"}, [Key 2])
)),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Custom] = true)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Order"}, {{"Value", each List.Max([Value]), type number}})
in #"Grouped Rows"
这是在组和最大值之前创建的中间步骤
假设离散类型 1 优先于“通配符”类型 1,您可以通过 .FuzzyNestedJoin
来处理,尽管根据您的实际数据,您可能需要调整threshold
来自默认的 0.8
此外,根据您的实际数据,您可能想要更改某些数据类型(整数与小数等)。
然后
- 按顺序分组
- 使用类型和键列的比较创建过滤器
- Return 值列的最大值
let
Source = Excel.CurrentWorkbook(){[Name="Filter"]}[Content],
filter = Table.TransformColumnTypes(Source,{
{"Type 1", type text}, {"Type 2", type text}, {"Key 1", Int64.Type}, {"Key 2", Int64.Type}}),
Source2 = Excel.CurrentWorkbook(){[Name="Transactions"]}[Content],
transactions = Table.TransformColumnTypes(Source2,{
{"Order", Int64.Type},
{"Type 1", type text},
{"Type 2", type text},
{"Key 1", Int64.Type},
{"Key 2", Int64.Type},
{"Value", Int64.Type}
}),
join = Table.FuzzyNestedJoin(transactions,"Type 1", filter,"Type 1","joined",JoinKind.LeftOuter),
#"Expanded joined" = Table.ExpandTableColumn(join, "joined",
{"Type 1", "Type 2", "Key 1", "Key 2"}, {"joined.Type 1", "joined.Type 2", "joined.Key 1", "joined.Key 2"}),
#"Grouped Rows" = Table.Group(#"Expanded joined", {"Order"}, {
{"Value", (t)=> List.Max(Table.SelectRows(t, each (([joined.Type 2]=null) or ([Type 2]=[joined.Type 2])) and
[Key 1]=[joined.Key 1] and [Key 2]=[joined.Key 2])[Value]), Int64.Type}
})
in
#"Grouped Rows"