POWERBI、DAX:连接字符串、拆分并仅保留子字符串一次
POWERBI, DAX : concatenate strings, split, and keep substrings only once
我尝试执行以下操作:
- 我有一列字符串,每个字符串可以有几个子字符串,用分隔符“:”分隔
- 我需要连接列字符串(我在这里做了一个过滤器以只保留有趣的行)
- 然后根据分隔符“:”进行拆分
- 如果子字符串重复,则只保留一次。
示例:
ColumnHeader
AA:BB:CC
BB:DD
DD:AA:EE
EE:AA:DD:BB
BB:EE
...
预期结果将是一个唯一的字符串:
"AA:BB:CC:DD:EE"
您将如何在 DAX 中执行此操作以填充新列?
我希望在 DAX 中找到 for/while 循环,就像在 Python 中一样...但是失败了。
我试过这个:
List =
VAR SIn = ""
VAR SOut = ""
VAR Cursor = 0
VAR SList =
CONCATENATEX(
FILTER(ATable, ATable[Name] = CTable[Name]),
[ColumnHeader],
":")
VAR pos1 = FIND(":", SList, Cursor, len(SList))
VAR pos2 = FIND(":", SList, pos1, len(SList))
VAR elem = TRIM(MID(SList, pos1+1, pos2-pos1))
// following is not good but is what I would like to do:
VAR SOut = CONCATENATE(SOut, elem)
VAR SList = MID(SList, pos2, len(SList)-pos2)
VAR Cursor = pos2
// I need to loop ... but how ? ... as no for/while loops are possibles ?
感谢您的帮助。
=====================================
感谢以下答案,我设法解决了这个问题。
我还是会给出更大的数据集,以便更好地理解全局问题:
我有 2 个表:
TABLE_BY_ELEMENT
KEY GROUP LIST KEY_DATA
1 G1 AA:BB:FF 11
2 G1 CC:AA 22
3 G1 FF:DD:AA 33
4 G1 CC:DD:AA 44
5 G2 CC:FF:GG 55
6 G2 BB:AA 66
TABLE_BY_GROUP
GROUP GROUP_DATA
G1 1111
G2 2222
我想这样查看数据:
RESULT_BY_GROUP
GROUP GROUP_DATA NewList
G1 111 AA:BB:FF:CC:DD
G2 222 CC:FF:GG:BB:AA
还有:
RESULT_ELEMENT
KEY LIST KEY_DATA
1 AA:BB:FF 11
2 CC:AA 22
3 FF:DD:AA 33
4 CC:DD:AA 44
5 CC:FF:GG 55
6 BB:AA 66
我希望这样更容易理解。
这不是 DAX 适合的东西。如果您需要使用 DAX 使其成为动态度量,那么您可能需要重塑数据以使其更有用。例如,
ID ColumnHeader
1 AA
1 BB
1 CC
2 BB
2 DD
3 DD
3 AA
3 EE
...
您可以在查询编辑器中使用“拆分列”>“按定界符”工具并选择按冒号拆分并展开成行来执行此拆分。
采用这种更有用的格式后,您可以像这样在 DAX 中使用它:
List = CONCATENATEX( VALUES('Table'[ColumnHeader]), 'Table'[ColumnHeader], ":" )
借用 here 的逻辑,可以 完全在 DAX 中执行此操作,但我不推荐这种方法。
List =
VAR LongString =
CONCATENATEX ( VALUES ( 'Table1'[ColumnHeader] ), Table1[ColumnHeader], ":" )
VAR StringToPath =
SUBSTITUTE ( LongString, ":", "|" )
VAR PathToTable =
ADDCOLUMNS (
GENERATESERIES ( 1, LEN ( StringToPath ) ),
"Item", PATHITEM ( StringToPath, [Value] )
)
VAR GroupItems =
FILTER (
SUMMARIZE ( PathToTable, [Item] ),
NOT ISBLANK ( [Item] )
)
RETURN
CONCATENATEX ( GroupItems, [Item], ":" )
让你的table看起来像下面-
现在在 Power Query 编辑器-
中尝试以下 Advance Editor 代码
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcnS0cnKycnZWitWJVgKyXFzALBcXK6CMqyuY4+oK4gCFnJxgykAysQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ColumnHeader = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ColumnHeader", type text}}),
//--NEW STEPS STARTS FROM HERE
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1, Int64.Type),
#"Reordered Columns" = Table.ReorderColumns(#"Added Index",{"Index", "ColumnHeader"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Reordered Columns", "ColumnHeader", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), {"ColumnHeader.1", "ColumnHeader.2", "ColumnHeader.3", "ColumnHeader.4"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"ColumnHeader.1", type text}, {"ColumnHeader.2", type text}, {"ColumnHeader.3", type text}, {"ColumnHeader.4", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Index"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute", "Index"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
#"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"Value", Order.Ascending}}),
#"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1, Int64.Type),
#"Reordered Columns1" = Table.ReorderColumns(#"Added Index1",{"Index", "Value"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Reordered Columns1", {{"Index", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Reordered Columns1", {{"Index", type text}}, "en-US")[Index]), "Index", "Value", List.Max),
#"Merged Columns" = Table.CombineColumns(#"Pivoted Column",{"1", "2", "3", "4", "5"},Combiner.CombineTextByDelimiter(":", QuoteStyle.None),"Merged")
in
#"Merged Columns"
这是最终输出-
这是 Power Query 编辑器 中考虑 GROUP BY-
的代码
使用以下代码创建一个新的 table RESULT_BY_GROUP-
let
Source = TABLE_BY_ELEMENT,
#"Removed Columns" = Table.RemoveColumns(Source,{"KEY", "KEY_DATA"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "LIST", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), {"LIST.1", "LIST.2", "LIST.3"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"LIST.1", type text}, {"LIST.2", type text}, {"LIST.3", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"GROUP"}, "Attribute", "Value"),
#"Removed Columns1" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns1"),
#"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"GROUP", Order.Ascending}, {"Value", Order.Ascending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"GROUP"}, {{"all", each _, type table [GROUP=nullable text, Value=text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "NewList", each [all][Value]),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"NewList", each Text.Combine(List.Transform(List.Sort(_), Text.From), ":"), type text}),
#"Removed Columns2" = Table.RemoveColumns(#"Extracted Values",{"all"}),
#"Merged Queries" = Table.NestedJoin(#"Removed Columns2", {"GROUP"}, TABLE_BY_GROUP, {"GROUP"}, "TABLE_BY_GROUP", JoinKind.LeftOuter),
#"Expanded TABLE_BY_GROUP" = Table.ExpandTableColumn(#"Merged Queries", "TABLE_BY_GROUP", {"GROUP_DATA "}, {"TABLE_BY_GROUP.GROUP_DATA "}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded TABLE_BY_GROUP",{{"TABLE_BY_GROUP.GROUP_DATA ", "GROUP_DATA"}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"GROUP", type text}, {"NewList", type text}, {"GROUP_DATA", Int64.Type}})
in
#"Changed Type1"
这是最终输出-
您可以使用您的基础 table [=32= 轻松想象您对 table RESULT_ELEMENT 的第二个要求]
我尝试执行以下操作:
- 我有一列字符串,每个字符串可以有几个子字符串,用分隔符“:”分隔
- 我需要连接列字符串(我在这里做了一个过滤器以只保留有趣的行)
- 然后根据分隔符“:”进行拆分
- 如果子字符串重复,则只保留一次。
示例:
ColumnHeader
AA:BB:CC
BB:DD
DD:AA:EE
EE:AA:DD:BB
BB:EE
...
预期结果将是一个唯一的字符串:
"AA:BB:CC:DD:EE"
您将如何在 DAX 中执行此操作以填充新列?
我希望在 DAX 中找到 for/while 循环,就像在 Python 中一样...但是失败了。
我试过这个:
List =
VAR SIn = ""
VAR SOut = ""
VAR Cursor = 0
VAR SList =
CONCATENATEX(
FILTER(ATable, ATable[Name] = CTable[Name]),
[ColumnHeader],
":")
VAR pos1 = FIND(":", SList, Cursor, len(SList))
VAR pos2 = FIND(":", SList, pos1, len(SList))
VAR elem = TRIM(MID(SList, pos1+1, pos2-pos1))
// following is not good but is what I would like to do:
VAR SOut = CONCATENATE(SOut, elem)
VAR SList = MID(SList, pos2, len(SList)-pos2)
VAR Cursor = pos2
// I need to loop ... but how ? ... as no for/while loops are possibles ?
感谢您的帮助。
=====================================
感谢以下答案,我设法解决了这个问题。
我还是会给出更大的数据集,以便更好地理解全局问题:
我有 2 个表:
TABLE_BY_ELEMENT
KEY GROUP LIST KEY_DATA
1 G1 AA:BB:FF 11
2 G1 CC:AA 22
3 G1 FF:DD:AA 33
4 G1 CC:DD:AA 44
5 G2 CC:FF:GG 55
6 G2 BB:AA 66
TABLE_BY_GROUP
GROUP GROUP_DATA
G1 1111
G2 2222
我想这样查看数据:
RESULT_BY_GROUP
GROUP GROUP_DATA NewList
G1 111 AA:BB:FF:CC:DD
G2 222 CC:FF:GG:BB:AA
还有:
RESULT_ELEMENT
KEY LIST KEY_DATA
1 AA:BB:FF 11
2 CC:AA 22
3 FF:DD:AA 33
4 CC:DD:AA 44
5 CC:FF:GG 55
6 BB:AA 66
我希望这样更容易理解。
这不是 DAX 适合的东西。如果您需要使用 DAX 使其成为动态度量,那么您可能需要重塑数据以使其更有用。例如,
ID ColumnHeader
1 AA
1 BB
1 CC
2 BB
2 DD
3 DD
3 AA
3 EE
...
您可以在查询编辑器中使用“拆分列”>“按定界符”工具并选择按冒号拆分并展开成行来执行此拆分。
采用这种更有用的格式后,您可以像这样在 DAX 中使用它:
List = CONCATENATEX( VALUES('Table'[ColumnHeader]), 'Table'[ColumnHeader], ":" )
借用 here 的逻辑,可以 完全在 DAX 中执行此操作,但我不推荐这种方法。
List =
VAR LongString =
CONCATENATEX ( VALUES ( 'Table1'[ColumnHeader] ), Table1[ColumnHeader], ":" )
VAR StringToPath =
SUBSTITUTE ( LongString, ":", "|" )
VAR PathToTable =
ADDCOLUMNS (
GENERATESERIES ( 1, LEN ( StringToPath ) ),
"Item", PATHITEM ( StringToPath, [Value] )
)
VAR GroupItems =
FILTER (
SUMMARIZE ( PathToTable, [Item] ),
NOT ISBLANK ( [Item] )
)
RETURN
CONCATENATEX ( GroupItems, [Item], ":" )
让你的table看起来像下面-
现在在 Power Query 编辑器-
中尝试以下 Advance Editor 代码let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcnS0cnKycnZWitWJVgKyXFzALBcXK6CMqyuY4+oK4gCFnJxgykAysQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ColumnHeader = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ColumnHeader", type text}}),
//--NEW STEPS STARTS FROM HERE
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1, Int64.Type),
#"Reordered Columns" = Table.ReorderColumns(#"Added Index",{"Index", "ColumnHeader"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Reordered Columns", "ColumnHeader", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), {"ColumnHeader.1", "ColumnHeader.2", "ColumnHeader.3", "ColumnHeader.4"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"ColumnHeader.1", type text}, {"ColumnHeader.2", type text}, {"ColumnHeader.3", type text}, {"ColumnHeader.4", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type1", {"Index"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute", "Index"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
#"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"Value", Order.Ascending}}),
#"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1, Int64.Type),
#"Reordered Columns1" = Table.ReorderColumns(#"Added Index1",{"Index", "Value"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Reordered Columns1", {{"Index", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Reordered Columns1", {{"Index", type text}}, "en-US")[Index]), "Index", "Value", List.Max),
#"Merged Columns" = Table.CombineColumns(#"Pivoted Column",{"1", "2", "3", "4", "5"},Combiner.CombineTextByDelimiter(":", QuoteStyle.None),"Merged")
in
#"Merged Columns"
这是最终输出-
这是 Power Query 编辑器 中考虑 GROUP BY-
的代码使用以下代码创建一个新的 table RESULT_BY_GROUP-
let
Source = TABLE_BY_ELEMENT,
#"Removed Columns" = Table.RemoveColumns(Source,{"KEY", "KEY_DATA"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns", "LIST", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), {"LIST.1", "LIST.2", "LIST.3"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"LIST.1", type text}, {"LIST.2", type text}, {"LIST.3", type text}}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"GROUP"}, "Attribute", "Value"),
#"Removed Columns1" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns1"),
#"Sorted Rows" = Table.Sort(#"Removed Duplicates",{{"GROUP", Order.Ascending}, {"Value", Order.Ascending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"GROUP"}, {{"all", each _, type table [GROUP=nullable text, Value=text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "NewList", each [all][Value]),
#"Extracted Values" = Table.TransformColumns(#"Added Custom", {"NewList", each Text.Combine(List.Transform(List.Sort(_), Text.From), ":"), type text}),
#"Removed Columns2" = Table.RemoveColumns(#"Extracted Values",{"all"}),
#"Merged Queries" = Table.NestedJoin(#"Removed Columns2", {"GROUP"}, TABLE_BY_GROUP, {"GROUP"}, "TABLE_BY_GROUP", JoinKind.LeftOuter),
#"Expanded TABLE_BY_GROUP" = Table.ExpandTableColumn(#"Merged Queries", "TABLE_BY_GROUP", {"GROUP_DATA "}, {"TABLE_BY_GROUP.GROUP_DATA "}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded TABLE_BY_GROUP",{{"TABLE_BY_GROUP.GROUP_DATA ", "GROUP_DATA"}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"GROUP", type text}, {"NewList", type text}, {"GROUP_DATA", Int64.Type}})
in
#"Changed Type1"
这是最终输出-
您可以使用您的基础 table [=32= 轻松想象您对 table RESULT_ELEMENT 的第二个要求]