Power Query - 来自 30 多个工作簿的可变列和 Header 位置
Power Query - Variable Column and Header location from 30+ workbooks
我正在尝试将许多工作簿与多个 sheet 合并。问题出在 sheet 1 之前有大量信息 header 需要提取的信息。以及许多 return 大量空值的合并单元格,并将数据推送到可变列中,具体取决于源工作簿的日期和版本。
目前排序和提升 headers 允许我匹配所需信息的前两列,但后续信息被直接推送到其他字段。
有没有办法删除空值并将数据集左移以匹配字段?或者更好的是识别动态 header 变化和 return 数据以匹配所选 headers?
下面是问题的概述,不幸的是,清理有关 sheet 数量和工作簿的数据并不是真正的选择。我是 Power Query 的新手,似乎无法理解这一点。
c1 c2 c3 c4 c5 c6 c7
A B Null C D Null E
a b c D Null E Null
A B C Null D G E
只需要 A-B-C-D-E。
= () => let
Source = Folder.Files("C:\Users\XXXXXXXX\Desktop\Log"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from Log", each #"Transform File from Log"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from Log"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from Log", Table.ColumnNames(#"Transform File from Log"(#"Sample File"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Name", type text}, {"Data", type any}, {"Item", type text}, {"Kind", type text}, {"Hidden", type logical}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Data", "Name", "Source.Name"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ([Name] = "page 1" or [Name] = "page 2" or [Name] = "page 2 +" or [Name] = "page 3 +" or [Name] = "page 4 +" or [Name] = "page 5 +" or [Name] = "page 6 +" or [Name] = "page 7 +" or [Name] = "page 8 +")),
#"Reordered Columns" = Table.ReorderColumns(#"Filtered Rows",{"Source.Name", "Name", "Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Reordered Columns", "Data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30", "Column31", "Column32", "Column33", "Column34", "Column35", "Column36", "Column37", "Column38", "Column39", "Column40", "Column41", "Column42", "Column43", "Column44", "Column45", "Column46", "Column47", "Column48", "Column49", "Column50", "Column51", "Column52", "Column53", "Column54", "Column55", "Column56", "Column57", "Column58", "Column59", "Column60", "Column61", "Column62", "Column63", "Column64", "Column65", "Column66", "Column67", "Column68", "Column69", "Column70", "Column71", "Column72", "Column73", "Column74", "Column75", "Column76", "Column77", "Column78", "Column79", "Column80", "Column81", "Column82", "Column83", "Column84"}, {"Data.Column1", "Data.Column2", "Data.Column3", "Data.Column4", "Data.Column5", "Data.Column6", "Data.Column7", "Data.Column8", "Data.Column9", "Data.Column10", "Data.Column11", "Data.Column12", "Data.Column13", "Data.Column14", "Data.Column15", "Data.Column16", "Data.Column17", "Data.Column18", "Data.Column19", "Data.Column20", "Data.Column21", "Data.Column22", "Data.Column23", "Data.Column24", "Data.Column25", "Data.Column26", "Data.Column27", "Data.Column28", "Data.Column29", "Data.Column30", "Data.Column31", "Data.Column32", "Data.Column33", "Data.Column34", "Data.Column35", "Data.Column36", "Data.Column37", "Data.Column38", "Data.Column39", "Data.Column40", "Data.Column41", "Data.Column42", "Data.Column43", "Data.Column44", "Data.Column45", "Data.Column46", "Data.Column47", "Data.Column48", "Data.Column49", "Data.Column50", "Data.Column51", "Data.Column52", "Data.Column53", "Data.Column54", "Data.Column55", "Data.Column56", "Data.Column57", "Data.Column58", "Data.Column59", "Data.Column60", "Data.Column61", "Data.Column62", "Data.Column63", "Data.Column64", "Data.Column65", "Data.Column66", "Data.Column67", "Data.Column68", "Data.Column69", "Data.Column70", "Data.Column71", "Data.Column72", "Data.Column73", "Data.Column74", "Data.Column75", "Data.Column76", "Data.Column77", "Data.Column78", "Data.Column79", "Data.Column80", "Data.Column81", "Data.Column82", "Data.Column83", "Data.Column84"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Data", each ([Data.Column2] <> null and [Data.Column2] <> 16 and [Data.Column2] <> "16" and [Data.Column2] <> "LOCATION")),
#"Promoted Headers" = Table.PromoteHeaders(#"Filtered Rows1", [PromoteAllScalars=true])
in
#"Promoted Headers"
Picture
去除空值并将所有内容向左滑动
添加列..索引列
右键单击索引列,反透视其他列
右键单击并删除属性列
Group on Index 修改代码以
结尾,在每组中添加另一个索引
each Table.AddIndexColumn(_, "Index2", 1, 1), type table}})
使用顶部的箭头展开 [x]values 和 [x]index2 字段的列
点击Index2字段并转换..pivot列,以Value为Values,高级,不聚合
将上面的 table 之前转换为 table
之后的示例代码
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Index"}, {{"GRP", each Table.AddIndexColumn(_, "Index2", 1, 1), type table}}),
#"Expanded GRP" = Table.ExpandTableColumn(#"Grouped Rows", "GRP", {"Value", "Index2"}, {"Value", "Index2"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Expanded GRP", {{"Index2", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Expanded GRP", {{"Index2", type text}}, "en-US")[Index2]), "Index2", "Value"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Index"})
in #"Removed Columns1"
我正在尝试将许多工作簿与多个 sheet 合并。问题出在 sheet 1 之前有大量信息 header 需要提取的信息。以及许多 return 大量空值的合并单元格,并将数据推送到可变列中,具体取决于源工作簿的日期和版本。
目前排序和提升 headers 允许我匹配所需信息的前两列,但后续信息被直接推送到其他字段。
有没有办法删除空值并将数据集左移以匹配字段?或者更好的是识别动态 header 变化和 return 数据以匹配所选 headers?
下面是问题的概述,不幸的是,清理有关 sheet 数量和工作簿的数据并不是真正的选择。我是 Power Query 的新手,似乎无法理解这一点。
c1 c2 c3 c4 c5 c6 c7
A B Null C D Null E
a b c D Null E Null
A B C Null D G E
只需要 A-B-C-D-E。
= () => let
Source = Folder.Files("C:\Users\XXXXXXXX\Desktop\Log"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File from Log", each #"Transform File from Log"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File from Log"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File from Log", Table.ColumnNames(#"Transform File from Log"(#"Sample File"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Name", type text}, {"Data", type any}, {"Item", type text}, {"Kind", type text}, {"Hidden", type logical}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Data", "Name", "Source.Name"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ([Name] = "page 1" or [Name] = "page 2" or [Name] = "page 2 +" or [Name] = "page 3 +" or [Name] = "page 4 +" or [Name] = "page 5 +" or [Name] = "page 6 +" or [Name] = "page 7 +" or [Name] = "page 8 +")),
#"Reordered Columns" = Table.ReorderColumns(#"Filtered Rows",{"Source.Name", "Name", "Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Reordered Columns", "Data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column22", "Column23", "Column24", "Column25", "Column26", "Column27", "Column28", "Column29", "Column30", "Column31", "Column32", "Column33", "Column34", "Column35", "Column36", "Column37", "Column38", "Column39", "Column40", "Column41", "Column42", "Column43", "Column44", "Column45", "Column46", "Column47", "Column48", "Column49", "Column50", "Column51", "Column52", "Column53", "Column54", "Column55", "Column56", "Column57", "Column58", "Column59", "Column60", "Column61", "Column62", "Column63", "Column64", "Column65", "Column66", "Column67", "Column68", "Column69", "Column70", "Column71", "Column72", "Column73", "Column74", "Column75", "Column76", "Column77", "Column78", "Column79", "Column80", "Column81", "Column82", "Column83", "Column84"}, {"Data.Column1", "Data.Column2", "Data.Column3", "Data.Column4", "Data.Column5", "Data.Column6", "Data.Column7", "Data.Column8", "Data.Column9", "Data.Column10", "Data.Column11", "Data.Column12", "Data.Column13", "Data.Column14", "Data.Column15", "Data.Column16", "Data.Column17", "Data.Column18", "Data.Column19", "Data.Column20", "Data.Column21", "Data.Column22", "Data.Column23", "Data.Column24", "Data.Column25", "Data.Column26", "Data.Column27", "Data.Column28", "Data.Column29", "Data.Column30", "Data.Column31", "Data.Column32", "Data.Column33", "Data.Column34", "Data.Column35", "Data.Column36", "Data.Column37", "Data.Column38", "Data.Column39", "Data.Column40", "Data.Column41", "Data.Column42", "Data.Column43", "Data.Column44", "Data.Column45", "Data.Column46", "Data.Column47", "Data.Column48", "Data.Column49", "Data.Column50", "Data.Column51", "Data.Column52", "Data.Column53", "Data.Column54", "Data.Column55", "Data.Column56", "Data.Column57", "Data.Column58", "Data.Column59", "Data.Column60", "Data.Column61", "Data.Column62", "Data.Column63", "Data.Column64", "Data.Column65", "Data.Column66", "Data.Column67", "Data.Column68", "Data.Column69", "Data.Column70", "Data.Column71", "Data.Column72", "Data.Column73", "Data.Column74", "Data.Column75", "Data.Column76", "Data.Column77", "Data.Column78", "Data.Column79", "Data.Column80", "Data.Column81", "Data.Column82", "Data.Column83", "Data.Column84"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Data", each ([Data.Column2] <> null and [Data.Column2] <> 16 and [Data.Column2] <> "16" and [Data.Column2] <> "LOCATION")),
#"Promoted Headers" = Table.PromoteHeaders(#"Filtered Rows1", [PromoteAllScalars=true])
in
#"Promoted Headers"
Picture
去除空值并将所有内容向左滑动
添加列..索引列
右键单击索引列,反透视其他列
右键单击并删除属性列
Group on Index 修改代码以
结尾,在每组中添加另一个索引each Table.AddIndexColumn(_, "Index2", 1, 1), type table}})
使用顶部的箭头展开 [x]values 和 [x]index2 字段的列
点击Index2字段并转换..pivot列,以Value为Values,高级,不聚合
将上面的 table 之前转换为 table
之后的示例代码let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Removed Columns" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Index"}, {{"GRP", each Table.AddIndexColumn(_, "Index2", 1, 1), type table}}),
#"Expanded GRP" = Table.ExpandTableColumn(#"Grouped Rows", "GRP", {"Value", "Index2"}, {"Value", "Index2"}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Expanded GRP", {{"Index2", type text}}, "en-US"), List.Distinct(Table.TransformColumnTypes(#"Expanded GRP", {{"Index2", type text}}, "en-US")[Index2]), "Index2", "Value"),
#"Removed Columns1" = Table.RemoveColumns(#"Pivoted Column",{"Index"})
in #"Removed Columns1"