Powerquery 附加文件给出错误
Powerquery-appending files giving error
我正在尝试附加近 10000 个 excel 个文件(每个文件的大小为 50-100 kb)。在过程进行到一半时,我 运行 遇到了 PQ 错误。当我附加文件时,错误发生了一半,无法找出导致问题的 .xlsx 文件。
PQ 的查询和连接窗格同时显示以下错误:
除了手动一个接一个地在 PQ 上上传查询,直到找到出错的文件,我该如何解决这个问题?感谢阅读!
我经常 运行 遇到 PQ 在 运行 进入 excel 工作簿中的 "error" 单元格时彻底失败的问题,即使您尝试过删除前面步骤中的错误。我不清楚导致这种情况的标准,但我想知道这里是否会出现这种情况,因为它提到了“#VALUE!”该消息中的错误?虽然 PQ 可能应该更优雅地处理这个问题,但我做了几个查询,让我输入一个目录,它将 return 工作簿,sheet 和每个 [=27 中的每个单元格的行错误=] 该目录中的文件。我从未尝试过使用 10k excel 个文件,但如果我的代码经过清理以提高效率,它可能会足够快。
获取所有原始 excel 文件数据的查询如下所示:
let
Source = Folder.Files(YOUR DIRECTORY HERE),
#"Filtered Rows1" = Table.SelectRows(Source, each not Text.StartsWith([Name], "~")),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows1", each Text.EndsWith([Extension], ".xlsx") or Text.EndsWith([Extension], ".xlsm")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "WorkbookData", each Excel.Workbook([Content])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Folder Path", "Name", "WorkbookData"}),
#"Expanded WorkbookData" = Table.ExpandTableColumn(#"Removed Other Columns", "WorkbookData", {"Data", "Hidden", "Item", "Kind", "Name"}, {"WorkbookData.Data", "WorkbookData.Hidden", "WorkbookData.Item", "WorkbookData.Kind", "WorkbookData.Name"}),
#"Filtered Rows2" = Table.SelectRows(#"Expanded WorkbookData", each ([WorkbookData.Kind] = "Sheet")),
#"Removed Other Columns1" = Table.SelectColumns(#"Filtered Rows2",{"Folder Path", "Name", "WorkbookData.Name", "WorkbookData.Data"}),
ExpandedData = Table.ExpandTableColumn(#"Removed Other Columns1", "WorkbookData.Data", Table.ColumnNames(Table.Combine(#"Removed Other Columns1"[WorkbookData.Data]))),
IdentifySheets = Table.AddColumn(ExpandedData, "UniqueSheet", each [Folder Path]&[Name]&[WorkbookData.Name]),
SheetRowCounts = Table.Group(IdentifySheets, {"UniqueSheet"}, {{"Count", each Table.RowCount(_), type number}}),
#"Added Custom2" = Table.AddColumn(SheetRowCounts, "PerSheetRow", each List.Numbers(1, [Count], 1)),
#"Expanded PerSheetIndex" = Table.ExpandListColumn(#"Added Custom2", "PerSheetRow"),
IndexBase = Table.AddIndexColumn(#"Expanded PerSheetIndex", "Index", 0, 1),
#"Added Index" = Table.AddIndexColumn(IdentifySheets, "Index", 0, 1),
#"Merged Queries" = Table.NestedJoin(#"Added Index",{"Index"},IndexBase,{"Index"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"PerSheetRow"}, {"PerSheetRow"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded NewColumn",{"UniqueSheet", "Index"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns", List.Combine({{"Folder Path", "Name", "WorkbookData.Name", "PerSheetRow"}, List.RemoveMatchingItems(Table.ColumnNames(ExpandedData), {"Folder Path", "Name", "WorkbookData.Name"})}))
in
#"Reordered Columns"
并且该部分设置为仅连接查询,因为我不想加载我正在检查的每个工作簿的每个 sheet 的数据。
我用来加载其中有错误的行的查询如下所示:
let
Source = NAME OF THE QUERY ABOVE,
#"Kept Errors" = Table.SelectRowsWithErrors(Source, Table.ColumnNames(Source)),
ColumnList = Table.FromList(Table.ColumnNames(#"Kept Errors")),
#"Added Custom" = Table.AddColumn(ColumnList, "Custom", each "ERROR"),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Replacements", each Record.FieldValues(_)),
ErrorReplacements = Table.SelectColumns(#"Added Custom1",{"Replacements"}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Kept Errors", ErrorReplacements[Replacements]),
#"Renamed Columns" = Table.RenameColumns(#"Replaced Errors",{{"PerSheetRow", "SheetRow"}, {"Name", "Workbook"}, {"WorkbookData.Name", "Sheet"}})
in
#"Renamed Columns"
我找不到让 PQ 将 "error" 单元格转换为特定错误的字符串的方法(可能是可能的,我只是不知道如何),所以我只是有它用 "ERROR" 替换所有错误单元格,并在我的 sheet 上设置条件格式以突出显示。
不能说这对你的情况有多大作用,但它帮助我无数次在 excel 文件集中找到错误单元格。
我正在尝试附加近 10000 个 excel 个文件(每个文件的大小为 50-100 kb)。在过程进行到一半时,我 运行 遇到了 PQ 错误。当我附加文件时,错误发生了一半,无法找出导致问题的 .xlsx 文件。
PQ 的查询和连接窗格同时显示以下错误:
除了手动一个接一个地在 PQ 上上传查询,直到找到出错的文件,我该如何解决这个问题?感谢阅读!
我经常 运行 遇到 PQ 在 运行 进入 excel 工作簿中的 "error" 单元格时彻底失败的问题,即使您尝试过删除前面步骤中的错误。我不清楚导致这种情况的标准,但我想知道这里是否会出现这种情况,因为它提到了“#VALUE!”该消息中的错误?虽然 PQ 可能应该更优雅地处理这个问题,但我做了几个查询,让我输入一个目录,它将 return 工作簿,sheet 和每个 [=27 中的每个单元格的行错误=] 该目录中的文件。我从未尝试过使用 10k excel 个文件,但如果我的代码经过清理以提高效率,它可能会足够快。
获取所有原始 excel 文件数据的查询如下所示:
let
Source = Folder.Files(YOUR DIRECTORY HERE),
#"Filtered Rows1" = Table.SelectRows(Source, each not Text.StartsWith([Name], "~")),
#"Filtered Rows" = Table.SelectRows(#"Filtered Rows1", each Text.EndsWith([Extension], ".xlsx") or Text.EndsWith([Extension], ".xlsm")),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "WorkbookData", each Excel.Workbook([Content])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Folder Path", "Name", "WorkbookData"}),
#"Expanded WorkbookData" = Table.ExpandTableColumn(#"Removed Other Columns", "WorkbookData", {"Data", "Hidden", "Item", "Kind", "Name"}, {"WorkbookData.Data", "WorkbookData.Hidden", "WorkbookData.Item", "WorkbookData.Kind", "WorkbookData.Name"}),
#"Filtered Rows2" = Table.SelectRows(#"Expanded WorkbookData", each ([WorkbookData.Kind] = "Sheet")),
#"Removed Other Columns1" = Table.SelectColumns(#"Filtered Rows2",{"Folder Path", "Name", "WorkbookData.Name", "WorkbookData.Data"}),
ExpandedData = Table.ExpandTableColumn(#"Removed Other Columns1", "WorkbookData.Data", Table.ColumnNames(Table.Combine(#"Removed Other Columns1"[WorkbookData.Data]))),
IdentifySheets = Table.AddColumn(ExpandedData, "UniqueSheet", each [Folder Path]&[Name]&[WorkbookData.Name]),
SheetRowCounts = Table.Group(IdentifySheets, {"UniqueSheet"}, {{"Count", each Table.RowCount(_), type number}}),
#"Added Custom2" = Table.AddColumn(SheetRowCounts, "PerSheetRow", each List.Numbers(1, [Count], 1)),
#"Expanded PerSheetIndex" = Table.ExpandListColumn(#"Added Custom2", "PerSheetRow"),
IndexBase = Table.AddIndexColumn(#"Expanded PerSheetIndex", "Index", 0, 1),
#"Added Index" = Table.AddIndexColumn(IdentifySheets, "Index", 0, 1),
#"Merged Queries" = Table.NestedJoin(#"Added Index",{"Index"},IndexBase,{"Index"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"PerSheetRow"}, {"PerSheetRow"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded NewColumn",{"UniqueSheet", "Index"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns", List.Combine({{"Folder Path", "Name", "WorkbookData.Name", "PerSheetRow"}, List.RemoveMatchingItems(Table.ColumnNames(ExpandedData), {"Folder Path", "Name", "WorkbookData.Name"})}))
in
#"Reordered Columns"
并且该部分设置为仅连接查询,因为我不想加载我正在检查的每个工作簿的每个 sheet 的数据。
我用来加载其中有错误的行的查询如下所示:
let
Source = NAME OF THE QUERY ABOVE,
#"Kept Errors" = Table.SelectRowsWithErrors(Source, Table.ColumnNames(Source)),
ColumnList = Table.FromList(Table.ColumnNames(#"Kept Errors")),
#"Added Custom" = Table.AddColumn(ColumnList, "Custom", each "ERROR"),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Replacements", each Record.FieldValues(_)),
ErrorReplacements = Table.SelectColumns(#"Added Custom1",{"Replacements"}),
#"Replaced Errors" = Table.ReplaceErrorValues(#"Kept Errors", ErrorReplacements[Replacements]),
#"Renamed Columns" = Table.RenameColumns(#"Replaced Errors",{{"PerSheetRow", "SheetRow"}, {"Name", "Workbook"}, {"WorkbookData.Name", "Sheet"}})
in
#"Renamed Columns"
我找不到让 PQ 将 "error" 单元格转换为特定错误的字符串的方法(可能是可能的,我只是不知道如何),所以我只是有它用 "ERROR" 替换所有错误单元格,并在我的 sheet 上设置条件格式以突出显示。
不能说这对你的情况有多大作用,但它帮助我无数次在 excel 文件集中找到错误单元格。