电源查询。合并具有折叠值的行中的重复行

Power Query. Merge duplicated lines in a row with collapsing values

我有一个公交时刻表,其中有停靠站 Time_in 和 Time_out。有时在我的数据中,停靠点重复(连续),我需要合并它们,只留下第一个 Time_in 和最后一个 Time_out.

下面是一个例子:

停止 Time_in Time_out
23 街 15:23 15:27
42 街 15:35 15:40
42 街 15:42 15:48
47 街 15:56 16:10
42 街 16:14 16:19

想要的结果:

停止 Time_in Time_out
23 街 15:23 15:27
42 街 15:35 15:48
47 街 15:56 16:10
42 街 16:14 16:19

非常感谢任何帮助,在此先感谢。

Power Query

    let
    Source = Web.BrowserContents("
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", "DIV.s-table-container:nth-child(3) > TABLE.s-table > * > TR > :nth-child(1)"}, {"Column2", "DIV.s-table-container:nth-child(3) > TABLE.s-table > * > TR > :nth-child(2)"}, {"Column3", "DIV.s-table-container:nth-child(3) > TABLE.s-table > * > TR > :nth-child(3)"}}, [RowSelector="DIV.s-table-container:nth-child(3) > TABLE.s-table > * > TR"]),
    #"Promoted Headers" = Table.PromoteHeaders(#"Extracted Table From Html", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Stop", type text}, {"Time_in", type time}, {"Time_out", type time}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Time_out"}),
    #"Grouped Rows" = Table.Group(#"Removed Columns", {"Stop"}, {{"ad_1", each _, type table [Stop=nullable text, Time_in=nullable time]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each let x= [ad_1],
 #"Removed Columns1" = Table.RemoveColumns(x,{"Stop"}),
    #"Sorted Rows" = Table.Sort(#"Removed Columns1",{{"Time_in", Order.Ascending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1, Int64.Type),
    #"Filtered Rows" = Table.SelectRows(#"Added Index", each ([Index] = 1)),
    #"Removed Columns2" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
    #"Removed Columns2"),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"ad_1"}),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Columns1", "Custom", {"Time_in"}, {"Time_in"}),
    Custom1 = Table.RemoveColumns(#"Changed Type",{"Time_in"}),
    #"Grouped Rows1" = Table.Group(Custom1, {"Stop"}, {{"ad_2", each _, type table [Stop=nullable text, Time_out=nullable time]}}),
    Custom2 = Table.AddColumn(#"Grouped Rows1", "Custom", each let x= [ad_2],
 #"Removed Columns1" = Table.RemoveColumns(x,{"Stop"}),
    #"Sorted Rows" = Table.Sort(#"Removed Columns1",{{"Time_out", Order.Descending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1, Int64.Type),
    #"Filtered Rows" = Table.SelectRows(#"Added Index", each ([Index] = 1)),
    #"Removed Columns2" = Table.RemoveColumns(#"Filtered Rows",{"Index"})
in
    #"Removed Columns2"),
    #"Removed Columns2" = Table.RemoveColumns(Custom2,{"ad_2"}),
    #"Expanded Custom1" = Table.ExpandTableColumn(#"Removed Columns2", "Custom", {"Time_out"}, {"Time_out"}),
    #"Merged Queries" = Table.NestedJoin(#"Expanded Custom", {"Stop"}, #"Expanded Custom1", {"Stop"}, "Expanded Custom1", JoinKind.LeftOuter),
    #"Expanded Expanded Custom1" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Custom1", {"Time_out"}, {"Time_out"})
in
    #"Expanded Expanded Custom1"

DAX

min:= MIN('Table 1'[Time_in])
max:= MAX('Table 1'[Time_out])

DAX 结果

在 powerquery 中,右键单击“停止”列,然后单击“分组依据...”。

选择添加分组

对于第 Time_in 列的第一行选择操作最小值

对于第二行,选择第 Time_out

列的操作最大值

如果需要,将类型数字更改为在编辑栏或主页中输入时间...高级编辑器..

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Stop", type text}, {"Time_in", type time}, {"Time_out", type time}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Stop"}, {{"Time_in", each List.Min([Time_in]), type time}, {"Time_out", each List.Max([Time_out]), type time}})
in  #"Grouped Rows"

对于 Stops 可以重复的新要求,我们首先创建一个组号,以确保 Stops 在组合之前位于相邻的行中

添加列索引列

添加列,使用公式自定义列

= try if #"Added Index"{[Index]}[Stop] = #"Added Index"{[Index]-1}[Stop] then null else [Index] otherwise [Index]

右键单击新列并向下填写

同时单击“停止”和“自定义”列并对其进行分组

选择添加聚合

对于第 Time_in 列的第一行选择操作最小值

对于第二行,选择第 Time_out 列上的操作最大值。

示例代码:

let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Stop", type text}, {"Time_in", type time}, {"Time_out", type time}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom", each try if #"Added Index"{[Index]}[Stop] = #"Added Index"{[Index]-1}[Stop] then null else [Index] otherwise [Index]),
#"Filled Down" = Table.FillDown(#"Added Custom",{"Custom"}),
#"Grouped Rows" = Table.Group(#"Filled Down", {"Stop", "Custom"}, {{"Time_in", each List.Min([Time_in]), type time}, {"Time_out", each List.Max([Time_out]), type time}}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Custom"})
in #"Removed Columns"