合并具有连接日期的行

Merge rows with connecting dates

我有大量 Excel sheet 客户和订阅数据。从这个 table 我想合并 records/rows 与 connection stop_ 和 start_dates 并在新作品 sheet 中显示结果。数据的简化版本如下所示。

Customer_id subscription_id   start_date    stop_date
1034        RV4               30-4-2012     30-1-2015
1035        AB7               30-1-2014     30-3-2014
1035        AB6               30-1-2014     30-3-2014
1035        AB7               30-12-2013    30-1-2014
1035        AB7               12-12-2012    30-12-2013
1035        AB7               12-9-2010     14-1-2011

因此,公式必须检查 customer_id 和 subscription_id。当sheet中有两行或多行匹配,其中一行的stop_date与另一行的start_date重叠,则提取合并后,新行必须显示第一行的 start_date 和另一行的 stop_date。如果有多行具有连接日期,这也必须起作用。不符合这些条件的所有行在提取后保持不变。所以结果会是这样的:

Customer_id subscription_id start_date  stop_date
1034        RV4             30-4-2012   30-1-2015
1035        AB6             30-1-2014   30-3-2014
1035        AB7             12-12-2012  30-3-2014
1035        AB7             12-9-2010   14-1-2011

动态解决方案是理想的,同时新数据将添加到原始数据 sheet。虽然我知道当您确定要查找的行始终在彼此下方时这是可能的,但这里不是这种情况,它不会为您提供非常动态的解决方案。

所以我猜 Excel 中需要某种数组函数,但经过大量搜索后我找不到 suitable 解决方案。我也有 MATLAB 可用,但不知道从哪里开始该程序有这样的问题。

动态解决方案可能是可能的,但如果数据集很大,它可能会使事情陷入困境,因为你每次都需要它运行更改了一个单元格。

基本上我能看到的最好的方法是在你的 customer_id 和 subscription_id 中创建唯一的键,然后收集该键下的所有日期范围并合并它们。

类似这样的内容应该可以帮助您入门(需要参考 Microsoft Scripting Runtime):

Public Sub LinkSubscriptionDates()

    Dim data As Dictionary, source As Worksheet, target As Worksheet

    Set source = ActiveSheet
    Set data = GetSubscriptions(source)
    Set target = source.Parent.Worksheets.Add

    'Copy headers
    target.Range(target.Cells(1, 1), target.Cells(1, 4)).Value = _
           source.Range(source.Cells(1, 1), source.Cells(1, 4)).Value

    Dim row As Long
    row = 2

    Dim key As Variant, item As Variant
    For Each key In data.Keys
        For Each item In data(key)
            target.Cells(row, 1) = Split(key, "|")(0)
            target.Cells(row, 2) = Split(key, "|")(1)
            target.Cells(row, 3) = Split(item, "|")(0)
            target.Cells(row, 4) = Split(item, "|")(1)
            row = row + 1
        Next item
    Next key

End Sub

Private Function GetSubscriptions(source As Worksheet) As Dictionary

    Dim subscrips As Dictionary
    Set subscrips = New Dictionary

    Dim row As Long
    Dim cust As String, subs As String, starting As String, ending As String

    'Gather all the data as pairs of customer|subscription, starting|ending
    For row = 2 To source.UsedRange.Rows.Count
        Dim dates() As String
        cust = source.Cells(row, 1).Value
        subs = source.Cells(row, 2).Value
        'Valid customer/subscription?
        If cust <> vbNullString And subs <> vbNullString Then
            starting = source.Cells(row, 3).Value
            ending = source.Cells(row, 4).Value
            'Has an ending and starting date?
            If starting <> vbNullString And ending <> vbNullString Then
                Dim key As String
                key = cust & "|" & subs
                'New combo?
                If Not subscrips.Exists(key) Then
                    subscrips.Add key, New Collection
                    subscrips(key).Add starting & "|" & ending
                Else
                    subscrips(key).Add starting & "|" & ending
                    Set subscrips(key) = MergeDates(subscrips(key))
                End If
            End If
        End If
    Next row

    Set GetSubscriptions = subscrips

End Function

Private Function MergeDates(dates As Collection) As Collection

    Dim candidate As Long, index As Long
    Dim values() As String, test() As String
    Dim merge As Boolean

    For index = 1 To dates.Count
        values = Split(dates(index), "|")
        'Check to see if it can be merged with any other row.
        For candidate = index + 1 To dates.Count
            test = Split(dates(candidate), "|")
            If CDate(test(0)) >= CDate(values(0)) And _
               CDate(test(0)) <= CDate(values(1)) Or _
               CDate(test(1)) >= CDate(values(0)) And _
               CDate(test(1)) <= CDate(values(1)) Then
                dates.Remove candidate
                merge = True
                Exit For
            End If
        Next candidate
        If merge Then Exit For
    Next index

    If merge Then
        'Pull both rows out of the collection.
        dates.Remove index
        values(0) = IIf(CDate(test(0)) < CDate(values(0)), _
                        CDate(test(0)), CDate(values(0)))
        values(1) = IIf(CDate(test(1)) > CDate(values(1)), _
                        CDate(test(1)), CDate(values(1)))
        'Put the merged date range back in.
        dates.Add values(0) & "|" & values(1)
        'Recurse.
        Set MergeDates = MergeDates(dates)
    End If

    Set MergeDates = dates

End Function

它确实需要通过数据验证、错误捕获等来充实,目前它只是将结果数据放在新的工作表上。所有工作都在 GetSubscriptions 函数中完成,因此您只需从中获取返回的 Dictionary 并使用其中的数据做任何您需要做的事情。