如何找到列表的最长连续非零子集?

How to find the longest consecutive non-zero subset of a list?

我有一个浮动列表,看起来有点像这样:

[
 163.33333333333334,
 0.0,
 0.0,
 154.73684210526315,
 172.94117647058823,
 155.8303886925795,
 0.0,
 156.93950177935943,
 0.0,
 0.0,
 0.0,
 151.5463917525773,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 165.1685393258427,
 156.93950177935943,
 169.6153846153846,
 159.7826086956522,
 167.04545454545453,
 158.06451612903226,
 168.9655172413793,
 157.5,
 0.0,
 159.7826086956522,
 0.0,
 163.94052044609666,
 166.41509433962264,
 0.0,
 0.0,
 0.0,
]

实际列表比这个大得多,但具有相似的值。 从这个列表中,我想找到其中最大的非零连续子集。在这种情况下,它将是:


 [165.1685393258427,
 156.93950177935943,
 169.6153846153846,
 159.7826086956522,
 167.04545454545453,
 158.06451612903226,
 168.9655172413793]

我是 python 和 python 的新手,一般来说是编码,所以非常感谢任何帮助。

def max_non_zero_subset(arr):    
    max_non_zero = []    
    curr_non_zero = []    
    for n in arr:    
        if n == 0:    
            if len(curr_non_zero) > len(max_non_zero):    
                max_non_zero = curr_non_zero    
            curr_non_zero = []    
        else:                                                                           
            curr_non_zero.append(n)    
                                                                                    
    return max_non_zero if len(max_non_zero) >= len(curr_non_zero) else curr_non_zero     

您可以使用 itertools.groupby,根据值是否为 0 进行分组,然后 select 所有具有非零值的子列表并找到最大长度的子列表:

from itertools import groupby

g = groupby(l, key=lambda x:x>0.0)
m = max([list(s) for v, s in g if v > 0.0], key=len)
print(m)

输出(对于您的示例数据):

[
 165.1685393258427,
 156.93950177935943,
 169.6153846153846,
 159.7826086956522,
 167.04545454545453,
 158.06451612903226,
 168.9655172413793,
 157.5
]

请注意,由于您只需要与0进行比较,您可以将bool用作groupby函数(即g = groupby(l, bool))。这应该比 0.

比较快

您可以尝试使用一些 if 语句。因此你说你是 python 的新手,我更愿意让代码尽可能简单,但优化它会是一个很好的“培训”

fulllist = [163.33333333333334, 0.0, 0.0, 154.73684210526315, 172.94117647058823, 155.8303886925795, 0.0, 156.93950177935943, 0.0, 0.0, 0.0, 151.5463917525773, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 165.1685393258427, 156.93950177935943, 169.6153846153846, 159.7826086956522, 167.04545454545453, 158.06451612903226, 168.9655172413793, 157.5, 0.0, 159.7826086956522, 0.0, 163.94052044609666, 166.41509433962264, 0.0, 0.0, 0.0,]

longest = []
new_try = []

for element in fulllist:
    if element != 0:
        new_try.append(element)

    if new_try>longest:
        longest = new_try.copy()

    if element == 0:
        new_try = []

print(longest)

输出:

[165.1685393258427, 156.93950177935943, 169.6153846153846, 159.7826086956522, 167.04545454545453, 158.06451612903226, 168.9655172413793, 157.5]

您可以利用 groupby() 处理未排序数据的方式:

from itertools import groupby

lst = [163.33333333333334, 0.0, 0.0, 154.73684210526315, 172.94117647058823, 155.8303886925795, 0.0, 156.93950177935943, 0.0, 0.0, 0.0, 151.5463917525773, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 165.1685393258427, 156.93950177935943, 169.6153846153846, 159.7826086956522, 167.04545454545453, 158.06451612903226, 168.9655172413793, 157.5, 0.0, 159.7826086956522, 0.0, 163.94052044609666, 166.41509433962264, 0.0, 0.0, 0.0]
result = max((list(g) for k, g in groupby(lst, bool) if k), key=len)

你可以使用带缓冲区的简单算法

做一个for循环,然后获取当前子集,如果当前子集的长度大于最大值,则设置为最大值。

def get_longest_consecutive_non_zero_subset(input_list: list) -> list:

    max_subset = []
    current_max_subset = []

    for number in input_list:
        if number > 0:
            current_max_subset.append(number)
        else:
            if len(current_max_subset) > len(max_subset):
                max_subset = current_max_subset
            current_max_subset = []

    return max_subset


test_list = [0, 1, 2, 3, 0, 0, 1, 2, 3, 4, 0]
result = get_longest_consecutive_non_zero_subset(test_list)

print(result)
assert result == [1, 2, 3, 4]