快速计算由 connectedComponents 生成的所有簇的平均颜色

Quickly compute average color of all the clusters generated by connectedComponents

我用 cv2.connectedComponentWithStats 将图像分成几组。现在我想观察每个簇的颜色变化。

目前我的代码是这样的:

#...Calculation of the mask

res,labels,stats,centroids = cv2.connectedComponentsWithStats(im_mask)

def compute_average(frame,i):
   data=frame[labels==i].mean(axis=0)
   return (data[2]-data[1]) # difference between red and green channel is meaningful for me

while True:
   frame=capture.read()
   if(not frame[0]):
      break
   start_time=time.time()
   measurements = [ compute_average(frame,i) for i in range(1,len(centroids))]
   print("Computing took",time.time()-start_time

似乎计算每个帧的测量值花费了将近 1.5 秒(我有大约 300 个簇,每个簇有 200-600 像素)。这是不可接受的。

看来,通过巧妙地选择 numpy 算法来计算平均值,我可以获得更好的性能。特别是应该可以同时计算所有集群的平均值。但是,我被困在这里了。

有没有办法根据标签对图像的像素进行分组?

似乎是利用 np.bincount 的完美用例,这是一种使用可选的第二个参数计算分箱求和和加权求和的非常有效的方法。在我们这里的例子中,我们将使用标签作为 bin 并将总和作为计数,然后 frame 作为权重作为可选权重 arg。

因此,我们将有一个向量化的并且希望更有效的方式,就像这样 -

def bincount_method(frame, labels):
    f = frame.reshape(-1,3)
    cn1 = np.bincount(labels.ravel(), f[:,1])
    cn2 = np.bincount(labels.ravel(), f[:,2])
    return (cn2[1:]-cn1[1:])/stats[1:,-1]

基准测试

对于计时,我们将重新使用。我们正在使用 Python 3.6.8.

import numpy as np
import cv2
import urllib.request as ur

# Setup
url = 'https://i.stack.imgur.com/puxMo.png'
s = ur.urlopen(url)
url_response = ur.urlopen(url)
img_array = np.array(bytearray(url_response.read()), dtype=np.uint8)
img = cv2.imdecode(img_array, -1)
im_mask = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

res,labels,stats,centroids = cv2.connectedComponentsWithStats(im_mask)

np.random.seed(0)
frame = np.random.rand(im_mask.shape[0],im_mask.shape[1],3)

计时 -

# Original soln by OP
In [5]: %timeit [compute_average(frame,i) for i in range(1,len(centroids))]
2.38 s ± 116 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# @Dan Mašek's soln
In [3]: %timeit rg_mean_diff_per_label(frame, labels)
92.5 ms ± 1.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

# Solution with bincount
In [4]: %timeit bincount_method(frame, labels)
30.1 ms ± 82.7 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

允许对标签进行预计算

那么 bincount 有一个小的修改:

L = (labels.ravel()[:,None]+np.arange(2)*res).ravel()

def bincount_method_with_precompute(frame, L):
    p = np.bincount(L,frame[...,1:].ravel())
    return (p[res+1:]-p[1:res])/stats[1:,-1]

与@Dan Mašek 的预计算解决方案在同一设置上进行比较:

In [4]: %timeit bincount_method_with_precompute(frame, L)
25.1 ms ± 326 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [5]: %timeit rg_mean_diff_per_label(frame, label_indices)
20.3 ms ± 432 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

想到以下算法。这个 运行s 大约用了 0.115 秒,而原来的是 2.65 秒。 (为了比较@Divakar 的解决方案 运行s 在大约 0.058 秒内)

  • 将标签展平成线性阵列。 (注意:最好使用 ravel(),因为显然 flatten() 总是复制一份)
  • 对标签执行间接排序以获得索引列表。
  • 使用它对标签进行排序。
  • 在排序列表中找到每个标签的第一次出现(两个相邻标签不同)
  • 然后对于每个标签
    • 获取该标签对应的索引范围
    • Select对应的像素,计算第1列和第2列的平均值
    • 存储差值

代码:

import numpy as np

def rg_mean_diff_per_label(frame, labels):
    flat_labels = labels.ravel()
    order = flat_labels.argsort()
    sorted_labels = flat_labels[order]
    # Find the position of last occurence of each label
    boundaries = np.where(sorted_labels[:-1] != sorted_labels[1:])[0]
    # And turn it into position of the first occurence each label
    # (except 0, which we want to ignore anyway, as it represents the background)
    boundaries += 1
    # Add position of end of array, so we can simply make ranges using zip(...)
    boundaries = np.append(boundaries, len(flat_labels))

    flat_pixels = frame.reshape(-1, 3) # One pixel per row, 3 columns, 1 per colour channel

    measurements = []
    for start, end in zip(boundaries[:-1], boundaries[1:]):
        indices = order[start:end]
        # NB: We don't care about blue, skip it
        data = flat_pixels[indices,1:]
        means = data.mean(axis=0)
        measurements.append(means[1] - means[0])

    return measurements

测试图片:

颜色数据是随机的,其中的值对性能评估无关紧要。


由于您对所有帧使用相同的掩码,即 labels 保持不变,我们可以通过稍微重构并缓存保持不变的部分来避免大量重复计算。

我们来看看函数的简介:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    88                                           def rg_mean_diff_per_label(frame, labels):
    89         1        109.0    109.0      0.0      flat_labels = labels.ravel()
    90         1     592417.0 592417.0     35.1      order = flat_labels.argsort()
    91         1     107003.0 107003.0      6.3      sorted_labels = flat_labels[order]
    93         1      38591.0  38591.0      2.3      boundaries = np.where(sorted_labels[:-1] != sorted_labels[1:])[0]
    96         1        364.0    364.0      0.0      boundaries += 1
    98         1        666.0    666.0      0.0      boundaries = np.append(boundaries, len(flat_labels))
   100         1         61.0     61.0      0.0      flat_pixels = frame.reshape(-1, 3) # One pixel per row, 3 columns, 1 per colour channel
   102         1         25.0     25.0      0.0      measurements = []
   103       459      11182.0     24.4      0.7      for start, end in zip(boundaries[:-1], boundaries[1:]):
   104       458      17117.0     37.4      1.0          indices = order[start:end]
   106       458     314348.0    686.3     18.6          data = flat_pixels[indices,1:]
   107       458     579712.0   1265.7     34.4          means = data.mean(axis=0)
   108       458      24001.0     52.4      1.4          measurements.append(means[1] - means[0])
   110         1         21.0     21.0      0.0      return measurements

每次迭代唯一改变的是像素数据,所以如果我们预先计算属于每个标签的所有像素的索引数组,我们应该能够将每次迭代时间再减少 40 % 或者。 (这个 运行s 每次迭代大约 0.057 秒,但竞争已经改善了:))

代码:

def precalc_label_indices(labels):
    flat_labels = labels.ravel()
    order = flat_labels.argsort()
    sorted_labels = flat_labels[order]
    # Find the position of last occurence of each label
    boundaries = np.where(sorted_labels[:-1] != sorted_labels[1:])[0]
    # And turn it into position of the first occurence each label
    # (except 0, which we want to ignore anyway, as it represents the background)
    boundaries += 1
    # Add position of end of array, so we can simply make ranges using zip(...)
    boundaries = np.append(boundaries, len(flat_labels))

    label_indices = []
    for start, end in zip(boundaries[:-1], boundaries[1:]):
        indices = order[start:end]
        indices = np.sort(indices)) # Access in order can't hurt
        label_indices.append(indices)
    return label_indices

label_indices = precalc_label_indices(labels)

def rg_mean_diff_per_label(frame, label_indices):
    flat_pixels = frame.reshape(-1, 3) # One pixel per row, 3 columns, 1 per colour channel

    measurements = []
    for indices in label_indices:
        # NB: We don't care about blue, skip it
        data = flat_pixels[indices,1:]
        means = data.mean(axis=0)
        measurements.append(means[1] - means[0])

    return measurements

现在,既然您已经在使用 OpenCV,为什么不在这里利用它呢?均值的计算似乎是现在最大的瓶颈。事实证明,OpenCV mean() 在这里要快得多。

def rg_mean_diff_per_label(frame, label_indices):
    flat_pixels = frame.reshape(-1, 3) # One pixel per row, 3 columns, 1 per colour channel

    measurements = []
    for indices in label_indices:
        # NB: We don't care about blue, skip it
        data = flat_pixels[indices,1:]
        means = cv2.mean(data.reshape(-1,1,2)) # This one works per-channel
        measurements.append(means[1] - means[0])

    return measurements

此算法导致输出略有不同(差异最大为 1e-14)。但是,它现在每次迭代 运行s 在 0.021 秒内(不包括在 long-运行 中无关紧要的预计算)。


让我们看看新功能的配置文件。

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    32                                           def rg_mean_diff_per_label(frame, label_indices):
    33        16       3314.0    207.1      0.1      flat_pixels = frame[...,1:].reshape(-1, 2)
    35        16        464.0     29.0      0.0      measurements = []
    36      7344     151147.0     20.6      2.7      for indices in label_indices:
    38      7328    4574161.0    624.2     81.6          data = flat_pixels[indices]
    39      7328     640719.0     87.4     11.4          means = cv2.mean(data.reshape(-1,1,2))
    40      7328     237797.0     32.5      4.2          measurements.append(means[1] - means[0])
    42        16        413.0     25.8      0.0      return measurements

现在最大的瓶颈是属于给定标签的像素的提取。

cv2.mean 支持掩码参数这一事实给了我另一个想法。让我们利用 stats,因为它们包含每个标签的边界框和像素数。我们可以为每个标签预先生成 ROI 大小的蒙版图像。 然后对于每个标签,我们提取 ROI(基于边界框),并使用适当的掩码计算它的平均值。这次我们计算所有三个通道的平均值,以避免像素数据的副本。

label_data = []
for label in range(res):
    mask = cv2.inRange(labels, label, label)
    x,y,w,h,n = stats[label]
    roi = mask[y:y+h,x:x+w]
    label_data.append((x,y,x+w,y+h,roi))

def test_v4(frame, label_data):
    measurements = []
    for x1,y1,x2,y2,mask in label_data[1:]:
        roi = frame[y1:y2,x1:x2,:]
        means = cv2.mean(roi, mask)
        measurements.append(means[2] - means[1])
    return measurements

每次迭代需要 0.007 秒 运行s,同样不包括预计算时间。

这是使用我创建的单个测试图像。

注意:我有一个坏情况(3 像素宽对角线条纹)场景的想法,其中掩码对于包含的像素数量来说非常大。我会尽快提供更新。