如何将 RGB 或 HEX 颜色代码分组为更大的颜色组?

How to group RGB or HEX color codes to bigger sets of color groups?

我正在分析大量图像并提取主要颜色代码。

我想将它们分组为通用颜色名称的范围,例如绿色、深绿色、浅绿色、蓝色、深蓝色、浅蓝色等。

我正在寻找一种与语言无关的方式来自己实现一些东西,如果有我可以研究的例子来实现这一点,我将不胜感激。

在机器学习领域,你想要做的事情叫做classification,其中的目标是将类(颜色)之一的标签分配给每个观察值(图片)。 为此,类 必须是 pre-defined。假设这些是我们要分配给图像的颜色:

要确定图像的主色,必须计算每个像素与 table 中所有颜色之间的距离。请注意,此距离是以 RGB 颜色 space 计算的。要计算图像的 ij-th 像素与 table 的 k-th 颜色之间的距离,可以使用以下等式:

d_ijk = sqrt((r_ij-r_k)^2+(g_ij-g_k)^2+(b_ij-b_k)^2)

下一步,对于每个像素,选择table中最接近的颜色。这是用于使用 indexed colors 压缩图像的概念(除了这里的调色板对于所有图像都是相同的,并且不会为每个图像计算以最小化原始图像和索引图像之间的差异)。现在,正如@jairoar 指出的那样,我们可以获得图像的直方图(不要与 RGB 直方图或强度直方图混淆),并确定重复次数最多的颜色。
为了展示这些步骤的结果,我使用了这件艺术品的随机裁剪!我的:


这是图像在索引之前和之后的样子(左:原始,右:索引): 这些是最重复的颜色(左:索引,右:主色):

但是既然你说图片数量多,你应该知道这些计算是比较耗时的。但好消息是,有一些方法可以提高性能。例如,而不是在原始图像上使用 Euclidean distance (formula above), you can use the City Block or Chebyshev distance. You can also calculate the distance only for a fraction of the pixels instead of calculating it for all the pixels in an image. For this purpose, you can first scale the image to a much smaller size (for example, 32 by 32) and perform calculations for the pixels of this reduced image. If you decided to resize images, don not bother to use bilinear or bicubic interpolations, it doesn't worth the extra computation. Instead, go for the nearest neighbor, which actually performs a rectangular lattice 采样。

虽然上述改动会大大提高计算速度,但天下无敌。这是性能与准确性的 trade-off。比如前面两张图,我们看到图片一开始识别为橙色(代码20),调整大小后识别为粉色(代码26)
要确定算法的参数(距离测量、缩小图像尺寸和缩放算法),您必须首先以尽可能高的精度对大量图像进行分类操作,并将结果作为 ground truth。然后,通过多次实验,得到不使分类误差超过最大可容忍值的参数组合。

@saastn 的精彩回答假设您有一组 pre-defined 颜色,您希望将图像排序为这些颜色。如果您只想将图像分类为某组 X 等距颜色中的一种颜色(直方图),则实施起来会更容易。

总而言之,将图像中每个像素的颜色四舍五入为一组等距颜色箱中最接近的颜色。这会将您的颜色精度降低到您想要的任何数量的颜色。然后计算图像中的所有颜色,select 最常见的颜色作为该图像的分类。

这是我在 Python 中的实现:

import cv2
import numpy as np

#Set this to the number of colors that you want to classify the images to
number_of_colors = 8

#Verify that the number of colors chosen is between the minimum possible and maximum possible for an RGB image.
assert 8 <= number_of_colors <= 16777216

#Get the cube root of the number of colors to determine how many bins to split each channel into.
number_of_values_per_channel = number_of_colors ** ( 1 / 3 )

#We will divide each pixel by its maximum value divided by the number of bins we want to divide the values into (minus one for the zero bin).
divisor = 255 / (number_of_values_per_channel - 1)

#load the image and convert it to float32 for greater precision. cv2 loads the image in BGR (as opposed to RGB) format.
image = cv2.imread("image.png", cv2.IMREAD_COLOR).astype(np.float32)

#Divide each pixel by the divisor defined above, round to the nearest bin, then convert float32 back to uint8.
image = np.round(image / divisor).astype(np.uint8)

#Flatten the columns and rows into just one column per channel so that it will be easier to compare the columns across the channels.
image = image.reshape(-1, image.shape[2])

#Find and count matching rows (pixels), where each row consists of three values spread across three channels (Blue column, Red column, Green column).
uniques = np.unique(image, axis=0, return_counts=True)

#The first of the two arrays returned by np.unique is an array compromising all of the unique colors.
colors = uniques[0]

#The second of the two arrays returend by np.unique is an array compromising the counts of all of the unique colors.
color_counts = uniques[1]

#Get the index of the color with the greatest frequency
most_common_color_index = np.argmax(color_counts)

#Get the color that was the most common
most_common_color = colors[most_common_color_index]

#Multiply the channel values by the divisor to return the values to a range between 0 and 255
most_common_color = most_common_color * divisor

#If you want to name each color, you could also provide a list sorted from lowest to highest BGR values comprising of
#the name of each possible color, and then use most_common_color_index to retrieve the name.
print(most_common_color)