为什么我的代码不分割输入图像的字符？

Question

我正在尝试对图像进行字符分割，但输出没有得到分割。

那是我的 new_input 图片：

new input image

这是我分割字符的代码：

word = cv2.imread('../input/word-image/word.PNG',0)

fig, ax1 = plt.subplots(1)
ax1.imshow(word, cmap="gray")
# the next two lines is based on the assumptions that the width of
# a license plate should be between 5% and 15% of the license plate,
# and height should be between 35% and 60%
# this will eliminate some
character_dimensions = (0.35*word.shape[0], 0.60*word.shape[0], 0.05*word.shape[1], 0.15*word.shape[1])
min_height, max_height, min_width, max_width = character_dimensions

characters = []
counter=0
column_list = []
for regions in regionprops(word):
    y0, x0, y1, x1 = regions.bbox
    region_height = y1 - y0
    region_width = x1 - x0

    if region_height > min_height and region_height < max_height and region_width > min_width and region_width < max_width:
        roi = word[y0:y1, x0:x1]

        # draw a red bordered rectangle over the character.
        rect_border = patches.Rectangle((x0, y0), x1 - x0, y1 - y0, edgecolor="red",
                                       linewidth=2, fill=False)
        ax1.add_patch(rect_border)

        # resize the characters to 20X20 and then append each character into the characters list
        resized_char = resize(roi, (20, 20))
        characters.append(resized_char)

        # this is just to keep track of the arrangement of the characters
        column_list.append(x0)
# print(characters)
plt.show()

这是当前输出：

这可能是什么问题？

Answer 1

来自 skimage.measure.regionprops 上的文档：

Measure properties of labeled image regions.

您没有提供正确标记的图像，但您的输入被解释为一个，因为由于混叠，您有多个灰度值。在你的循环中调试 regions，你会看到，检测到很多区域。由于您对宽度和高度的假设，它们都被忽略了。

因此，第一步是生成适当的标记图像，例如使用 cv2.connectedComponents。因此，您需要事先对输入图像进行（反向）二值化。有了 labels 图像，您可以直接继续循环。不过，我会在这里驳回所有关于宽度和高度的假设。

那是修改后的代码：

import cv2
import matplotlib.pyplot as plt
from matplotlib import patches
from skimage.measure import regionprops
from skimage.transform import resize

# Read image as grayscale
word = cv2.imread('pFLpN.png', cv2.IMREAD_GRAYSCALE)

# Inverse binarize image, and find connected components
thr = cv2.threshold(word, 254, 255, cv2.THRESH_BINARY_INV)[1]
labels = cv2.connectedComponents(thr)[1]

# Maybe leave out any assumptions on the width and height...

# Prepare outputs
plt.figure(figsize=(18, 9))
plt.subplot(2, 2, 1), plt.imshow(word, cmap='gray'), plt.title('Original image')
plt.subplot(2, 2, 2), plt.imshow(thr, cmap='gray'), plt.title('Binarized image')
plt.subplot(2, 2, 3), plt.imshow(labels), plt.title('Connected components')
ax = plt.subplot(2, 2, 4), plt.imshow(word, cmap='gray')

# Iterate found connected components as before
# (Without checking width and height...)
characters = []
counter = 0
column_list = []
for regions in regionprops(labels):
    y0, x0, y1, x1 = regions.bbox
    region_height = y1 - y0
    region_width = x1 - x0
    roi = word[y0:y1, x0:x1]
    rect_border = patches.Rectangle((x0, y0), x1 - x0, y1 - y0, edgecolor='red',
                                    linewidth=2, fill=False)
    ax[0].add_patch(rect_border)

    resized_char = resize(roi, (20, 20))
    characters.append(resized_char)

    column_list.append(x0)

plt.title('Segmented characters')
plt.tight_layout(), plt.show()

然后，这就是输出：

----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.19041-SP0
Python:        3.9.1
PyCharm:       2021.1.2
Matplotlib:    3.4.2
OpenCV:        4.5.2
scikit-image:  0.18.1
----------------------------------------

为什么我的代码不分割输入图像的字符？

Why does my code not segment the characters of the input image?

python

ocr

image-processing

image-segmentation

deep-learning