从图像中去除噪声线

Question

我的图像带有一些随机线条，如下所示：

我想对它们进行一些预处理，以消除不需要的噪音（扭曲文字的线条），以便我可以将它们与 OCR (Tesseract) 一起使用。
我想到的想法是用膨胀来去除噪声，然后在第二步用腐蚀来修复缺失的部分。
为此，我使用了这段代码：

import cv2
import numpy as np

img = cv2.imread('linee.png', cv2.IMREAD_GRAYSCALE)
kernel = np.ones((5, 5), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
cv2.imwrite('delatedtest.png', img)

不幸的是，膨胀效果不佳，噪声线仍然存在。

我尝试更改内核形状，但情况变得更糟：文字被部分或完全删除。
我还发现 answer 说可以通过

删除这些行

turning all black pixels with two or less adjacent black pixels to white.

这对我来说有点复杂，因为我是计算机视觉和 opencv 的初学者。
任何帮助将不胜感激，谢谢。

Answer 1

像这样检测线条就是我在下面使用的path opening was invented for. DIPlib has an implementation (disclosure: I implemented it there). As an alternative, you can try using the implementation by the authors of the paper that I linked above. That implementation does not have the "constrained" mode。

这里有一个关于如何使用它的快速演示：

import diplib as dip
import matplotlib.pyplot as pp

img = 1 - pp.imread('/home/cris/tmp/DWRTF.png')
lines = dip.PathOpening(img, length=300, mode={'constrained'})

这里我们先把图像倒过来，因为这样以后做其他事情就容易多了。如果不反转，请改用路径闭合。 lines 图片：

接下来我们减去线条。小面积开孔去掉了被路径开孔过滤掉的那几个孤立的线条像素：

text = img - lines
text = dip.AreaOpening(text, filterSize=5)

但是，我们现在在文本中添加了空白。填写这些内容并非易事。这是一个快速而肮脏的尝试，您可以将其用作起点：

lines = lines > 0.5
text = text > 0.5
lines -= dip.BinaryPropagation(text, lines, connectivity=-1, iterations=3)
img[lines] = 0

Answer 2

您可以使用 createLineSegmentDetector()，一个来自 opencv

的函数来做到这一点

import cv2

#Read gray image
img = cv2.imread("lines.png",0)

#Create default parametrization LSD
lsd = cv2.createLineSegmentDetector(0)

#Detect lines in the image
lines = lsd.detect(img)[0] #Position 0 of the returned tuple are the detected lines

#Draw the detected lines
drawn_img = lsd.drawSegments(img,lines)

#Save the image with the detected lines
cv2.imwrite('lsdsaved.png', drawn_img)

代码的下一部分将只删除长度超过 50 像素的行：

for element in lines:

  #If the length of the line is more than 50, then draw a white line on it
  if (abs(int(element[0][0]) - int(element[0][2])) > 50 or abs(int(element[0][1]) - int(element[0][3])) > 50): 

    #Draw the white line
    cv2.line(img, (int(element[0][0]), int(element[0][1])), (int(element[0][2]), int(element[0][3])), (255, 255, 255), 12)

#Save the final image
cv2.imwrite('removedzz.png', img)

嗯，它不能完美地与当前图像配合使用，但它可能会为不同的图像提供更好的结果。您可以调整要移除的线条的长度和要绘制的白线的粗细，以代替移除的线条。
希望对你有帮助。

从图像中去除噪声线

Remove noisy lines from an image

python

opencv

image-processing

noise-reduction