如何阅读 python 中的这些验证码？

Question

我有以下问题。我想在 python:

中阅读这些类型的验证码

我做过的最好的代码是这个，但是它无法解决所有这些验证码：

import pytesseract
import cv2
import numpy as np
import re

def odstran_sum(img,threshold):
    """Funkce odstrani sum."""
    filtered_img = np.zeros_like(img)
    labels,stats= cv2.connectedComponentsWithStats(img.astype(np.uint8),connectivity=8)[1:3]
    label_areas = stats[1:, cv2.CC_STAT_AREA]
    for i,label_area in enumerate(label_areas):
        if label_area > threshold:
            filtered_img[labels==i+1] = 1
    return filtered_img


def preprocess(img_path):
    """Konvertuje do binary obrazku."""
    img = cv2.imread(img_path,0)
    blur = cv2.GaussianBlur(img, (3,3), 0)
    thresh = cv2.threshold(blur, 150, 255, cv2.THRESH_BINARY_INV)[1]
    filtered_img = 255-odstran_sum(thresh,20)*255
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
    erosion = cv2.erode(filtered_img,kernel,iterations = 1)
    return erosion

def captcha_to_string(obrazek):
    """Funkce vrati text z captchy"""
    text = pytesseract.image_to_string(obrazek)
    return re.sub(r'[^\x00-\x7F]+',' ', text).strip()

img = preprocess(CAPTCHA_NAME)
text = captcha_to_string(img)
print(text)

是否可以改进我的代码使其能够读取所有这五个示例？非常感谢。

Answer 1

我认为除了编写自己的基于相似验证码的图像识别神经网络外，没有太多需要改进的地方。验证码的设计使得计算机很难对其进行解码，因此我认为您无法获得完美的结果。

如何阅读 python 中的这些验证码？

How to read these captchas in python?

python

ocr

opencv

text-recognition