InvalidArgumentError: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [4], [batch]: [5] [Op:IteratorGetNext]
InvalidArgumentError: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [4], [batch]: [5] [Op:IteratorGetNext]
任务:
Keras验证码ocr模型训练
问题:
我正在尝试从我的验证集中打印 CAPTCHAS,但这样做会导致以下错误
InvalidArgumentError Traceback (most recent call last)
<ipython-input-6-df1fce607804> in <module>()
1
2 #_, ax = plt.subplots(1, 4, figsize=(10, 5))
----> 3 for batch in validation_dataset.take(1):
4 images = batch["image"]
5 labels = batch["label"]
3 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
7105 def raise_from_not_ok_status(e, name):
7106 e.message += (" name: " + name if name is not None else "")
-> 7107 raise core._status_to_exception(e) from None # pylint: disable=protected-access
7108
7109
InvalidArgumentError: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [4], [batch]: [5] [Op:IteratorGetNext]
打印输出的代码,这是我试过的:
#_, ax = plt.subplots(1, 4, figsize=(10, 5))
for batch in validation_dataset.take(1):
images = batch["image"]
labels = batch["label"]
for i in range(batch_size):
img = (images[i] * 255).numpy().astype("uint8")
label = tf.strings.reduce_join(num_to_char(labels[i])).numpy().decode("utf-8")
plt.title(label)
plt.imshow(img[:, :, 0].T, cmap="gray")
plt.show()
对于此任务,我尝试将批量大小设置为 1,但我想用更高的批量大小来训练我的模型
我的批量大小 = 16
# Mapping integers back to original characters
num_to_char = layers.StringLookup(
vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True
)
这是从 tensorflow 数据集文档中获取的代码,用于将数据转换为 tf 中的数据集类型
创建数据集对象
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train]))
train_dataset = (
train_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.AUTOTUNE).repeat(10)
)
validation_dataset = tf.data.Dataset.from_tensor_slices((x_valid, y_valid]))
validation_dataset = (
validation_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.AUTOTUNE)
)
此代码读取图像并对图像进行预处理以使所有图像具有统一的形状
函数编码单个样本
def encode_single_sample(img_path, label):
# 1. Read image
img = tf.io.read_file(img_path)
# 2. Decode and convert to grayscale
img = tf.io.decode_png(img, channels=3)
# 3. Convert to float32 in [0, 1] range
img = tf.image.convert_image_dtype(img, tf.float32)
# 4. Resize to the desired size
img = tf.image.resize(img, [img_height, img_width])
# 5. Transpose the image because we want the time
# dimension to correspond to the width of the image.
img = tf.transpose(img, perm=[1, 0, 2])
# 6. Map the characters in label to numbers
label = char_to_num(tf.strings.unicode_split(label, input_encoding="UTF-8"))
# 7. Return a dict as our model is expecting two inputs
return {"image": img, "label": label}
编辑:
关于数据:
这是来自 data. Its similar to the keras ocr example 数据集的示例。尽管图像大小不同,但验证码模式的可变性非常小。它维护 5 个长度的单词,中间有 1 个或两个数字。单词总是大写。
我哪里错了?
这是一个完整的 运行 示例,它基于您在 Google Colab 中的数据集 运行:
!pip install unrar
!unrar x /content/captcha_caps.rar
import os
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from collections import Counter
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
data_dir = Path("/content/captcha_caps/")
images = sorted(list(map(str, list(data_dir.glob("*.PNG")))))
labels = [img.split(os.path.sep)[-1].split(".PNG")[0] for img in images]
characters = set(char for label in labels for char in label)
print("Number of images found: ", len(images))
print("Number of labels found: ", len(labels))
print("Number of unique characters: ", len(characters))
print("Characters present: ", characters)
batch_size = 16
img_width = 200
img_height = 50
downsample_factor = 4
max_length = max([len(label) for label in labels])
char_to_num = layers.StringLookup(
vocabulary=list(characters), mask_token=None
)
num_to_char = layers.StringLookup(
vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True
)
def split_data(images, labels, train_size=0.9, shuffle=True):
size = len(images)
indices = np.arange(size)
if shuffle:
np.random.shuffle(indices)
train_samples = int(size * train_size)
x_train, y_train = images[indices[:train_samples]], labels[indices[:train_samples]]
x_valid, y_valid = images[indices[train_samples:]], labels[indices[train_samples:]]
return x_train, x_valid, y_train, y_valid
x_train, x_valid, y_train, y_valid = split_data(np.array(images), np.array(labels))
def encode_single_sample(img_path, label):
img = tf.io.read_file(img_path)
img = tf.io.decode_png(img, channels=1)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img, [img_height, img_width])
img = tf.transpose(img, perm=[1, 0, 2])
label = char_to_num(tf.strings.unicode_split(label, input_encoding="UTF-8"))
return {"image": img, "label": label}
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = (
train_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.AUTOTUNE).repeat(10)
)
validation_dataset = tf.data.Dataset.from_tensor_slices((x_valid, y_valid))
validation_dataset = (
validation_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.AUTOTUNE)
)
_, ax = plt.subplots(4, 4, figsize=(10, 5))
for batch in validation_dataset.take(1):
images = batch["image"]
labels = batch["label"]
for i in range(16):
img = (images[i] * 255).numpy().astype("uint8")
label = tf.strings.reduce_join(num_to_char(labels[i])).numpy().decode("utf-8")
ax[i // 4, i % 4].imshow(img[:, :, 0].T, cmap="gray")
ax[i // 4, i % 4].set_title(label)
ax[i // 4, i % 4].axis("off")
plt.show()
Requirement already satisfied: unrar in /usr/local/lib/python3.7/dist-packages (0.4)
UNRAR 5.50 freeware Copyright (c) 1993-2017 Alexander Roshal
Extracting from /content/captcha_caps.rar
Creating captcha caps OK
Extracting captcha caps/24VCZ.PNG OK
Extracting captcha caps/26SGX.PNG OK
Extracting captcha caps/2HC5E.PNG OK
Extracting captcha caps/2NDXL.PNG OK
Extracting captcha caps/2NUEH.PNG OK
Extracting captcha caps/2QX4B.PNG OK
Extracting captcha caps/2V78Y.PNG OK
Extracting captcha caps/2Z45Y.PNG OK
Extracting captcha caps/2Z9R2.PNG OK
Extracting captcha caps/32HZA.PNG OK
Extracting captcha caps/38JKT.PNG OK
Extracting captcha caps/39EZ4.PNG OK
Extracting captcha caps/3GJ85.PNG OK
Extracting captcha caps/3R2JE.PNG OK
Extracting captcha caps/3RU4C.PNG OK
Extracting captcha caps/3TPFA.PNG OK
Extracting captcha caps/3TVAC.PNG OK
Extracting captcha caps/44U8C.PNG OK
Extracting captcha caps/452LV.PNG OK
Extracting captcha caps/4E4P8.PNG OK
Extracting captcha caps/4E5HX.PNG OK
Extracting captcha caps/4FVS7.PNG OK
Extracting captcha caps/4GJCC.PNG OK
Extracting captcha caps/4QQJD.PNG OK
Extracting captcha caps/4TH2K.PNG OK
Extracting captcha caps/4TN2L.PNG OK
Extracting captcha caps/4YBT5.PNG OK
Extracting captcha caps/4ZLHE.PNG OK
Extracting captcha caps/556F5.PNG OK
Extracting captcha caps/55DT5.PNG OK
Extracting captcha caps/5CEZD.PNG OK
Extracting captcha caps/5CQ39.PNG OK
Extracting captcha caps/5FZUR.PNG OK
Extracting captcha caps/5H7F4.PNG OK
Extracting captcha caps/5K4TY.PNG OK
Extracting captcha caps/5N2KC.PNG OK
Extracting captcha caps/5P6B4.PNG OK
Extracting captcha caps/5R728.PNG OK
Extracting captcha caps/5S9E7.PNG OK
Extracting captcha caps/5VRRV.PNG OK
Extracting captcha caps/5VZHL.PNG OK
Extracting captcha caps/5YVYG.PNG OK
Extracting captcha caps/63P4N.PNG OK
Extracting captcha caps/65DQ7.PNG OK
Extracting captcha caps/66JUU.PNG OK
Extracting captcha caps/69ZQ3.PNG OK
Extracting captcha caps/6B655.PNG OK
Extracting captcha caps/6GBFG.PNG OK
Extracting captcha caps/6K27H.PNG OK
Extracting captcha caps/6R7G5.PNG OK
Extracting captcha caps/6VFYG.PNG OK
Extracting captcha caps/6X8AJ.PNG OK
Extracting captcha caps/6ZNJP.PNG OK
Extracting captcha caps/73ZK2.PNG OK
Extracting captcha caps/74FPR.PNG OK
Extracting captcha caps/7C46N.PNG OK
Extracting captcha caps/7C48B.PNG OK
Extracting captcha caps/7JVBT.PNG OK
Extracting captcha caps/7NVS8.PNG OK
Extracting captcha caps/7REZP.PNG OK
Extracting captcha caps/7RHSQ.PNG OK
Extracting captcha caps/7RTT2.PNG OK
Extracting captcha caps/7VV9J.PNG OK
Extracting captcha caps/82JNK.PNG OK
Extracting captcha caps/83JKQ.PNG OK
Extracting captcha caps/89RGK.PNG OK
Extracting captcha caps/8A2D7.PNG OK
Extracting captcha caps/8ENGQ.PNG OK
Extracting captcha caps/8K5KS.PNG OK
Extracting captcha caps/95BDX.PNG OK
Extracting captcha caps/963D9.PNG OK
Extracting captcha caps/9878H.PNG OK
Extracting captcha caps/99G9R.PNG OK
Extracting captcha caps/99RJ8.PNG OK
Extracting captcha caps/9CKGT.PNG OK
Extracting captcha caps/9DK36.PNG OK
Extracting captcha caps/9E3FU.PNG OK
Extracting captcha caps/9EZCJ.PNG OK
Extracting captcha caps/9HS3T.PNG OK
Extracting captcha caps/9J59G.PNG OK
Extracting captcha caps/9JXEJ.PNG OK
Extracting captcha caps/9TBBF.PNG OK
Extracting captcha caps/9TYDP.PNG OK
Extracting captcha caps/9YEY2.PNG OK
Extracting captcha caps/A6TC6.PNG OK
Extracting captcha caps/ADB8Y.PNG OK
Extracting captcha caps/AERBR.PNG OK
Extracting captcha caps/AG43G.PNG OK
Extracting captcha caps/ALX5Q.PNG OK
Extracting captcha caps/AP6EJ.PNG OK
Extracting captcha caps/AUFH4.PNG OK
Extracting captcha caps/AVAYP.PNG OK
Extracting captcha caps/AX2QR.PNG OK
Extracting captcha caps/AZS3U.PNG OK
Extracting captcha caps/B6ZYP.PNG OK
Extracting captcha caps/B8YTF.PNG OK
Extracting captcha caps/BEGC2.PNG OK
Extracting captcha caps/BQFXZ.PNG OK
Extracting captcha caps/BQSB2.PNG OK
Extracting captcha caps/BT5CN.PNG OK
Extracting captcha caps/BYJL9.PNG OK
Extracting captcha caps/BZYB7.PNG OK
Extracting captcha caps/C2EFS.PNG OK
Extracting captcha caps/C3T9L.PNG OK
Extracting captcha caps/C8C26.PNG OK
Extracting captcha caps/CACQC.PNG OK
Extracting captcha caps/CBXJY.PNG OK
Extracting captcha caps/CE6S8.PNG OK
Extracting captcha caps/CEFCR.PNG OK
Extracting captcha caps/CEPQV.PNG OK
Extracting captcha caps/CF3V8.PNG OK
Extracting captcha caps/CFR3R.PNG OK
Extracting captcha caps/CKEQK.PNG OK
Extracting captcha caps/CUD8R.PNG OK
Extracting captcha caps/D2ZSU.PNG OK
Extracting captcha caps/D56EX.PNG OK
Extracting captcha caps/DBAAX.PNG OK
Extracting captcha caps/DC2AV.PNG OK
Extracting captcha caps/DDZRZ.PNG OK
Extracting captcha caps/DF266.PNG OK
Extracting captcha caps/DGLYX.PNG OK
Extracting captcha caps/DNQ8C.PNG OK
Extracting captcha caps/DPQCC.PNG OK
Extracting captcha caps/DUU3R.PNG OK
Extracting captcha caps/DY935.PNG OK
Extracting captcha caps/DYE9U.PNG OK
Extracting captcha caps/E6RVE.PNG OK
Extracting captcha caps/E7B47.PNG OK
Extracting captcha caps/EB975.PNG OK
Extracting captcha caps/EHQVT.PNG OK
Extracting captcha caps/EJB7K.PNG OK
Extracting captcha caps/EJEUJ.PNG OK
Extracting captcha caps/EN3SG.PNG OK
Extracting captcha caps/EQP2Q.PNG OK
Extracting captcha caps/ESLUT.PNG OK
Extracting captcha caps/ET497.PNG OK
Extracting captcha caps/F2GTJ.PNG OK
Extracting captcha caps/F32UK.PNG OK
Extracting captcha caps/F8B56.PNG OK
Extracting captcha caps/FEQRA.PNG OK
Extracting captcha caps/FF5AZ.png OK
Extracting captcha caps/FGBBV.PNG OK
Extracting captcha caps/FN4XQ.PNG OK
Extracting captcha caps/FUHZJ.PNG OK
Extracting captcha caps/FZBZB.PNG OK
Extracting captcha caps/G3C7R.PNG OK
Extracting captcha caps/G3H2V.PNG OK
Extracting captcha caps/G5A9V.PNG OK
Extracting captcha caps/G9CLN.PNG OK
Extracting captcha caps/GAGZG.PNG OK
Extracting captcha caps/GCRFA.PNG OK
Extracting captcha caps/GF59Q.PNG OK
Extracting captcha caps/GGFJH.PNG OK
Extracting captcha caps/GHNPE.PNG OK
Extracting captcha caps/GKGQ5.PNG OK
Extracting captcha caps/GLNPR.PNG OK
Extracting captcha caps/GRKU3.PNG OK
Extracting captcha caps/GXGEA.PNG OK
Extracting captcha caps/H2JTQ.PNG OK
Extracting captcha caps/HAF8J.PNG OK
Extracting captcha caps/HDK86.PNG OK
Extracting captcha caps/HLG6G.PNG OK
Extracting captcha caps/HQEFG.PNG OK
Extracting captcha caps/HUG3K.PNG OK
Extracting captcha caps/HZBJ9.PNG OK
Extracting captcha caps/J38T7.PNG OK
Extracting captcha caps/J3DG2.PNG OK
Extracting captcha caps/J5PE7.PNG OK
Extracting captcha caps/J72PD.PNG OK
Extracting captcha caps/J8E5Z.PNG OK
Extracting captcha caps/JC5YF.PNG OK
Extracting captcha caps/JFEJ5.PNG OK
Extracting captcha caps/JGXV6.PNG OK
Extracting captcha caps/K2K9D.PNG OK
Extracting captcha caps/K4GBD.PNG OK
Extracting captcha caps/K5QCD.PNG OK
Extracting captcha caps/K6U7U.PNG OK
Extracting captcha caps/KCXAC.PNG OK
Extracting captcha caps/KFBTD.PNG OK
Extracting captcha caps/KG4HH.PNG OK
Extracting captcha caps/KHZE4.PNG OK
Extracting captcha caps/KSJ4A.PNG OK
Extracting captcha caps/KZ62P.PNG OK
Extracting captcha caps/L345J.PNG OK
Extracting captcha caps/L36HP.PNG OK
Extracting captcha caps/L37HK.PNG OK
Extracting captcha caps/L9EG2.PNG OK
Extracting captcha caps/L9H4D.PNG OK
Extracting captcha caps/LA4P7.PNG OK
Extracting captcha caps/LBNXN.PNG OK
Extracting captcha caps/LJ6PC.PNG OK
Extracting captcha caps/LJ7DT.PNG OK
Extracting captcha caps/LVRKL.PNG OK
Extracting captcha caps/LXFBT.PNG OK
Extracting captcha caps/N52PE.PNG OK
Extracting captcha caps/N8ZFS.PNG OK
Extracting captcha caps/NBHG7.PNG OK
Extracting captcha caps/NGGQ.PNG OK
Extracting captcha caps/NHRGB.PNG OK
Extracting captcha caps/NKJX4.PNG OK
Extracting captcha caps/NPPPD.PNG OK
Extracting captcha caps/NU4KH.PNG OK
Extracting captcha caps/NX7BX.PNG OK
Extracting captcha caps/P2K69.PNG OK
Extracting captcha caps/PFHLR.PNG OK
Extracting captcha caps/PTFPU.PNG OK
Extracting captcha caps/PV6YU.PNG OK
Extracting captcha caps/Q4KXA.PNG OK
Extracting captcha caps/Q6L72.PNG OK
Extracting captcha caps/Q7HNQ.PNG OK
Extracting captcha caps/Q8A2V.PNG OK
Extracting captcha caps/QC7TD.PNG OK
Extracting captcha caps/QEYHL.PNG OK
Extracting captcha caps/QHHKX.PNG OK
Extracting captcha caps/QRRE2.PNG OK
Extracting captcha caps/QU3JN.PNG OK
Extracting captcha caps/QURPC.PNG OK
Extracting captcha caps/QZR3Z.PNG OK
Extracting captcha caps/R27UF.PNG OK
Extracting captcha caps/R38UE.PNG OK
Extracting captcha caps/R3UVK.PNG OK
Extracting captcha caps/RDE9N.PNG OK
Extracting captcha caps/RJUNG.PNG OK
Extracting captcha caps/RJYRY.PNG OK
Extracting captcha caps/RLKT2.PNG OK
Extracting captcha caps/RLL8T.PNG OK
Extracting captcha caps/RLTV6.PNG OK
Extracting captcha caps/RQ2QT.PNG OK
Extracting captcha caps/RQZST.PNG OK
Extracting captcha caps/RU5HH.PNG OK
Extracting captcha caps/RZFFE.PNG OK
Extracting captcha caps/S2Y25.PNG OK
Extracting captcha caps/SG9EL.PNG OK
Extracting captcha caps/SHRTS.PNG OK
Extracting captcha caps/SJ6KJ.PNG OK
Extracting captcha caps/SPHR4.PNG OK
Extracting captcha caps/SQ7S2.PNG OK
Extracting captcha caps/SQDUP.PNG OK
Extracting captcha caps/SVYGE.PNG OK
Extracting captcha caps/T3HA6.PNG OK
Extracting captcha caps/T49RZ.PNG OK
Extracting captcha caps/TJ6DE.PNG OK
Extracting captcha caps/TNVRU.PNG OK
Extracting captcha caps/TR9L8.PNG OK
Extracting captcha caps/TTBYT.PNG OK
Extracting captcha caps/TUJZK.PNG OK
Extracting captcha caps/TVAKL.PNG OK
Extracting captcha caps/U554X.PNG OK
Extracting captcha caps/U7SKY.PNG OK
Extracting captcha caps/U8U4N.PNG OK
Extracting captcha caps/UDB3Z.PNG OK
Extracting captcha caps/UDD86.PNG OK
Extracting captcha caps/UDTQS.PNG OK
Extracting captcha caps/UHB6B.PNG OK
Extracting captcha caps/UKSL7.PNG OK
Extracting captcha caps/UP88V.PNG OK
Extracting captcha caps/URCF5.PNG OK
Extracting captcha caps/URKGY.PNG OK
Extracting captcha caps/URQ7D.PNG OK
Extracting captcha caps/UTYPV.PNG OK
Extracting captcha caps/UYY55.PNG OK
Extracting captcha caps/V99ZG.PNG OK
Extracting captcha caps/VB93G.PNG OK
Extracting captcha caps/VX43Z.PNG OK
Extracting captcha caps/VYLHE.PNG OK
Extracting captcha caps/X9PJE.PNG OK
Extracting captcha caps/XC93Q.PNG OK
Extracting captcha caps/XETJ6.PNG OK
Extracting captcha caps/XGHYS.PNG OK
Extracting captcha caps/XGSHP.PNG OK
Extracting captcha caps/XH7NL.PNG OK
Extracting captcha caps/XHYGL.PNG OK
Extracting captcha caps/XPQ6V.PNG OK
Extracting captcha caps/XTC2K.PNG OK
Extracting captcha caps/XVFC7.PNG OK
Extracting captcha caps/XY364.PNG OK
Extracting captcha caps/Y2KFQ.PNG OK
Extracting captcha caps/Y5HRR.PNG OK
Extracting captcha caps/Y9R52.PNG OK
Extracting captcha caps/YC6TY.PNG OK
Extracting captcha caps/YE6KN.PNG OK
Extracting captcha caps/YGNG.PNG OK
Extracting captcha caps/YJ433.PNG OK
Extracting captcha caps/YJPSF.PNG OK
Extracting captcha caps/YL647.PNG OK
Extracting captcha caps/YNDAP.PNG OK
Extracting captcha caps/YPJAT.PNG OK
Extracting captcha caps/YTZLF.PNG OK
Extracting captcha caps/YU63S.PNG OK
Extracting captcha caps/YVX5X.PNG OK
Extracting captcha caps/YXK3G.PNG OK
Extracting captcha caps/YYGQ2.PNG OK
Extracting captcha caps/Z3EQP.PNG OK
Extracting captcha caps/Z79JR.PNG OK
Extracting captcha caps/ZBJHX.PNG OK
Extracting captcha caps/ZCUJB.PNG OK
Extracting captcha caps/ZEEBR.PNG OK
Extracting captcha caps/ZENDD.PNG OK
Extracting captcha caps/ZQPTE.PNG OK
Extracting captcha caps/ZYHPX.PNG OK
Extracting captcha caps/ZYJ6X.PNG OK
All OK
Number of images found: 300
Number of labels found: 300
Number of unique characters: 30
Characters present: {'9', 'A', '6', 'Y', '2', 'J', '3', 'F', 'C', 'X', 'D', 'N', 'V', '4', '5', 'K', 'Z', '7', 'P', 'L', 'H', 'T', 'Q', 'E', 'U', '8', 'S', 'G', 'R', 'B'}
任务: Keras验证码ocr模型训练
问题: 我正在尝试从我的验证集中打印 CAPTCHAS,但这样做会导致以下错误
InvalidArgumentError Traceback (most recent call last)
<ipython-input-6-df1fce607804> in <module>()
1
2 #_, ax = plt.subplots(1, 4, figsize=(10, 5))
----> 3 for batch in validation_dataset.take(1):
4 images = batch["image"]
5 labels = batch["label"]
3 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in raise_from_not_ok_status(e, name)
7105 def raise_from_not_ok_status(e, name):
7106 e.message += (" name: " + name if name is not None else "")
-> 7107 raise core._status_to_exception(e) from None # pylint: disable=protected-access
7108
7109
InvalidArgumentError: Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [4], [batch]: [5] [Op:IteratorGetNext]
打印输出的代码,这是我试过的:
#_, ax = plt.subplots(1, 4, figsize=(10, 5))
for batch in validation_dataset.take(1):
images = batch["image"]
labels = batch["label"]
for i in range(batch_size):
img = (images[i] * 255).numpy().astype("uint8")
label = tf.strings.reduce_join(num_to_char(labels[i])).numpy().decode("utf-8")
plt.title(label)
plt.imshow(img[:, :, 0].T, cmap="gray")
plt.show()
对于此任务,我尝试将批量大小设置为 1,但我想用更高的批量大小来训练我的模型 我的批量大小 = 16
# Mapping integers back to original characters
num_to_char = layers.StringLookup(
vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True
)
这是从 tensorflow 数据集文档中获取的代码,用于将数据转换为 tf 中的数据集类型 创建数据集对象
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train]))
train_dataset = (
train_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.AUTOTUNE).repeat(10)
)
validation_dataset = tf.data.Dataset.from_tensor_slices((x_valid, y_valid]))
validation_dataset = (
validation_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.AUTOTUNE)
)
此代码读取图像并对图像进行预处理以使所有图像具有统一的形状 函数编码单个样本
def encode_single_sample(img_path, label):
# 1. Read image
img = tf.io.read_file(img_path)
# 2. Decode and convert to grayscale
img = tf.io.decode_png(img, channels=3)
# 3. Convert to float32 in [0, 1] range
img = tf.image.convert_image_dtype(img, tf.float32)
# 4. Resize to the desired size
img = tf.image.resize(img, [img_height, img_width])
# 5. Transpose the image because we want the time
# dimension to correspond to the width of the image.
img = tf.transpose(img, perm=[1, 0, 2])
# 6. Map the characters in label to numbers
label = char_to_num(tf.strings.unicode_split(label, input_encoding="UTF-8"))
# 7. Return a dict as our model is expecting two inputs
return {"image": img, "label": label}
编辑: 关于数据: 这是来自 data. Its similar to the keras ocr example 数据集的示例。尽管图像大小不同,但验证码模式的可变性非常小。它维护 5 个长度的单词,中间有 1 个或两个数字。单词总是大写。
我哪里错了?
这是一个完整的 运行 示例,它基于您在 Google Colab 中的数据集 运行:
!pip install unrar
!unrar x /content/captcha_caps.rar
import os
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path
from collections import Counter
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
data_dir = Path("/content/captcha_caps/")
images = sorted(list(map(str, list(data_dir.glob("*.PNG")))))
labels = [img.split(os.path.sep)[-1].split(".PNG")[0] for img in images]
characters = set(char for label in labels for char in label)
print("Number of images found: ", len(images))
print("Number of labels found: ", len(labels))
print("Number of unique characters: ", len(characters))
print("Characters present: ", characters)
batch_size = 16
img_width = 200
img_height = 50
downsample_factor = 4
max_length = max([len(label) for label in labels])
char_to_num = layers.StringLookup(
vocabulary=list(characters), mask_token=None
)
num_to_char = layers.StringLookup(
vocabulary=char_to_num.get_vocabulary(), mask_token=None, invert=True
)
def split_data(images, labels, train_size=0.9, shuffle=True):
size = len(images)
indices = np.arange(size)
if shuffle:
np.random.shuffle(indices)
train_samples = int(size * train_size)
x_train, y_train = images[indices[:train_samples]], labels[indices[:train_samples]]
x_valid, y_valid = images[indices[train_samples:]], labels[indices[train_samples:]]
return x_train, x_valid, y_train, y_valid
x_train, x_valid, y_train, y_valid = split_data(np.array(images), np.array(labels))
def encode_single_sample(img_path, label):
img = tf.io.read_file(img_path)
img = tf.io.decode_png(img, channels=1)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(img, [img_height, img_width])
img = tf.transpose(img, perm=[1, 0, 2])
label = char_to_num(tf.strings.unicode_split(label, input_encoding="UTF-8"))
return {"image": img, "label": label}
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = (
train_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.AUTOTUNE).repeat(10)
)
validation_dataset = tf.data.Dataset.from_tensor_slices((x_valid, y_valid))
validation_dataset = (
validation_dataset.map(
encode_single_sample, num_parallel_calls=tf.data.AUTOTUNE
)
.batch(batch_size)
.prefetch(buffer_size=tf.data.AUTOTUNE)
)
_, ax = plt.subplots(4, 4, figsize=(10, 5))
for batch in validation_dataset.take(1):
images = batch["image"]
labels = batch["label"]
for i in range(16):
img = (images[i] * 255).numpy().astype("uint8")
label = tf.strings.reduce_join(num_to_char(labels[i])).numpy().decode("utf-8")
ax[i // 4, i % 4].imshow(img[:, :, 0].T, cmap="gray")
ax[i // 4, i % 4].set_title(label)
ax[i // 4, i % 4].axis("off")
plt.show()
Requirement already satisfied: unrar in /usr/local/lib/python3.7/dist-packages (0.4)
UNRAR 5.50 freeware Copyright (c) 1993-2017 Alexander Roshal
Extracting from /content/captcha_caps.rar
Creating captcha caps OK
Extracting captcha caps/24VCZ.PNG OK
Extracting captcha caps/26SGX.PNG OK
Extracting captcha caps/2HC5E.PNG OK
Extracting captcha caps/2NDXL.PNG OK
Extracting captcha caps/2NUEH.PNG OK
Extracting captcha caps/2QX4B.PNG OK
Extracting captcha caps/2V78Y.PNG OK
Extracting captcha caps/2Z45Y.PNG OK
Extracting captcha caps/2Z9R2.PNG OK
Extracting captcha caps/32HZA.PNG OK
Extracting captcha caps/38JKT.PNG OK
Extracting captcha caps/39EZ4.PNG OK
Extracting captcha caps/3GJ85.PNG OK
Extracting captcha caps/3R2JE.PNG OK
Extracting captcha caps/3RU4C.PNG OK
Extracting captcha caps/3TPFA.PNG OK
Extracting captcha caps/3TVAC.PNG OK
Extracting captcha caps/44U8C.PNG OK
Extracting captcha caps/452LV.PNG OK
Extracting captcha caps/4E4P8.PNG OK
Extracting captcha caps/4E5HX.PNG OK
Extracting captcha caps/4FVS7.PNG OK
Extracting captcha caps/4GJCC.PNG OK
Extracting captcha caps/4QQJD.PNG OK
Extracting captcha caps/4TH2K.PNG OK
Extracting captcha caps/4TN2L.PNG OK
Extracting captcha caps/4YBT5.PNG OK
Extracting captcha caps/4ZLHE.PNG OK
Extracting captcha caps/556F5.PNG OK
Extracting captcha caps/55DT5.PNG OK
Extracting captcha caps/5CEZD.PNG OK
Extracting captcha caps/5CQ39.PNG OK
Extracting captcha caps/5FZUR.PNG OK
Extracting captcha caps/5H7F4.PNG OK
Extracting captcha caps/5K4TY.PNG OK
Extracting captcha caps/5N2KC.PNG OK
Extracting captcha caps/5P6B4.PNG OK
Extracting captcha caps/5R728.PNG OK
Extracting captcha caps/5S9E7.PNG OK
Extracting captcha caps/5VRRV.PNG OK
Extracting captcha caps/5VZHL.PNG OK
Extracting captcha caps/5YVYG.PNG OK
Extracting captcha caps/63P4N.PNG OK
Extracting captcha caps/65DQ7.PNG OK
Extracting captcha caps/66JUU.PNG OK
Extracting captcha caps/69ZQ3.PNG OK
Extracting captcha caps/6B655.PNG OK
Extracting captcha caps/6GBFG.PNG OK
Extracting captcha caps/6K27H.PNG OK
Extracting captcha caps/6R7G5.PNG OK
Extracting captcha caps/6VFYG.PNG OK
Extracting captcha caps/6X8AJ.PNG OK
Extracting captcha caps/6ZNJP.PNG OK
Extracting captcha caps/73ZK2.PNG OK
Extracting captcha caps/74FPR.PNG OK
Extracting captcha caps/7C46N.PNG OK
Extracting captcha caps/7C48B.PNG OK
Extracting captcha caps/7JVBT.PNG OK
Extracting captcha caps/7NVS8.PNG OK
Extracting captcha caps/7REZP.PNG OK
Extracting captcha caps/7RHSQ.PNG OK
Extracting captcha caps/7RTT2.PNG OK
Extracting captcha caps/7VV9J.PNG OK
Extracting captcha caps/82JNK.PNG OK
Extracting captcha caps/83JKQ.PNG OK
Extracting captcha caps/89RGK.PNG OK
Extracting captcha caps/8A2D7.PNG OK
Extracting captcha caps/8ENGQ.PNG OK
Extracting captcha caps/8K5KS.PNG OK
Extracting captcha caps/95BDX.PNG OK
Extracting captcha caps/963D9.PNG OK
Extracting captcha caps/9878H.PNG OK
Extracting captcha caps/99G9R.PNG OK
Extracting captcha caps/99RJ8.PNG OK
Extracting captcha caps/9CKGT.PNG OK
Extracting captcha caps/9DK36.PNG OK
Extracting captcha caps/9E3FU.PNG OK
Extracting captcha caps/9EZCJ.PNG OK
Extracting captcha caps/9HS3T.PNG OK
Extracting captcha caps/9J59G.PNG OK
Extracting captcha caps/9JXEJ.PNG OK
Extracting captcha caps/9TBBF.PNG OK
Extracting captcha caps/9TYDP.PNG OK
Extracting captcha caps/9YEY2.PNG OK
Extracting captcha caps/A6TC6.PNG OK
Extracting captcha caps/ADB8Y.PNG OK
Extracting captcha caps/AERBR.PNG OK
Extracting captcha caps/AG43G.PNG OK
Extracting captcha caps/ALX5Q.PNG OK
Extracting captcha caps/AP6EJ.PNG OK
Extracting captcha caps/AUFH4.PNG OK
Extracting captcha caps/AVAYP.PNG OK
Extracting captcha caps/AX2QR.PNG OK
Extracting captcha caps/AZS3U.PNG OK
Extracting captcha caps/B6ZYP.PNG OK
Extracting captcha caps/B8YTF.PNG OK
Extracting captcha caps/BEGC2.PNG OK
Extracting captcha caps/BQFXZ.PNG OK
Extracting captcha caps/BQSB2.PNG OK
Extracting captcha caps/BT5CN.PNG OK
Extracting captcha caps/BYJL9.PNG OK
Extracting captcha caps/BZYB7.PNG OK
Extracting captcha caps/C2EFS.PNG OK
Extracting captcha caps/C3T9L.PNG OK
Extracting captcha caps/C8C26.PNG OK
Extracting captcha caps/CACQC.PNG OK
Extracting captcha caps/CBXJY.PNG OK
Extracting captcha caps/CE6S8.PNG OK
Extracting captcha caps/CEFCR.PNG OK
Extracting captcha caps/CEPQV.PNG OK
Extracting captcha caps/CF3V8.PNG OK
Extracting captcha caps/CFR3R.PNG OK
Extracting captcha caps/CKEQK.PNG OK
Extracting captcha caps/CUD8R.PNG OK
Extracting captcha caps/D2ZSU.PNG OK
Extracting captcha caps/D56EX.PNG OK
Extracting captcha caps/DBAAX.PNG OK
Extracting captcha caps/DC2AV.PNG OK
Extracting captcha caps/DDZRZ.PNG OK
Extracting captcha caps/DF266.PNG OK
Extracting captcha caps/DGLYX.PNG OK
Extracting captcha caps/DNQ8C.PNG OK
Extracting captcha caps/DPQCC.PNG OK
Extracting captcha caps/DUU3R.PNG OK
Extracting captcha caps/DY935.PNG OK
Extracting captcha caps/DYE9U.PNG OK
Extracting captcha caps/E6RVE.PNG OK
Extracting captcha caps/E7B47.PNG OK
Extracting captcha caps/EB975.PNG OK
Extracting captcha caps/EHQVT.PNG OK
Extracting captcha caps/EJB7K.PNG OK
Extracting captcha caps/EJEUJ.PNG OK
Extracting captcha caps/EN3SG.PNG OK
Extracting captcha caps/EQP2Q.PNG OK
Extracting captcha caps/ESLUT.PNG OK
Extracting captcha caps/ET497.PNG OK
Extracting captcha caps/F2GTJ.PNG OK
Extracting captcha caps/F32UK.PNG OK
Extracting captcha caps/F8B56.PNG OK
Extracting captcha caps/FEQRA.PNG OK
Extracting captcha caps/FF5AZ.png OK
Extracting captcha caps/FGBBV.PNG OK
Extracting captcha caps/FN4XQ.PNG OK
Extracting captcha caps/FUHZJ.PNG OK
Extracting captcha caps/FZBZB.PNG OK
Extracting captcha caps/G3C7R.PNG OK
Extracting captcha caps/G3H2V.PNG OK
Extracting captcha caps/G5A9V.PNG OK
Extracting captcha caps/G9CLN.PNG OK
Extracting captcha caps/GAGZG.PNG OK
Extracting captcha caps/GCRFA.PNG OK
Extracting captcha caps/GF59Q.PNG OK
Extracting captcha caps/GGFJH.PNG OK
Extracting captcha caps/GHNPE.PNG OK
Extracting captcha caps/GKGQ5.PNG OK
Extracting captcha caps/GLNPR.PNG OK
Extracting captcha caps/GRKU3.PNG OK
Extracting captcha caps/GXGEA.PNG OK
Extracting captcha caps/H2JTQ.PNG OK
Extracting captcha caps/HAF8J.PNG OK
Extracting captcha caps/HDK86.PNG OK
Extracting captcha caps/HLG6G.PNG OK
Extracting captcha caps/HQEFG.PNG OK
Extracting captcha caps/HUG3K.PNG OK
Extracting captcha caps/HZBJ9.PNG OK
Extracting captcha caps/J38T7.PNG OK
Extracting captcha caps/J3DG2.PNG OK
Extracting captcha caps/J5PE7.PNG OK
Extracting captcha caps/J72PD.PNG OK
Extracting captcha caps/J8E5Z.PNG OK
Extracting captcha caps/JC5YF.PNG OK
Extracting captcha caps/JFEJ5.PNG OK
Extracting captcha caps/JGXV6.PNG OK
Extracting captcha caps/K2K9D.PNG OK
Extracting captcha caps/K4GBD.PNG OK
Extracting captcha caps/K5QCD.PNG OK
Extracting captcha caps/K6U7U.PNG OK
Extracting captcha caps/KCXAC.PNG OK
Extracting captcha caps/KFBTD.PNG OK
Extracting captcha caps/KG4HH.PNG OK
Extracting captcha caps/KHZE4.PNG OK
Extracting captcha caps/KSJ4A.PNG OK
Extracting captcha caps/KZ62P.PNG OK
Extracting captcha caps/L345J.PNG OK
Extracting captcha caps/L36HP.PNG OK
Extracting captcha caps/L37HK.PNG OK
Extracting captcha caps/L9EG2.PNG OK
Extracting captcha caps/L9H4D.PNG OK
Extracting captcha caps/LA4P7.PNG OK
Extracting captcha caps/LBNXN.PNG OK
Extracting captcha caps/LJ6PC.PNG OK
Extracting captcha caps/LJ7DT.PNG OK
Extracting captcha caps/LVRKL.PNG OK
Extracting captcha caps/LXFBT.PNG OK
Extracting captcha caps/N52PE.PNG OK
Extracting captcha caps/N8ZFS.PNG OK
Extracting captcha caps/NBHG7.PNG OK
Extracting captcha caps/NGGQ.PNG OK
Extracting captcha caps/NHRGB.PNG OK
Extracting captcha caps/NKJX4.PNG OK
Extracting captcha caps/NPPPD.PNG OK
Extracting captcha caps/NU4KH.PNG OK
Extracting captcha caps/NX7BX.PNG OK
Extracting captcha caps/P2K69.PNG OK
Extracting captcha caps/PFHLR.PNG OK
Extracting captcha caps/PTFPU.PNG OK
Extracting captcha caps/PV6YU.PNG OK
Extracting captcha caps/Q4KXA.PNG OK
Extracting captcha caps/Q6L72.PNG OK
Extracting captcha caps/Q7HNQ.PNG OK
Extracting captcha caps/Q8A2V.PNG OK
Extracting captcha caps/QC7TD.PNG OK
Extracting captcha caps/QEYHL.PNG OK
Extracting captcha caps/QHHKX.PNG OK
Extracting captcha caps/QRRE2.PNG OK
Extracting captcha caps/QU3JN.PNG OK
Extracting captcha caps/QURPC.PNG OK
Extracting captcha caps/QZR3Z.PNG OK
Extracting captcha caps/R27UF.PNG OK
Extracting captcha caps/R38UE.PNG OK
Extracting captcha caps/R3UVK.PNG OK
Extracting captcha caps/RDE9N.PNG OK
Extracting captcha caps/RJUNG.PNG OK
Extracting captcha caps/RJYRY.PNG OK
Extracting captcha caps/RLKT2.PNG OK
Extracting captcha caps/RLL8T.PNG OK
Extracting captcha caps/RLTV6.PNG OK
Extracting captcha caps/RQ2QT.PNG OK
Extracting captcha caps/RQZST.PNG OK
Extracting captcha caps/RU5HH.PNG OK
Extracting captcha caps/RZFFE.PNG OK
Extracting captcha caps/S2Y25.PNG OK
Extracting captcha caps/SG9EL.PNG OK
Extracting captcha caps/SHRTS.PNG OK
Extracting captcha caps/SJ6KJ.PNG OK
Extracting captcha caps/SPHR4.PNG OK
Extracting captcha caps/SQ7S2.PNG OK
Extracting captcha caps/SQDUP.PNG OK
Extracting captcha caps/SVYGE.PNG OK
Extracting captcha caps/T3HA6.PNG OK
Extracting captcha caps/T49RZ.PNG OK
Extracting captcha caps/TJ6DE.PNG OK
Extracting captcha caps/TNVRU.PNG OK
Extracting captcha caps/TR9L8.PNG OK
Extracting captcha caps/TTBYT.PNG OK
Extracting captcha caps/TUJZK.PNG OK
Extracting captcha caps/TVAKL.PNG OK
Extracting captcha caps/U554X.PNG OK
Extracting captcha caps/U7SKY.PNG OK
Extracting captcha caps/U8U4N.PNG OK
Extracting captcha caps/UDB3Z.PNG OK
Extracting captcha caps/UDD86.PNG OK
Extracting captcha caps/UDTQS.PNG OK
Extracting captcha caps/UHB6B.PNG OK
Extracting captcha caps/UKSL7.PNG OK
Extracting captcha caps/UP88V.PNG OK
Extracting captcha caps/URCF5.PNG OK
Extracting captcha caps/URKGY.PNG OK
Extracting captcha caps/URQ7D.PNG OK
Extracting captcha caps/UTYPV.PNG OK
Extracting captcha caps/UYY55.PNG OK
Extracting captcha caps/V99ZG.PNG OK
Extracting captcha caps/VB93G.PNG OK
Extracting captcha caps/VX43Z.PNG OK
Extracting captcha caps/VYLHE.PNG OK
Extracting captcha caps/X9PJE.PNG OK
Extracting captcha caps/XC93Q.PNG OK
Extracting captcha caps/XETJ6.PNG OK
Extracting captcha caps/XGHYS.PNG OK
Extracting captcha caps/XGSHP.PNG OK
Extracting captcha caps/XH7NL.PNG OK
Extracting captcha caps/XHYGL.PNG OK
Extracting captcha caps/XPQ6V.PNG OK
Extracting captcha caps/XTC2K.PNG OK
Extracting captcha caps/XVFC7.PNG OK
Extracting captcha caps/XY364.PNG OK
Extracting captcha caps/Y2KFQ.PNG OK
Extracting captcha caps/Y5HRR.PNG OK
Extracting captcha caps/Y9R52.PNG OK
Extracting captcha caps/YC6TY.PNG OK
Extracting captcha caps/YE6KN.PNG OK
Extracting captcha caps/YGNG.PNG OK
Extracting captcha caps/YJ433.PNG OK
Extracting captcha caps/YJPSF.PNG OK
Extracting captcha caps/YL647.PNG OK
Extracting captcha caps/YNDAP.PNG OK
Extracting captcha caps/YPJAT.PNG OK
Extracting captcha caps/YTZLF.PNG OK
Extracting captcha caps/YU63S.PNG OK
Extracting captcha caps/YVX5X.PNG OK
Extracting captcha caps/YXK3G.PNG OK
Extracting captcha caps/YYGQ2.PNG OK
Extracting captcha caps/Z3EQP.PNG OK
Extracting captcha caps/Z79JR.PNG OK
Extracting captcha caps/ZBJHX.PNG OK
Extracting captcha caps/ZCUJB.PNG OK
Extracting captcha caps/ZEEBR.PNG OK
Extracting captcha caps/ZENDD.PNG OK
Extracting captcha caps/ZQPTE.PNG OK
Extracting captcha caps/ZYHPX.PNG OK
Extracting captcha caps/ZYJ6X.PNG OK
All OK
Number of images found: 300
Number of labels found: 300
Number of unique characters: 30
Characters present: {'9', 'A', '6', 'Y', '2', 'J', '3', 'F', 'C', 'X', 'D', 'N', 'V', '4', '5', 'K', 'Z', '7', 'P', 'L', 'H', 'T', 'Q', 'E', 'U', '8', 'S', 'G', 'R', 'B'}