无效参数：重塑的输入是具有 x 值的张量，但请求的形状需要 y 的倍数。 {节点Reshape_13}

Question

我正在使用 tensorflow 的对象检测 api 和 faster_rcnn_resnet101 并在尝试训练时出现以下错误：

tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Input to reshape is a tensor with 36 values, but the requested shape requires a multiple of 16

[[{{node Reshape_13}}]]

[[IteratorGetNext]]

[[IteratorGetNext/_7243]]

(1) Invalid argument: Input to reshape is a tensor with 36 values, but the requested shape requires a multiple of 16

[[{{node Reshape_13}}]]

[[IteratorGetNext]]

0 successful operations. 0 derived errors ignored.

我正在使用略微修改过的 pets-train.sh 文件来运行训练（仅改变了路径）。我正在尝试训练 tf.record 包含大小为 (1280, 720) 的 jpg 图像的文件，并且没有对网络架构进行任何更改（我已经确认记录中的所有图像都是这个大小）。

奇怪的是，当我执行与教程文件 detect_pets.py 中的内容等效的操作时，我可以成功地运行推断这些图像。这让我觉得我创建 tf.record 文件（下面的代码）的方式有问题，而不是与图像的形状有关，尽管错误与重塑有关。但是，我已经成功地训练了之前以相同方式创建的 tf.record（来自大小为 (600, 600)、(1024, 1024) 和 (720, 480) 的图像，所有图像都具有相同的网络） .此外，我之前在大小为 (600, 600).

的不同图像数据集上遇到过类似的错误（只是数字不同，但错误仍然与节点 Reshape_13 有关）

我用的是python3.7，tf版本1.14.0，cuda 10.2，Ubuntu18.04

我已经广泛查看了其他各种帖子 (here, , , , and )，但我无法取得任何进展。

我试过调整 keep_aspect_ratio_resizer 参数（最初 min_dimension=600, max_dimension=1024 但我也试过 min, max = (720, 1280) 和已经尝试过 pad_to_max_dimension: true 也有这两个 min/max 选择。

这是我用来创建 tf.record 文件的代码（抱歉或此处没有缩进）：

def make_example(imfile, boxes):
with tf.gfile.GFile(imfile, "rb") as fid:
    encoded_jpg = fid.read()

encoded_jpg_io = io.BytesIO(encoded_jpg)
image = PIL.Image.open(encoded_jpg_io)
if image.format != "JPEG":
    raise Exception("Images need to be in JPG format")

height = image.height
width = image.width

xmins = []
xmaxs = []
ymins = []
ymaxs = []
for box in boxes:
    xc, yc, w, h = box
    xmin = xc - w / 2
    xmax = xc + w / 2
    ymin = yc - h / 2
    ymax = yc + h / 2

    new_xmin = np.clip(xmin, 0, width - 1)
    new_xmax = np.clip(xmax, 0, width - 1)
    new_ymin = np.clip(ymin, 0, height - 1)
    new_ymax = np.clip(ymax, 0, height - 1)

    area = (ymax - ymin) * (xmax - xmin)
    new_area = (new_ymax - new_ymin) * (new_xmax - new_xmin)
    if new_area > 0.3 * area:
        xmins.append(new_xmin / width)
        xmaxs.append(new_xmax / width)
        ymins.append(new_ymin / height)
        ymaxs.append(new_ymax / height)

classes_text = ["vehicle".encode("utf8")] * len(boxes)
classes = [1] * len(boxes)
abs_imfile = os.path.abspath(imfile)
difficult = [0] * len(boxes)

example = tf.train.Example(
    features=tf.train.Features(
        feature={
            "image/height": int64_feature(height),
            "image/width": int64_feature(width),
            "image/filename": bytes_feature(imfile.encode("utf8")),
            "image/source_id": bytes_feature(abs_imfile.encode("utf8")),
            "image/encoded": bytes_feature(encoded_jpg),
            "image/format": bytes_feature("jpeg".encode("utf8")),
            "image/object/bbox/xmin": float_list_feature(xmins),
            "image/object/bbox/xmax": float_list_feature(xmaxs),
            "image/object/bbox/ymin": float_list_feature(ymins),
            "image/object/bbox/ymax": float_list_feature(ymaxs),
            "image/object/class/text": bytes_list_feature(classes_text),
            "image/object/class/label": int64_list_feature(classes),
            "image/object/difficult": int64_list_feature(difficult),
        }
    )
)
return example


def make_tfrecord(outfile, imfiles, truthfiles):
writer = tf.python_io.TFRecordWriter(outfile)

for imfile, truthfile in zip(imfiles, truthfiles):
    print(imfile)
    boxes = pd.read_csv(truthfile)
    if boxes.empty:
        boxes = []
    else:
        boxes = [
            (box.Xc, box.Yc, box.Width, box.Height) for box in boxes.itertuples()
        ]

    example = make_example(imfile, boxes)
    writer.write(example.SerializeToString())
writer.close()

def make_combined_train_dset(names):
imfiles = []
truthfiles = []
traindir = os.path.join(tf_datadir, "train")
valdir = os.path.join(tf_datadir, "val")

for name in names:
    imdir = os.path.join(processed_datadir, name, "images")
    truthdir = os.path.join(processed_datadir, name, "truth")

    imfiles.extend(sorted(glob.glob(os.path.join(imdir, "*.jpg"))))
    truthfiles.extend(sorted(glob.glob(os.path.join(truthdir, "*.csv"))))

inds = list(range(len(imfiles)))
np.random.shuffle(inds)
imfiles = [imfiles[i] for i in inds]
truthfiles = [truthfiles[i] for i in inds]

ntrain = round(0.9 * len(imfiles))
train_imfiles = imfiles[:ntrain]
train_truthfiles = truthfiles[:ntrain]
val_imfiles = imfiles[ntrain:]
val_truthfiles = truthfiles[ntrain:]

chunksize = 1500

for d in [traindir, valdir]:
    if not os.path.exists(d):
        os.mkdir(d)

for i in range(0, len(train_imfiles), chunksize):
    print(f"{i} / {len(train_imfiles)}", end="\r")
    cur_imfiles = train_imfiles[i : i + chunksize]
    cur_truthfiles = train_truthfiles[i : i + chunksize]
    testfile = os.path.join(traindir, f"{i}.tfrecord")
    make_tfrecord(testfile, cur_imfiles, cur_truthfiles)

for i in range(0, len(val_imfiles), chunksize):
    print(f"{i} / {len(val_imfiles)}", end="\r")
    cur_imfiles = val_imfiles[i : i + chunksize]
    cur_truthfiles = val_truthfiles[i : i + chunksize]
    testfile = os.path.join(valdir, f"{i}.tfrecord")
    make_tfrecord(testfile, cur_imfiles, cur_truthfiles)

def make_train_dset(name, train_inc=1, val_inc=1, test_inc=1):
trainfile = os.path.join(tf_datadir, name + "-train.tfrecord")
valfile = os.path.join(tf_datadir, name + "-val.tfrecord")

imdir = os.path.join(processed_datadir, name, "images")
truthdir = os.path.join(processed_datadir, name, "truth")

imfiles = sorted(glob.glob(os.path.join(imdir, "*.jpg")))
truthfiles = sorted(glob.glob(os.path.join(truthdir, "*.csv")))

n = len(imfiles)
ntrain = round(0.9 * n)

print(trainfile)
make_tfrecord(trainfile, imfiles[:ntrain:train_inc], truthfiles[:ntrain:train_inc])
print(valfile)
make_tfrecord(valfile, imfiles[ntrain::val_inc], truthfiles[ntrain::val_inc])

对于其他数据集，我已经能够使用上面定义的函数 make_combined_train_dset（或 make_train_dset）创建一个 tf.record，然后提供这些数据集的路径faster_rcnn_resnet101.config 文件，然后训练正常进行（就像在教程示例中一样）。使用这个新数据集（以及至少一个其他数据集），我遇到了上述重塑错误。尽管如此，我仍然可以运行推断这个数据集中的图像，所以这让我认为问题出在 tf 记录或它们是如何被读入的，而不是图像或其大小的内在问题。

任何人都可以提供任何帮助，我将不胜感激，因为我已经为此苦苦挣扎了好几天。

Answer 1

我是个白痴：确认。

问题是 classes_text、类和困难是长度错误。

已替换

classes_text = ["vehicle".encode("utf8")] * len(boxes)
classes = [1] * len(boxes)
difficult = [0] * len(boxes)

和

classes_text = ["vehicle".encode("utf8")] * len(xmins)
classes = [1] * len(xmins
difficult = [0] * len(xmins)

并且运行良好。发布这个以防其他人遇到类似问题。

感谢所有花时间或思考过我的问题的人。希望这有助于某人不要浪费时间。

无效参数：重塑的输入是具有 x 值的张量，但请求的形状需要 y 的倍数。 {节点Reshape_13}

Invalid argument: Input to reshape is a tensor with x values, but requested shape requires a multiple of y. {node Reshape_13}

python

tensorflow

object-detection-api