tf.nn.max_pool 的 ksize 参数有什么用？

Question

在tf.nn.max_pool的定义中，ksize是做什么用的？

tf.nn.max_pool(value, ksize, strides, padding, data_format='NHWC', name=None)

Performs the max pooling on the input.

Args:

value: A 4-D Tensor with shape [batch, height, width, channels] and type    tf.float32.
ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.

例如，如果 input value 属于 tensor : [1, 64, 64, 3] 和 ksize=3。那是什么意思？

Answer 1

documentation 状态：

ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.

一般来说，对于图像，对于 64x64 像素的 RGB 图像，您的输入的形状为 [batch_size, 64, 64, 3]。

内核大小 ksize 通常是 [1, 2, 2, 1] 如果你有一个 2x2 window 并且你取了最大值。在批量大小维度和通道维度上，ksize 是 1，因为我们不想在多个示例或多个通道上取最大值。

tf.nn.max_pool 的 ksize 参数有什么用？

What is tf.nn.max_pool's ksize parameter used for?

computer-vision

tensorflow