理解 pycaffe 中的 load_image() 方法

Question

来源描述

Load an image converting from grayscale or alpha as needed.

Parameters
----------
filename : string
color : boolean
    flag for color format. True (default) loads as RGB while False
    loads as intensity (if image is already grayscale).

Returns
-------
image : an image with type np.float32 in range [0, 1]
    of size (H x W x 3) in RGB or
    of size (H x W x 1) in grayscale.

这是一个如何使用它的例子

input_image = 255 * caffe.io.load_image(IMAGE_FILE)

我的问题是 IMAGE_FILE 是否是每个通道 0-255 值的 RGB 颜色并且 return 值 caffe.io.load_image(IMAGE_FILE) 在 [0,1] 范围内，乘以 255，每个通道的范围仍然是0-255.

那么做这一步有什么意义呢？

Answer 1

将图像读取为 [0..1] 范围内的浮动类型的原因是：

有些模型不会将输入缩放回 [0..255]，而是处理 [0..1] 范围内的输入。
在将图像数据类型从 uint 转换为浮点数时，处理图像以将像素值缩放为 [0..1] 是很常见的（例如，参见 Matlab 的 im2double, im2single).
一些图像格式的数据在 [0..65536] (2 bytes/pixel) 范围内，在这种情况下，保持范围固定并且只使用比例比较方便。

理解 pycaffe 中的 load_image() 方法

Understanding load_image() method in pycaffe

machine-learning

computer-vision

neural-network

deep-learning

caffe