如何计算 PIL 中 Image.resize() 的结果文件大小

Question

我必须将传入文件的大小减少到最大 1MB。我使用 PIL 进行图像操作和 python 3.5。图像的文件大小由下式给出：

import os
src = 'testfile.jpg'
os.path.getsize(src)
print(src)

在我的案例中给出了 1531494 如果我用 PIL 打开文件，我只能得到尺寸：

from PIL import Image
src = 'testfile.jpg'
image = Image.open(src)
size =  image.size
print(size)

在我的案例中给出 (1654, 3968)

当然我可以像下面这样用不同的大小循环文件，保存文件并检查它的文件大小。但是必须有更简单的方法，因为这太费时间了。（如果你缩小1000个不同大小的文件）

def resize_image(src, reduceby=1):
    '''
    resizes image by percent given in reduceby
    '''
    print(" process_image:",src, reduceby)
    org = Image.open(src)
    real_size = org.size
    reduced_size = (int(real_size[0] * reduceby / 100),int(real_size[1] * reduceby / 100) )
    org.resize(reduced_size, Image.ANTIALIAS)
    reduced_file = src[:-4] +"_" + str(reduceby) + src[-4:]
    org.save(reduced_file, optimize=True)
    print(" reduced_image:", reduced_file)
    reduced_filesize = os.path.getsize(reduced_file)
    return reduced_filesize, reduced_file

def loop_image(src, target_size):
    print("loop_image    :", src, target_size)
    file_size = os.path.getsize(src)
    reduced_file =src
    print("source        :", src, file_size)
    reduce_by = 1
    while file_size > target_size:
        file_size, reduced_file = resize_image(src, reduce_by)
        print("target       :", file_size, reduced_file)
        reduce_by += 1
    return reduced_file

该功能有效，但减少太多，耗时太长。我的问题是：如何在调整大小之前计算生成的文件大小 ？或者有更简单的方法吗？

Answer 1

长话短说，您不知道图像的压缩效果如何，因为这在很大程度上取决于图像的类型。也就是说，我们可以优化您的代码。

一些优化：

使用内存大小和图像宽度估算每个像素的字节数。
根据新内存消耗和旧内存消耗执行比率更新。

我的编码解决方案应用了上述两种方法，因为单独应用它们似乎并没有产生非常稳定的收敛。以下部分将更深入地解释这两个部分并显示我考虑的测试用例。

减少图像内存

以下代码根据原始文件大小（以字节为单位）和首选文件大小（以字节为单位）之间的差异来估算新图像尺寸。它将近似每个像素的字节数，然后将每个像素的原始字节数与图像宽度和高度上的每个像素的首选字节数之间的差异应用（因此取平方根）。

然后我使用 opencv-python (cv2) 进行图像缩放，但这可以通过您的代码进行更改。

def reduce_image_memory(path, max_file_size: int = 2 ** 20):
    """
        Reduce the image memory by downscaling the image.

        :param path: (str) Path to the image
        :param max_file_size: (int) Maximum size of the file in bytes
        :return: (np.ndarray) downscaled version of the image
    """
    image = cv2.imread(path)
    height, width = image.shape[:2]

    original_memory = os.stat(path).st_size
    original_bytes_per_pixel = original_memory / np.product(image.shape[:2])

    # perform resizing calculation
    new_bytes_per_pixel = original_bytes_per_pixel * (max_file_size / original_memory)
    new_bytes_ratio = np.sqrt(new_bytes_per_pixel / original_bytes_per_pixel)
    new_width, new_height = int(new_bytes_ratio * width), int(new_bytes_ratio * height)

    new_image = cv2.resize(image, (new_width, new_height), interpolation=cv2.INTER_LINEAR_EXACT)
    return new_image

应用比例

大部分魔法都发生在 ratio *= max_file_size / new_memory 中，我们计算了与首选尺寸相关的误差，并用该值更正了我们的比率。

程序将搜索满足以下条件的比率：

abs(1 - max_file_size / new_memory) > max_deviation_percentage

这意味着新文件大小必须相对接近首选文件大小。你通过delta来控制这个亲密度。增量越高，您的文件可以越小（低于 max_file_size）。增量越小，新文件大小越接近 max_file_size，但永远不会变大。

的交易是及时的，delta越小，找到满足条件的比率所需的时间越长，经验测试表明0.01和0.05之间的值是好的。

if __name__ == '__main__':
    image_location = "test img.jpg"

    # delta denotes the maximum variation allowed around the max_file_size
    # The lower the delta the more time it takes, but the close it will be to `max_file_size`.
    delta = 0.01
    max_file_size = 2 ** 20 * (1 - delta)
    max_deviation_percentage = delta

    current_memory = new_memory = os.stat(image_location).st_size
    ratio = 1
    steps = 0

    # make sure that the comparison is within a certain deviation.
    while abs(1 - max_file_size / new_memory) > max_deviation_percentage:
        new_image = reduce_image_memory(image_location, max_file_size=max_file_size * ratio)
        cv2.imwrite(f"resize {image_location}", new_image)

        new_memory = os.stat(f"resize {image_location}").st_size
        ratio *= max_file_size / new_memory
        steps += 1

    print(f"Memory resize: {current_memory / 2 ** 20:5.2f}, {new_memory / 2 ** 20:6.4f} MB, number of steps {steps}")

测试用例

为了测试，我有两种不同的方法，使用随机生成的图像和来自 google 的示例。

对于随机图像，我使用了以下代码

def generate_test_image(ratio: Tuple[int, int], file_size: int) -> Image:
    """
        Generate a test image with fixed width height ratio and an approximate size.

        :param ratio: (Tuple[int, int]) screen ratio for the image
        :param file_size: (int) Approximate size of the image, note that this may be off due to image compression.
    """
    height, width = ratio  # Numpy reverse values
    scale = np.int(np.sqrt(file_size // (width * height)))
    img = np.random.randint(0, 255, (width * scale, height * scale, 3), dtype=np.uint8)
    return img

结果

使用随机生成的图像

image_location = "test image random.jpg"
# Generate a large image with fixed ratio and a file size of ~1.7MB
image = generate_test_image(ratio=(16, 9), file_size=1531494)
cv2.imwrite(image_location, image)

内存大小调整：1.71，0.99 MB，步数 2

分两步将原始大小从 1.7 MB 减少到 0.99 MB。

（之前）

（之后）

使用 google 图片

内存大小调整：1.51，0.996 MB，步数 4

它通过 4 个步骤将原始大小从 1.51 MB 减小到 0.996 MB。

（之前）

（之后）

奖金

它也适用于 .png、.jpeg、.tiff 等...
除了缩小尺寸外，它还可以用于将图像放大到一定的内存消耗。
尽可能保持图像比例。

编辑

我使代码对用户更加友好，并使用 io.Buffer 添加了 Mark Setchell 的建议，这大致加快了代码速度 2 倍。还有一个 step_limit，如果增量非常小，可以防止无限循环。

import io
import os
import time
from typing import Tuple

import cv2
import numpy as np
from PIL import Image


def generate_test_image(ratio: Tuple[int, int], file_size: int) -> Image:
    """
        Generate a test image with fixed width height ratio and an approximate size.

        :param ratio: (Tuple[int, int]) screen ratio for the image
        :param file_size: (int) Approximate size of the image, note that this may be off due to image compression.
    """
    height, width = ratio  # Numpy reverse values
    scale = np.int(np.sqrt(file_size // (width * height)))
    img = np.random.randint(0, 255, (width * scale, height * scale, 3), dtype=np.uint8)
    return img


def _change_image_memory(path, file_size: int = 2 ** 20):
    """
        Tries to match the image memory to a specific file size.

        :param path: (str) Path to the image
        :param file_size: (int) Size of the file in bytes
        :return: (np.ndarray) rescaled version of the image
    """
    image = cv2.imread(path)
    height, width = image.shape[:2]

    original_memory = os.stat(path).st_size
    original_bytes_per_pixel = original_memory / np.product(image.shape[:2])

    # perform resizing calculation
    new_bytes_per_pixel = original_bytes_per_pixel * (file_size / original_memory)
    new_bytes_ratio = np.sqrt(new_bytes_per_pixel / original_bytes_per_pixel)
    new_width, new_height = int(new_bytes_ratio * width), int(new_bytes_ratio * height)

    new_image = cv2.resize(image, (new_width, new_height), interpolation=cv2.INTER_LINEAR_EXACT)
    return new_image


def _get_size_of_image(image):
    # Encode into memory and get size
    buffer = io.BytesIO()
    image = Image.fromarray(image)
    image.save(buffer, format="JPEG")
    size = buffer.getbuffer().nbytes
    return size


def limit_image_memory(path, max_file_size: int, delta: float = 0.05, step_limit=10):
    """
        Reduces an image to the required max file size.

        :param path: (str) Path to the original (unchanged) image.
        :param max_file_size: (int) maximum size of the image
        :param delta: (float) maximum allowed variation from the max file size.
            This is a value between 0 and 1, relatively to the max file size.
        :return: an image path to the limited image.
    """
    start_time = time.perf_counter()
    max_file_size = max_file_size * (1 - delta)
    max_deviation_percentage = delta
    new_image = None

    current_memory = new_memory = os.stat(image_location).st_size
    ratio = 1
    steps = 0

    while abs(1 - max_file_size / new_memory) > max_deviation_percentage:
        new_image = _change_image_memory(path, file_size=max_file_size * ratio)
        new_memory = _get_size_of_image(new_image)
        ratio *= max_file_size / new_memory
        steps += 1

        # prevent endless looping
        if steps > step_limit:  break

    print(f"Stats:"
          f"\n\t- Original memory size: {current_memory / 2 ** 20:9.2f} MB"
          f"\n\t- New memory size     : {new_memory / 2 ** 20:9.2f} MB"
          f"\n\t- Number of steps {steps}"
          f"\n\t- Time taken: {time.perf_counter() - start_time:5.3f} seconds")

    if new_image is not None:
        cv2.imwrite(f"resize {path}", new_image)
        return f"resize {path}"
    return path


if __name__ == '__main__':
    image_location = "your nice image.jpg"

    # Uncomment to generate random test images
    # test_image = generate_test_image(ratio=(16, 9), file_size=1567289)
    # cv2.imwrite(image_location, test_image)

    path = limit_image_memory(image_location, max_file_size=2 ** 20, delta=0.01)

如何计算 PIL 中 Image.resize() 的结果文件大小

How to calculate the resulting filesize of Image.resize() in PIL

python-imaging-library

python-3.x

减少图像内存

应用比例

测试用例

结果

奖金

编辑