ImageGrab.grab() 方法太慢

ImageGrab.grab() method is too slow

所以我需要每秒获取一堆屏幕截图,例如 5 个。我正在使用它来为游戏编写机器人程序。然而 imagegrab 方法大约需要 0.3 秒,这对我来说太慢了。即使在指定了 bbox 值之后,它仍然需要 0.3 秒。我想应该提到我在 mac 上。有什么更好的办法吗

我什至尝试了 os.system("screencapture filename.png"),它的运行时间为 0.15-0.2 秒,这很好,但我想走得更快。

所以我如何让它为我工作是通过使用

os.system("screencapture -R0,0,100,100 filename.png")
im = Image.open("filename.png")

在那里你可以相应地替换 0,0,100,100。 它的 运行 时间不到 0.1 秒,更像是 0.06 秒。

另一个解决方案是使用 Python MSS

from mss import mss
from PIL import Image

def capture_screenshot():
    # Capture entire screen
    with mss() as sct:
        monitor = sct.monitors[1]
        sct_img = sct.grab(monitor)
        # Convert to PIL/Pillow Image
        return Image.frombytes('RGB', sct_img.size, sct_img.bgra, 'raw', 'BGRX')

img = capture_screenshot()
img.show()

此功能可以在我的慢速笔记本电脑上 return 以高达 27 fps 的速度截屏。

问题:在 Mac

上有什么比 ImageGrab.grab() 更好

我在 mss、pil 和 pyscreenshot 之间进行了测试,测量了抓取不同大小的图像所花费的平均时间,并以毫秒为单位报告了时间。

回答

看来 mss 在 mac 上远远超过其他人。要捕获 800x400 的屏幕片段,mms 需要 15ms,而其他两个需要 300-400ms。这就是彩信 66fps 或其他两个 3fps 之间的差异。

代码

# !pip install image
# !pip install opencv-python
# !pip install pyscreenshot

import numpy as np
from time import time


resolutions = [
    (0, 0, 100,100),(0, 0, 200,100),
    (0, 0, 200,200),(0, 0, 400,200),
    (0, 0, 400,400),(0, 0, 800,400)
]


import numpy as np
import pyscreenshot as ImageGrab
import cv2


def show(nparray):
    import cv2
    cv2.imshow('window',cv2.cvtColor(nparray, cv2.COLOR_BGR2RGB))
    # key controls in a displayed window
    # if cv2.waitKey(25) & 0xFF == ord('q'):
        # cv2.destroyAllWindows()


def mss_test(shape) :
    average = time()
    import mss
    sct = mss.mss()
    mon = {"top": shape[0], "left": shape[1], "width": shape[2]-shape[1], "height": shape[3]-shape[0]}
    for _ in range(5):
        printscreen =  np.asarray(sct.grab(mon))
    average_ms = int(1000*(time()-average)/5.)
    return average_ms, printscreen.shape




def pil_test(shape) :
    average = time()
    from PIL import ImageGrab
    for _ in range(5):
        printscreen =  np.array(ImageGrab.grab(bbox=shape))
    average_ms = int(1000*(time()-average)/5.)
    return average_ms, printscreen.shape




def pyscreenshot_test(shape):
    average = time()
    import pyscreenshot as ImageGrab
    for _ in range(5):
        printscreen = np.asarray( ImageGrab.grab(bbox=shape) )
    average_ms = int(1000*(time()-average)/5.)
    return average_ms, printscreen.shape


named_function_pair = zip("mss_test,pil_test,pyscreenshot_test".split(","),
    [mss_test,pil_test,pyscreenshot_test])

for name,function in named_function_pair:
    results = [ function(res) for res in resolutions ]
    print("Speed results for using",name)
    for res,result in zip(resolutions,results) :
        speed,shape = result
        print(res,"took",speed,"ms, produced shaped",shape)

输出

Speed results for using mss_test
(0, 0, 100, 100) took 7 ms, produced shaped (200, 200, 4)
(0, 0, 200, 100) took 4 ms, produced shaped (200, 400, 4)
(0, 0, 200, 200) took 5 ms, produced shaped (400, 400, 4)
(0, 0, 400, 200) took 6 ms, produced shaped (400, 800, 4)
(0, 0, 400, 400) took 9 ms, produced shaped (800, 800, 4)
(0, 0, 800, 400) took 15 ms, produced shaped (800, 1600, 4)

Speed results for using pil_test
(0, 0, 100, 100) took 313 ms, produced shaped (100, 100, 4)
(0, 0, 200, 100) took 321 ms, produced shaped (100, 200, 4)
(0, 0, 200, 200) took 334 ms, produced shaped (200, 200, 4)
(0, 0, 400, 200) took 328 ms, produced shaped (200, 400, 4)
(0, 0, 400, 400) took 321 ms, produced shaped (400, 400, 4)
(0, 0, 800, 400) took 320 ms, produced shaped (400, 800, 4)

Speed results for using pyscreenshot_test
(0, 0, 100, 100) took 85 ms, produced shaped (200, 200, 4)
(0, 0, 200, 100) took 101 ms, produced shaped (200, 400, 4)
(0, 0, 200, 200) took 122 ms, produced shaped (400, 400, 4)
(0, 0, 400, 200) took 163 ms, produced shaped (400, 800, 4)
(0, 0, 400, 400) took 236 ms, produced shaped (800, 800, 4)
(0, 0, 800, 400) took 400 ms, produced shaped (800, 1600, 4)

进一步观察

尽管所有三个库都发送了相同的屏幕区域以进行抓取,但 mss 和 pyscreenshot 都抓取了 mac 屏幕的物理像素,而 pil 抓取了逻辑像素。仅当您将 Mac 显示分辨率从最高分辨率调低时才会发生这种情况。在我的例子中,我将视网膜显示设置为 "balanced",这意味着每个逻辑像素实际上是 2x2 物理像素。