裁剪模板区域后，如何使用 matchTemplate 转换 x、y 坐标？

Question

我正在使用 python 3.9.6 和 OpenCV 4.5.1

我正在尝试检测模板上的对象。我的模板是我的显示器的实时馈送，我的对象是 jpg 的。

问题：当我裁剪模板以加快检测速度时，我的鼠标开始点击错误的位置。

这只会在我裁剪模板后发生。我认为这是因为我在错误的时间在我的脚本中裁剪了我的模板。我的全监视器是 (0 , 0, 1920, 1080) 但我只想捕获 [220:900, 270:1590]

到目前为止，我已经遵循了 OpenCV 文档和一些在线教程，但我现在卡住了。

如何单击 img（第三个代码块）而不是因模板裁剪不当而导致的不正确偏移？

我正在使用 win32gui 获取我的模板：

import numpy as np
import win32gui, win32ui, win32con


class WindowCapture:

    # properties
    w = 0
    h = 0
    hwnd = None
    cropped_x = 0
    cropped_y = 0
    offset_x = 0
    offset_y = 0

    # constructor
    def __init__(self, window_name=None):
        # find the handle for the window we want to capture.
        # if no window name is given, capture the entire screen
        if window_name is None:
            self.hwnd = win32gui.GetDesktopWindow()
        else:
            self.hwnd = win32gui.FindWindow(None, window_name)
            if not self.hwnd:
                raise Exception('Window not found: {}'.format(window_name))

        # get the window size
        window_rect = win32gui.GetWindowRect(self.hwnd)
        self.w = window_rect[2] - window_rect[0]
        self.h = window_rect[3] - window_rect[1]

        # account for the window border and titlebar and cut them off
        border_pixels = 0
        titlebar_pixels = 0
        self.w = self.w - (border_pixels * 2)
        self.h = self.h - titlebar_pixels - border_pixels
        self.cropped_x = border_pixels
        self.cropped_y = titlebar_pixels

        # set the cropped coordinates offset so we can translate screenshot
        # images into actual screen positions
        self.offset_x = window_rect[0] + self.cropped_x
        self.offset_y = window_rect[1] + self.cropped_y

    def get_screenshot(self):

        # get the window image data
        wDC = win32gui.GetWindowDC(self.hwnd)
        dcObj = win32ui.CreateDCFromHandle(wDC)
        cDC = dcObj.CreateCompatibleDC()
        dataBitMap = win32ui.CreateBitmap()
        dataBitMap.CreateCompatibleBitmap(dcObj, self.w, self.h)
        cDC.SelectObject(dataBitMap)
        cDC.BitBlt((0, 0), (self.w, self.h), dcObj, (self.cropped_x, self.cropped_y), win32con.SRCCOPY)

        # convert the raw data into a format opencv can read
        # dataBitMap.SaveBitmapFile(cDC, 'debug.bmp')
        signedIntsArray = dataBitMap.GetBitmapBits(True)
        img = np.fromstring(signedIntsArray, dtype='uint8')
        img.shape = (self.h, self.w, 4)

        # free resources
        dcObj.DeleteDC()
        cDC.DeleteDC()
        win32gui.ReleaseDC(self.hwnd, wDC)
        win32gui.DeleteObject(dataBitMap.GetHandle())
        img = img[...,:3]
        img = np.ascontiguousarray(img)

        return img

    @staticmethod
    def list_window_names():
        def winEnumHandler(hwnd, ctx):
            if win32gui.IsWindowVisible(hwnd):
                print(hex(hwnd), win32gui.GetWindowText(hwnd))
        win32gui.EnumWindows(winEnumHandler, None)

以及用于我的对象检测的 OpenCV 和 numpy：

import cv2 as cv
import numpy as np
    
class Vision:

    # properties
    needle_img = None
    needle_w = 0
    needle_h = 0
    method = None

    # constructor
    def __init__(self, needle_img_path, method=cv.TM_CCORR_NORMED):
        self.needle_img = cv.imread(needle_img_path, cv.IMREAD_UNCHANGED)

        # Save the dimensions of the needle image
        self.needle_w = self.needle_img.shape[1]
        self.needle_h = self.needle_img.shape[0]

        # There are 6 methods to choose from:
        # TM_CCOEFF, TM_CCOEFF_NORMED, TM_CCORR, TM_CCORR_NORMED, TM_SQDIFF, TM_SQDIFF_NORMED
        self.method = method

    def find(self, haystack_img, threshold=0.5, debug_mode=None):
        # run the OpenCV algorithm
        result = cv.matchTemplate(haystack_img, self.needle_img, self.method)

        # Get the all the positions from the match result that exceed our threshold           
        locations = np.where(result >= threshold)
        locations = list(zip(*locations[::-1]))
        rectangles = []
        for loc in locations:
            rect = [int(loc[0]), int(loc[1]), self.needle_w, self.needle_h]
            # Add every box to the list twice in order to retain single (non-overlapping) boxes
            rectangles.append(rect)
            rectangles.append(rect)
        # Apply group rectangles
        rectangles, weights = cv.groupRectangles(rectangles, groupThreshold=1, eps=0.5)

        points = []
        if len(rectangles):
            line_color = (0, 255, 0)
            line_type = cv.LINE_4
            marker_color = (255, 0, 255)
            marker_type = cv.MARKER_CROSS

            # Loop over all the rectangles
            for (x, y, w, h) in rectangles:

                # Determine the center position
                center_x = x + int(w/2)
                center_y = y + int(h/2)
                # Save the points
                points.append((center_x, center_y))

                if debug_mode == 'rectangles':
                    # Determine the box position
                    top_left = (x, y)
                    bottom_right = (x + w, y + h)
                    # Draw the box
                    cv.rectangle(haystack_img, top_left, bottom_right, color=line_color, 
                                lineType=line_type, thickness=2)
                elif debug_mode == 'points':
                    # Draw the center point
                    cv.drawMarker(haystack_img, (center_x, center_y), 
                                color=marker_color, markerType=marker_type, 
                                markerSize=40, thickness=2)


        ############ DISPLAYS MATCHES #############
        if debug_mode:
            cv.imshow('Matches', haystack_img)

        return points

然后在此处的单独脚本中传递两个变量：

import cv2 as cv
import pyautogui as py
from windowcapture import WindowCapture
from vision import Vision
    
# initialize the WindowCapture class
# leave blank to capture the whole screen
haystack = WindowCapture()
# initialize the Vision class
needle = Vision('needle.jpg')

while(True):

    # get an updated image of the game
    screenshot = template.get_screenshot()
    screenshotCropped = screenshot[220:900, 270:1590]

    img = needle.find(screenshotCropped, 0.85, 'points')

    if img:
        py.moveTo(img[0])

导致问题的行是：screenshotCropped = screenshot[220:900, 270:1590]如果它被删除，我正确地点击了对象。

我也尝试添加 border_pixels 和 titlebar_pixels 以允许我直接从 WindowCapture 裁剪，但我运行遇到了上面详述的相同问题。

Answer 1

如果我对你的代码的理解正确，当你裁剪图像时，你（还）没有考虑通过该裁剪引入的 X/Y 偏移量。

如果我正确理解你的例子，你的代码

screenshotCropped = screenshot[220:900, 270:1590]

是从 220-900 沿 y 轴（高度）和 270-1590 沿 x 轴（宽度）裁剪，是吗？如果是这样，试试

x_0, x_1 = 270,1590
y_0, y_1 = 220,900
screenshotCropped = screenshot[y_0:y_1, x_0:x_1]
...
if img:
    x_coord = img[0][0] + x_0
    y_coord = img[0][1] + y_0
    py.moveTo(x_coord,y_coord)

如果您的裁剪区域发生变化，请相应地更新您的 (x_0, x_1, y_0, y_1) 值（在裁剪操作和 py.moveTo 操作中）？

裁剪模板区域后，如何使用 matchTemplate 转换 x、y 坐标？

How do I translate x, y coordinates with matchTemplate once I've cropped the template area?

python

opencv

image-processing