如何在给定相对 bbox 坐标的情况下在此图片上裁剪车号?
How to crop car number on this pic given its relative bbox coordinates?
我有这张照片:
我有以下相对坐标:
[[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
(但是我不明白,为什么这里有 5 个值而不是通常的 4 个值及其含义)
我尝试 scikit-image
显示整张图片而不是裁剪:
import numpy as np
from skimage import io, draw
img = io.imread(pic)
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
vertices = np.asarray(test_vals)
rows, cols = draw.polygon(vertices[:, 0], vertices[:, 1])
crop = img.copy()
crop[:, :, -1] = 0
crop[rows, cols, -1] = 255
io.imshow(crop)
io.show()
# shows whole pic instead of cropping
我对 opencv
的尝试给出了错误,因为坐标是浮点格式:
import cv2 as cv
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
x = vals[0][0]
y = vals[0][1]
width = vals[1][0] - x
height = vals[2][1] - y
img = cv.imread(pic)
crop_img = img[y:y+height, x:x+width]
cv.imshow("cropped", crop_img)
cv.waitKey(0)
# TypeError: slice indices must be integers or None or have an __index__ method
如何在给定相对 bbox 坐标的情况下在此图片上裁剪车号?
我不局限于任何框架,所以如果您认为 TF 或其他任何东西可能有帮助 - 请提出建议。
检查
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
表明列表中的第一个和最后一个条目是相同的。
在图像处理中,位置 (0,0) 是左上角。查看列表中的值,可以假设坐标给出如下:
[top_left, bottom_left, bottom_right, top_right, top_left]
所有数字都在 0 到 1 之间的事实表明这些是相对坐标。要重新缩放回图像尺寸,它们需要分别乘以高度和宽度:
# dummy img sizes:
image_height = 480
image_width = 640
# rescale to img dimensions, and convert to int, to allow slicing:
bbox_coordinates = [[int(a[0]*image_height), int(a[1]* image_width)] for a in vals]
现在,您可以使用数组切片对图像进行裁剪:
top_left = bbox_coordinates[0]
bottom_right = boox_coordinates[2]
bbox = img[top_left[0]:bottom_right[0], top_left[1]:bottom_right[1]]
我有这张照片:
我有以下相对坐标:
[[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
(但是我不明白,为什么这里有 5 个值而不是通常的 4 个值及其含义)
我尝试 scikit-image
显示整张图片而不是裁剪:
import numpy as np
from skimage import io, draw
img = io.imread(pic)
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
vertices = np.asarray(test_vals)
rows, cols = draw.polygon(vertices[:, 0], vertices[:, 1])
crop = img.copy()
crop[:, :, -1] = 0
crop[rows, cols, -1] = 255
io.imshow(crop)
io.show()
# shows whole pic instead of cropping
我对 opencv
的尝试给出了错误,因为坐标是浮点格式:
import cv2 as cv
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
x = vals[0][0]
y = vals[0][1]
width = vals[1][0] - x
height = vals[2][1] - y
img = cv.imread(pic)
crop_img = img[y:y+height, x:x+width]
cv.imshow("cropped", crop_img)
cv.waitKey(0)
# TypeError: slice indices must be integers or None or have an __index__ method
如何在给定相对 bbox 坐标的情况下在此图片上裁剪车号?
我不局限于任何框架,所以如果您认为 TF 或其他任何东西可能有帮助 - 请提出建议。
检查
vals = [[0.6625, 0.6035714285714285], [0.7224999999999999, 0.6035714285714285], [0.7224999999999999, 0.6571428571428571], [0.6625, 0.6571428571428571], [0.6625, 0.6035714285714285]]
表明列表中的第一个和最后一个条目是相同的。 在图像处理中,位置 (0,0) 是左上角。查看列表中的值,可以假设坐标给出如下:
[top_left, bottom_left, bottom_right, top_right, top_left]
所有数字都在 0 到 1 之间的事实表明这些是相对坐标。要重新缩放回图像尺寸,它们需要分别乘以高度和宽度:
# dummy img sizes:
image_height = 480
image_width = 640
# rescale to img dimensions, and convert to int, to allow slicing:
bbox_coordinates = [[int(a[0]*image_height), int(a[1]* image_width)] for a in vals]
现在,您可以使用数组切片对图像进行裁剪:
top_left = bbox_coordinates[0]
bottom_right = boox_coordinates[2]
bbox = img[top_left[0]:bottom_right[0], top_left[1]:bottom_right[1]]