如何将带有 openCV 的视频流式传输到我的 pytorch 神经网络中?
How do i stream a video with openCV into my pytorch neural network?
我从零开始用Pytorch写了YOLOv3。如果我通过训练有权重的模型发送图像,它会起作用。下一步是使用我的相机让 YOLO 实时发挥它的魔力。
我认为正确的工作管道是捕获视频的单帧并将其馈送到网络。然后,将框写在同一帧上。
checkpoint = torch.load("\my_checkpoint_40.pth.tar")
model = YOLOv3(in_channels = 3, num_classes = 20).to(config.DEVICE)
model.load_state_dict(checkpoint["state_dict"])
ip_camera = "http://192.168.1.70:4500/mjpegfeed?640x480"
outputFile = "yolo_out_py.avi"
通过这种方式,我将权重加载到网络中。然后,我编写了使用我的相机的功能(它是我手机上的 droidCamera,因为在我的 PC 上我没有任何相机设备,所以我使用移动设备的 ip)并且代码本身有效:视频出现在屏幕上。
outputFile 应该是写入视频的目标路径。
问题是当我尝试将单个帧加载到网络中并执行其余过程时。
def streaming(model, thresh, iou_thresh, anchors, ip_camera):
stream = cv2.VideoCapture(ip_camera)
# Corrective actions printed in the even of failed connection.
if stream.isOpened() is not True:
print('Not opened.')
print('Please ensure the following:')
print('1. DroidCam is not running in your browser.')
print('2. The IP address given is correct.')
# Resizing the image to be in hte same dimension of the YOLOv3 Network
width = 416
height = 416
# Connection successful. Proceeding to display video stream.
while stream.isOpened() is True:
# Capture frame-by-frame
ret, f = stream.read()
dim = (width, height)
image = cv2.resize(f, dim, interpolation = cv2.INTER_AREA)
cv2.imshow('frame', image)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
for frame in image:
model.eval()
anchors = torch.tensor(anchors)
anchors = anchors.to(config.DEVICE)
x = torch.tensor(frame)
x = x.to("cuda")
# from this line to the nms_boxes, it's the same code i used for plotting a single image
with torch.no_grad():
out = model(x)
bboxes = [[] for _ in range(x.shape[0])]
for i in range(1):
batch_size, A, S, _, _ = out[i].shape
anchor = anchors[i]
boxes_scale_i = cells_to_bboxes(
out[i], anchor, S = S, is_preds = True
)
for idx, (box) in enumerate(boxes_scale_i):
bboxes[idx] += box
model.train()
for i in range(batch_size):
nms_boxes = non_max_suppression(
bboxes[i], iou_threshold = iou_thresh, threshold = thresh, box_format = "midpoint",
)
# cells_to_boxes and non_max_suppression are functions that return boxes coordinates
# and the "better" box
#now it's time to write things on the frame
frame = cv2.VideoWriter(outputFile, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), 30,
(round(stream.get(cv2.CAP_PROP_FRAME_WIDTH)), round(stream.get(cv2.CAP_PROP_FRAME_HEIGHT))))
frame.write(nms_boxes)
stream.release()
cv2.destroyAllWindows()
由于以下几个原因,代码不起作用:model.eval() 给我一个错误:缺少 1 个必需的位置参数:'self'
然后,我有几个错误,我认为是关于视频流工作的正确管道。这是我第一次使用 openCV。
如果我删除 model.eval(),我会出现另一个错误:
out = model(x)
这是回溯
Traceback (most recent call last):
File "C:/Python_Project/YOLOV3/openVid.py", line 97, in <module>
streaming(YOLOv3, 0.6, 0.6, config.ANCHORS, ip_camera)
File "C:/Python_Project/YOLOV3/openVid.py", line 72, in streaming
out = model(x)
File "C:\Python_Project\YOLOV3\model.py", line 106, in __init__
self.layers = self.create_conv_layers()
File "C:\Python_Project\YOLOV3\model.py", line 141, in create_conv_layers
CNNBlock(
File "C:\Python_Project\YOLOV3\model.py", line 46, in __init__
self.conv = nn.Conv2d(in_channels, out_channels, bias = not bn_act, **kwargs)
File "C:\Users\Simone\anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\conv.py", line 430, in __init__
super(Conv2d, self).__init__(
File "C:\Users\Simone\anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\conv.py", line 83, in __init__
if in_channels % groups != 0:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
我不知道接下来要做什么。
我看到我应该在 ONNX 中转换模型,但我真的不知道该怎么做。我在互联网上找不到任何教程,我被困住了。
你能帮帮我吗?
根据错误消息,model
不是 class 实例。请注意,在回溯中,
out = model(x)
正在调用 __init__
函数。因此,model
可能是 YOLOV3
而不是 YOLOV3(...)
。根据初始签名,x
被视为 in_channels
,而作为 x
图像,
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
有道理。这也解释了 .eval()
错误。除此之外,我相信你需要为你的框架添加一个批次维度(例如,x.unsqueeze(0)
),否则你会得到另一个错误。
我从零开始用Pytorch写了YOLOv3。如果我通过训练有权重的模型发送图像,它会起作用。下一步是使用我的相机让 YOLO 实时发挥它的魔力。
我认为正确的工作管道是捕获视频的单帧并将其馈送到网络。然后,将框写在同一帧上。
checkpoint = torch.load("\my_checkpoint_40.pth.tar")
model = YOLOv3(in_channels = 3, num_classes = 20).to(config.DEVICE)
model.load_state_dict(checkpoint["state_dict"])
ip_camera = "http://192.168.1.70:4500/mjpegfeed?640x480"
outputFile = "yolo_out_py.avi"
通过这种方式,我将权重加载到网络中。然后,我编写了使用我的相机的功能(它是我手机上的 droidCamera,因为在我的 PC 上我没有任何相机设备,所以我使用移动设备的 ip)并且代码本身有效:视频出现在屏幕上。 outputFile 应该是写入视频的目标路径。 问题是当我尝试将单个帧加载到网络中并执行其余过程时。
def streaming(model, thresh, iou_thresh, anchors, ip_camera):
stream = cv2.VideoCapture(ip_camera)
# Corrective actions printed in the even of failed connection.
if stream.isOpened() is not True:
print('Not opened.')
print('Please ensure the following:')
print('1. DroidCam is not running in your browser.')
print('2. The IP address given is correct.')
# Resizing the image to be in hte same dimension of the YOLOv3 Network
width = 416
height = 416
# Connection successful. Proceeding to display video stream.
while stream.isOpened() is True:
# Capture frame-by-frame
ret, f = stream.read()
dim = (width, height)
image = cv2.resize(f, dim, interpolation = cv2.INTER_AREA)
cv2.imshow('frame', image)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
for frame in image:
model.eval()
anchors = torch.tensor(anchors)
anchors = anchors.to(config.DEVICE)
x = torch.tensor(frame)
x = x.to("cuda")
# from this line to the nms_boxes, it's the same code i used for plotting a single image
with torch.no_grad():
out = model(x)
bboxes = [[] for _ in range(x.shape[0])]
for i in range(1):
batch_size, A, S, _, _ = out[i].shape
anchor = anchors[i]
boxes_scale_i = cells_to_bboxes(
out[i], anchor, S = S, is_preds = True
)
for idx, (box) in enumerate(boxes_scale_i):
bboxes[idx] += box
model.train()
for i in range(batch_size):
nms_boxes = non_max_suppression(
bboxes[i], iou_threshold = iou_thresh, threshold = thresh, box_format = "midpoint",
)
# cells_to_boxes and non_max_suppression are functions that return boxes coordinates
# and the "better" box
#now it's time to write things on the frame
frame = cv2.VideoWriter(outputFile, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), 30,
(round(stream.get(cv2.CAP_PROP_FRAME_WIDTH)), round(stream.get(cv2.CAP_PROP_FRAME_HEIGHT))))
frame.write(nms_boxes)
stream.release()
cv2.destroyAllWindows()
由于以下几个原因,代码不起作用:model.eval() 给我一个错误:缺少 1 个必需的位置参数:'self'
然后,我有几个错误,我认为是关于视频流工作的正确管道。这是我第一次使用 openCV。
如果我删除 model.eval(),我会出现另一个错误:
out = model(x)
这是回溯
Traceback (most recent call last):
File "C:/Python_Project/YOLOV3/openVid.py", line 97, in <module>
streaming(YOLOv3, 0.6, 0.6, config.ANCHORS, ip_camera)
File "C:/Python_Project/YOLOV3/openVid.py", line 72, in streaming
out = model(x)
File "C:\Python_Project\YOLOV3\model.py", line 106, in __init__
self.layers = self.create_conv_layers()
File "C:\Python_Project\YOLOV3\model.py", line 141, in create_conv_layers
CNNBlock(
File "C:\Python_Project\YOLOV3\model.py", line 46, in __init__
self.conv = nn.Conv2d(in_channels, out_channels, bias = not bn_act, **kwargs)
File "C:\Users\Simone\anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\conv.py", line 430, in __init__
super(Conv2d, self).__init__(
File "C:\Users\Simone\anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\conv.py", line 83, in __init__
if in_channels % groups != 0:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
我不知道接下来要做什么。 我看到我应该在 ONNX 中转换模型,但我真的不知道该怎么做。我在互联网上找不到任何教程,我被困住了。 你能帮帮我吗?
根据错误消息,model
不是 class 实例。请注意,在回溯中,
out = model(x)
正在调用 __init__
函数。因此,model
可能是 YOLOV3
而不是 YOLOV3(...)
。根据初始签名,x
被视为 in_channels
,而作为 x
图像,
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
有道理。这也解释了 .eval()
错误。除此之外,我相信你需要为你的框架添加一个批次维度(例如,x.unsqueeze(0)
),否则你会得到另一个错误。