如何将带有 openCV 的视频流式传输到我的 pytorch 神经网络中？

Question

我从零开始用Pytorch写了YOLOv3。如果我通过训练有权重的模型发送图像，它会起作用。下一步是使用我的相机让 YOLO 实时发挥它的魔力。

我认为正确的工作管道是捕获视频的单帧并将其馈送到网络。然后，将框写在同一帧上。

checkpoint = torch.load("\my_checkpoint_40.pth.tar")
model = YOLOv3(in_channels = 3, num_classes = 20).to(config.DEVICE)
model.load_state_dict(checkpoint["state_dict"])
ip_camera = "http://192.168.1.70:4500/mjpegfeed?640x480"
outputFile = "yolo_out_py.avi"

通过这种方式，我将权重加载到网络中。然后，我编写了使用我的相机的功能（它是我手机上的 droidCamera，因为在我的 PC 上我没有任何相机设备，所以我使用移动设备的 ip）并且代码本身有效：视频出现在屏幕上。 outputFile 应该是写入视频的目标路径。问题是当我尝试将单个帧加载到网络中并执行其余过程时。

def streaming(model, thresh, iou_thresh, anchors, ip_camera):
    stream = cv2.VideoCapture(ip_camera)

    # Corrective actions printed in the even of failed connection.
    if stream.isOpened() is not True:
        print('Not opened.')
        print('Please ensure the following:')
        print('1. DroidCam is not running in your browser.')
        print('2. The IP address given is correct.')
    # Resizing the image to be in hte same dimension of the YOLOv3 Network
    width = 416
    height = 416
    # Connection successful. Proceeding to display video stream.
    while stream.isOpened() is True:
        # Capture frame-by-frame
        ret, f = stream.read()
        dim = (width, height)
        image = cv2.resize(f, dim, interpolation = cv2.INTER_AREA)
        cv2.imshow('frame', image)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
        for frame in image:
            model.eval()
            anchors = torch.tensor(anchors)
            anchors = anchors.to(config.DEVICE) 
            x = torch.tensor(frame)
            x = x.to("cuda")

           # from this line to the nms_boxes, it's the same code i used for plotting a single               image
            with torch.no_grad():
                out = model(x)
                bboxes = [[] for _ in range(x.shape[0])]
                for i in range(1):
                    batch_size, A, S, _, _ = out[i].shape
                    anchor = anchors[i]
                    boxes_scale_i = cells_to_bboxes(
                        out[i], anchor, S = S, is_preds = True
                    )
                    for idx, (box) in enumerate(boxes_scale_i):
                        bboxes[idx] += box

                model.train()

            for i in range(batch_size):
                nms_boxes = non_max_suppression(
                    bboxes[i], iou_threshold = iou_thresh, threshold = thresh, box_format =                   "midpoint",
                )
           # cells_to_boxes and non_max_suppression are functions that return boxes coordinates
           # and the "better" box

                #now it's time to write things on the frame
                frame = cv2.VideoWriter(outputFile, cv2.VideoWriter_fourcc('M', 'J', 'P', 'G'), 30,
                                        (round(stream.get(cv2.CAP_PROP_FRAME_WIDTH)), round(stream.get(cv2.CAP_PROP_FRAME_HEIGHT))))

                frame.write(nms_boxes)
    stream.release()
    cv2.destroyAllWindows()

由于以下几个原因，代码不起作用：model.eval() 给我一个错误：缺少 1 个必需的位置参数：'self'

然后，我有几个错误，我认为是关于视频流工作的正确管道。这是我第一次使用 openCV。

如果我删除 model.eval()，我会出现另一个错误：

out = model(x)

这是回溯

Traceback (most recent call last):
  File "C:/Python_Project/YOLOV3/openVid.py", line 97, in <module>
    streaming(YOLOv3, 0.6, 0.6, config.ANCHORS, ip_camera)
  File "C:/Python_Project/YOLOV3/openVid.py", line 72, in streaming
    out = model(x)
  File "C:\Python_Project\YOLOV3\model.py", line 106, in __init__
    self.layers = self.create_conv_layers()
  File "C:\Python_Project\YOLOV3\model.py", line 141, in create_conv_layers
    CNNBlock(
  File "C:\Python_Project\YOLOV3\model.py", line 46, in __init__
    self.conv = nn.Conv2d(in_channels, out_channels, bias = not bn_act, **kwargs)
  File "C:\Users\Simone\anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\conv.py", line 430, in __init__
    super(Conv2d, self).__init__(
  File "C:\Users\Simone\anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\conv.py", line 83, in __init__
    if in_channels % groups != 0:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous

我不知道接下来要做什么。我看到我应该在 ONNX 中转换模型，但我真的不知道该怎么做。我在互联网上找不到任何教程，我被困住了。你能帮帮我吗？

Answer 1

根据错误消息，model 不是 class 实例。请注意，在回溯中，

out = model(x)

正在调用 __init__ 函数。因此，model 可能是 YOLOV3 而不是 YOLOV3(...)。根据初始签名，x 被视为 in_channels，而作为 x 图像，

RuntimeError: Boolean value of Tensor with more than one value is ambiguous

有道理。这也解释了 .eval() 错误。除此之外，我相信你需要为你的框架添加一个批次维度（例如，x.unsqueeze(0)），否则你会得到另一个错误。

如何将带有 openCV 的视频流式传输到我的 pytorch 神经网络中？

How do i stream a video with openCV into my pytorch neural network?

python

opencv

computer-vision

deep-learning

pytorch