从视频中检测和分类文本

Question

我正在尝试处理 ICDAR2015 数据集，这是一个来自视频文件的文本检测和分类问题。我以前处理过静态图像的文本检测和分类问题，但以前从未处理过视频数据。

是否有一些 library/tool 可以帮助我截取视频中不同帧的图像？谢谢。

Answer 1

只要视频未加密，根据您使用的平台，有多种截屏方法。

鉴于您的问题领域和您在该领域的经验，开源计算机视觉库 OpenCV 可能是一个很好的选择：

http://opencv.org

文档中包含捕获视频帧的示例：

http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_gui/py_video_display/py_video_display.html

以上述教程为例，从文件中读取视频：

import numpy as np
import cv2

cap = cv2.VideoCapture('vtest.avi')

while(cap.isOpened()):
    ret, frame = cap.read()

    //Do whatever work you want on the frame here - in this example
    //from the tutorial the image is being converted from one colour 
    //space to another
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    //This displays the resulting frame - you may or may not not need 
    //this for your case
    cv2.imshow('frame',gray)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

从视频中检测和分类文本

Detecting and classifying text from a video

video

image

machine-learning

video-capture