在 Unity for OpenCV 中将相机和 AudioListener 馈送合并到视频中

Merge Camera and AudioListener feeds into a video in Unity for OpenCV

我正在尝试使用 Unity3D 创建模拟环境。我的第一个山羊是获取相机和音频监听器提要并将其合并/转换为可读视频。一旦我得到这个,我想把它发送给 OpenCV。

为此,我创建了一个场景,其中有一个摄像头和一个 object 发出恒定的噪音。然后,我将此脚本附加到相机以捕获提要:

private Texture2D texture;
private byte[] lastTexture;
private float[] lastAudio;


private void Start () 
{
    texture = new Texture2D(Screen.width, Screen.height, TextureFormat.RGB24, false);
}

/// <summary>
/// Called each time the camera have finished rendering the scene.
/// </summary>
private void OnPostRender()
{
    // Read the pixel of the camera.
    texture.ReadPixels(new Rect(0.0f, 0.0f, Screen.width, Screen.height), 0, 0, true);

    // Encode it to PNG.
    lastTexture = texture.EncodeToJPG(25);

    // Audio sync => fetch the last recorded audio.
    var encodedLastAudio = ConvertAudio(lastAudio);

    // TODO merge and convert to mpeg then send via UDP.
}

/// <summary>
/// Raises the audio filter read event.
/// </summary>
/// <param name="data">Data.</param>
/// <param name="channels">Channels.</param>
private void OnAudioFilterRead(float[] data, int channels)
{
    lastAudio = data;
}

/// <summary>
/// Converts the data recieved by the AudioFilterRead event into a byte array.
/// </summary>
/// <returns>The audio.</returns>
/// <param name="dataSource">Data source.</param>
private byte[] ConvertAudio(float[] dataSource)
{
    // Converting in 2 steps : float[] to Int16[], 
    // then Int16[] to Byte[].
    var intData = new Int16[dataSource.Length];

    // BytesData array is twice the size of
    //dataSource array because a float converted in Int16 is 2 bytes.
    var bytesData = new Byte[dataSource.Length*2];

    // To convert float to Int16
    var rescaleFactor = 32767f; 

    for (var i = 0; i < dataSource.Length; i++)
    {
        intData[i] = (short)(dataSource[i]*rescaleFactor);
    }
    Buffer.BlockCopy(intData, 0, bytesData, 0, bytesData.Length);

    return bytesData;   
}

我做的对吗?如果是这样,我已经在寻找 ffmpeg 的 C# 实现,例如:

但我认为它有点复杂,并且会有大量 cpu 用途。 是否有人设法做到了,或者类似的事情?

感谢另一个帖子,我已经知道如何将 wav header 添加到我的音频源中。

获取统一的摄像头和音频监听器的最简单方法是使用 VLC 对其进行流式传输。

VLC 有一个功能,您可以流式传输自己的桌面,对其进行编码并通过 UDP 发送。 唯一的 "catch" 是你会有一些延迟(比如 2-3 秒)。