将音频数据表示为从 python 到 Javascript 的 numpy 数组

Question

我有一个 TTS（文本到语音）系统，它以 numpy 数组形式生成音频，其数据类型为 np.float32。这个系统在后台运行，我想把数据从后台传到前台，在某个事件发生的时候播放。

这个问题显而易见的解决方法是将音频数据写成wav文件写入磁盘，然后将路径传递给前端播放。这工作正常，但出于管理原因我不想这样做。我只想将音频数据（numpy 数组）传输到前端。

到目前为止我所做的是：

后端

text = "Hello"
wav, sr = tts_model.synthesize(text)
data = {"snd", wav.tolist()}
flask_response = app.response_class(response=flask.json.dumps(data),
                                    status=200,
                                    mimetype='application/json' )
# then return flask_response

前端

// gets wav from backend
let arrayData = new Float32Array(wav);
let blob = new Blob([ arrayData ]);
let url = URL.createObjectURL(blob);
let snd = new Audio(url);
snd.play()

我到目前为止所做的，但是 JavaScript 抛出以下错误：

Uncaught (in promise) DOMException: Failed to load because no supported source was found.

这就是我要做的事情的要点。很抱歉，您没有 TTS 系统，因此无法重现错误，所以这是它生成的 audio file，您可以用它来查看我做错了什么。

我尝试的其他事情：

将音频数据类型更改为 np.int8，np.int16 将分别由 Int8Array() 和 int16Array() 转换为 JavaScript。
在创建 blob 时尝试了不同的类型，例如 {"type": "application/text;charset=utf-8;"} 和 {"type": "audio/ogg; codecs=opus;"}。

我已经在这个问题上苦苦挣扎了很长时间，所以请提供任何帮助!!

Answer 1

您的样本不能开箱即用。（不玩）

然而：

StarWars3.wav：好的。从 cs.uic.edu
您的示例使用 PCM16 而不是 PCM32 编码：OK（检查 wav 元数据）

烧瓶

from flask import Flask, render_template, json
import base64

app = Flask(__name__)

with open("sample_16.wav", "rb") as binary_file:
    # Read the whole file at once
    data = binary_file.read()
    wav_file = base64.b64encode(data).decode('UTF-8')

@app.route('/wav')
def hello_world():
    data = {"snd": wav_file}
    res = app.response_class(response=json.dumps(data),
        status=200,
        mimetype='application/json')
    return res

@app.route('/')
def stat():
    return render_template('index.html')

if __name__ == '__main__':
    app.run(debug = True)

js


  <audio controls></audio>
  <script>
    ;(async _ => {
      const res = await fetch('/wav')
      let {snd: b64buf} = await res.json()
      document.querySelector('audio').src="data:audio/wav;base64, "+b64buf;
    })()
  </script>

原始海报编辑

所以，我之前（使用这个解决方案）解决了我的问题是：

首先，将数据类型从 np.float32 更改为 np.int16:

wav = (wav * np.iinfo(np.int16).max).astype(np.int16)

使用 scipy.io.wavfile:

from scipy.io import wavfile
wavfile.write(".tmp.wav", sr, wav)

从 tmp 文件读取字节：

# read the bytes
with open(".tmp.wav", "rb") as fin:
    wav = fin.read()

删除临时文件

import os
os.remove(".tmp.wav")

Answer 2

将 wav 值数组转换为字节

在合成之后，您可以立即将 wav 的 numpy 数组转换为字节对象，然后通过 base64 进行编码。

import io
from scipy.io.wavfile import write

bytes_wav = bytes()
byte_io = io.BytesIO(bytes_wav)
write(byte_io, sr, wav)
wav_bytes = byte_io.read()

audio_data = base64.b64encode(wav_bytes).decode('UTF-8')

这可以直接用于创建 html 音频标签作为源（使用 flask）：

<audio controls src="data:audio/wav;base64, {{ audio_data }}"></audio>

因此，您只需将 wav、sr 转换为 audio_data 代表原始 .wav 文件。并作为 render_template 的参数用于您的 Flask 应用程序。（没有发送的解决方案）

或者如果您发送 audio_data，在您接受响应的 .js 文件中，使用 audio_data 构造 url（将被放置为 src html):

中的属性

// get audio_data from response

let snd = new Audio("data:audio/wav;base64, " + audio_data);
snd.play()

因为：

Audio(url) Return value: A new HTMLAudioElement object, configured to be used for playing back the audio from the file specified by url.The new object's preload property is set to auto and its src property is set to the specified URL or null if no URL is given. If a URL is specified, the browser begins to asynchronously load the media resource before returning the new object.

将音频数据表示为从 python 到 Javascript 的 numpy 数组

Send Audio data represent as numpy array from python to Javascript

javascript

numpy

flask

html5-audio

python-3.x

后端

前端

我尝试的其他事情：

原始海报编辑

将 wav 值数组转换为字节