将 mp4 声音转换为 python 中的文本
Convert mp4 sound to text in python
我想将 Facebook Messenger 中的录音转换为文本。
以下是使用 Facebook API 发送的 .mp4 文件示例:
https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833
所以这个文件只包含音频(不包含视频),我想把它转换成文本。
此外,我想尽可能快地完成它,因为我将在几乎实时的应用程序中使用生成的文本(即用户发送 .mp4 文件,脚本将其转换为文本并将其显示回来).
我找到了这个例子https://github.com/Uberi/speech_recognition/blob/master/examples/audio_transcribe.py
这是我使用的代码:
import requests
import speech_recognition as sr
url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
r = requests.get(url)
with open("test.mp4", "wb") as handle:
for data in r.iter_content():
handle.write(data)
r = sr.Recognizer()
with sr.AudioFile('test.mp4') as source:
audio = r.record(source)
command = r.recognize_google(audio)
print command
但是我收到这个错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Asterios\Anaconda2\lib\site-packages\speech_recognition\__init__.py", line 200, in __enter__
self.audio_reader = aifc.open(aiff_file, "rb")
File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 952, in open
return Aifc_read(f)
File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 347, in __init__
self.initfp(f)
File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 298, in initfp
chunk = Chunk(file)
File "C:\Users\Asterios\Anaconda2\lib\chunk.py", line 63, in __init__
raise EOFError
EOFError
有什么想法吗?
编辑:我想 运行 pythonanywhere.com 免费计划中的脚本,所以我不确定如何在那里安装像 ffmpeg 这样的工具。
编辑 2:如果您 运行 上面的脚本将 url 替换为这个“http://www.wavsource.com/snds_2017-01-08_2348563217987237/people/men/about_time.wav”并将 'mp4' 更改为 'wav',则它工作正常。所以肯定是文件格式的问题。
使用Python视频转换器
https://github.com/senko/python-video-converter
import requests
import speech_recognition as sr
from converter import Converter
url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
r = requests.get(url)
c = Converter()
with open("/tmp/test.mp4", "wb") as handle:
for data in r.iter_content():
handle.write(data)
conv = c.convert('/tmp/test.mp4', '/tmp/test.wav', {
'format': 'wav',
'audio': {
'codec': 'pcm',
'samplerate': 44100,
'channels': 2
},
})
for timecode in conv:
pass
r = sr.Recognizer()
with sr.AudioFile('/tmp/test.wav') as source:
audio = r.record(source)
command = r.recognize_google(audio)
print command
终于找到解决办法了。我将它张贴在这里,以防将来对某人有所帮助。
还好pythonanywhere.com自带avconvpre-installed(avconv类似于ffmpeg)
下面是一些有效的代码:
import urllib2
import speech_recognition as sr
import subprocess
import os
url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
mp4file = urllib2.urlopen(url)
with open("test.mp4", "wb") as handle:
handle.write(mp4file.read())
cmdline = ['avconv',
'-i',
'test.mp4',
'-vn',
'-f',
'wav',
'test.wav']
subprocess.call(cmdline)
r = sr.Recognizer()
with sr.AudioFile('test.wav') as source:
audio = r.record(source)
command = r.recognize_google(audio)
print command
os.remove("test.mp4")
os.remove("test.wav")
在免费计划中,cdn.fbsbx.com
不在 pythonanywhere 的站点白名单中,因此我无法下载 urllib2
的内容。我联系了他们,他们在 1-2 小时内将域添加到白名单!
非常感谢并祝贺他们提供的优质服务,即使我使用的是免费套餐。
我想将 Facebook Messenger 中的录音转换为文本。 以下是使用 Facebook API 发送的 .mp4 文件示例: https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833
所以这个文件只包含音频(不包含视频),我想把它转换成文本。
此外,我想尽可能快地完成它,因为我将在几乎实时的应用程序中使用生成的文本(即用户发送 .mp4 文件,脚本将其转换为文本并将其显示回来).
我找到了这个例子https://github.com/Uberi/speech_recognition/blob/master/examples/audio_transcribe.py 这是我使用的代码:
import requests
import speech_recognition as sr
url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
r = requests.get(url)
with open("test.mp4", "wb") as handle:
for data in r.iter_content():
handle.write(data)
r = sr.Recognizer()
with sr.AudioFile('test.mp4') as source:
audio = r.record(source)
command = r.recognize_google(audio)
print command
但是我收到这个错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\Asterios\Anaconda2\lib\site-packages\speech_recognition\__init__.py", line 200, in __enter__
self.audio_reader = aifc.open(aiff_file, "rb")
File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 952, in open
return Aifc_read(f)
File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 347, in __init__
self.initfp(f)
File "C:\Users\Asterios\Anaconda2\lib\aifc.py", line 298, in initfp
chunk = Chunk(file)
File "C:\Users\Asterios\Anaconda2\lib\chunk.py", line 63, in __init__
raise EOFError
EOFError
有什么想法吗?
编辑:我想 运行 pythonanywhere.com 免费计划中的脚本,所以我不确定如何在那里安装像 ffmpeg 这样的工具。
编辑 2:如果您 运行 上面的脚本将 url 替换为这个“http://www.wavsource.com/snds_2017-01-08_2348563217987237/people/men/about_time.wav”并将 'mp4' 更改为 'wav',则它工作正常。所以肯定是文件格式的问题。
使用Python视频转换器 https://github.com/senko/python-video-converter
import requests
import speech_recognition as sr
from converter import Converter
url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
r = requests.get(url)
c = Converter()
with open("/tmp/test.mp4", "wb") as handle:
for data in r.iter_content():
handle.write(data)
conv = c.convert('/tmp/test.mp4', '/tmp/test.wav', {
'format': 'wav',
'audio': {
'codec': 'pcm',
'samplerate': 44100,
'channels': 2
},
})
for timecode in conv:
pass
r = sr.Recognizer()
with sr.AudioFile('/tmp/test.wav') as source:
audio = r.record(source)
command = r.recognize_google(audio)
print command
终于找到解决办法了。我将它张贴在这里,以防将来对某人有所帮助。
还好pythonanywhere.com自带avconvpre-installed(avconv类似于ffmpeg)
下面是一些有效的代码:
import urllib2
import speech_recognition as sr
import subprocess
import os
url = 'https://cdn.fbsbx.com/v/t59.3654-21/15720510_10211855778255994_5430581267814940672_n.mp4/audioclip-1484407992000-3392.mp4?oh=a78286aa96c9dea29e5d07854194801c&oe=587C3833'
mp4file = urllib2.urlopen(url)
with open("test.mp4", "wb") as handle:
handle.write(mp4file.read())
cmdline = ['avconv',
'-i',
'test.mp4',
'-vn',
'-f',
'wav',
'test.wav']
subprocess.call(cmdline)
r = sr.Recognizer()
with sr.AudioFile('test.wav') as source:
audio = r.record(source)
command = r.recognize_google(audio)
print command
os.remove("test.mp4")
os.remove("test.wav")
在免费计划中,cdn.fbsbx.com
不在 pythonanywhere 的站点白名单中,因此我无法下载 urllib2
的内容。我联系了他们,他们在 1-2 小时内将域添加到白名单!
非常感谢并祝贺他们提供的优质服务,即使我使用的是免费套餐。