如何在 Python 中使用 Google 的文字转语音 API
How to use Google's Text-to-Speech API in Python
我的密钥已准备就绪,可以发出请求并从 Google 的文本中获取语音。
我尝试了这些命令以及更多命令。
这些文档没有提供我发现的 Python 入门的直接解决方案。我不知道我的 API 键与 JSON 和 URL
放在一起
One solution in their docs here is for CURL.。但是涉及在必须发送回他们以获取文件的请求之后下载一个 txt。有没有办法在 Python 中执行此操作而不涉及我必须 return 他们的 txt?
我只想将我的字符串列表 return 编辑为音频文件。
(我把我的实际密钥放在上面的块中。我只是不打算在这里分享它。)
找到答案并在我打开的 150 Google 个文档页面中丢失了 link。
#(Since I'm using a Jupyter Notebook)
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/Path/to/JSON/file/jsonfile.json"
from google.cloud import texttospeech
# Instantiates a client
client = texttospeech.TextToSpeechClient()
# Set the text input to be synthesized
synthesis_input = texttospeech.types.SynthesisInput(text="Hello, World!")
# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-US',
ssml_gender=texttospeech.enums.SsmlVoiceGender.NEUTRAL)
# Select the type of audio file you want returned
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3)
# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(synthesis_input, voice, audio_config)
# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
我费时的追求是尝试通过 JSON 和 Python 发送请求,但这似乎是通过它们自己的模块,效果很好。
注意默认语音性别是 'neutral'.
为 JSON 文件配置 Python 应用程序并安装客户端库
- 创建服务帐户
- 使用服务帐户创建服务帐户密钥here
- JSON 文件下载并安全保存
- 在您的 Python 应用程序中包含 Google 应用程序凭据
- 安装库:
pip install --upgrade google-cloud-texttospeech
使用 Google 的 Python 示例发现:
https://cloud.google.com/text-to-speech/docs/reference/libraries
注意:在 Google 的示例中,它没有正确包含名称参数。
和
https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/texttospeech/cloud-client/quickstart.py
以下是使用 google 应用凭据和女性 wavenet 语音对示例进行的修改。
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/home/yourproject-12345.json"
from google.cloud import texttospeech
# Instantiates a client
client = texttospeech.TextToSpeechClient()
# Set the text input to be synthesized
synthesis_input = texttospeech.types.SynthesisInput(text="Do no evil!")
# Build the voice request, select the language code ("en-US")
# ****** the NAME
# and the ssml voice gender ("neutral")
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-US',
name='en-US-Wavenet-C',
ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)
# Select the type of audio file you want returned
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3)
# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(synthesis_input, voice, audio_config)
# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
语音、姓名、语言代码、SSML 性别等
语音列表:https://cloud.google.com/text-to-speech/docs/voices
在上面的代码示例中,我从 Google 的示例代码中更改了语音,以包含名称参数并使用 Wavenet 语音(改进很多但价格更高,每百万个字符 16 美元)和 SSML 性别给女性。
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-US',
name='en-US-Wavenet-C',
ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)
如果您想避免使用 google Python API,您可以简单地这样做:
import requests
import json
url = "https://texttospeech.googleapis.com/v1beta1/text:synthesize"
text = "This is a text"
data = {
"input": {"text": text},
"voice": {"name": "fr-FR-Wavenet-A", "languageCode": "fr-FR"},
"audioConfig": {"audioEncoding": "MP3"}
};
headers = {"content-type": "application/json", "X-Goog-Api-Key": "YOUR_API_KEY" }
r = requests.post(url=url, json=data, headers=headers)
content = json.loads(r.content)
这与您所做的类似,但您需要包含 API 密钥。
我的密钥已准备就绪,可以发出请求并从 Google 的文本中获取语音。
我尝试了这些命令以及更多命令。
这些文档没有提供我发现的 Python 入门的直接解决方案。我不知道我的 API 键与 JSON 和 URL
One solution in their docs here is for CURL.。但是涉及在必须发送回他们以获取文件的请求之后下载一个 txt。有没有办法在 Python 中执行此操作而不涉及我必须 return 他们的 txt? 我只想将我的字符串列表 return 编辑为音频文件。
(我把我的实际密钥放在上面的块中。我只是不打算在这里分享它。)
找到答案并在我打开的 150 Google 个文档页面中丢失了 link。
#(Since I'm using a Jupyter Notebook)
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/Path/to/JSON/file/jsonfile.json"
from google.cloud import texttospeech
# Instantiates a client
client = texttospeech.TextToSpeechClient()
# Set the text input to be synthesized
synthesis_input = texttospeech.types.SynthesisInput(text="Hello, World!")
# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-US',
ssml_gender=texttospeech.enums.SsmlVoiceGender.NEUTRAL)
# Select the type of audio file you want returned
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3)
# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(synthesis_input, voice, audio_config)
# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
我费时的追求是尝试通过 JSON 和 Python 发送请求,但这似乎是通过它们自己的模块,效果很好。 注意默认语音性别是 'neutral'.
为 JSON 文件配置 Python 应用程序并安装客户端库
- 创建服务帐户
- 使用服务帐户创建服务帐户密钥here
- JSON 文件下载并安全保存
- 在您的 Python 应用程序中包含 Google 应用程序凭据
- 安装库:
pip install --upgrade google-cloud-texttospeech
使用 Google 的 Python 示例发现: https://cloud.google.com/text-to-speech/docs/reference/libraries 注意:在 Google 的示例中,它没有正确包含名称参数。 和 https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/texttospeech/cloud-client/quickstart.py
以下是使用 google 应用凭据和女性 wavenet 语音对示例进行的修改。
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/home/yourproject-12345.json"
from google.cloud import texttospeech
# Instantiates a client
client = texttospeech.TextToSpeechClient()
# Set the text input to be synthesized
synthesis_input = texttospeech.types.SynthesisInput(text="Do no evil!")
# Build the voice request, select the language code ("en-US")
# ****** the NAME
# and the ssml voice gender ("neutral")
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-US',
name='en-US-Wavenet-C',
ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)
# Select the type of audio file you want returned
audio_config = texttospeech.types.AudioConfig(
audio_encoding=texttospeech.enums.AudioEncoding.MP3)
# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(synthesis_input, voice, audio_config)
# The response's audio_content is binary.
with open('output.mp3', 'wb') as out:
# Write the response to the output file.
out.write(response.audio_content)
print('Audio content written to file "output.mp3"')
语音、姓名、语言代码、SSML 性别等
语音列表:https://cloud.google.com/text-to-speech/docs/voices
在上面的代码示例中,我从 Google 的示例代码中更改了语音,以包含名称参数并使用 Wavenet 语音(改进很多但价格更高,每百万个字符 16 美元)和 SSML 性别给女性。
voice = texttospeech.types.VoiceSelectionParams(
language_code='en-US',
name='en-US-Wavenet-C',
ssml_gender=texttospeech.enums.SsmlVoiceGender.FEMALE)
如果您想避免使用 google Python API,您可以简单地这样做:
import requests
import json
url = "https://texttospeech.googleapis.com/v1beta1/text:synthesize"
text = "This is a text"
data = {
"input": {"text": text},
"voice": {"name": "fr-FR-Wavenet-A", "languageCode": "fr-FR"},
"audioConfig": {"audioEncoding": "MP3"}
};
headers = {"content-type": "application/json", "X-Goog-Api-Key": "YOUR_API_KEY" }
r = requests.post(url=url, json=data, headers=headers)
content = json.loads(r.content)
这与您所做的类似,但您需要包含 API 密钥。