recognize_google(audio) 过滤掉坏词

recognize_google(audio) filters out bad words

我遇到了 google speech_recognition api 这个问题。它会自动过滤掉坏词和 returns 像 "F***" 或 "P******"

这样的字符串

这是我的代码。我的代码没有错误,但请帮助我如何从我的音频中获取原始转换文本。

    from gtts import gTTS
    import speech_recognition as sr

    r = sr.Recognizer()

with sr.Microphone() as source:
    print('Ready...')
    r.pause_threshold = 1
    r.adjust_for_ambient_noise(source, duration=1)
    audio = r.listen(source)

    command = r.recognize_google(audio).lower()
    print('You said: ' + command + '\n')

profanity_filter

Optional If set to true, the server will attempt to filter out profanities, replacing all but the initial character in each filtered word with asterisks, e.g. “f***”. If set to false or omitted, profanities won’t be filtered out.

搜索: https://googlecloudplatform.github.io/google-cloud-python/latest/search.html?q=profanity_filter&check_keywords=yes&area=default

示例:

https://googlecloudplatform.github.io/google-cloud-python/latest/speech/index.html?highlight=profanity_filter#synchronous-recognition

Example of using the profanity filter.

>>> from google.cloud import speech
>>> client = speech.SpeechClient()
>>> results = client.recognize(
...     audio=speech.types.RecognitionAudio(
...         uri='gs://my-bucket/recording.flac',
...     ),
...     config=speech.types.RecognitionConfig(
...         encoding='LINEAR16',
...         language_code='en-US',
...         profanity_filter=True,
...         sample_rate_hertz=44100,
...     ),
... )
>>> for result in results:
...     for alternative in result.alternatives:
...         print('=' * 20)
...         print('transcript: ' + alternative.transcript)
...         print('confidence: ' + str(alternative.confidence))
====================
transcript: Hello, this is a f****** test
confidence: 0.81

很好的例子 ;-)

(我没有测试过)