使用Google语音到文本,我只得到部分转录。输入文件:来自google示例音频文件
链接到google回购位置 commercial_mono.wav
这是我的代码:
def transcribe_gcs(gcs_uri):
from google.cloud import speech_v1p1beta1 as speech
from google.cloud.speech_v1p1beta1 import enums
from google.cloud.speech_v1p1beta1 import types
client = speech.SpeechClient()
audio = types.RecognitionAudio(uri = gcs_uri)
config = speech.types.RecognitionConfig( language_code = 'en-US',enable_speaker_diarization=True, diarization_speaker_count=2)
operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
response = operation.result(timeout=5000)
result = response.results[-1]
words_info = result.alternatives[0].words
tag=1
speaker=" "
for word_info in words_info:
if word_info.speaker_tag==tag:
speaker=speaker+" "+word_info.word
else:
print("speaker {}: {}".format(tag,speaker))
tag=word_info.speaker_tag
speaker=" "+word_info.word下面是我如何调用脚本:
transcribe_gcs('gs://mybucket0000t/commercial_mono.wav')我只从整个音频文件中得到部分转录
(venv3) ➜ g-transcribe git:(master) ✗ python gtranscribeWithDiarization.py
Waiting for operation to complete...
speaker 1: I'm here
speaker 2: hi I'd like to buy a Chrome Cast and I was wondering whether you
could help me这就是我所得到的
如果我多次执行代码,在5或6次之后,我不会收到任何转录。
以下是几次尝试后的结果:
(venv3) ➜ g-transcribe git:(master) ✗ python gtranscribeWithDiarization.py
Waiting for operation to complete...
speaker 1:
(venv3) ➜ g-transcribe git:(master) ✗ 环境:使用python3
当说话人改变时,我正试着用时间戳得到整个抄写。
期望输出
Speaker 1: Start Time 0.0001: Hello transcription starts
Speaker 2: Start Time 0.0009: Here starts with the transcription of the 2nd speaker and so on to the end of file.希望你能帮忙。
发布于 2019-02-25 13:30:00
到目前为止,我还没有对v1p1beta有任何意见。
建议1:可能是一个明显的建议,但是您的项目允许“数据日志记录”吗?它是使用更高级的特性/模型所必需的。也许试一试?你可以在测试后关掉它,如果它不改变你的结果。
数据日志记录引用:https://cloud.google.com/speech-to-text/docs/data-logging
建议2:尝试使用下面的一行:
client = speech_v1p1beta1.SpeechClient()建议#3:尝试在配置中添加示例速率
sample_rate_hertz = 44100https://stackoverflow.com/questions/54696360
复制相似问题