文章/答案/技术大牛

发布

社区首页 >问答首页 >如何使用google.cloud.speech_v1p1beta1获取完整的文字记录？

问如何使用google.cloud.speech_v1p1beta1获取完整的文字记录？
EN

Stack Overflow用户

提问于 2019-02-14 17:51:20

回答 1查看 579关注 0票数 0

使用Google语音到文本，我只得到部分转录。输入文件:来自google示例音频文件

链接到google回购位置 commercial_mono.wav

这是我的代码：

def transcribe_gcs(gcs_uri):
from google.cloud import speech_v1p1beta1 as speech
from google.cloud.speech_v1p1beta1 import enums
from google.cloud.speech_v1p1beta1 import types
client = speech.SpeechClient()
audio = types.RecognitionAudio(uri = gcs_uri)
config = speech.types.RecognitionConfig( language_code = 'en-US',enable_speaker_diarization=True, diarization_speaker_count=2)
operation = client.long_running_recognize(config, audio)


print('Waiting for operation to complete...')
response = operation.result(timeout=5000)
result = response.results[-1]

words_info = result.alternatives[0].words

tag=1
speaker=" "

for word_info in words_info:
    if word_info.speaker_tag==tag:
        speaker=speaker+" "+word_info.word

    else:
        print("speaker {}: {}".format(tag,speaker))
        tag=word_info.speaker_tag
        speaker=" "+word_info.word

下面是我如何调用脚本：

transcribe_gcs('gs://mybucket0000t/commercial_mono.wav')

我只从整个音频文件中得到部分转录

(venv3) ➜  g-transcribe git:(master) ✗ python gtranscribeWithDiarization.py
Waiting for operation to complete...
speaker 1:   I'm here
speaker 2:  hi I'd like to buy a Chrome Cast and I was wondering whether you 
could help me

这就是我所得到的

如果我多次执行代码，在5或6次之后，我不会收到任何转录。

以下是几次尝试后的结果：

(venv3) ➜  g-transcribe git:(master) ✗ python gtranscribeWithDiarization.py

Waiting for operation to complete...
speaker 1:  

(venv3) ➜  g-transcribe git:(master) ✗

环境:使用python3

使用谷歌服务帐户，没有连接问题。
还将文件复制到google存储，并确认我可以播放。
尝试将文件从wav转换为flac，但结果是相同的。
使用ff探头确保只有一个通道

当说话人改变时，我正试着用时间戳得到整个抄写。

期望输出

Speaker 1: Start Time 0.0001: Hello transcription starts
Speaker 2: Start Time 0.0009: Here starts with the transcription of the 2nd speaker and so on to the end of file.

希望你能帮忙。

google-cloud-platform

speech-to-text

google-speech-api

google-cloud-speech

python-3.x

回答 1

Stack Overflow用户

发布于 2019-02-25 13:30:00

到目前为止，我还没有对v1p1beta有任何意见。

建议1：可能是一个明显的建议，但是您的项目允许“数据日志记录”吗？它是使用更高级的特性/模型所必需的。也许试一试？你可以在测试后关掉它，如果它不改变你的结果。

数据日志记录引用：https://cloud.google.com/speech-to-text/docs/data-logging

建议2：尝试使用下面的一行：

client = speech_v1p1beta1.SpeechClient()

建议#3:尝试在配置中添加示例速率

sample_rate_hertz = 44100

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/54696360

复制

相似问题

问如何使用google.cloud.speech_v1p1beta1获取完整的文字记录？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用google.cloud.speech_v1p1beta1获取完整的文字记录？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何使用google.cloud.speech_v1p1beta1获取完整的文字记录？
EN