该应用程序可以在拥抱脸https://huggingface.co/spaces/rowel/asr中查看。
import gradio as gr
from transformers import pipeline
model = pipeline(task="automatic-speech-recognition",
model="facebook/s2t-medium-librispeech-asr")
gr.Interface.from_pipeline(model,
title="Automatic Speech Recognition (ASR)",
description="Using pipeline with Facebook S2T for ASR.",
examples=['data/ljspeech.wav',]
).launch()我不知道文本文件是用那几行代码存储在哪里的。我想把句子文本存储在一个字符串中。
老实说,我只知道基本的python编程。我只想将它们存储到字符串变量中,并对它们执行一些操作。
发布于 2022-05-03 05:36:32
您可以打开Interface.from_pipeline抽象,并定义自己的G电波接口。您需要定义自己的输入、输出和预测函数,从而访问模型中的文本预测。下面是一个例子。
你可以在这里测试https://huggingface.co/spaces/radames/Speech-Recognition-Example
import gradio as gr
from transformers import pipeline
model = pipeline(task="automatic-speech-recognition",
model="facebook/s2t-medium-librispeech-asr")
def predict_speech_to_text(audio):
prediction = model(audio)
# text variable contains your voice-to-text string
text = prediction['text']
return text
gr.Interface(fn=predict_speech_to_text,
title="Automatic Speech Recognition (ASR)",
inputs=gr.inputs.Audio(
source="microphone", type="filepath", label="Input"),
outputs=gr.outputs.Textbox(label="Output"),
description="Using pipeline with F acebook S2T for ASR.",
examples=['ljspeech.wav'],
allow_flagging='never'
).launch()https://stackoverflow.com/questions/71568142
复制相似问题