在我的代码(下面)中,当我通过STT处理它时,它只给出整个音频的第一个字母/单词。
音频有"A、B、C、D、E、F“
我遗漏了什么?
Imports Microsoft.CognitiveServices.Speech
Imports Microsoft.CognitiveServices.Speech.SpeechConfig
Imports Microsoft.CognitiveServices.Speech.Audio
Module Module1
Sub Main()
Dim SpeechConfig As SpeechConfig = FromSubscription("<CHANGED>", "eastus")
Dim audioConfig As Audio.AudioConfig = Audio.AudioConfig.FromWavFileInput("<CHANGED>.wav")
SpeechConfig.OutputFormat = Microsoft.CognitiveServices.Speech.OutputFormat.Detailed
Dim recognizer As New SpeechRecognizer(SpeechConfig, audioConfig)
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
End Select
End Sub
End Module您可以在github这里下载音频文件,https://github.com/ullfindsmit/StackOverflowAssets/blob/master/abcdef.wav
另外,如果你知道我在哪里可以得到更详细的STT数据,我会很感激的。我正在寻找的是一个JSON输出,它表示开始时间和结束时间以及单词和/或句子。
非常感谢你的帮助。
更新,因此异步处理程序由于某种原因不能为我工作,但是下面的代码确实起作用了
While True
Dim result = recognizer.RecognizeOnceAsync().Result
Select Case result.Reason
Case ResultReason.RecognizedSpeech
Console.WriteLine($"RECOGNIZED: Text={result.Text}")
Console.WriteLine($" Intent not recognized.")
Case ResultReason.NoMatch
Console.WriteLine($"NOMATCH: Speech could not be recognized.")
Case ResultReason.Canceled
Dim cancellation = CancellationDetails.FromResult(result)
Console.WriteLine($"CANCELED: Reason={cancellation.Reason}")
If cancellation.Reason = CancellationReason.[Error] Then
Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}")
Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}")
Console.WriteLine($"CANCELED: Did you update the subscription info?")
End If
Exit While
End Select
End While发布于 2020-05-28 16:05:59
RecognizeOnceAsync方法只识别“一次”.音频数据文件中包含的第一个“话语/短语”。如果你想识别多个短语,你可以做以下两件事中的一件:
RecognizeOnceAsync打电话..。在识别最后一个短语之后,对该方法的下一次调用将返回一个将Canceled.result.Reason的结果,从使用RecognizeOnceAsync到使用StartContinuousRecognitionAsync,并将事件提交器挂钩到Recognizing事件。事件回调将允许您通过检查传递的SpeechRecognitionEventArgs来查看结果,如:e.Result . 通过像这样运行Speech CLI,您可以看到这两种行为:
spx recognize --once+ --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/StackOverflowAssets/blob/master/abcdef.wav"
spx recognize --continuous --key YOUR-KEY --region YOUR-REGION --file "https://github.com/ullfindsmit/StackOverflowAssets/blob/master/abcdef.wav"您可以在这里下载演讲CLI:https://aka.ms/speech/spx-zips.zip
https://stackoverflow.com/questions/62069005
复制相似问题