文章/答案/技术大牛

发布

社区首页 >问答首页 >基于Android的长音频语音识别

问基于Android的长音频语音识别
EN

Stack Overflow用户

提问于 2015-10-26 10:53:52

回答 3查看 5.1K关注 0票数 4

我想开发一个模块，它将使用语音来支持Android中的文本。我发现了许多与RecognizerIntent等相关的文档和演示。但我发现所有这样的演示只需要10秒钟左右。但我要我的演示运行超过5-10分钟。如果没有离线运行，我没有任何问题，因为我的应用程序总是在线工作。

我也考虑过Android上的Pocketsphinx，但结果并不好。而且，这只为Android提供了支持，而不是在Eclipse上。

我见过许多应用程序提供了连续5-10分钟将语音转换为文本的功能，比如：语音到文字记事本。

有人能建议其他的演示代码库来实现这一点吗？蒂娅。

android

speech-recognition

speech-to-text

pocketsphinx-android

回答 3

Stack Overflow用户

回答已采纳

发布于 2018-06-12 06:20:32

在Google语音API的帮助下，我成功地完成了这一任务。他们还添加了一个演示这里。

Google语音到文本使开发人员能够通过在一个易于使用的API中应用强大的神经网络模型将音频转换为文本。API识别120种语言和变体，以支持您的全局用户群。您可以启用语音命令和控制，从呼叫中心转录音频，等等.它可以使用谷歌的机器学习技术，实时处理流媒体或预先录制的音频.

您可以将用户的文本转录到应用程序的麦克风上，通过声音启用命令和控制，或者转录音频文件，以及许多其他用例。识别在请求中上传的音频，并通过使用Google使用的相同技术来支持自己的产品，从而集成到上的音频存储。

票数 -1

Stack Overflow用户

发布于 2017-08-23 07:38:14

请参考这个自定义活动中无对话框的Android语音识别。

尝试重写方法onEndOfSpeech并使用speechRecognizer.startListening(recognizerIntent)重新启动服务

我得到的结果与你引用的应用程序语音到文字记事本相同，这是我的活动

import java.util.ArrayList;
import android.speech.RecognitionListener;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.app.Activity;
import android.content.Intent;
import android.os.Bundle;
import android.util.Log;
import android.view.View;
import android.view.WindowManager;
import android.widget.CompoundButton;
import android.widget.CompoundButton.OnCheckedChangeListener;
import android.widget.ProgressBar;
import android.widget.TextView;
import android.widget.ToggleButton;

public class VoiceRecognitionActivity extends Activity implements
        RecognitionListener {

    private TextView returnedText;
    private ToggleButton toggleButton;
    private ProgressBar progressBar;
    private SpeechRecognizer speech = null;
    private Intent recognizerIntent;
    private String LOG_TAG = "VoiceRecognition";
    String speechString = "";
    boolean spechStarted = false;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_voice_recognition);
        getWindow().addFlags(WindowManager.LayoutParams.FLAG_KEEP_SCREEN_ON);
        returnedText = (TextView) findViewById(R.id.textView1);
        progressBar = (ProgressBar) findViewById(R.id.progressBar1);
        toggleButton = (ToggleButton) findViewById(R.id.toggleButton1);

        progressBar.setVisibility(View.INVISIBLE);
        speech = SpeechRecognizer.createSpeechRecognizer(this);
        speech.setRecognitionListener(this);
        recognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_PREFERENCE,
                "en");
        recognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
                this.getPackageName());
        recognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
                RecognizerIntent.LANGUAGE_MODEL_WEB_SEARCH);

        recognizerIntent.putExtra(RecognizerIntent.EXTRA_PARTIAL_RESULTS,
                true);

        toggleButton.setOnCheckedChangeListener(new OnCheckedChangeListener() {

            @Override
            public void onCheckedChanged(CompoundButton buttonView,
                                         boolean isChecked) {
                if (isChecked) {
                    speech.setRecognitionListener(VoiceRecognitionActivity.this);
                    progressBar.setVisibility(View.VISIBLE);
                    progressBar.setIndeterminate(true);
                    speech.startListening(recognizerIntent);
                } else {
                    progressBar.setIndeterminate(false);
                    progressBar.setVisibility(View.INVISIBLE);
                    speech.stopListening();
                    speech.destroy();

                }
            }
        });

    }

    @Override
    protected void onPause() {
        super.onPause();
        if (speech != null) {
            speech.destroy();
            Log.i(LOG_TAG, "destroy");
        }

    }

    @Override
    public void onBeginningOfSpeech() {
        Log.i(LOG_TAG, "onBeginningOfSpeech");
        spechStarted = true;
        progressBar.setIndeterminate(false);
        progressBar.setMax(10);
    }

    @Override
    public void onBufferReceived(byte[] buffer) {
        Log.i(LOG_TAG, "onBufferReceived: " + buffer);
    }

    @Override
    public void onEndOfSpeech() {

        spechStarted = false;
        Log.i(LOG_TAG, "onEndOfSpeech");
        speech.startListening(recognizerIntent);

    }

    @Override
    public void onError(int errorCode) {
        Log.d(LOG_TAG, "FAILED ");
        if (!spechStarted)
            speech.startListening(recognizerIntent);

    }

    @Override
    public void onEvent(int arg0, Bundle arg1) {
        Log.i(LOG_TAG, "onEvent");
    }

    @Override
    public void onPartialResults(Bundle arg0) {
        Log.i(LOG_TAG, "onPartialResults");

        ArrayList<String> matches = arg0
                .getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);

        returnedText.setText(speechString + matches.get(0));


    }

    @Override
    public void onReadyForSpeech(Bundle arg0) {
        Log.i(LOG_TAG, "onReadyForSpeech");
    }

    @Override
    public void onResults(Bundle results) {
        Log.i(LOG_TAG, "onResults");
        ArrayList<String> matches = results
                .getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
        speechString = speechString + ". " + matches.get(0);
    }

    @Override
    public void onRmsChanged(float rmsdB) {
        Log.i(LOG_TAG, "onRmsChanged: " + rmsdB);
        progressBar.setProgress((int) rmsdB);
    }


}

票数 4

Stack Overflow用户

发布于 2015-10-26 18:39:52

一般来说，长时间的音频语音识别是一个具有挑战性的问题，所以你几乎找不到任何开放的东西。相反，我建议你应用其中一种音频分割算法，并分别识别它们。另外，如果您有文本记录和音频，并且只想获得时间框架(例如，对于视频标题问题)，那么任务就变得容易多了，您可以尝试长音频对齐。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/33343942

复制

相似问题

问基于Android的长音频语音识别
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于Android的长音频语音识别EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问基于Android的长音频语音识别
EN