搜索 - 腾讯云开发者社区-腾讯云

文章/答案/技术大牛

发布

来自专栏VoiceVista语音智能
Introducing SensoryCloud.ai Part 3: Speech-to-Text & Accuracy
When considering speech-to-text (STT) solutions, businesses are faced with many different solutions To demonstrate the performance of the SensoryCloud speech-to-text, we hired a 3rd party company to perform accuracy and the flexibility to work with your team to build a customized solution, then SensoryCloud’s speech-to-text invite you to subscribe to our blog and stay up to date on all the services offered by SensoryCloud: Speech-to-Text
57020编辑于 2022-04-02
来自专栏蔻丁杂记
ChatGPT 实时语音交流, speech-to-text and text-to-speech
如果期望与 ChatGPT 进行实时的语音交流，可以直接使用 ChatGPT 的 APP 就可以了，本文完。😂
81710编辑于 2024-12-25
来自专栏人工智能极简应用
【机器学习】Whisper：开源语音转文本（speech-to-text）大模型实战
上一篇对ChatTTS文本转语音模型原理和实战进行了讲解，第6次拿到了热榜第一🏆。今天，分享其对称功能（语音转文本）模型：Whisper。Whisper由OpenAI研发并开源，参数量最小39M，最大1550M，支持包含中文在内的多种语言。由于其低资源成本、优质的生存效果，被广泛应用于音乐识别、私信聊天、同声传译、人机交互等各种语音转文本场景，且商业化后价格不菲。今天免费分享给大家，不要再去花钱买语音识别服务啦！
7.9K20编辑于 2024-08-13
来自专栏小轻论坛
已汉化！高效音频转文本工具Whisper
Speech-to-text API 介绍文档 https://platform.openai.com/docs/guides/speech-to-text 这里我们先在下载好Whisper模型（下载地址请见文末
1.5K10编辑于 2024-09-30
来自专栏AI技术应用
AI口语练习App的技术架构
三、核心AI组件 (Core AI Components)语音识别 (Speech-to-Text, STT) 引擎: 将用户录制的英语语音转换为文本。常用的STT引擎包括： Google Cloud Speech-to-Text Amazon Transcribe Microsoft Azure Speech to Text 开源引擎 (如Mozilla
72610编辑于 2025-04-08
自然语言控制机械臂：ChatGPT与机器人技术的融合创新（上）
Speech recognition：（搭配处理自然语言必不可少的功能模块）我们这边使用的是Google的一种语音识别服务，Speech-to-text，它允许开发者将语音转化成文本的形式。你可以进行在线的尝试语音转文本：https://cloud.google.com/speech-to-text? 2.语音识别转文本功能speech-to-text：为什么要用语音识别转文本功能呢？ ChatGPT API的形式的话只能够接收“文本”的形式来使用，所以speech-to-text可以讲我们讲话转化成文本的形式输入到电脑当中。
1.1K12编辑于 2024-04-12
来自专栏AI 大数据
【AI 语音】实时语音交互优化全解析：从 RTC 技术到双讲处理
Google Speech-to-Text、Azure Speech Recognition 以及 Whisper 等模型可用于 ASR 任务。参考资料WebRTC 官方文档：https://webrtc.org/Google Speech-to-Text API：https://cloud.google.com/speech-to-textFastSpeech
4.6K10编辑于 2025-02-05
来自专栏ATYUN订阅号
谷歌云重大更新：Text-to-Speech现已支持26种WaveNet语音
对于未单独录制的音频样本，Cloud Speech-to-Text提供了diarization，它使用机器学习通过识别扬声器标记每个单词数。谷歌表示，标签的准确性会随着时间的推移而提高。 ? 谷歌云的Speech-to-Text diarization特征这一切都很有用处，但如果你是一个拥有大量双语用户的开发人员呢？
2.8K40发布于 2018-09-26
来自专栏大数据智能实战
pytorch版本的OpenNMT多任务编译实践
logging Source word features Pretrained Embeddings Copy and Coverage Attention Image-to-text processing Speech-to-text
1.2K10发布于 2019-05-26
来自专栏VoiceVista语音智能
Sensory's Take on Generative AI
This is Sensory’s domain as we can perform the speech-to-text, text-to-speech, wake words and even voice
43610编辑于 2023-03-02
来自专栏AI技术应用
AI背单词App的开发流程
AI 相关技术: 语音识别 (Speech-to-Text): 用于发音评估。可以选择第三方 API (如 Google Cloud Speech-to-Text, Amazon Transcribe, 讯飞语音等) 或自建模型。
1.3K10编辑于 2025-04-10
来自专栏AI技术应用
AI英语听力APP的开发框架
技术：ASR (Automatic Speech Recognition) / STT (Speech-to-Text): 负责将语音转为文本。 1.语音识别 (ASR/STT):云服务API： AWS Transcribe, Google Cloud Speech-to-Text, Azure Speech Service, 百度语音、讯飞语音
82210编辑于 2025-06-13
来自专栏AI研习社
Github项目推荐 | Cheetah - 基于深度学习的设备端语音转文本引擎
Cheetah - On-device speech-to-text engine powered by deep learning by Picovoice Website：https://picovoice.ai
2.5K20发布于 2019-05-08
来自专栏AI技术应用
KET 口语练习APP的开发
模拟考官对话: 利用 AI (Text-to-Speech + Speech-to-Text + Dialogue Management) 模拟考官提问并理解用户的回答，进行简单的互动对话（技术复杂度高语音转文字 (ASR): 可以考虑集成第三方的云服务 API，如 Google Cloud Speech-to-Text, AWS Transcribe, 百度语音、科大讯飞等。
68200编辑于 2025-05-08
来自专栏APP开发
PET口语练习APP的技术框架
语音识别 (Speech-to-Text, ASR): 将用户录制的语音转换为文本。第三方云服务: Google Cloud Speech-to-Text, Microsoft Azure Speech Service, Amazon Transcribe, 科大讯飞语音听写、百度语音识别等
66910编辑于 2025-05-09
来自专栏AI科技评论
一心二用：高性能端到端语音翻译模型同时识别声音和翻译
这篇文章给大家介绍AAAI2021上的一篇研究自动语音翻译的工作，《COnsecutive Decoding for Speech-to-text Translation》[1]，简称COSTT，作者来自中科院自动化所和字节跳动人工智能实验室 Consecutive Decoding for Speech-to-text Translation. Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation. Arxiv, 2016.
2.6K40发布于 2021-07-02
来自专栏人工智能前沿讲习
AAAI 2020 | 中科院自动化所：通过识别和翻译交互打造更优的语音翻译模型
本文对中科院宗成庆、张家俊团队完成、被 AAAI-20 录用的口头报告论文《Synchronous Speech Recognition and Speech-to-Text Translation with Long Zhou, Zhongjun He, Hua Wu, Haifeng Wang, and Chengqing Zong.Synchronous Speech Recognition and Speech-to-Text
1.2K20发布于 2020-05-13
来自专栏NewBeeNLP
NLP简报（Issue#9）
RONEC 1.2 小样本学习综述 1.3 Scaling Laws for Neural Language Models 1.4 预训练Transformers校准 1.5 深度学习的统计学 1.6 Speech-to-Text 1.6 Speech-to-Text的ImageNet时刻在Gradient中发布的新文章，Towards an ImageNet Moment for Speech-to-Text[7]中，Alexander Veysov解释了为什么他们认为在俄语中语音转文本（Speech-to-Text，STT）的ImageNet时刻已经到来。 www.annualreviews.org/doi/abs/10.1146/annurev-conmatphys-031119-050745 [7] Towards an ImageNet Moment for Speech-to-Text
1.4K20发布于 2020-08-26
来自专栏AI技术应用
AI听力陪练APP的技术框架
3.AI与语音处理框架：语音识别方面，可以使用Google Speech-to-Text API、Amazon Transcribe或CMU Sphinx（PocketSphinx）等工具，它们提供高精度的语音识别能力
76910编辑于 2024-12-16
来自专栏AI
AI口语练习APP的开发
AI技术和平台 (AI Technologies and Platforms): 语音识别 (ASR): Google Cloud Speech-to-Text API Amazon Transcribe Microsoft Azure Speech to Text 开源方案 (例如：Mozilla DeepSpeech) 发音评估 (Pronunciation Assessment): Google Cloud Speech-to-Text
1.1K10编辑于 2025-03-28

第 2 页第 3 页第 4 页第 5 页第 6 页第 7 页

点击加载更多

Introducing SensoryCloud.ai Part 3: Speech-to-Text & Accuracy

ChatGPT 实时语音交流, speech-to-text and text-to-speech

【机器学习】Whisper：开源语音转文本（speech-to-text）大模型实战

已汉化！高效音频转文本工具Whisper

AI口语练习App的技术架构

自然语言控制机械臂：ChatGPT与机器人技术的融合创新（上）

【AI 语音】实时语音交互优化全解析：从 RTC 技术到双讲处理

谷歌云重大更新：Text-to-Speech现已支持26种WaveNet语音

pytorch版本的OpenNMT多任务编译实践

Sensory's Take on Generative AI

AI背单词App的开发流程

AI英语听力APP的开发框架

Github项目推荐 | Cheetah - 基于深度学习的设备端语音转文本引擎

KET 口语练习APP的开发

PET口语练习APP的技术框架

一心二用：高性能端到端语音翻译模型同时识别声音和翻译

AAAI 2020 | 中科院自动化所：通过识别和翻译交互打造更优的语音翻译模型

NLP简报（Issue#9）

AI听力陪练APP的技术框架

AI口语练习APP的开发

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐