我想为我的谷歌助手对话流代理使用WaveNet语音,而不是标准的机器人语音。所以我撞上了这个article。
据我所知,可以用WaveNet语音输出机器人的答案音频文件,但如果我能在Dialogflow的操作控制台或我的谷歌主页中听到这个声音,那就太好了。是否可以在控制台中听到常规TTS语音之外的其他语音?
发布于 2018-11-06 05:37:48
您可以使用SSML和Dialogflow来附加您自己的.ogg文件。
如果你感兴趣的话,谷歌提供了a sample use case on Github。附加代码:
<speak>
The key element for layered sound mixing is <sub alias="par">${'<par>'}</sub>
(as in "parallel") which inserts a mixed sound at the current point of the TTS.
It is similar to the <sub alias="paragraph">${'<p>'}</sub>
element with an important difference of not displaying
the text content in chat bubbles on surfaces with displays.
<par>
<media xml:id="first_thing" begin="2.5s">
<speak>
This media element contains a <sub alias="speak element">${'<speak>'}</sub> for TTS.
It has an <say-as interpret-as="verbatim">xml:id</say-as> attribute so that other
<sub alias="media">${'<media>'}</sub> elements can refer to it.
There is also a "begin" attribute that delays the start time by 2.5 seconds.
Millisecond units are also supported by the
<say-as interpret-as="letters">ms</say-as> suffix.
</speak>
</media>
<media xml:id="second_thing" soundLevel="-1dB" repeatCount="3">
<audio src="https://actions.google.com/sounds/v1/cartoon/cartoon_boing.ogg">
The sound source for this <sub alias="audio">${'<audio>'}</sub> element is missing.
Find more sounds at https://developers.google.com/actions/tools/sound-library.
</audio>
</media>
<media xml:id="last_thing" begin="first_thing.end + 1234ms">
<speak>
This TTS starts <say-as interpret-as="units">1234 milliseconds</say-as>
after the end of the media element with the
<say-as interpret-as="verbatim">xml:id</say-as> equal to "first_thing".
</speak>
</media>
</par>
</speak>https://stackoverflow.com/questions/53126245
复制相似问题