#Speech
#textToSpeech
-
Type: true
{
voiceName?: string
,
lang?: string
,
pitch?: number
,
rate?: string
,
volume?: number
}
When the chat receives a new text message - your device will automatically read it out.
voiceName
is the name of the voice that will be used to read out the incoming message. Please note that different Operating Systems support different voices. Use the following code snippet to see the available voices for your device: window.speechSynthesis.getVoices()
lang
is used to set the utterance language. See the following QA
for the available options.
pitch
sets the pitch at which the utterance will be spoken at.
volume
set the volume at which the utterance will be spoken at.
[!INFO] Text to speech is using
SpeechSynthesis
Web API which is supported differently across different devices.
[!INFO] Your mouse needs to be focused on the browser window for this to work.
#Example
<deep-chat textToSpeech='{"volume": 0.9}'></deep-chat>
#speechToText
-
Type: true
{
webSpeech?:
true
WebSpeechOptions
,
azure?: AzureOptions
,
textColor?: TextColor
,
displayInterimResults?: boolean
,
translations?: {[key: string]: string}
,
commands?: Commands
,
button?: ButtonStyles
,
stopAfterSubmit?: boolean
,
submitAfterSilence?: SubmitAfterSilence
} - Default: {webSpeech: true, stopAfterSubmit: true}
Transcribe your voice into text and control chat with commands.
webSpeech
utilises Web Speech API
to transcribe your speech.
azure
utilises Azure Cognitive Speech Services API
to transcribe your speech.
textColor
is used to set the color of interim and final results text.
displayInterimResults
controls whether interim results are displayed.
translations
is a case-sensitive one-to-one mapping of words that will automatically be translated to others.
commands
is used to set the phrases that will trigger various chat functionality.
button
defines the styling used for the microphone button.
stopAfterSubmit
is used to toggle whether the recording stops after a message has been submitted.
submitAfterSilence
configures automated message submit functionality when the user stops speaking.
[!WARNING]
Web Speech API
is not supported in this browser.
#Example
<deep-chat speechToText='{ "webSpeech": true, "translations": {"hello": "goodbye", "Hello": "Goodbye"}, "commands": {"resume": "resume", "settings": {"commandMode": "hello"}}, "button": {"position": "outside-left"} }' ></deep-chat>
[!INFO] If the
microphone
recorder is set - this will not be enabled.
[!INFO] Speech to text functionality is provided by the
Speech To Element
library.
[!CAUTION] Support for
webSpeech
varies across different browsers, please check theCan I use
Speech Recognition API section. (The yellow bars indicate that it is supported)
#Types
Object types for speechToText
:
#WebSpeechOptions
- Type: {
language?: string
}
language
is used to set the recognition language. See the following QA
for the full list.
[!WARNING]
Web Speech API
is not supported in this browser.
#Example
<deep-chat speechToText='{"webSpeech": {"language": "en-US"}}'></deep-chat>
[!NOTE] This service stops after a brief period of silence due to limitations in its API and not Deep Chat.
#AzureOptions
- Type: {
region: string
,
retrieveToken?: () => Promise<string>
,
subscriptionKey?: string
,
token?: string
,
language?: string
,
stopAfterSilenceMs?: number
} - Default: {stopAfterSilenceMs: 25000 (25 seconds)}
This object requires region
and either retrieveToken
, subscriptionKey
or the token
properties to be defined with it:
region
is the location/region of your Azure speech resource.
retrieveToken
is a function used to retrieve a new token for the Azure speech resource. It is the recommended property to use as it can retrieve the token from a secure server that will hide your credentials. Check out the retrieval example below and starter server templates.
subscriptionKey
is the subscription key for the Azure speech resource.
token
is a temporary token for the Azure speech resource.
language
is a BCP-47 string value to denote the recognition language. You can find the full list here.
stopAfterSilenceMs
is the milliseconds of silence required for the microphone to automatically turn off.
[!INFO] To use the Azure Speech To Text service - please add the
Speech SDK
to your project. See EXAMPLES.
#Example
<deep-chat speechToText='{ "azure": { "subscriptionKey": "resource-key", "region": "resource-region", "language": "en-US", "stopAfterSilenceMs": 5000 } }' ></deep-chat>
Location of speech service credentials in Azure Portal:
[!CAUTION] The
subscriptionKey
andtoken
properties should only be used for local/prototyping/demo purposes ONLY. When you are ready to deploy your application, please switch to using theretrieveToken
property. Check out the example below and starter server templates.
#Retrieve token example
speechToText.speechToText = {
region: 'resource-region',
retrieveToken: async () => {
return fetch('http://localhost:8080/token')
.then((res) => res.text())
.then((token) => token);
},
};
#TextColor
- Type: {
interim?: string
,final?: string
}
This object is used to set the color of interim
and final
results text.
#Example
<deep-chat speechToText='{"textColor": {"interim": "green", "final": "blue"}}'></deep-chat>
#Commands
- Type: {
stop?: string
,
pause?: string
,
resume?: string
,
removeAllText?: string
,
submit?: string
,
commandMode?: string
,
settings?:
{substrings?: boolean
,caseSensitive?: boolean
}
} - Default: {settings: {substrings: true, caseSensitive: false}}
This object is used to set the phrases which will control chat functionality via speech.
stop
is used to stop the speech service.
pause
will temporarily stop the transcription and will re-enable it after the phrase for resume
is spoken.
removeAllText
is used to remove all input text.
submit
will send the current input text.
commandMode
is a phrase that is used to activate the command mode which will not transcribe any text and will wait for a command to be executed. To leave the command mode - you can use the phrase for the resume
command.
substrings
is used to toggle whether command phrases can be part of spoken words or if they are whole words. E.g. when this is set to true and your command phrase is “stop” - when you say “stopping” the command will be executed. However if it is set to false - the command will only be executed if you say “stop”.
caseSensitive
is used to toggle if command phrases are case sensitive. E.g. if this is set to true and your command phrase is “stop” - when the service recognizes your speech as “Stop” it will not execute your command. On the other hand if it is set to false it will execute.
#Example
<deep-chat speechToText='{ "commands": { "stop": "stop", "pause": "pause", "resume": "resume", "removeAllText": "remove text", "submit": "submit", "commandMode": "command", "settings": { "substrings": true, "caseSensitive": false }}}' ></deep-chat>
#ButtonStyles
This object is used to define the styling for the microphone button.
It contains the same properties as the MicrophoneStyles
object and an additional commandMode
property which sets the button styling when the command mode
is activated.
#Example
<deep-chat speechToText='{ "button": { "commandMode": { "svg": { "styles": { "default": { "filter": "brightness(0) saturate(100%) invert(70%) sepia(70%) saturate(4438%) hue-rotate(170deg) brightness(92%) contrast(98%)" }}}}, "active": { "svg": { "styles": { "default": { "filter": "brightness(0) saturate(100%) invert(10%) sepia(97%) saturate(7495%) hue-rotate(0deg) brightness(101%) contrast(107%)" }}}}, "default": { "svg": { "styles": { "default": { "filter": "brightness(0) saturate(100%) invert(77%) sepia(9%) saturate(7093%) hue-rotate(32deg) brightness(99%) contrast(83%)" }}}}}, "commands": { "removeAllText": "remove text", "commandMode": "command" } }' ></deep-chat>
[!TIP] You can use the
CSSFilterConverter
tool to generate filter values for the icon color.
#SubmitAfterSilence
-
Type: true
number
Automatically submit the input message after a period of silence.
This property accepts the value of true or a number which represents the milliseconds of silence required to wait before a messaget is submitted. If this is set to true the default milliseconds is 2000.
#Example
<deep-chat speechToText='{"submitAfterSilence": 3000}'></deep-chat>
[!CAUTION] When using the default
Web Speech API
- the recording will automatically stop after 5-7 seconds of silence, please take care when setting thems
property.
#Demo
This is the example used in the demo video. When replicating - make sure to add the Speech SDK to your project and add your resource properties.
<!-- This example is for Vanilla JS and should be tailored to your framework (see Examples) -->
<div style="display: flex">
<deep-chat speechToText='{ "azure": { "subscriptionKey": "resource-key", "region": "resource-region" }, "commands": { "stop": "stop", "pause": "pause", "resume": "resume", "removeAllText": "remove text", "submit": "submit", "commandMode": "command" }}' errorMessages='{ "overrides": {"speechToText": "Azure Speech To Text can not be used in this website as you need to set your credentials."} }' style="margin-right: 30px" demo="true" ></deep-chat>
<deep-chat speechToText='{ "commands": { "azure": { "subscriptionKey": "resource-key", "region": "resource-region" }, "stop": "stop", "pause": "pause", "resume": "resume", "removeAllText": "remove text", "submit": "submit", "commandMode": "command" }}' errorMessages='{ "overrides": {"speechToText": "Azure Speech To Text can not be used in this website as you need to set your credentials."} }' demo="true" ></deep-chat>
</div>