Voice
The SDKs for the Oracle Android, Oracle iOS, and Oracle Web channels have been integrated with speech recognition to allow users to talk directly to skills and digital assistants and get the appropriate responses.
When speech recognition is enabled, a microphone button replaces the send button whenever the user input field is empty. Users tap this button to begin recording their voices. The speech is sent to the speech server for recognition, converted to text, and then sent to the skill. If the speech is only partly recognized, then the partial result is displayed in the user input field, allowing the user to clean it up before sending it to the skill.
See General Feature Support by Language for a list of the languages that are supported for voice.
Enable Voice for the Oracle Android Channel
- Create the Oracle Android Channel and enable it.
- Set the
feature flag toenableSpeechRecognition
true
. Speech Recognition describes this and other voice-related properties and methods.
Enable Voice for the Oracle Web Channel
- Configure the Oracle Web Channel and enable it.
- Set the
enableSpeech
configuration property totrue
. Voice Recognition describes this and other voice-related properties and methods.
Enable Voice on the Oracle iOS Channel
- Configure the Oracle iOS Channel.
- Set the
enableSpeechRecognition
feature flag totrue
. Speech Recognition describes this and other voice-recognition properties and methods.
Improve ASR with Enhanced Speech
You can only use enhanced speech with English-language skills (with training data in English) that are intended for an English-speaking audience.
- Select Enable Enhanced Speech in Settings.
- Retrain the skill.
- Route an Oracle Web, iOS, or Android client channel to the skill.
Tip:
Enhanced speech models are only available for skills developed with Version 20.12 or later. If you want to use enhanced speech models, then you must upgrade the skill to 20.12.
When you select this option, the speech recognition system builds an enhanced speech model that's based on the skill's intent and entity data: utterances, entity values, synonyms for both custom and dynamic entity values, and system entities that have been associated with intents. The enhanced speech model is updated each time you retrain your skill (or, as is the case in the current release, when the skill is retrained after a finalized push request from the Dynamic Entity API).
When users issue a speech request through the Oracle Web, iOS, or Android client channels, the speech runtime dynamically pulls in the custom language model for the skill that's routed to the channel. If the channel points to a digital assistant, it will pull the custom language models for each skill that has Enable Enhanced Speech enabled. You can toggle this setting on and off for the individual skills that are registered to a digital assistant.