To create voice commands from the speech (voice) of a user, it is necessary to automatically recognize the speech and convert it to text. Then, simple intentions can be associated with texts in order to create an action on computer.
The generic operating principle of automatic speech recognition is based on a probabilistic recognition algorithm for identifying phonemes in a sentence. In order to make the algorithm more efficient, three additional components are essential:
- The Acoustic Model : It defines the sound environment in which automatic speech recognition will be most effective. If an algorithm is configured with a car interior acoustic model, it will be less effective outdoors with wind. Some algorithms have generic acoustic models, some can be configured according to the use case.
- The Language Model : It allows the use of speech recognition in different languages.
- Lexicon ( or grammar): When using voice command creation, the speech recognition algorithm is generally restricted to a limited vocabulary base. In this case, the algorithm will limit its word search to the vocabulary defined in the lexical base. The smaller the vocabulary, the more efficient the algorithm, but the less satisfactory the user experience.
SPIX industry has expertise in the configuration and implementation of automatic speech recognition in the industrial field.