Automatic Speech Recognition and Speech Recognition for Industry

Voice

Automatic speech recognition

Voice commands for your industrial software

Among all the Spix SKILLS, some are simple and respond to a need to integrate voice commands into existing software.

Voice commands use automatic speech recognition features that turn a voice signal into text. Voice commands can be associated with this text to drive a software interface, or trigger simple actions.

Automatic speech recognition

To create voice commands from the speech (voice) of a user, it is necessary to automatically recognize the speech and convert it to text. Then, simple intentions can be associated with texts in order to create an action on computer.

The generic operating principle of automatic speech recognition is based on a probabilistic recognition algorithm for identifying phonemes in a sentence. In order to make the algorithm more efficient, three additional components are essential:

  • The Acoustic Model : It defines the sound environment in which automatic speech recognition will be most effective. If an algorithm is configured with a car interior acoustic model, it will be less effective outdoors with wind. Some algorithms have generic acoustic models, some can be configured according to the use case.
  • The Language Model : It allows the use of speech recognition in different languages.
  • Lexicon ( or grammar): When using voice command creation, the speech recognition algorithm is generally restricted to a limited vocabulary base. In this case, the algorithm will limit its word search to the vocabulary defined in the lexical base. The smaller the vocabulary, the more efficient the algorithm, but the less satisfactory the user experience.

SPIX industry has expertise in the configuration and implementation of automatic speech recognition in the industrial field.

Available in multiple languages

The ability to use voice commands in a given language depends solely on the effectiveness of speech recognition in that language. Intents related to commands are generic regardless of the language (“validate” expresses the same intent in all languages).

SPIX industry has speech recognition capabilities in nearly 40 languages. Not all are validated at the same level of reliability for industrial use.

Speech recognition validated for the industrial field

French

native and non-native

English

not native

English

native

Spanish

native

Portuguese

brazil and native

German

native

Other speech recognition available

Italian

Dutch

Korean

Chinese

Other languages on request…

Tailored to industry operational needs

To create voice commands from the speech (voice) of a user, it is necessary to automatically recognize the speech and convert it to text. Then, simple intentions can be associated with texts in order to create an action on computer.

The generic operating principle of automatic speech recognition is based on a probabilistic recognition algorithm for identifying phonemes in a sentence. In order to make the algorithm more efficient, three additional components are essential:

  • The Acoustic Model : It defines the sound environment in which automatic speech recognition will be most effective. If an algorithm is configured with a car interior acoustic model, it will be less effective outdoors with wind. Some algorithms have generic acoustic models, some can be configured according to the use case.
  • The Language Model : It allows the use of speech recognition in different languages.
  • Lexicon ( or grammar): When using voice command creation, the speech recognition algorithm is generally restricted to a limited vocabulary base. In this case, the algorithm will limit its word search to the vocabulary defined in the lexical base. The smaller the vocabulary, the more efficient the algorithm, but the less satisfactory the user experience.

SPIX industry has expertise in the configuration and implementation of automatic speech recognition in the industrial field.