The revolution of industrial voice assistance will go through the blue collars of the industry

The revolution of industrial voice assistance will go through the blue collars of the industry

Several articles announce the arrival of voice and voice assistance in the workplace in the coming months. The arguments put forward are relevant, but insufficient for SPIX industry. Voice is used in the industry when the proposed functions are contextualized and deemed useful. This is what we will see!

An update on consumer voice assistants

A few months ago, PwC and Bloomberg analyzed the use of consumer voice assistants to understand user motivations and future trends. The interest of comparing their studies lies in the consideration of voice assistants installed on smartphones and connected speakers for one, and only on connected speakers for the other. The differences are major and relevant for our industrial analysis: in one case the hardware device is mobile and offers visual feedback, in the other a purely vocal interface.

The results of the PwC analysis in 2019 [1] do not come as a surprise. Young users are adopting the technology quickly. The older ones are slower, but confident in their future use. The use of voice assistance resources remains mainly on mobile phones (57%), the use of connected speakers comes after (27%).

The PwC study confirms the expected use of this technology in a private context for simple tasks. In order for the first 5:

  • Do a search on a search engine
  • Ask a quick question
  • Check the weather or the latest news
  • Play music
  • Start a timer

The Bloomberg study [2] published in 2021 (carried out 1 year after that of PwC) tells us that users of Amazon’s Alexa voice assistant mainly operate the device equipped with a screen, therefore with visual feedback. These devices are nevertheless fixed, most of the time in the context of the home. Alexa users at home use in order:

  • Play music
  • Start a timer
  • Turn lights on or off

In parallel, the publisher himself on the promotional site of Amazon-Alexa[3] proposes the use of the voice assistant for the following functions: ” Alexa is always ready to answer your questions, tell you jokes, play music, find the fastest route to work, adjust the thermostat with your compatible connected devices, and much more “. Answering questions, having fun, playing music: the proposal is consistent with the use observed by PWC and Bloomberg.

But then what use could a voice assistant be at work? Indeed, the functions mentioned above do not make an industrialist want to invest in this technology.




A first general answer for the industry

A study conducted by the Manutan Group[1 ] in 2022 presents the possible uses of voice assistants at work, as well as the expected gains. Among these functions, the following are invoked as ” scopes of voice assistants in business “:

  • Voice dictation is the most widely used element;
  • Note taking (meeting minutes) ;
  • Teamwork;
  • Employee calendar management;
  • Customer services;
  • IT support services;
  • Document search: it is possible to ask the voice assistant to display documents on the screen during presentations (financial statements, presentation brochures, etc.).

First of all, it is interesting to note that the distinction between voice recognition and voice assistance is not very clear… Indeed, “voice dictation” or “note taking” functions are linked to the implementation of voice-to-text transcription capabilities (from speech to text in good English), but do not require advanced text comprehension or dialogue skills.

The other functions are effectively linked to the implementation of real voice assistance functions. That is, an artificial intelligence must understand the user’s intent from the text transcription of their voice request, and then respond to it either directly or through turns to refine the request.

Nevertheless, there is a mix between external and internal functions. External functions, such as “customer service”, use technologies similar to those of the “general public” mentioned in the previous studies. They raise classic questions about the management of personal data, but there is no doubt about their cost-effectiveness. It seems obvious that replacing a human operator by a call-bot for first level support tasks is a cost-effective choice. Is it socially responsible… that’s another question!

Other functions such as “calendar”, “IT services”, or “document research” are effectively internal. These features are described “as an ergonomic asset” by the authors. This would mean that voice access to these functions is not critical, or not essential. Indeed, the functions mentioned such as “appointment booking”, “room reservation” or “scheduling a customer appointment” are already accessible by the classic office tools. The people who use them are mostly in an office, in front of a computer. Is it really necessary and faster to set a customer appointment by voice rather than with three clicks on a classic interface? Except for people with disabilities, will the return on investment be commensurate with the expense of implementation?

For SPIX industry[2], it is essential that the implementation of voice and voice assistance brings a real added value to the user and to the industry that finances it. In the same way, it is important to consider all the personnel in the company who could benefit from voice solutions or voice assistance.

Wouldn’t the added value of the use of voice and voice assistance be found in secondary activities, rather than in the third party? This is the real revolution of the arrival of voice assistants in the industry: investing in the voice of the clos-bleus to maximize the competitiveness of the industry of the future.


[2 ]

Known limits of consumer voice assistants

The first assessment drawn from previous studies is therefore clear: voice assistants for the general public are therefore used mainly for very simple tasks, and mainly on smartphones with visual feedback.

What about the adoption and recurring use of connected speakers, and feedback from their users. The second part of the Bloomberg study[1] is much more unexpected and provides relevant insights for industrial use.

This second part of the study shows that the main functions of Amazon-Alexa are understood by their users within 3 hours of their installation. Subsequently, 25% of them abandon these devices in the following 2 weeks… it leaves you dreaming and at the same time! The reasons given for this finding are as follows:

  • The functions accessible and used are considered of little use

The functions used highlighted by the previous studies are not revolutionary compared to the classic touch use of a tablet or a smartphone. The added value of these functions remains in the field either of the game, or of the gadget, or even of comfort. The PwC study [2] shows that functions with much higher added value (reserving a restaurant or plane ticket for example) are not yet used (16% and 0% respectively), but are largely desired for the future (32% and 26% respectively). Users therefore seek to simplify complex tasks with high added value by using voice and voice assistance.

In view of the Manutan study, we can legitimately ask ourselves whether the voice use of classic office functions will not fall into this category. An exemption: an efficient voice dictation for the industry, which understands the business terms and knows the names of the employees.

  • Users are concerned about the privacy of their data

Several recent studies reported by the media [3] [4] demonstrate that the capabilities of consumer voice assistants such as Alexa, Google-Home or Siri are improved by analyzing the exchanges between users and their assistant. This analysis necessarily involves listening to the voice commands given by the users to obtain the requested service. Doubt about data confidentiality is therefore sown in the minds of users.

In the industrial context, the questions are the same. First-level service and support call-bots for customers will therefore need to quickly clarify their position on the subject of manipulated data. The same goes for internal solutions: do they listen all the time, is the voice recorded, …

  • Spontaneous suggestions made by Alexa annoy users

Finally, the Bloomberg report goes back to a clear observation: the user does not accept that his voice assistant gives him orders, or makes him suggestions for actions or purchases in an unsolicited way. First, it proves that the assistant exploits the user’s previous requests, which reinforces his mistrust of his personal data. Finally, users find it difficult to get their voice assistant to initiate the exchange: the human wants to keep the initiative.

And yet, one could imagine voice functions associated with critical situations, fire alarms, climatic risks, or imminent danger. This type of spontaneous solicitation function would be much more accepted. We will see that for these functions to be effective, the voice assistant must be “aware”, i.e. it must be aware of the context, of the environment in which it is used.

Thus, the functions available in the “consumer market” assistants are considered of little use by their users. Indeed, to be “intelligent” and therefore relevant, a voice assistant must understand the context of its use. In the domain of the general public, the user ( rightly ) does not want to share too much context with his assistant ( where I am, what I do, what I like, my mood, … ). The accessible functions are therefore necessarily simple.

The revolution of voice assistance in the industry consists in using it for critical tasks (office tools are not usable in a traditional way), and for which a context is available (a task order, a work order, a form to be completed). So it’s time to turn on the voice in the production, maintenance or quality control departments of the industry.





Why blue-collars are mandatory?

For a voice assistant to be considered “useful”, it must provide a “high value-added service” to the user and bring a “return on investment” (a profit) to the user. bought (the payer).

As we have already seen in a previous article [1] , for consumer voice assistants the “payer” is also the “provider” of the technology. The voice assistants Alexa, Siri or Google-Home are actually made available to users free of charge, the non-financial counterparts being largely underestimated. The economic model of these assistants therefore intrinsically bears part of the frustration of users.

As stated in a Wikipedia definition[2], the intelligence of a voice assistant (or personal assistant) is based on the knowledge of the user and the history of the data, i.e. the knowledge bases. This knowledge in a general public context is available in large quantities, but necessarily imperfect. This therefore results in services rendered to the user which are simple and without real added value.

We could conclude that the situation is hopeless… Fortunately not! It is enough to find fields of application for voice assistance technologies which make it possible to fulfill the two conditions: “value” for the user and “benefit” for the payer.

The implementation of voice solutions for blue-collar workers in the industry meets the criteria of value and benefit, and brings the expected set of ROI[3] expected.

For several years, SPIX industry[4 ] has been developing an intelligent voice assistance technology for the industry, and more particularly for technicians and blue-collar workers. Why does this assistant fulfill the two conditions previously stated?

In the current context of Industry 5.0[5] development, manual operators are increasingly faced with digital tasks. The digitization of tasks, the deployment of complex software for the management of procedures, interventions, reports, etc. requires men and women in the field in industry to use digital applications on smartphones or tablets. Problem: it’s not their job, they often have gloves, and don’t like this type of task on which they “waste” a lot of time.

Using a voice assistant as an interface to the software they already use on a tablet or smartphone simplifies their digital tasks. They perform the digital actions requested of them, but in a simpler way and keeping their hands and eyes free. They retain the visual support of their usual software, which allows them to mature in the use of voice. One day, they will be able to use their software without looking at the screen!

In this case, the added value and the service provided to the user are high, because the voice assistant has all the knowledge necessary for its operation. Indeed, the procedures, task order, work instructions or report forms, supplemented by vocabulary bases and industry-specific ontologies, constitute the body of knowledge used to define the context in which the user uses his voice assistant.

Users trust this intelligent Spix voice assistant because the rules for using their voice and company data are clear. Indeed, Spix’s SKILLS[6] allow the assembly of an embedded intelligent voice assistant, without connection to an external cloud and which does not require access to real-time enterprise data. It works as an interface with the operators’ business software, this software remains in charge of managing the business data that is specific to them. On the operator side, their voice is not recorded, they can talk to their colleagues without interference with their assistant, and the context of use is limited to their work context. The operator always remains at the initiative of the interaction with his assistant, except in the event of a security alert. But in this case, the added value of the assistant’s intervention is obvious.

From an industry perspective, SPIX industry’s Spix voice assistant is commercial software. The technology developed by the company respects the rules of confidentiality of industrial data, interfaces with their business software through libraries[7 ] and meets the validation constraints for operational use in an industrial environment. Thus, the industry is acquiring this voice assistant to enable its technicians and field operators to better use their digital applications.

In this case, the manufacturer is the payer. Its return on investment[8 ] is found in the reduction of production non-quality, the reduction of non-value added time of its operators, the increase of the safety of human interventions. Finally, the manufacturer finds its benefit in the increase in data that it is able to collect in real time on field operations carried out by its technicians.



[3 ]

[4 ]

[5 ]


[7 ]



The adoption of voice, for both private and industrial use, will require the emergence of ” high value-added ” services for users. In the general public, it’s complicated because it’s all about the data we’re all willing to share.

For the industry, it is important not to mistake the target. To control a desktop software by voice while the mouse and the keyboard are available and usable, does not bring a real added value. As we have seen, only the introduction of internal and confidential servers for voice transcription capable of processing business terms provides a real value-added service to office users.

The voice revolution will therefore involve technicians and blue-collar workers who have their hands full with their tasks and are increasingly being asked to interact digitally.

At SPIX industry, we are convinced by the development of intelligent voice assistance solutions in business verticals. On production, inspection, quality control or maintenance jobs, it is possible to assemble the necessary components to ensure the usefulness, usability and acceptability of the voice services rendered to the user, while satisfying the return on investment of the manufacturer.

Press contacts
André JOLY – Managing Director
Phone. : +33 (0)6 25 17 27 94

Legal entity
Website :
Linkedin :
Simsoft3D SAS – 1244 rue l’Occitane – 31670 Labège (France)
“SPIX” and “SPIX industry” are registered trademarks of Simsoft3D SAS.