Comment from Clive: Voice recognition comes of age

clive-gilbertSince the dawn of mainstream personal computing in the 1980s, the mouse and keyboard have become such an integral part of our daily routines that it might seem hard to imagine how or why we would consider life without them. And yet, technologists are always seeking breakthroughs that will herald the next big thing to revolutionise our increasingly digitised world.

According to delegates at January’s Consumer Electronics Show in Las Vegas, they think they have found it in voice recognition technology which has matured enough in the past few years that it is now poised to bring about a new era of faceless computing.

Whereas mice and keyboards often feel like extensions of our own bodies, our voices really are part of us. The growing capabilities of voice recognition are expected to fundamentally change the way we interact with our devices. Along with everyone else; older and disabled people will face a new set of opportunities and challenges as society adapts to a reality in which routine tasks can increasingly be carried out with speech alone.

Understanding human speech

Under the bonnet of these new tools are sophisticated technologies that help machines understand human speech by picking up on patterns and signals in our voices, allowing computers to become better listeners the more we talk to them. By one estimate, the error rate for voice recognition systems has plummeted from 43% in 1995 to just 6.3% this year, which is on a par with humans. With the advent of Apple’s Siri and Microsoft’s Cortana, voice activated digital assistants have proliferated across all mainstream operating systems and sales of stand-alone devices such as Google Home and the Amazon Echo are expected to double to £10 million in 2017.

For many disabled people, these tools and the innovations that will inevitably follow will be immensely liberating, saving time and energy, as well as helping to eliminate the need for a range of secondary assistive technologies. However, for others, these benefits may not arrive so readily.  Most voice recognition tools are primed to understand speakers who fall into a certain range which encompasses the majority of people. They tend to perform less well for people whose voices fall beyond these boundaries, such as distinct regional accents and older people’s voices. There are also approximately 100 million people worldwide living with a permanent speech impairment for whom mainstream voice recognition software is unlikely to be suitable.

Ways to solve the problems

Fortunately, there are ways around this problem. For example, giving voice recognition more information to work with by capturing the person’s facial cues can help improve the accuracy of their output. Limiting the range of words the software is designed to recognise can vastly shorten the odds that it will settle on the correct phrase. Yet another option is to programme the software to ask for clarification when it does not immediately understand what the user has said.

Voice recognition’s flaws might mean some people will have to wait longer than most before they can bark orders at their digital assistants, but – with a little ingenuity – its impact on disabled and older people’s lives is likely to be far greater than current offerings suggest.

Subscribe to Clive’s monthly dispATches newsletter for free for all the latest in public policy, social affairs and technology.

We will be posting his editorial on our blog every month for you to enjoy.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s