If you are wondering who Zeva is and why am I asking her that question, well, it is Zensar’s own enterprise-level AI virtual assistant. Since she’s busy with enterprise tasks, I’ll answer the question on her behalf.
Until the mid to late 1970s, the manufacturers didn’t really care for the computer interfaces. The few people who had access to computers were professionals or academicians. Since the user base was small, it wasn’t necessary to focus on how the users interacted with the computers.
With the dawn of personal computing, the area of interaction opened. All the major players, Xerox, Apple, IBM, started designing their version of the keyboard and mouse. In the 1980s, the discipline of human-computer interaction was born with a single objective; making computing easier for the masses.
Whenever there is a paradigm shift, it drives humans to innovate new technologies and it is essential to the progression of society. We are currently facing a similar shift, the rising use of smartphones, tablets, VR, AR devices, smartwatches and smart home gadgets have made the keyboard and mouse impractical. And just like every time, we are innovating. There are a couple of technologies and products that are all set to replace our traditional keyboard and mouse. And all these new ways of interactions have their pros and cons, some claim to be faster, some claim to be easier and some work where others just can’t. Let’s explore the future of interacting with computing devices.
Voice
Speech is considered as the future of interacting with computers. Since the launch of Siri in 2010, the world has been enthralled with voice interfaces. How convenient has it became to just tell Google to turn off all the lights when you leave a room, Alexa to play your favourite song when you feel bored or Siri to inform you about the weather when you are planning a trip to Lonavla. According to some estimates, the voice-first devices have reached a total footprint of around 33 million in circulation.
At Zenlabs, we have developed an artificially intelligent virtual assistant known as Zeva (short for Zensar’s Enterprise Virtual Assistant). It has been integrated with multiple Zensar’s platforms like ZenVerse, ZenMCM etc. People can easily query regarding those platforms and get answers in real-time. Sample queries can be –
“Hi Zeva, who has asked the highest number of questions on ZeVerse?”
“Hi Zeva, what’s my meeting schedule for today?
“What are the action items for Ajay Bhandari?”
Although speech can be useful and fast, it has some downsides to it. Suppose you have to compose a whole paragraph or write a paper, during that, the speech starts to interfere with the part of the brain that is composing the information for you. Also, speech interfaces can be slow and embarrassing when other humans are around. And they always require a start phrase like “Okay Google”, “Alexa” or “Hey Siri”, which can be pretty annoying after some time.
Thankfully, though, talking into midair is no longer our only option.
Wearable keyboards
Tap systems, Inc. is trying to reimagine how portable and ergonomic a keyboard can be. Tap is a smart wearable device that enables the user to type without the use of a physical keyboard. According to the company, it is designed in a way such that it supports strain-free input for a longer period. Unlike traditional keyboards, the user can tap on any surface using a combination of the fingers to define a particular symbol or alphabet. The combination is something like this – tapping with your thumb and index finger, simultaneously, will input the letter ‘N’.
The major market that the company is trying to focus on is VR and AR, where you cannot see what you are typing on a keyboard. It’s definitely an interesting piece of hardware but there is a learning curve to it. And the user has to remember a lot of combinations for different letters.
Gestures
The way computers perceive visual information is an entirely different discussion. Computer scientists have been researching on computer vision for a fairly long time. Today, a computer can understand if you are stirring a coffee mug or opening the laptop. This itself is very fascinating because each image/frame is just a long list of binary digits that contain information about each pixel.
A lot of these smart devices come with a camera, sometimes multiple cameras that can perceive three dimensions and generate a depth map of the scene. Combining this with trained neural networks, computers can identify the humans in the scene, how they are posing and how they are moving. This technology opens the gates to gesture-based interaction models.
Leap Motion takes the idea of hand interactions another step further. They made sensors that can detect your hands moving in space and then model them into virtual environments. The key is to detect all the movements and reflecting those movements in VR in real-time.
We have been working on hand gesture recognition for retail systems for quite a long time. We have trained our own convolutional neural network that can detect your hand gestures in real-time. Using this trained neural network, we have developed a system that allows the user to control the location of mouse pointer using the hand movements, the user can click and scroll too.
Speed is the major advantage of using gestures as it is much quicker than speaking sentences. For example, instead of asking the computer to “scroll 10% down”, you can just show your fist as if you are grabbing the page and then move up or down to scroll the page.
But if you have to communicate a longer set of information, probably, it’s not the best option.
Brain Control
How cool would that be if a real brain could communicate with an artificial brain? Well, it’s a reality now. We can now get the data by physiological functions using devices that can track the activity of different systems in the body. There are headsets that can sense brain waves. Similar to what neurons do for our body, these headsets pick up brain signals and then translate them into operations.
At Stanford, researchers have developed a technology called Brain Gate that has enabled a woman suffering from A.L.S. to express her thoughts by typing on a screen, not with her fingers but with her brain waves.
The current generation of brain waves sensing headsets are not that powerful, so people need to focus a lot to provide commands to the system.
Conclusion
All the new technologies beg the question – out of these existing interaction technologies – sound (voice), haptics (touch), vision (gestures), and bio-feedback (brain control) – which is the best one to use?
The use cases point towards the answer. I believe all these technologies will not be competing against each other, rather, they will be working together in a multimodal interface. We will be communicating in one interface and getting the response in another. It will be complicated to develop and adapt, but that hasn’t stopped us from innovating, has it?
References
- The 2017 Voice Report by Alpine (fka VoiceLabs) by Adam Marchick
https://medium.com/@marchick/the-2017-voice-report-by-alpine-fka-voicelabs-24c5075a070f - Wall Street Journal – The race to replace your keyboard
https://youtu.be/lmEc3QaC08E?list=LLP5mLUHyhy9VmPreVwrrHUw - Why Gesture is the Next Big Thing in Design by David Rose (IDEO)
https://www.ideo.com/blog/why-gesture-is-the-next-big-thing-in-design - Stanford researchers harness brain waves for movement with ‘Brain Gate’ https://abc7news.com/health/stanford-researchers-harness-brain-waves-for-movement/1825212/
- https://www.statista.com/statistics/610447/wearable-device-revenue-worldwide/