Last year, I wrote about how gesture user interfaces are becoming mainstream. Today, with the IMS Touch Gesture Motion conference coming up next week in London, I thought I'd revisit that topic from a different angle.
Among my colleagues at BDTI, there's been a healthy debate going on about whether gesture user interfaces are a gimmick - essentially a solution looking for a problem - or something that really adds value. With tens of millions of Kinect devices sold, I think that Microsoft has demonstrated the value of gesture control for video games - not necessarily for traditional serious gamers, but for the potentially much larger audience of casual gamers.
Reportedly, Microsoft now plans to position Kinect as a controller not only for video games, but also for watching television and for other living room multimedia experiences. To me, this makes very good sense. I've spent countless hours setting up new TVs and related components for family members and providing tutorials on how to use them, only to find that in many cases the user interfaces (based on a small army of infrared remote controls, or on one complex unified remote control) are too complex for casual, non-technically-minded users. For those people - arguably the majority of TV owners - wouldn't it be better to simply be able to stare at the dark screen for a few seconds and have the TV recognize that someone is staring at it, turn on, and offer a simple, gesture-based menu of basic options?
This leads me to what I think is a crucial, and often overlooked, question in debates about the value of gesture control: What is a gesture? Most discussions of gesture control assume that gesture control systems respond only to hand gestures or, as in the case of Kinect-based video games, to the overall position and movement of a user's body. But isn't the act of looking at something also a gesture? A person's glance can be used both in broad-brush ways ("turn on the display if someone looks at it for more than three seconds") and in precise ways ("move the cursor to the location on the screen where the user's eyes are currently focused").
Similarly, aren't head nods and shakes also gestures - and ones that are very natural for users? And if head nods and shakes are gestures, then what about facial expressions? At the recent IFA consumer electronics show, consumer products behemoth Haier demonstrated a television that can be controlled by eye blinks.
Of course, there are intentional facial expressions, such as deliberate eye blinks, and then there are automatic facial expressions that reflect our emotional state. Are you delighted, bored, frustrated? If so, that information is reflected in your facial expression and can be inferred via embedded vision technology, as shown by Rosalind Picard in her research work at MIT and her start-up company, Affectiva.
In contemplating the potential of gesture-based user interfaces, I think it's important that we consider a broad definition of what kinds of gestures may be useful. For some applications, hand gestures alone might be best, but for others, a different subset of face detection, face recognition, glance tracking, hand gestures, body gestures, deliberate facial expressions and emotional state may yield a much more effective solution.
Considering how primitive and frustrating today's user interfaces are for many our increasingly complex and sophisticated systems, I'm convinced that various forms of gesture-based user interfaces will play a major role in many fields. For this reason, I'm very excited that we've been able to enlist Professor Rosalind Picard of MIT to be the morning keynote speaker at the upcoming Embedded Vision Summit in Boston on September 19th.
If you're an engineer involved in, or interested in learning about, how to incorporate embedded vision capabilities into your designs, I invite you to join us at the Summit, hosted by the Embedded Vision Alliance. The Summit will provide a technical educational forum for engineers, including how-to presentations, demonstrations, visionary keynote presentations, and opportunities to interact with Alliance member companies. Space is limited and filling up fast, so please register now. To begin the registration process for the Embedded Vision Summit, please send an email to summit@Embedded-Vision.com containing your name, job title, and company or organization name. We will respond with further details via email.
Jeff Bier is president of BDTI and founder of the Embedded Vision Alliance. Post a comment here or send him your feedback at http://www.BDTI.com/Contact.
Add new comment