Saurabh Gupta On Ultraleap’s Push To Bring Touchless Interaction To DOOH And Digital Signage

November 24, 2021 by Dave Haynes

If you have been in the industry for a while, you’ll maybe remember all the excitement around using gesture technology to control screens. That was followed by the letdown of how crappy and feeble these gesture-driven touchless working examples turned out to be.

Like just about everything, the technology and the ideas have got a lot better, and there is a lot of renewed discussion about how camera sensors, AI and related technologies can change up how consumers both interact … and transact.

Ultraleap is steadily developing a product that lets consumers interact with and experience digital displays using sensors and, when it makes sense, haptic feedback. The company was formed in 2019 when Ultrahaptics acquired Leap Motion, and the blended entity now operates out of both Silicon Valley and Bristol, England.

Leap Motion was known for a little USB device and a lot of code that could interpret hand gestures in front of a screen as commands, while Ultrahaptics used ultrasound to project tactile sensations directly onto a user’s hands, so you could feel a response and control that isn’t really there. Or something like that. It’s complicated stuff.

I had an interesting chat with Saurabh Gupta, who is charged with developing and driving a product aimed at the digital OOH ad market, one of many Ultraleap is chasing. We got into a bunch of things – from how the tech works, to why brands and venues would opt for touchless, when touchscreens are so commonplace, as is hand sanitizer.


Hey, Saurabh, thank you for joining me. Let’s get this out of the way. What is an Ultraleap and how did it come about? 

Saurabh Gupta: Hey, Dave, nice to be here. Thank you for having me. Ultraleap is a technology company and our mission is to deliver solutions that remove the boundaries between physical and digital worlds. We have two main technologies. We have a computer vision-based hand tracking and gesture recognition technology that we acquired and on the other side of the equation, we have made a haptic technology using ultrasound. The whole premise of how we came about was we started out as a haptics company and that’s what our founder and CEO, Tom Carter, built when he was in college, and it was a breakthrough idea for us to be able to deliver the sense of touch in mid air using ultrasound was how we started, and to be able to project haptic sensations in mid-air, one of the key components of that was, you need to understand where the hands are in space and for that we were using computer vision technology by Leap Motion to track and locate user’s hands in space, and we had an opportunity to make an acquisition, and some of your listeners may already know about Leap Motion. Leap Motion has been a pioneer in gesture based hand tracking technology since 2010. They’ve got 10 plus years of pedigree in really refining gesture based hand tracking models. So we had an opportunity to purchase them and make an acquisition in 2019, we completed the acquisition and rebranded ourselves to Ultraleap.

So that’s how we started. As stated in our mission, it’s all about focusing on user experience for the use cases of how users are interacting with their environment, and that environment could be a sort of a 2D screen in certain applications, the application that we’ll probably talk about today, but also other aspects of augmented reality and virtual reality, which are on the horizon and our emerging technologies that are gaining more ground. So that’s the central approach. How can we enhance the interactivity that users have with a physical environment, through an input and an output technology offerings with gesture as input and haptics being the output? 

The whole gesture thing through the years has been kind of an interesting journey, so to speak. I can remember some of the early iterations of Microsoft Kinect gesture, sensors, and display companies and solutions providers doing demos showing, you can control a screen by waving your hand, lifting it up and down and this and that, and I thought this is not going to go anywhere. It’s just too complicated. There’s too much of a learning curve and everything else. 

Now, the idea as it’s evolved and like all technology got a lot better is, it’s more intuitive, but it’s still something of a challenge, right? There’s still a bit of a curve because we’re now conditioned to touching screens.

Saurabh Gupta: Yeah, you’re right. One of the key aspects here is that gesture has been around. There’s been research that goes back to the early 90s, if not in the 80s, but computer vision technology in general has come a long way. The deep learning models that are powering our hand tracking technology today are a lot more sophisticated. They are more robust, they are more adaptable and they are able to train based on a lot of real world inputs. So what that really means is that since the computing power and the technology behind recognizing gestures has improved, a lot of that has manifested itself in a more approachable user experience, and I completely accept the fact that there is a gap and we’ve got 10 plus years of learned behavior of using a touchscreen. We use a touchscreen everyday, carry it in our pockets, but you also have to understand that when touch screens became prevelant, there was the type keyboard before that. 

So the point that I’m making here with this is that we are pushing the envelope on new technologies and a new paradigm of interactivity. Yes, there is a learning curve, but those are the things that we are actually actively solving for:

The gesture tracking technology should be so refined that it is inclusive and is able to perform in any environment, and I think we’ve made some really good steps towards that. You may have heard of our recent announcement of our latest hand tracking offering called Gemini. The fundamental thing with Gemini is that it’s based on years and years of research and analysis on making the computer vision, deep learning models, that power that platform to be as robust, to be low latency, high yield in terms of productivity and really high initialization, which means as part of the user experience, when you walk up to an interface, you expect to use it right away. We know we can do that with touch screens, but if you put this technology complementary to an interface, what we are solving for at Ultraleap is: when somebody walks up to a screen and they put up their hand to start to interact, the computer vision technologies should instantly recognize that there’s a person who is looking to interact. That’s number one, and I think with Gemini, with the deep model work that we’ve done, we’ve made some good progress there. Number two, which is once the technology recognizes that a person wants to interact, now can we make it more intuitive for the person to be as or more productive than she would be with a touchscreen interface? And that’s where I think we’ve made more progress. I will say that we need to make more progress there, but some of the things that we’ve done, Dave. We have a distance call to interact, which is a video tutorial attraction loop that serves as an education piece.

And I’ll give you a stat. We ran a really large public pilot in the Pacific Northwest at an airport, and the use case there was immigration check-in, so people coming off the plane, before they go talk to a border security agent, some people to fill out their information on a kiosk. So we outfitted some kiosks with our gesture based technology and the rest were the controls, which were all touchscreen based and over multiple weeks we ran this study with active consumers who actually had very little to no prior experience using gestures and we did this AB test where we measured the gesture adoption rate on the kiosks without a call interact, before a call to interact and after a call to interact, and it increased the gesture adoption rate by 30%, which means that it certainly is helping people to understand how to use the interface.

The second stat that came from it, that at the end of the pilot, we were almost at 65% gesture adoption rate, which means almost more than 6 out of 10 people who use that interface used gesture as the dominant interface for input control, and the third piece of this was how long did it take for them to finish their session? We measured that using the gesture based interaction, the time was slightly higher than for the control group that was using a touchscreen, but it wasn’t much, it was only 10% higher. Now one can look at that stat and say in a transactional setting where you know, it’s going to take you 30 seconds to order a burger, adding an extra second can be a problem, but at the same time, those stats are encouraging for us to think about when we look at that as the baseline to improve from. 

So if I’m listening to this and I’m trying to wrap my head around what’s going on here, this is not a gesture where you’re standing 3 feet away from a screen and doing the Tom cruise Minority Report thing, where you’re waving your arm and doing this and that is, can you describe it? Because you’re basically doing touch-like interactions and the ultrasonic jets or blasts of air or whatever are giving you the feedback to guide you, right? 

Saurabh Gupta: So we’ve got two avenues that we have going at this from. One is for the self service type offering, so you think of check-in kiosks or ordering kiosks at restaurants or even digital wayfinding, digital directories. We are solving for those primarily led at least in the first phase led by our gesture tracking technology. So gesture being the input modality, complimentary to touch.

So, what we do is we build a touch-free application, which is a ready to use application that is available today on Windows based media players or systems to convert existing touch screen-based user interfaces to gesture, but what we’ve done is we’ve made the transition a lot more intuitive and easier because what we’ve done is we’ve replicated and done a lot of research on this and replicated interaction methods or gestures you would call it. I hate to use gestures as a word, because it gets tagged with weird hand poses and things like that, people pinching and all of that. For us, it’s all about how we can replicate the same usage that a typical average consumer will have when she interacts with a touch screen based interface.

So we came up with this an interaction method that we call Airpush which is basically, to explain it to your listeners, it’s all about using your finger and moving towards an interactive element on screen. But what happens is the button gets pressed even before you approach them based on your forward motion or interaction. Now, the smart math behind all of this is that not only do we track motion, but we also track velocity, which means that for people who are aggressive in terms of their button pressing, which means they do short jabs, we can cater for those or people who are more careful in their approach as they move towards the screen, the system is adaptable to cater to all types of interaction types, and we track all the fingers so you can use multiple fingers too or different fingers as well. So these are some of the things that we’ve included in our application.

So that’s one side. The second side is all about interactive advertising, immersion and that’s where I think we use our haptic technology more, to engage and involve the user in the interactive experience that they’re going to. So for self service and more transactional type use cases, we’re using primarily our hand gesture technology. And for immersive experiential marketing, or even the digital out-of-home advertising type of use cases, we are leading without haptic based technology.  

And you’re involved on the digita, out-of-home side, right? That’s part of your charge? 

Saurabh Gupta: That’s correct. So I lead Ultraleap’s out-of-home business. So in the out-of-home business, we have both self service retail, and digital out-of-home advertising businesses that we focus on.

David:. So how would that manifest itself in terms of, I am at a train station or I’m out somewhere and there’s a digital out-of-home display and I go up and interact with it and you’re saying it’s a more robust and rich experience than just boinking away at a touchscreen. What’s going on? What would be a good example of that?

Saurabh Gupta: So a good example of digital out of home activations is that we’ve partnered with CEN (Cinema Entertainment Network) where we’ve augmented some of their interactive in cinema displays that are being sold from a programmatic perspective. Now the interactive piece is still being worked into the programmatic side of things, but that’s one example of an interactive experience in a place based setting.

The other example is experiential marketing activations that we’ve done with Skoda in retail malls and also an activation that we did with Lego for Westfield. So these are some of the experiences that we’ve launched and released with our haptics technology and on the self service side we’ve been working with a lot of providers in the space you may have heard of. 

Our recent pilot concluded with PepsiCo where we are bringing in or trialing gestures for their ordering kiosks for their food and beverage partners. So these are some of the things that are going on on both sides in the business.

David:. So for the Lego one or the Scoda one, what would a consumer experience? 

Saurabh Gupta: So these are all interactive experiences. So for Lego, it was about building a Lego together. So basically using our haptic technology which obviously contains gestures as the input, moving Lego blocks and making an object that was being displayed on a really large LED screen at one of the retail outlets and in London, so a user would walk up, they would use their hands in front of our haptic device to control the pieces on the screen and then join them together and make a Lego out of it and while they’re doing that, they’re getting the sensation of the tactile sensation of joining the pieces and that all adds up to a really immersive, engaging experience within a digital out of home setting. 

So you get the sensation that you’re snapping Lego pieces together? 

Saurabh Gupta: Yeah, snapping pieces together, controlling so you get the agency of control, and it’s one of those sensations that gives you a very high memorability factor.

I don’t know whether you track the news. This was in 2019. We did actually a really extensive activation with Warner Brothers in LA, and what we did was at one of the cinemas down there for Warner Brothers’ three upcoming movies, Shazam, The Curse of La Llorona, and Detective Pikachu, we added interactive movie posters using haptics in the cinema lobby, and this would complement the digital poster network that was already existing at that location, and over the course of the activation, which was around six weeks long, we had almost 150,000 people that went through the cinema and we actually did in partnership with QBD, we did a lot of analytics around what the. performance was of an interactive movie poster experience within a digital out-of-home setting and got some really great stats. 

We measured a conversion rate between an interactive experience versus a static digital signage experience. The conversion rate was almost 2x, 33% increase in dwell time, like people were spending more time in front of an interactive sign versus a static sign. Attention span was significantly higher at 75%, 42% lift in brand favorability. So these are really interesting stats that gave us the confidence that haptic technology combined with gesture based interface has a lot of value in providing and delivering memorable experiences that people remember.

And that’s the whole point with advertising, right? That’s the whole point. You want to present experiences that provide a positive association of your branded message with your target consumer, and we feel that our technology allows that connection to be made 

One of the assumptions/expectations that happened when the pandemic broke out was that this was the end of touchscreens, nobody’s ever going to want to touch the screen again, the interactivity was dead and I made a lot of those assumptions myself and turns out the opposite has happened. The touch screen manufacturers have had a couple of pretty good years and the idea is that with a touchscreen, you can wipe it down and clean your hands and do all that stuff. But you’re at a far greater risk standing four feet away from somebody across a counter, ordering a burger or a ticket or whatever it may be. 

So when you’re speaking with solutions providers, end user customers and so on are you getting the question of, “Why do I need to be touchless?”

Saurabh Gupta: Yeah, it’s a fair point, Dave, and let me clarify that. Look, from our perspective, we are focusing on building the right technology and building the right solutions that elevate the user experience. Hygiene surely is part of that equation, but I accept your points that there are far greater risks for germ transmission than shared surfaces, I totally accept that, and yes, there is a TCO argument, the total cost of ownership argument that has to be made here also. 

The point that I will make here is that we fundamentally believe and being a scale-up organization that is focusing on new technology, we have to believe that we are pushing the technology envelope where what we are focusing on is elevating the user experience from what the current model provides. So yes, there will be some use cases where we are not a good fit, but contactless as a category or touchless as a category, maybe the pandemic catalyzed it, maybe it expedited things, but that category in itself is growing significantly. 

A couple of stats here, right? The contactless payment as a category itself, 88% of all retail transactions in 2020 were contactless, that’s a pretty big number And assuming that retail is a $25 trillion dollar market. That’s a huge chunk. 

But that’s about speed and convenience though, right?

Saurabh Gupta: Totally. But all I’m saying is contactless as a category is preferable from a user perspective. Now, gesture based interactivity as a part of that user flow, we fundamentally believe that gesture based interactivity plays a part in the overall user journey. So let me give you an example. 

Some of the retailers that we are talking to are thinking about new and interesting ways to remove levels of friction from a user’s in-store experience. So there are multiple technologies that are being trialed at the moment. You may have heard of Amazon’s just walk out stores as an example. You don’t even have to take out your wallet and that is completely based on computer vision, as an example, but there are other retailers who are looking to use technology to better recognize who their loyal customers are. So think of how we used to all have loyalty cards for Costco or any other retailer. 

They’re removing that friction to say, when you walk through the door, you’ve done your shopping and you’re at the payment powder, we can recognize who you are. And if we recognize who you are, we can give you an offer at the last mile, and in that scenario, they are integrating gestures as part of the completely contactless flow. This is where I think we are gaining some traction. There is a product that we are a part of that hasn’t been announced yet. I can’t go into details specifically on who it is and when it’s going to be released. But we are part of a computer vision based fully automated checkout system that uses gesture as the last mile for confirmation and things of that nature. That’s where we are gaining traction.

Overall point here is that we are focusing on really showcasing and delivering value on how you can do certain things in a more natural and intuitive way. So think of digital wayfinding at malls, right? You have these giant screens that are traditionally touchscreens, right? When you think of that experience, it has a lot of friction in it, because first of all, you can’t use touch as effectively on a large screen because you can’t swipe from left to right to turn a map as an example. We fundamentally believe that the product could be better with gesture. You can gesture to zoom in, zoom out, rotate a map, and find your direction to a store. Those kinds of things can be augmented. That experience can be augmented with adding just a capability as opposed to using a touchscreen based interface. So those are the high value use cases that we are focusing on. 

So it’s not really a case where you’re saying, you don’t need to touch screen overlay anymore for whatever you’re doing, Mr. Client, you just use this instead. It’s tuned to a particular use case and an application scenario, as opposed to this is better than a touch overlay?

Saurabh Gupta: I think that is a mission that we are driving towards, which is, we know that there is potentially a usability gap between gesture in terms of its evolution than touchscreen. We are looking to bridge that gap and get to a point where we can show more productivity using gesture. 

And the point is that with our technology, and this is something that you referenced a second ago, you can turn any screen into a touchscreen. So you don’t necessarily need a touchscreen and then you can convert it to gesture. You can convert any LCD screen to an interactive screen. So there is some deep argument there as well.

What’s the kit, like what are you adding?

Saurabh Gupta: Just a camera and a USB cable, and some software.

And if you’re using haptics feedback, how does that work?

Saurabh Gupta: So haptics is a commercially off the shelf product. So it’s another accessory that gets added to the screen. However, that contains the camera in it so you don’t need an additional camera. That also connects to external power and a USB back to the media player. 

So as long as you’ve got a USB on the media player, you’re good, and right now your platform is Windows based. Do you have Android or Linux? 

Saurabh Gupta: Good question, Dave. So right now we are Windows based, but we know it’s of strategic importance for us to enable support on additional platforms. So we are starting to do some work on that front. You’ll hear some updates from us early next year on at least the hand tracking side of things being available on more platforms than just Windows. 

How does economics work? I suspect you get this question around, “All right. If I added a touch overlay to a display, it’s going to cost me X. If I use this instead, it’s going to cost me Y. 

Is it at that kind of parity or is one a lot more than the other? 

Saurabh Gupta: It depends on screen size, Dave, to be honest. So the higher in screen size you go, the wider the gap is. I would say that for a 21 or 23 inch screen and up, the economics are in our favor for a comparable system.

And are you constrained by size? I think of all the LED video walls that are now going into retail and public spaces and so on, and those aren’t touch enabled. You really wouldn’t want to do that, and in the great majority of cases with this, in theory, you could turn a potentially fragile, please don’t touch surface like that into an interactive surface, but are you constrained to only doing things like a 55 inch canvas or something?

Saurabh Gupta: This will require a little bit of technical explanation. The Lego example that I talked about was targeted on, I would say a large outdoor LED screen. So the concept here is that if you want one-to-one interactivity. 

So what do I mean by one-to-one interactivity? One-to-one interactivity is that basically when in our interface, when the user approaches the screen, there is an onscreen cursor that shows up, and that on screen cursor is what is the control point for the user. Now one-to-one interactivity for us to achieve that where the cursor is at the same height or there’s no parallax between where the finger is and where the cursor is, for that you have to be connected to or at the screen, and when you are connected to the screen, based on our current camera technology, we can control up to a 42 inch screen for one-to-one interactivity, but we’ve also been doing exams showing examples where if you connect the sensor to slightly in front of the display, then you can cover a wider area and we’ve been able to showcase examples of our technology being used on up to a 75 inch LCD screen in portrait mode. 

So then any larger than that, the scale gets a little wonky, right? Cause you’ve got a person standing in front of a very large display and it just starts to get a little weird.

Saurabh Gupta: Yeah. It’s like putting a large TV in a small living room. So you need to be slightly further away because then it gets too overwhelming, and for that, we have worked with certain partners and they’ve done some really interesting work like this company called IDUM, they built a pedestal and so that pedestal encloses our tracking device, and that can be placed several feet from a large immersive canvas, like a LED wall, as an example, in a museum type activation, and people can walk by and then they can control the whole screen with that pedestal slightly further away from the screen.

So it’s like a Crestron controller or something except for a big LED display! 

Saurabh Gupta: Exactly. It’s like a trackpad in front of the screen, but slightly further away. 

Gotcha. All right. Time flew by, man. We’re already deep into this. You were telling me before we hit record that your company will be at NRF and you may also have people wandering around IEC but if people want to know more about your company, they go to 

Saurabh Gupta: That’s correct., we have all the information there and David, it was great to talk to you and thank you for the opportunity. 


Leave a comment