Reviewed by Lexie CornerNov 14 2024
Researchers at Carnegie Mellon University's Human-Computer Interaction Institute recently published a study on EgoTouch. This tool uses artificial intelligence (AI) to control AR/VR interfaces through touch on the skin with a finger.
The next generation of augmented and virtual reality controllers may not just fit in the palm of your hand—they might be the palm of your hand itself.
The team aimed to develop a control system that provided tactile feedback using only the sensors in a standard AR/VR headset. A previous approach, OmniTouch, developed by Chris Harrison, an associate professor at HCII and director of the Future Interfaces Group, came close to this goal.
However, OmniTouch required a specialized, bulky depth-sensing camera. Vimal Mollyn, a Ph.D. student under Harrison's guidance, proposed the idea of using machine learning to train standard cameras to recognize touch.
Try taking your finger and see what happens when you touch your skin with it. You will notice that there are these shadows and local skin deformations that only occur when you are touching the skin. If we can see these, then we can train a machine-learning model to do the same, and that is essentially what we did.
Vimal Mollyn, Ph.D. Student, Carnegie Mellon University
Mollyn gathered data for EgoTouch by placing a custom touch sensor on the underside of the index finger and palm. The sensor collected information on various types of contact with different levels of force, all while remaining invisible to the camera. The model then learned to associate visual cues, such as shadows and skin variations, with touch and force without human input.
To enhance the training data, the team expanded their collection to include 15 individuals with different skin tones and hair densities, accumulating hours of data from a range of scenarios, activities, and lighting conditions.
EgoTouch detects touch with over 96 % accuracy and has a false positive rate of about 5 %. It can identify actions like pressing, lifting, and dragging. Additionally, the model can determine whether a touch was light or firm with 98 % accuracy.
“That can be useful for having a right-click functionality on the skin,” Mollyn added.
Detecting changes in touch could enable developers to simulate touchscreen actions on human skin. For example, a smartphone could recognize when you scroll up or down a website, zoom in, swipe right, or tap and hold an icon. To achieve this on a skin-based interface, the camera must be able to detect subtle variations in touch type and force.
The system demonstrated similar accuracy across different skin tones and hair densities, as well as at various locations on the hand and forearm, including the front and back of the arm, the palm, and the back of the hand. However, the technology was less effective on bony surfaces, such as the knuckles.
Mollyn added, “It is probably because there was not as much skin deformation in those areas. As a user interface designer, what you can do is avoid placing elements on those regions.”
Mollyn is exploring the use of night vision cameras and infrared lighting to enable the EgoTouch system to function in the dark. He is also collaborating with researchers to adapt the touch-detection technology for use on surfaces other than the skin.
Mollyn concluded, “For the first time, we have a system that just uses a camera that is already in all the headsets. Our models are calibration-free, and they work right out of the box. Now we can build off prior work on on-skin interfaces and actually make them real.”
Journal Reference:
Mollyn, V. et. al. (2024) Foundation models for fast, label-free detection of glioma infiltration. UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology. doi.org/10.1145/3654777.3676455