An article recently published in the journal NPJ | Flexible Electronics introduced a novel silent speech interface (SSI) that enhances wearable communication using ultrasensitive textile strain sensors. These sensors, embedded in a smart choker, detect subtle throat movements, enabling silent speech recognition.
The researchers utilized artificial intelligence (AI) algorithms to improve accuracy while maintaining user comfort. The goal was to address the limitations of existing SSI systems by enhancing sensitivity, accuracy, and efficiency, making them practical for real-world use.
Background
SSIs are crucial in situations where verbal communication is limited. They convert non-vocal signals into speech using various sensors and AI-based methods. SSIs have been particularly beneficial for individuals with speech impairments, such as those recovering from laryngeal surgeries or dealing with conditions like Parkinson’s disease.
Traditional systems, like electroencephalography (EEG) and computer vision (CV), decode speech from brain activity or lip movements. However, these methods are often impractical due to their invasive nature or high computational demands.
Recent developments focus on non-invasive mechanical sensors, such as strain and electromyography (EMG) sensors, that detect throat movements. Strain sensors are highly accurate and easily integrated into wearables.
About the Research
In this paper, the authors addressed key challenges in SSI design, such as balancing sensitivity, user comfort, and computational efficiency. They introduced a few-layer graphene (FLG) strain sensor, integrated into a bamboo-based textile, capable of detecting small throat micromovements during silent speech. This sensor was embedded into a smart choker worn around the neck. By improving sensor sensitivity, the system captured subtle throat movements associated with different speech patterns.
A unique feature of the study was the structured graphene layer, which had ordered cracks that enhanced sensitivity by 420% compared to other textile strain sensors. These cracks formed during pre-stretching, significantly altering electrical resistance with throat movements. The sensor design was optimized through a multi-layer printing process, ensuring stability and high sensitivity.
To complement the sensor, the researchers developed a lightweight, end-to-end neural network to process and classify speech signals efficiently. Unlike traditional two-dimensional (2D) models, which are computationally intensive, they used one-dimensional (1D) convolutional layers, reducing computational load by 90% while achieving a 95.25% accuracy rate in speech decoding. Integration of the sensor and neural network provided high performance with low energy consumption, making the system suitable for wearables.
The machine learning model was trained on a database of common English words, including confusing pairs like “book” and “look” and “sheep” and “ship.” Data was collected from users with different accents, speaking speeds, and native languages. Three datasets were used, each with 100 samples per word, split into 80% for training and 20% for testing. A potentiostat was used for data acquisition, with a 500 Hz sampling frequency and 3-second word samples. The data collection process reflected real-world conditions, accounting for variations in choker positioning and tightness.
Key Findings
The system demonstrated effectiveness in both controlled and real-world environments. The textile strain sensor, with its ordered cracks, exhibited excellent sensitivity, capturing detailed signals related to throat movements during speech. Its gauge factor reached 317 with less than 5% strain, significantly improving over the previous textile sensor. The sensor also maintained stable performance through 10,000 stretch-release cycles.
The 1D convolutional neural network (CNN) processed the sensor’s high-density data. Through transfer learning, it adapted to different users and speech patterns, achieving a 95.25% accuracy rate for decoding the 20 most common English words. It also performed well with challenging word pairs and varying speech rates, reaching 93% and 96% accuracy, respectively.
The system was further validated with new users and unfamiliar words. With minimal fine-tuning, the model achieved 80% accuracy with 15-20 samples per class, which increased to 90% with 30 samples per class, demonstrating the system’s adaptability to new data.
Applications
This research has significant potential, especially in areas requiring silent communication or where traditional speech is restricted. For example, it could be used in noisy environments like factories or military operations where verbal communication is difficult.
The technology also shows promise for medical applications, particularly for individuals with speech impairments caused by conditions like stroke or paralysis. Additionally, it could be integrated into consumer electronics, offering a discreet and efficient way to control devices through silent speech.
The system’s energy efficiency and wearability make it ideal for extended use in practical settings. The authors noted that it performed well even in noise, such as environmental sounds or physiological artifacts like breathing and swallowing, further enhancing its everyday application potential.
Conclusion
The study demonstrated that combining an ultrasensitive textile strain sensor and an AI-driven neural network significantly advanced SSI technology. By improving sensitivity, user comfort, and energy efficiency, the system proved to be an efficient solution for silent communication. Its ability to decode speech accurately and adapt to new users makes it a promising tool for various applications.
Future work could focus on expanding vocabulary and enhancing real-time processing. Adding more sensors or advanced machine learning techniques could further improve scalability for both medical and consumer applications. Overall, this research sets a new standard for wearable silent speech interfaces and paves the way for future innovations in human-computer interaction.
Journal Reference
Tang, C., Xu, M., Yi, W. et al. Ultrasensitive textile strain sensors redefine wearable silent speech interfaces with high machine learning efficiency. npj Flex Electron 8, 27 (2024). DOI: 10.1038/s41528-024-00315-1, https://www.nature.com/articles/s41528-024-00315-1
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.