An article recently published in the journal NPJ | Flexible Electronics introduced a novel silent speech interface (SSI) that enhances wearable communication using ultrasensitive textile strain sensors. These sensors, embedded in a smart choker, detect subtle throat movements, enabling silent speech recognition.
The researchers utilized artificial intelligence (AI) algorithms to improve accuracy while maintaining user comfort. The goal was to address the limitations of existing SSI systems by enhancing sensitivity, accuracy, and efficiency, making them practical for real-world use.
Background
SSIs are crucial in situations where verbal communication is limited. They convert non-vocal signals into speech using various sensors and AI-based methods. SSIs have been particularly beneficial for individuals with speech impairments, such as those recovering from laryngeal surgeries or dealing with conditions like Parkinson’s disease.
Traditional systems, like electroencephalography (EEG) and computer vision (CV), decode speech from brain activity or lip movements. However, these methods are often impractical due to their invasive nature or high computational demands.
Recent developments focus on non-invasive mechanical sensors, such as strain and electromyography (EMG) sensors, that detect throat movements. Strain sensors are highly accurate and easily integrated into wearables.
About the Research
In this paper, the authors addressed key challenges in SSI design, such as balancing sensitivity, user comfort, and computational efficiency. They introduced a few-layer graphene (FLG) strain sensor integrated into a bamboo-based textile capable of detecting small throat micromovements during silent speech. This sensor was embedded into a smart choker worn around the neck. By improving sensor sensitivity, the system captured subtle throat movements associated with different speech patterns.
A unique feature of the study was the structured graphene layer, which had ordered cracks that enhanced sensitivity by 420 % compared to other textile strain sensors. These cracks formed during pre-stretching, significantly altering electrical resistance with throat movements. The sensor design was optimized through a multi-layer printing process, ensuring stability and high sensitivity.
To complement the sensor, the researchers developed a lightweight, end-to-end neural network to process and classify speech signals efficiently. Unlike traditional two-dimensional (2D) models, which are computationally intensive, they used one-dimensional (1D) convolutional layers, reducing computational load by 90 % while achieving a 95.25 % accuracy rate in speech decoding. Integration of the sensor and neural network provided high performance with low energy consumption, making the system suitable for wearables.
The machine learning model was trained on a database of common English words, including confusing pairs like “book” and “look” and “sheep” and “ship.” Data was collected from users with different accents, speaking speeds, and native languages. Three datasets were used, each with 100 samples per word, split into 80 % for training and 20 % for testing. A potentiostat was used for data acquisition, with a 500 Hz sampling frequency and 3-second word samples. The data collection process reflected real-world conditions, accounting for variations in choker positioning and tightness.
Key Findings
The system demonstrated effectiveness in both controlled and real-world environments. The textile strain sensor, enhanced with ordered cracks, exhibited excellent sensitivity, capturing detailed throat movement signals during speech. Its gauge factor reached 317 with less than 5 % strain, representing a significant improvement over previous textile sensors. Moreover, the sensor maintained stable performance through 10,000 stretch-release cycles.
The 1D convolutional neural network (CNN) efficiently processed the sensor’s high-density data. Through transfer learning, the model adapted to various users and speech patterns, achieving a 95.25 % accuracy rate in decoding the 20 most common English words. The system also performed well with challenging word pairs and varying speech rates, reaching accuracy rates of 93 % and 96 %, respectively.
Further validation involved testing with new users and unfamiliar words. With minimal fine-tuning, the model achieved 80 % accuracy with 15-20 samples per class, which improved to 90 % with 30 samples per class. This demonstrated the system’s strong adaptability to new data and users.
Applications
This research has significant potential, especially in areas requiring silent communication or where traditional speech is restricted. For example, it could be used in noisy environments like factories or military operations where verbal communication is difficult.
The technology also shows promise for medical applications, particularly for individuals with speech impairments caused by conditions like stroke or paralysis. Additionally, it could be integrated into consumer electronics, offering a discreet and efficient way to control devices through silent speech.
The system’s energy efficiency and wearability make it ideal for extended use in practical settings. The authors noted that it performed well even in noise, such as environmental sounds or physiological artifacts like breathing and swallowing, further enhancing its everyday application potential.
Conclusion
The study demonstrated that combining an ultrasensitive textile strain sensor and an AI-driven neural network significantly advanced SSI technology. By improving sensitivity, user comfort, and energy efficiency, the system proved to be an efficient solution for silent communication. Its ability to decode speech accurately and adapt to new users makes it a promising tool for various applications.
Journal Reference
Tang, C., Xu, M., Yi, W. et al. Ultrasensitive textile strain sensors redefine wearable silent speech interfaces with high machine learning efficiency. npj Flex Electron 8, 27 (2024). DOI: 10.1038/s41528-024-00315-1, https://www.nature.com/articles/s41528-024-00315-1
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.
Article Revisions
- Sep 25 2024 - Revised sentence structure, word choice, punctuation, and clarity to improve readability and coherence.
- Sep 25 2024 - Image changed so that it is more relevant to the content of the news item.