May 7 2019
Signs of anxiety and depression in the speech patterns of small children can be detected using a machine learning algorithm, thereby providing a quick and easy way of diagnosing conditions that are hard to spot and frequently ignored in young people, according to a new study published in the Journal of Biomedical and Health Informatics.
It is said that one in five children suffer from depression and anxiety, collectively referred to as "internalizing disorders." But since children under the age of eight cannot reliably express their emotional anguish, adults need to be able to deduce their mental state, and recognize possible mental health issues. Waiting lists for appointments with psychologists, insurance matters, and failure to spot the symptoms by parents all add to children missing out on crucial treatment.
"We need quick, objective tests to catch kids when they are suffering," says Ellen McGinnis, a clinical psychologist at the University of Vermont Medical Center's Vermont Center for Children, Youth and Families and study’s lead author. "The majority of kids under eight are undiagnosed."
Early diagnosis is important as children respond well to treatment while their brains are in the formative stage, but if they are left untreated they are at greater danger of drug abuse and suicide later in life. Typical diagnosis involves a 60-90 minute semi-structured interview with a skilled clinician and their principal caregiver. McGinnis, together with University of Vermont biomedical engineer and study senior author Ryan McGinnis, has been seeking ways to apply artificial intelligence and machine learning to speed up diagnosis and render it more reliable.
The scientists used a modified version of a mood induction task known as the Trier-Social Stress Task, which is meant to cause feelings of anxiety and stress in the subject. A group of 71 children falling in the age bracket of three and eight were asked to create a three-minute story, and told that they would be judged according to how fascinating it was. The scientist acting as the judge stayed stern while delivering the speech, and gave only negative or neutral feedback. After 90 seconds, and again with 30 seconds left, a buzzer would ring and the judge would inform them how much time was remaining.
"The task is designed to be stressful, and to put them in the mindset that someone was judging them," says Ellen McGinnis.
The children were also diagnosed with the aid of a structured clinical interview and parent questionnaire, both well-proven ways of recognizing internalizing disorders in children.
The scientists used a machine learning algorithm to examine statistical features of the audio recordings of each kid’s story and connect them to the child’s diagnosis. They learned the algorithm was very fruitful in diagnosing children, and that the middle phase of the recordings, between the two buzzers, was the most predictive of a diagnosis.
"The algorithm was able to identify children with a diagnosis of an internalizing disorder with 80% accuracy, and in most cases that compared really well to the accuracy of the parent checklist," says Ryan McGinnis. It can also provide the results a lot more rapidly — the algorithm requires merely a few seconds of processing time once the assignment is complete to deliver a diagnosis.
The algorithm recognized eight diverse audio features of the children's speech, but three specifically stood out as extremely suggestive of internalizing disorders: with repeatable speech inflections and content, low-pitched voices, and a higher-pitched response to the startling buzzer. Ellen McGinnis says these features fit well with what one might expect from someone dealing with depression.
A low-pitched voice and repeatable speech elements mirror what we think about when we think about depression: speaking in a monotone voice, repeating what you're saying.
Ellen McGinnis, Study’s Lead Author and Clinical Psychologist, Vermont Center for Children, Youth, and Families and, University of Vermont Medical Center.
The higher-pitched response to the buzzer is also akin to the response the scientists found in their earlier work, where children with internalizing disorders were seen to display a larger turning-away response from a fearful stimulus in a fear induction assignment.
The voice analysis has a similar accuracy in diagnosis to the motion analysis in that previous work, but Ryan McGinnis thinks it would be a lot easier to use in a clinical environment. The fear task involves a darkened room, toy snake, motion sensors connected to the child and a guide, while the voice task just requires a judge, a way to tape the speech and a buzzer to interrupt. "This would be more feasible to deploy," he says.
Ellen McGinnis says the subsequent step will be to create the speech analysis algorithm into a common screening tool for clinical use, maybe via a smartphone app that could record and examine results instantly. The voice analysis could also be integrated with the motion analysis into a battery of technology-assisted diagnostic tools, to help recognize children at risk of anxiety and depression before even their parents suspect that something is not right.