Researchers at the Massachusetts Institute of Technology (MIT) recently developed a neural network that predicts, with a relatively high level of accuracy, the likelihood a person has a cognitive impairment. If you don’t mind speaking loosely, you could call it a depression detector.
It’s not, but we’ll get into that later.
The team, consisting of MIT researchers Tuka Alhanai, Mohammad Ghassemi, and James Glass are presenting their work this week at the Interspeech 2018 conference in India.
According to their paper, they’ve developed a context-free method by which a machine can break down text or audio from a human and assign a score indicating the person’s level of depression. The thing to key in on is the “context-free” aspect of this AI.
A therapist, typically, uses a combination of tried-and-true questions and direct observation to diagnose mental health conditions such as depression. According to the MIT team, their AI can do something similar without the need for conditioned questions or direct observation. It doesn’t need context – just data.
Here’s a paragraph from MIT’s press release on the paper:
MIT researchers detail a neural-network model that can be unleashed on raw text and audio data from interviews to discover speech patterns indicative of depression. Given a new subject, it can accurately predict if the individual is depressed, without needing any other information about the questions and answers.
So is it detecting or predicting?
It may be a minor quibble, but detecting and predicting are two entirely different things. An algorithm that predicts whether a person is depressed is merely labeling data for further review by humans. One that detects depression would have to determine and confirm that a person was, in fact, depressed.
The researchers, of course, realize this. The lead author on the paper, Tuka Alhanai, said “It’s not so much detecting depression, but it’s a similar concept of evaluating, from an everyday signal in speech, if someone has cognitive impairment or not.”
It’s predicting and definitely not detecting. That’s good to know. And that minor distinction is why this work is so very, very scary.
To test their AI, the researchers conducted an experiment where 142 people being screened for depression answered a series of questions asked by a human-controlled virtual agent. The AI had no prior knowledge of the questions and respondents were free to answer any way they wanted. There was no A,B,C,D to choose from, the AI discerns depression from linguistic cues.
The study participants’ responses were recorded in a mix of text and audio. In the text version, the AI was able to predict depression after about seven question and answer sequences. But, interestingly enough, in the audio version it took about 30 sequences for the AI to make a determination. Its averaged accuracy, according to the researchers, was an astounding 77 percent.
The problem? There’s little reason to believe this will be used by anyone who directly interfaces with healthcare patients in the physical world.
Therapists who see patients in offices tend to believe they’ve been trained to diagnose patients better than an algorithm. And this AI isn’t the same thing as using image recognition to detect cancer. You can physically find and remove cancer (in many cases), but you can’t verify an AI’s depression diagnosis with a scalpel.
In a game where, theoretically, a computer and a person listen to the same conversation and end up making diametrically opposed depression diagnoses, who decides which is correct? Or, if you prefer, in a scenario in which a computer identifies potential depressives, does a human also conduct the same checks to ensure doctors aren’t treating patients the algorithm was wrong about? Because that’d be redundant and wasteful. Where’s the automation?
More importantly, what happens when someone other than a medical professional finds use for a push-button “depression detector?” Certain language within the paper seems to indicate the algorithms are meant to be developed for use outside of MIT labs:
Individuals suffering from depression are beset by debilitating sadness for weeks to years on end. To treat depressed individuals, they must first be diagnosed. To obtain a diagnosis, depressed individuals must actively reach out to mental health professionals. In reality, it can be difficult for the depressed to attain professional attention due to constraints of mobility, cost, and motivation. Passive automated monitoring of human communication may address these constraints and provide better screening for depression.
Passive automated monitoring of human communication sounds like the dystopian future of our nightmares. There are numerous conceivable social and professional environments where a person could be subjected to a question and answer sequence without knowing their mental health would later be evaluated by a machine. Non-consensual mental health evaluations conducted inside the black box of a neural network seem like a bad idea.
Imagine losing out on a job because a company used a “depression detector” AI to determine you weren’t mentally stable enough during your interview, or having an algorithm’s interpretation of your responses to a lawyer’s questions being admissible in your child custody case.
Worse, consider a police station implementing black box AI to determine a suspect’s mental state during questioning. Let’s not forget that police units the world over have wasted taxpayer dollars and valuable time on ridiculousness like psychics — AI purported to “detect” emotional state seems like a step up.
It’s terrifying to imagine — yet also feels like a foregone conclusion — that this type of stuff is going to show up in interrogation rooms, employment interviews, and other places it doesn’t belong. But we can’t let people get bamboozled by AI: The machines absolutely do not know if we’re gay, guilty, or depressed — they’re just guessing. And we should be very careful how we allow their guesses to be used.