We've seen some tremendous improvements recently in the ability of machines to understand and interpret human speech. Indeed, there has even been tremendous progress in making machines sound more human than they currently do.
For this post however, I'm going to touch on some of the projects that are progressing our ability to interpret human speech. First up is a study from the start of this year that was looking to develop an algorithm that could detect empathy in live therapy sessions.
A simple level of speech recognition allowed the system to automatically identify key phrases that would signal a level of empathy in the speech. These included things like "do you think", and "it sounds like" for high empathy, or "you need to" and "during the past" for low empathy.
"Technological advances in human behavioral signal processing and informatics promise not only scale up and provide cost savings through automation of processes that are typically manual, but enable new insights by offering tools for discovery. This particular study gets at a hidden mental state and what this shows is that computers can be trained to detect constructs like empathy using observational data," the authors say.
Along similar lines is an MIT spinout, called Cogito, who apply their smarts to try and improve conversations between customer support staff and their clients. The system is designed to offer real-time assistance on how the conversations can be improved.
This could be something like adjusting the speed of ones speech so that it mirrors the customer, or maybe even changing our tone if the customer is becoming emotional. The system even notifies supervisors if it thinks more seniority would help matters.
There are perhaps the most significant improvements being made in healthcare however. A MIT team have developed a wearable device that aims to help people who suffer from speech disorders.
"When a patient comes in for therapy, you might only be able to analyze their voice for 20 or 30 minutes to see what they're doing incorrectly and have them practice better techniques," the team explain. "As soon as they leave, we don't really know how well they're doing, and so it's exciting to think that we could eventually give patients wearable devices that use round-the-clock data to provide more immediate feedback."
Another fascinating project in healthcare is from a new startup called Canary Speech. The company are headed by Henry O'Connell, CEO and Jeff Adams, CTO, and Adams was previously boss at Yap, who were bought by Amazon to underpin their subsequent Echo project.
The technology is designed to help both identify and diagnose a number of cognitive diseases, such as Dementia, Parkinson's and Alzheimer's.
It does this by analyzing what we're saying, and indeed how we're saying it to try and detect potential signs of the conditions. I spoke with Adams ahead of his presentation at AI Europe, where he explained how the system works.
"By examining large amounts of speech recordings from patients with a particular condition, even recording from before they were diagnosed, we use machine learning to identify patterns and markers in the words they use, their phrasing, and the quality of their speech. Our goal is to identify warning signs much earlier so patients can get treatment early enough to make a real difference."
The system has already been trialed with an American healthcare company, with data captured and then analyzed in real-time as patients communicate with their clinician.
Suffice to say, the technology will only get better as more data is made available for them to use to fine tune their algorithms. Traditional healthcare settings offer scant optimism, but with areas such as telehealth becoming more popular then it seems inevitable that data will not be an issue in future.
Whilst the technology is still in it's infancy, and there are undoubtedly concerns to be addressed around both accuracy and privacy, it's an exciting glimpse into what may be possible.