Video and Slides
Outline: Voice technology for healthcare
Dr. Shona D’Arcy, CEO Kids Speech Labs
Speech Recognition is often seen as the holy grail for healthcare, imagine removing all the form filling that is currently taking up clinicians valuable time. In this talk I will try to answer a few pertinent questions:
- Why aren’t all hospitals completely voice enabled?
- What are examples of interactive voice applications that are working in healthcare and delivering value?
- What does the future hold for interactive voice applications in healthcare?
Review of Presentation
You can ask Shona questions in the comments section of this weblog, or contact Shona directly on LinkedIn.
This is an excellent introduction to the many uses of speech recognition in the Healthcare industry, which accounts for about half the voice technology investment.
Because voice requires neurological, cognitive, and physical capabilities its a powerful tool beyond the automation of taking notes, into improving accessibility to health services, and it can even be used diagnostically.
Shona provides some useful background on her projects in speech recognition, and the importance of good training data. A point David Curran has made in several of his TADSummit presentations over the years, e.g. How to improve Natural Language Datasets.
Shona highlights the 20 years of work Nuance has done in Electronic Health Records, a major time sink for doctors. The importance of the training data with all the verified medical terminology is critical for a lower word error rate.
For example, the medical transcription company SOPRIS Health, has been able to use its verified data from over 10 years of transcription, to build a competitive automated tool. This highlights why Google is so keen to offer its tools to build up verified training data for its algorithms.
BUT Shona highlights a critical point, parts of the transcription still need human verification, e.g. dosage, as people’s lives are at stake. While transcription will never be 100% automated, it can still lessen the workload.
On patient engagement using voice, 2 issues that have slowed adoption are privacy and security. Your voice can be considered a personal identifier. Using smart speakers is not necessarily a private. Also there are important design issues in building voice interfaces for the needs of patients, elderly or disabled.
The final point on the role voice can play diagnostically is very interesting; across cognitive decline, Parkinson’s, cardiac arrest, depression, schizophrenia, and even COVID. I have noticed that people going into a depressive episode talk and interact differently, their face can even look different. It’s still early days in the application of speech technology diagnostically, its moving out of the lab into trials. There are so many exciting applications of speech recognition!
Thank you Shona for an inspirational presentation 🙂
Thank you Shona for an excellent presentation on Voice Tech in Healthcare. I have a few questions:
1) You highlighted an important point on the design of voice interfaces for patients. What do you think are some of the key design principles?
2) You mentioned the challenge of having patients talking about their symptoms means they’re focused on them. How do you get around that issue?
3) Do you think any voice recording should be part of a patient’s EHR and subject to the same protections?
4) I’m interested in your view of the Microsoft / Nuance acquisition. Microsoft has been focused on Healthcare for over 2 decades, and also has a strong voice recognition capability. Why do you think they bought Nuance?
5) Using a smart speaker for patients / elderly / disabled is interesting for engagement, diarizing progress, and even diagnostics. They are cheap, easy to use, and always accessible in the home. Do you think the privacy / security issues are a roadblock? Or will the smart speakers become a part of everyday healthcare?
6) Where do you think diagnostic speech recognition will break through first? For example, with depression, helping people recognize a wave could be coming on could help with self-medicating before family gets involved.
hi Alan
great questions there
1. You highlighted an important point on the design of voice interfaces for patients. What do you think are some of the key design principles?
Answer: User testing is key, if you’re building voice applications for healthcare you can be dealing with users who have additional factors influencing them, they may be stressed, vulnerable, scared. Designers should consider the emotional state of patients during development, patients will always want to speak to a doctor to aleve their fears, the VUI designers job is to build interfaces that can collect the data required for the clinical application reliably and keep patients engaged.
2. You mentioned the challenge of having patients talking about their symptoms means they’re focused on them. How do you get around that issue?
Answer: This goes back to co-design with patients, understanding their fears and motivations for tracking symptoms, medication etc will help designers build better applications. Think about how often you collect data and in what situation, asking patients constantly to rate their pain may focus them on their pain and make it appear worse, it may increase their depression and ultimately stop them tracking their pain. The only way to get around this is to speak to patients, understand how they are willing to think about their pain and how this can be captured. Presenting patients with a clear value proposition is also key, what is the benefit to them for engaging with the technology. It’s not just about sharing the data with their clinician but patients need to see a reason for providing all this data, a feedback to themselves that can improve their quality of life, reduce cost etc.
3. Do you think any voice recording should be part of a patient’s EHR and subject to the same protections?
Answer: Absolutely, I believe voice can be used as a longitudinal biomarker and is an excellent ‘original source’ for records. Security is key and voice data, particularly when captured as part of a health record, must be treated like any other personally identifiable data.
4. I’m interested in your view of the Microsoft / Nuance acquisition. Microsoft has been focused on Healthcare for over 2 decades, and also has a strong voice recognition capability. Why do you think they bought Nuance?
Answer: While Microsoft has had a healthcare presence it hasn’t been focused on voice, it simply doesn’t have the data to integrate voice into it’s platform without the context specific capabilities of Nuance. Nuance has decades of voice data from Clinicians, essentially providing them with a massive training dataset covering a huge array of medical applications. It would make no sense for Microsoft to try and replicate such an extensive, existing dataset. However, many of their healthcare solutions are primed to integrate voice technology and Nuance is the perfect partner for them.
5. Using a smart speaker for patients / elderly / disabled is interesting for engagement, diarizing progress, and even diagnostics. They are cheap, easy to use, and always accessible in the home. Do you think the privacy / security issues are a roadblock? Or will the smart speakers become a part of everyday healthcare?
Answer: I used to think so but I’ve changed my mind, the more these speakers are used the less people are concerned about privacy. This of course can change with one significant data breach but for the moment I think engagement will increase, particularly for those high value wins that voice can deliver:
• Parents managing healthcare of children with complex needs, being able to share important data “hands free” at the time it’s happening is invaluable to parents who are not professional carers.
• Empowering people with disabilities to take control of their own health through more interactive technologies can be transformational for some patients.
6. Where do you think diagnostic speech recognition will break through first? For example, with depression, helping people recognize a wave could be coming on could help with self-medicating before family gets involved.
Answer: This is a fantastic question, I used to think Parkinson’s was the clear front runner as the physiological manifestation of the disease can clearly be identified through frequency analysis. However it hasn’t hit mainstream and I wonder if it is because the researchers don’t have the inclination to spin out into the startup world, is the real world application too complex (I don’t think this is it) or whether the market size is not attractive enough. I think Depression is going to be the big win for voice diagnostics in healthcare, we’ve seen an explosion of digital health solutions for mental health in recent years meaning that people are more willing to recognise when they need help and do something about it. You’re exactly right, I think the technology will focus on intra-speaker variability, so when do my speech patterns change to indicate a difference in mood/symptoms. I wouldn’t be surprised if we see a “wellness” version of this claiming to identify stress in the next year or 2 as a precursor to a reliable, clinically validated technology that can robustly identify the onset of depression.