AI-powered transcription tool used in hospitals invents things no one ever said
Researchers found that OpenAI's Whisper AI transcription tool often generates false information, with inaccuracies in up to 80% of transcriptions, raising serious concerns, especially in healthcare settings.
Read original articleResearchers have identified significant issues with Whisper, an AI-powered transcription tool developed by OpenAI, which is increasingly used in various sectors, including healthcare. The tool is prone to generating false information, known as hallucinations, which can include fabricated statements, racial commentary, and even non-existent medical treatments. Interviews with software engineers and researchers revealed that hallucinations were found in a substantial number of transcriptions, with one study indicating that 80% of analyzed public meeting transcriptions contained inaccuracies. The prevalence of these errors raises concerns, particularly in medical settings where accurate transcriptions are critical for patient care. Despite OpenAI's warnings against using Whisper in high-risk domains, many healthcare providers have adopted it to streamline documentation processes. Critics argue that the tool's inaccuracies could lead to misdiagnoses and other serious consequences. Additionally, the erasure of original audio recordings by some applications complicates the verification of transcriptions, further heightening the risk of errors going unnoticed. Experts are calling for regulatory oversight and improvements to the technology to mitigate these issues, emphasizing the need for a higher standard of accuracy in AI transcription tools.
- Whisper, an AI transcription tool, frequently generates false information, raising concerns in healthcare.
- Hallucinations were found in up to 80% of analyzed transcriptions, including harmful content.
- Many medical centers are using Whisper despite warnings against its use in high-risk areas.
- The erasure of original audio recordings complicates error verification in transcriptions.
- Experts are advocating for regulatory oversight and improvements to AI transcription technologies.
Related
ChatGPT Isn't 'Hallucinating'–It's Bullshitting – Scientific American
AI chatbots like ChatGPT can generate false information, termed as "bullshitting" by authors to clarify responsibility and prevent misconceptions. Accurate terminology is crucial for understanding AI technology's impact.
AiOla open-sources ultra-fast 'multi-head' speech recognition model
aiOla has launched Whisper-Medusa, an open-source AI model that enhances speech recognition, achieving over 50% faster performance. It supports real-time understanding of industry jargon and operates in over 100 languages.
Chatbots Are Primed to Warp Reality
The integration of AI chatbots raises concerns about misinformation and manipulation, particularly in political contexts, as they can mislead users and implant false memories despite efforts to improve accuracy.
AI solution to the cocktail party problem used in court
AI technology has improved the "cocktail party problem," enhancing audio clarity in legal contexts. Wave Sciences' algorithm effectively isolates voices, with plans for broader applications in military and consumer markets.
Whisper-Large-v3-Turbo
Whisper is an advanced ASR model by OpenAI, supporting 99 languages with features like transcription, translation, and timestamp generation. The latest version offers faster performance but with slight quality trade-offs.
In others words, LLMs are only clearly useful if the results don't really matter or they can and will be externally verified.
LLMs negate a fundamental argument for computing --- instead of accurate results at low cost, we now have inaccurate results at high cost.
There is undoubtedly some utility to be had here but it is not at all clear or obvious that this will be widely transformative.
Another example in medicine, radiologists will start handling orders of magnitude more cases. But the number of scans done might also increase exponentially as costs likewise drop.
A machine learning engineer said he initially discovered hallucinations in about half of the over 100 hours of Whisper transcriptions he analyzed.
The '100 hours' is almost useless information. 'About half' is meaningless without knowing the sample size. Perhaps he had 5 transcripts averaging 20 hours each, and 2 of the 5 had issues. Or perhaps there were hundreds of short transripts, where the 'almost half' would imply significance.> But the transcription software added: “He took a big piece of a cross, a teeny, small piece ... I’m sure he didn’t have a terror knife so he killed a number of people.”
> A speaker in another recording described “two other girls and one lady.” Whisper invented extra commentary on race, adding "two other girls and one lady, um, which were Black.”
> In a third transcription, Whisper invented a non-existent medication called “hyperactivated antibiotics.”
I didn't expect it to be this bad.
I use a digital recorder app to record audio from my clinical consultations. It's important for me, as a patient, to have a record, because I'm alone in there, and I frequently misremember or misunderstand things that were said.
My current recorder app has a transcription feature. It's fairly good at picking out words. It's supposed to recognize and label speakers as well, but that requires a lot of manual editing after the fact.
Still, it's fantastic having my own durable record of what was said to me, and by me. There are usually a few surprises in there!
Now, I've stopped asking for permission to record, because usually they become hostile to it. Nevertheless, it's legal, and it's my right to have.
(Literally)
Related
ChatGPT Isn't 'Hallucinating'–It's Bullshitting – Scientific American
AI chatbots like ChatGPT can generate false information, termed as "bullshitting" by authors to clarify responsibility and prevent misconceptions. Accurate terminology is crucial for understanding AI technology's impact.
AiOla open-sources ultra-fast 'multi-head' speech recognition model
aiOla has launched Whisper-Medusa, an open-source AI model that enhances speech recognition, achieving over 50% faster performance. It supports real-time understanding of industry jargon and operates in over 100 languages.
Chatbots Are Primed to Warp Reality
The integration of AI chatbots raises concerns about misinformation and manipulation, particularly in political contexts, as they can mislead users and implant false memories despite efforts to improve accuracy.
AI solution to the cocktail party problem used in court
AI technology has improved the "cocktail party problem," enhancing audio clarity in legal contexts. Wave Sciences' algorithm effectively isolates voices, with plans for broader applications in military and consumer markets.
Whisper-Large-v3-Turbo
Whisper is an advanced ASR model by OpenAI, supporting 99 languages with features like transcription, translation, and timestamp generation. The latest version offers faster performance but with slight quality trade-offs.