December 20th, 2024

Study: Almost all leading AI chatbots show signs of cognitive decline

A study in The BMJ found leading AI chatbots show cognitive decline, with ChatGPT 4o scoring highest. Limitations in visuospatial skills and executive functions may hinder their clinical effectiveness.

Read original articleLink Icon
Study: Almost all leading AI chatbots show signs of cognitive decline

A recent study published in The BMJ reveals that nearly all leading AI chatbots exhibit signs of cognitive decline, challenging the notion that AI could soon replace human doctors. The research assessed the cognitive abilities of prominent large language models (LLMs) such as ChatGPT and Gemini using the Montreal Cognitive Assessment (MoCA), a test typically used to detect early signs of dementia. The results indicated that older versions of chatbots performed worse, similar to older patients. ChatGPT 4o scored the highest with 26 out of 30, while Gemini 1.0 scored the lowest at 16. All chatbots struggled particularly with visuospatial skills and executive functions, which are critical for clinical applications. The findings suggest that while LLMs can perform well in certain medical diagnostic tasks, their cognitive limitations, especially in visual abstraction and executive function, may hinder their effectiveness in clinical settings. The authors conclude that neurologists are unlikely to be replaced by AI models in the near future and may instead find themselves treating AI systems that exhibit cognitive impairments.

- Leading AI chatbots show signs of cognitive decline, challenging their potential to replace human doctors.

- The study used the Montreal Cognitive Assessment (MoCA) to evaluate the cognitive abilities of various LLMs.

- ChatGPT 4o scored the highest, while all models struggled with visuospatial skills and executive functions.

- Findings indicate significant cognitive limitations in AI that could impede clinical applications.

- Neurologists may soon encounter AI models presenting with cognitive impairments rather than being replaced by them.

Link Icon 3 comments
By @emptiestplace - 4 months
Wait, are we actually applying human neurological assessment tools to transformer-based language models and conflating architectural/training differences between model iterations with biological cognitive decline?
By @chis - 4 months
Legendary shitpost from a major medical journal