Can AI Chatbots Reason Like Doctors?
OpenAI's o1-preview model outperformed physicians on several clinical reasoning tasks using real emergency room records, according to a study published in Science on 30 April.
One of the earliest stated goals for computing in medicine was to aid in clinical reasoning: the decision-making steps required to reach a diagnosis and form a treatment plan. And over the years, researchers have built many clinical decision support systems, which have typically been purpose-built, with painstakingly written rules about symptoms, test thresholds, and medication interactions. As artificial intelligence capabilities develop, clinical reasoning is a natural application. Now, a large language model (LLM) from OpenAI has outperformed physicians on several clinical reasoning tasks using real emergency room records, according to a study published 30 April in Science.
The new findings arrive amid a wave of concerning evidence about medical information from chatbots, with some studies showing impressive diagnostic performance while others document fabricated citations, flawed advice, and results that shift depending on how researchers score the systems. Despite that uncertainty, products aimed towards medical professionals are already entering the market. For example, this year OpenAI introduced ChatGPT for Clinicians and ChatGPT for Healthcare. The performance of…
- spectrum.ieee.orgCan AI Chatbots Reason Like Doctors?primary