Harvard Medical School recently published a latest study on the performance of large language models in medical diagnosis in the journal Science. Through rigorous double-blind testing and clinical reasoning evaluation, the study objectively compared differences between AI systems and human physicians in interpreting medical records. The data show that the latest AI models have the edge in handling complex clinical information—especially in high-pressure, information-heavy emergency department settings. However, the researchers still emphasize that their findings do not mean that AI systems are ready to practice medicine autonomously, nor do they imply that doctors can be removed from the diagnostic process.
AI outperforms at early decision points in the ER
The research team had the LLM model evaluate patients in a standard emergency setting across different stages—from early triage to later admission decisions. At each stage, the model was only given the information available at that time—directly drawn from actual electronic medical records—and was asked to produce possible diagnostic outcomes and propose next-step treatment recommendations. In real-world emergency cases at early decision points, the model’s diagnostic accuracy was on par with, or even better than, that of the attending physicians—an outcome that even surprised the researchers.
The study stresses: AI still can’t practice medicine on its own; doctors’ role remains important
However, the researchers emphasized that their findings do not mean that AI systems are ready for autonomous medical practice, nor do they suggest that doctors can be removed from the diagnostic process.
The report also noted that the rapid development of AI remains of major significance for the science and practice of clinical medicine. Although applying artificial intelligence to clinical decision support is sometimes viewed as a high-risk measure, broader use of these tools may help reduce the human and economic costs caused by diagnostic errors, delays, and difficulties in accessing care.
This article Harvard Medical School’s latest study: AI diagnosis decisions in the ER are better than human doctors first appeared on Chain News ABMedia.
Related News
Worker Wins Landmark AI Substitution Case in China
China court’s latest ruling: Reasons why AI-driven automation without layoffs is lawful
After HBM, is the AI memory bottleneck HBF? Turing Award winner David Patterson: Inference will redefine storage architecture
Berkeley GEPA analysis: AI can learn new tasks without updating weights, with 35 times less training cost than RL
AISI assessment: GPT-5.5’s network-attack capabilities are on par with Anthropic’s Mythos