Frontiers in Endocrinology: Open-source LLMs for real-world diabetes diagnosis
Mar 25, 2026
·
1 min read
Our team published a study in Frontiers in Endocrinology evaluating how open-source large language models perform on real-world diabetes diagnosis from unstructured clinical narratives.
Using 11,329 adult diabetes cases from a large Chinese tertiary center, the study compared multiple open-source LLMs across diabetes subtype classification, diabetic kidney disease, and metabolic syndrome diagnosis. The results showed strong performance for complex subtype classification, alongside a clear gap on more rule-based diagnostic tasks, supporting a clinical co-pilot role for open-source LLMs in endocrinology.
Key points:
- Evaluated multiple open-source LLMs on 11,329 real-world diabetes cases from Chinese clinical records.
- Achieved a peak F1 score of 0.951 for multi-class diabetes subtype classification.
- Identified a clear difference between strong holistic pattern recognition and weaker procedural diagnostic reasoning in DKD and MetS tasks.
Paper: Real-world performance of open-source large language models in diabetes diagnosis