Frontiers in Endocrinology: Open-source LLMs for real-world diabetes diagnosis

Mar 25, 2026 · 1 min read
post

Our team published a study in Frontiers in Endocrinology evaluating how open-source large language models perform on real-world diabetes diagnosis from unstructured clinical narratives.

Using 11,329 adult diabetes cases from a large Chinese tertiary center, the study compared multiple open-source LLMs across diabetes subtype classification, diabetic kidney disease, and metabolic syndrome diagnosis. The results showed strong performance for complex subtype classification, alongside a clear gap on more rule-based diagnostic tasks, supporting a clinical co-pilot role for open-source LLMs in endocrinology.

Key points:

  • Evaluated multiple open-source LLMs on 11,329 real-world diabetes cases from Chinese clinical records.
  • Achieved a peak F1 score of 0.951 for multi-class diabetes subtype classification.
  • Identified a clear difference between strong holistic pattern recognition and weaker procedural diagnostic reasoning in DKD and MetS tasks.

Paper: Real-world performance of open-source large language models in diabetes diagnosis