With the increasing application of large language models (LLMs) in the medical field, their potential in patient education and clinical decision support is becoming increasingly prominent. Given the complex pathogenesis, diverse treatment options, and lengthy rehabilitation periods of spinal cord injury (SCI), patients are increasingly turning to advanced online resources to obtain relevant medical information. This study analyzed responses from four LLMs-ChatGPT-4o, Claude-3.5 sonnet, Gemini-1.5 Pro, and Llama-3.1-to 37 SCI-related questions spanning pathogenesis, risk factors, clinical features, diagnostics, treatments, and prognosis. Quality and readability were assessed using the Ensuring Quality Information for Patients (EQIP) tool and Flesch-Kincaid metrics, respectively. Accuracy was independently scored by three senior spine surgeons using consensus scoring. Performance varied among the models. Gemini ranked highest in EQIP scores, suggesting superior information quality. Although the readability of all four LLMs was generally low, requiring a college-level reading comprehension ability, they were all able to effectively simplify complex content. Notably, ChatGPT led in accuracy, achieving significantly higher "Good" ratings (83.8%) compared to Claude (78.4%), Gemini (54.1%), and Llama (62.2%). Comprehensiveness scores were high across all models. Furthermore, the LLMs exhibited strong self-correction abilities. After being prompted for revision, the accuracy of ChatGPT and Claude's responses improved by 100% and 50%, respectively; both Gemini and Llama improved by 67%. This study represents the first systematic comparison of leading LLMs in the context of SCI. While Gemini excelled in response quality, ChatGPT provided the most accurate and comprehensive responses.
基金:
Natural Science Foundation of Beijing Municipality
语种:
外文
WOS:
PubmedID:
中科院(CAS)分区:
出版当年[2025]版:
大类|3 区医学
小类|3 区卫生保健与服务4 区医学:信息
最新[2025]版:
大类|3 区医学
小类|3 区卫生保健与服务4 区医学:信息
JCR分区:
出版当年[2023]版:
Q1HEALTH CARE SCIENCES & SERVICESQ2MEDICAL INFORMATICS
最新[2023]版:
Q1HEALTH CARE SCIENCES & SERVICESQ2MEDICAL INFORMATICS
第一作者机构:[1]Capital Med Univ, Xuanwu Hosp, Dept Neurosurg, 45 Changchun St, Beijing 10053, Peoples R China[2]CHINA INI, Spine Ctr, Beijing, Peoples R China
共同第一作者:
通讯作者:
通讯机构:[1]Capital Med Univ, Xuanwu Hosp, Dept Neurosurg, 45 Changchun St, Beijing 10053, Peoples R China[2]CHINA INI, Spine Ctr, Beijing, Peoples R China
推荐引用方式(GB/T 7714):
Li Jinze,Chang Chao,Li Yanqiu,et al.Large Language Models' Responses to Spinal Cord Injury: A Comparative Study of Performance[J].JOURNAL OF MEDICAL SYSTEMS.2025,49(1):doi:10.1007/s10916-025-02170-7.
APA:
Li, Jinze,Chang, Chao,Li, Yanqiu,Cui, Shengyu,Yuan, Fan...&Jian, Fengzeng.(2025).Large Language Models' Responses to Spinal Cord Injury: A Comparative Study of Performance.JOURNAL OF MEDICAL SYSTEMS,49,(1)
MLA:
Li, Jinze,et al."Large Language Models' Responses to Spinal Cord Injury: A Comparative Study of Performance".JOURNAL OF MEDICAL SYSTEMS 49..1(2025)