DISCLAIMERS

contact us >>

Large Language Models in Burns And Trauma Education: Gpt-4 Passes The Abls And Atls Exams

Alexander J Comerci, BS; Hilary Y Liu, BS; Mario Alessandri Bonetti, MD; James Donovan, MD; Alain C Corcos, MD; Jenny Ziembicki, MD; Francesco M Egro, MD, MSc, MRCS
University of Pittsburgh School of Medicine
2024-01-15

Presenter: Alexander J Comerci

Affidavit:
J Peter Rubin

Director Name: J Peter Rubin

Author Category: Medical Student
Presentation Category: Clinical
Abstract Category: General Reconstruction

Background: The ABLS and ATLS exams assess medical professionals' ability to effectively evaluate and treat burn and trauma patients. Recently, there has been a growing prevalence of AI-powered large language models (LLMs), giving rise to the question of how these technologies may be integrated into medical education. To determine if GPT-3.5, Google Bard, and GPT-4 could serve as teaching tools, we assessed their performance on the ABLS and ATLS exams.

Methods: We used GPT-3.5, Google Bard, and GPT-4 to answer the 2023 ABLS exam and three ATLS 10th edition exams. Answers were compared to the answer key provided by the ACS. Average exam scores were calculated. The difference in correct answers was evaluated using chi-square.

Results: GPT-3.5, Bard, and GPT-4 scored 86%, 70%, and 90% on the ABLS exam, respectively. GPT-3.5 and GPT-4 scored above the passing threshold. No difference was found between GPT-3.5 and GPT-4 (p=0.538) and between GPT-3.5 and Bard (p=0.054). GPT-4 performed significantly better than Bard (p=0.012). On the ATLS exams, GPT-3.5, Bard, and GPT-4 scored an average of 65%, 61.7%, and 83.3%, respectively. Only GPT-4 exceeded the passing threshold (75%). There was no significant difference in GPT-3.5 and Bard's average scores (p=0.5921), but GPT-4 performed significantly better than both GPT-3.5 (p=0.0012) and Bard (p=0.0002).

Conclusion: GPT-4 outperforms ChatGPT and Google Bard, and passed the ATLS and ABLS exams. Although LLMs demonstrate an impressive level of burns and trauma knowledge, further improvements and ethical considerations must be made before LLMs can be used in an educational setting.

Ohio,Pennsylvania,West Virginia,Indiana,Kentucky,Pennsylvania American Society of Plastic Surgeons

OVSPS Conference