Evaluation of Artificial Intelligence (ChatGPT-5.2) in the Classification and Indication for Fixation of Posterior Malleolar Fractures: A Multicenter External Validation Study

  • Héctor Agustín Rivadeneira Jurado Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
  • Elías A. Rivadeneira Jurado Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina https://orcid.org/0009-0006-5784-5700
  • Daniel Espinoza Freire Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina https://orcid.org/0009-0000-9882-6027
  • Andrés F. Samaniego Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina https://orcid.org/0000-0002-6616-6471
  • Ezequiel Lulkin Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina https://orcid.org/0000-0002-4119-0483
  • Sebastián Pereira Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
  • Fernando Bidolegui Orthopedics and Traumatology Service, Sanatorio Otamendi y Miroli, Autonomous City of Buenos Aires, Argentina
  • Tomás Macagno Orthopedics and Traumatology Service, Sanatorio Otamendi y Miroli, Autonomous City of Buenos Aires, Argentina
Keywords: Artificial intelligence, posterior malleolus, multicenter study

Abstract

Introduction: Posterior malleolar fractures have a significant impact on ankle joint congruity. The indication for fixation no longer depends solely on fragment size but also on fracture morphology. Artificial intelligence (AI) has emerged as a tool to support clinical decision-making. The objective of this study was to evaluate the ability of AI to classify posterior malleolar fractures and determine the indication for fixation, compared with a reference standard based on expert consensus. Materials and Methods: A retrospective diagnostic accuracy study with external validation was conducted in accordance with the STARD-AI and GAMER guidelines. A protocol based on the Bartoníček and Rammelt classification was developed using 24 cases for calibration. Subsequently, 9 cases were evaluated using radiographs and computed tomography scans and analyzed by 12 experts and the ChatGPT-5.2 model. Agreement in fracture classification and sensitivity for the indication for fixation were assessed using Cohen’s kappa coefficient. Results: ChatGPT-5.2 achieved 78% agreement in fracture classification, with a kappa coefficient of 0.56, indicating moderate agreement. Sensitivity for the indication for posterior malleolar fixation was 100%. Conclusions: Artificial intelligence demonstrated performance comparable to that of experts in the classification of posterior malleolar fractures and high sensitivity in determining the indication for fixation. It proved useful as a supportive tool in medical education settings. Studies with larger sample sizes are needed to validate these findings.

Downloads

Download data is not yet available.

Author Biographies

Héctor Agustín Rivadeneira Jurado, Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Elías A. Rivadeneira Jurado, Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Daniel Espinoza Freire, Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Andrés F. Samaniego, Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Ezequiel Lulkin, Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Sebastián Pereira, Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Orthopedics and Traumatology Service, Hospital Sirio-Libanés, Autonomous City of Buenos Aires, Argentina
Fernando Bidolegui, Orthopedics and Traumatology Service, Sanatorio Otamendi y Miroli, Autonomous City of Buenos Aires, Argentina
Orthopedics and Traumatology Service, Sanatorio Otamendi y Miroli, Autonomous City of Buenos Aires, Argentina
Tomás Macagno, Orthopedics and Traumatology Service, Sanatorio Otamendi y Miroli, Autonomous City of Buenos Aires, Argentina
Orthopedics and Traumatology Service, Sanatorio Otamendi y Miroli, Autonomous City of Buenos Aires, Argentina

References

Terstegen J, Weel H, Frosch KH, Rolvien T, Schlickewei C, Mueller E. Classifications of posterior malleolar

fractures: a systematic literature review. Arch Orthop Trauma Surg 2023;143(7):4181-220. https://doi.org/10.1007/s00402-022-04643-7

Mohamed A, Fuad U, Elasad A, Shrestha S, Hagroo A, Pengas IP. Posterior malleolar fractures: From the „Forgotten Fragment“ to modern concepts in management. Cureus 2025;17(10):e94681. https://doi.org/10.7759/cureus.94681

Bartoníček J, Rammelt S, Tuček M, Naňka O. Posterior malleolar fractures of the ankle. Eur J Trauma Emerg Surg 2015;41(6):587-600. https://doi.org/10.1007/s00068-015-0560-6

Verhage SM, Hoogendoorn JM, Krijnen P. When and how to operate the posterior malleolus fragment in trimalleolar fractures. Arch Orthop Trauma Surg 2018;138(9):1213-22. https://doi.org/10.1007/s00402-018-2949-2

Gale W, Oakden-Rayner L, Carneiro G, Bradley AP, Palmer LJ. Detecting hip fractures with radiologist-level performance using deep neural networks. Preprint. Digit Med 2017. https://doi.org/10.48550/arXiv.1711.06504

Lindsey R, Daluiski A, Chopra S, Lachapelle A, Mozer M, Sicular S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci USA 2018;115(45):11591-6. https://doi.org/10.1073/pnas.1806905115

Rivadeneira Jurado HA, Rivadeneira Jurado EA, Espinoza Freire D, Samaniego AF, Lulkin E, Bidolegui F, et al. Evaluación de la clasificación de las fracturas de platillo tibial según Schatzker-Kfuri utilizando radiografías y tomografía. Comparación entre el observador experto y el modelo ChatGPT-4o. Rev Asoc Argent Ortop Traumatol 2025;90(6):556-60. https://doi.org/10.15417/issn.1852-7434.2025.90.6.2224

Husarek J, Hess S, Razaeian S, Ruder TD, Sehmisch S, Müller M, et al. Artificial intelligence in commercial fracture detection products: a systematic review and meta-analysis of diagnostic test accuracy. Sci Rep 2024;14(1):23053. https://doi.org/10.1038/s41598-024-73058-8

Mohammadi S, Parviz S, Parvaz P, Pirmoradi MM, Afzalimoghaddam M, Mirfazaelian H. Diagnostic performance of ChatGPT in tibial plateau fracture in knee X-ray. Emerg Radiol 2025;32(1):59-64. https://doi.org/10.1007/s10140-024-02298-y.

Published
2026-06-30
How to Cite
Rivadeneira Jurado, H. A., Rivadeneira Jurado, E. A., Espinoza Freire, D., Samaniego, A. F., Lulkin, E., Pereira, S., Bidolegui, F., & Macagno, T. (2026). Evaluation of Artificial Intelligence (ChatGPT-5.2) in the Classification and Indication for Fixation of Posterior Malleolar Fractures: A Multicenter External Validation Study. Revista De La Asociación Argentina De Ortopedia Y Traumatología, 91(3), 246-249. https://doi.org/10.15417/issn.1852-7434.2026.91.3.2348
Section
Basic Research