Expert evaluation and readability of conversational AI responses to common ophthalmic patient questions: an exploratory generational cross-sectional study
Abstract
Purpose: To evaluate expert-rated informational quality and linguistic accessibility of responses generated by a contemporary large language model to common ophthalmic patient questions, and to explore generational evolution in conversational artificial intelligence (AI) performance.Methods: In this cross-sectional exploratory study, 12 frequently asked ophthalmology questions were used to generate patient-oriented responses from ChatGPT-5 using a standardized specialist-role prompt. Responses were evaluated by ophthalmologists using validated instruments assessing Global Quality Score (GQS), Reliability Score (RS), and Usefulness Score (US). Readability was analyzed using the Flesch–Szigriszt Index (FSI) and categorized according to the INFLESZ scale. Descriptive analyses were performed, and findings were contextually interpreted against previously reported data obtained using identical questions and evaluation methods. Domain-specific variability and correlations among evaluation constructs were explored.Results: Responses were rated favorably across expert-assessed domains (mean GQS 4.01, RS 5.40, US 5.68), with contextual comparisons suggesting modest generational improvements. In contrast, readability showed a substantial increase (mean FSI 67.7 vs 53.9), corresponding to a shift from “somewhat difficult” to “fairly easy” patient comprehension. Performance changes were domain-dependent, with greater gains in explanatory topics than in context-sensitive counselling. Readability demonstrated minimal correlation with expert-rated quality constructs.Conclusions: Generational development of conversational AI in ophthalmology suggests a tendency to improve linguistic accessibility more consistently than expert-perceived informational quality. Conversational AI may therefore support patient education by improving communicative clarity, which would be useful in settings with limited consultation time. Careful clinical integration and further evaluation of real-world educational impact remain necessary.
Keywords
Citation Information
@article{javiergismerorodriguez2026,
title={Expert evaluation and readability of conversational AI responses to common ophthalmic patient questions: an exploratory generational cross-sectional study},
author={Javier Gismero Rodriguez and Carlos Ruiz Nuñez and Antonio Jose Garcia Ruiz},
journal={Research Square},
year={2026},
doi={https://doi.org/10.21203/rs.3.rs-9242640/v1}
}
SinoXiv