Poster Presentation ESA-SRB-ANZOS 2025 in conjunction with ENSA

An evaluation of the quality of recommendations for polycystic ovarian syndrome provided by generative artificial intelligence (128529)

Aleksia Grbic 1 , Harry Kirkland 1 , Ngaroma Steele 1 , Juliana Chen 1 2
  1. Faculty of Medicine and Health Sciences, Macquarie University, Ryde, NSW, Australia
  2. Discipline of Nutrition and Dietetics and Faculty of Medicine and Health, Charles Perkins Centre, The University of Sydney , Sydney, NSW, Australia

Polycystic Ovarian Syndrome (PCOS) is an endocrine disorder affecting 5-15% females (1). It is a heterogeneous disease, with each woman’s symptomology being vastly different. (2-5). Consequently, diagnosis is delayed and treatment difficult to find, so women self-manage their condition, seeking information from sources such as ChatGPT (2-3). There is, however, little data on the accuracy of evidence about PCOS (6-7). This study, therefore aims to assess the quality of recommendations provided by ChatGPT for PCOS management.

 

Common PCOS questions were gathered from various social media platforms. Meetings with PCOS and fertility experts were conducted where questions were verified. The questions were then categorised into groups using NVivo. The questions were typed into ChatGPT using a non-institutional email in a new conversation each time. After 14-21 days, questions were re-queried by another reviewer. The accuracy of the answers was then analysed by comparing its recommendations to PCOS guidelines. 6-point and 3-point likert scales were used to assess the accuracy and completeness of answers, respectively.

 

Across all questions (n=36), the mean accuracy score was 4.2 and the mean completeness score was 1.8. All questions were re-queried and re-graded approximately 14-21 days later with a mean accuracy score of 4.4 and mean completeness score of 2.5. The scores differed between reviewers because ChatGPT is constantly evolving to provide more accurate information. However, there were limitations, including providing outdated and wrong advice and not using clinical guidelines as an information source. Answers to many of the questions were also the same, highlighting the lack of creativity of ChatGPT.  

 

ChatGPT generated mostly accurate and complete answers to a range of questions regarding PCOS, although with some limitations. ChatGPT may be improved by including its references in its responses. It would also benefit from more robust training where it utilises information from credible, evidence-based sources.  

  1. Rosenfield RL, Ehrmann DA. The Pathogenesis of Polycystic Ovary Syndrome (PCOS): The Hypothesis of PCOS as Functional Ovarian Hyperandrogenism Revisited. Endocrine Reviews [Internet]. 2016 Jul 26 [cited 2025 Mar 7];37(5):467–520. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5045492/table/T1/?report=objectonly
  2. Azziz R. PCOS: a diagnostic challenge. Reproductive BioMedicine Online. 2004 Jan [cited 2025 Mar 7];8(6):644–8.
  3. Emanuel RHK, Docherty PD, Lunt H, Campbell RE. What do users in a polycystic ovary syndrome (PCOS) forum think about the treatments they tried: Analysing treatment sentiment using machine learning. Physical and Engineering Sciences in Medicine. 2025 Apr 14;
  4. Walter K. What Is Polycystic Ovary Syndrome? JAMA. 2022 Jan 18;327(3):294.
  5. Joham AE, Norman RJ, Stener-Victorin E, Legro RS, Franks S, Moran LJ, et al. Polycystic ovary syndrome. The Lancet Diabetes & Endocrinology. 2022 Sep;10(9):668–80.
  6. Yasutaka Yanagita, Daiki Yokokawa, Uchida S, Junsuke Tawara, Masatomi Ikusaka. Accuracy of ChatGPT on Medical Questions in the National Medical Licensing Examination in Japan: Evaluation Study. JMIR formative research. 2023 Oct 13;7:e48023–3.
  7. Johnson D, Goodman R, Patrinely J, Stone C, Zimmerman E, Donald R, et al. Assessing the Accuracy and Reliability of AI-Generated Medical Responses: An Evaluation of the Chat-GPT Model. Assessing the Accuracy and Reliability of AI-Generated Medical Responses: An Evaluation of the Chat-GPT Model [Internet]. 2023 Feb 28;1. Available from: https://assets.researchsquare.com/files/rs-2566942/v1/5c64b009-ab48-47a7-bd66-afc5c46d97af.pdf?c=1677623849