journal article Dec 09, 2024

Use of ChatGPT to Explore Gender and Geographic Disparities in Scientific Peer Review

Abstract
Background
In the realm of scientific research, peer review serves as a cornerstone for ensuring the quality and integrity of scholarly papers. Recent trends in promoting transparency and accountability has led some journals to publish peer-review reports alongside papers.


Objective
ChatGPT-4 (OpenAI) was used to quantitatively assess sentiment and politeness in peer-review reports from high-impact medical journals. The objective was to explore gender and geographical disparities to enhance inclusivity within the peer-review process.


Methods
All 9 general medical journals with an impact factor >2 that publish peer-review reports were identified. A total of 12 research papers per journal were randomly selected, all published in 2023. The names of the first and last authors along with the first author’s country of affiliation were collected, and the gender of both the first and last authors was determined. For each review, ChatGPT-4 was asked to evaluate the “sentiment score,” ranging from –100 (negative) to 0 (neutral) to +100 (positive), and the “politeness score,” ranging from –100 (rude) to 0 (neutral) to +100 (polite). The measurements were repeated 5 times and the minimum and maximum values were removed. The mean sentiment and politeness scores for each review were computed and then summarized using the median and interquartile range. Statistical analyses included Wilcoxon rank-sum tests, Kruskal-Wallis rank tests, and negative binomial regressions.


Results
Analysis of 291 peer-review reports corresponding to 108 papers unveiled notable regional disparities. Papers from the Middle East, Latin America, or Africa exhibited lower sentiment and politeness scores compared to those from North America, Europe, or Pacific and Asia (sentiment scores: 27 vs 60 and 62 respectively; politeness scores: 43.5 vs 67 and 65 respectively, adjusted P=.02). No significant differences based on authors’ gender were observed (all P>.05).


Conclusions
Notable regional disparities were found, with papers from the Middle East, Latin America, and Africa demonstrating significantly lower scores, while no discernible differences were observed based on authors’ gender. The absence of gender-based differences suggests that gender biases may not manifest as prominently as other forms of bias within the context of peer review. The study underscores the need for targeted interventions to address regional disparities in peer review and advocates for ongoing efforts to promote equity and inclusivity in scholarly communication.
Topics

No keywords indexed for this article. Browse by subject →

References
28
[1]
Mehta, P Int J Sci Technol Res (2020)
[4]
A survey on sentiment analysis methods, applications, and challenges

Mayur Wankhade, Annavarapu Chandra Sekhara Rao, Chaitanya Kulkarni

Artificial Intelligence Review 10.1007/s10462-022-10144-1
[18]
Gender API2023-12-31https://gender-api.com
[20]
SeboPUse of ChatGPT to explore gender and geographic disparities in scientific peer reviewOpen Science Framework2024-11-29https://osf.io/WNRZU/ 10.2196/preprints.57667
[22]
Negative binomial regression | Stata annotated output2023-12-31UCLA: Statistical Consulting Grouphttps://stats.idre.ucla.edu/stata/output/negative-binomial-regression/
[23]
Negative binomial regression | Stata data analysis examplesUCLA: Statistical Consulting Group2023-12-31https://stats.idre.ucla.edu/stata/dae/negative-binomial-regression/
[28]
Schneider, S Evidence Review: Peer Review Bias in the Funding Process: Main Themes and Interventions (2024)
Cited By
3
Metrics
3
Citations
28
References
Details
Published
Dec 09, 2024
Vol/Issue
26
Pages
e57667
Cite This Article
Paul Sebo (2024). Use of ChatGPT to Explore Gender and Geographic Disparities in Scientific Peer Review. Journal of Medical Internet Research, 26, e57667. https://doi.org/10.2196/57667