Assessing ChatGPT4 With and Without Retrieval Augmented Generation in Anticoagulation Management for GI Procedures
Abstract
Background and Aims:
With the growing complexity of managing anticoagulation for patients undergoing gastrointestinal (GI) procedures, this study evaluates ChatGPT-4's ability to provide accurate medical guidance, comparing it with its prior AI models (ChatGPT-3.5) and the Retrieval-Augmented Generation (RAG) supported model (ChatGPT4-RAG).
Methods:
Thirty-six anticoagulation-related questions, based on professional guidelines, were answered by ChatGPT-4. Ten gastroenterologists assessed these responses for accuracy and relevance. ChatGPT-4’s performance was also compared to that of ChatGPT-3.5 and ChatGPT4-RAG. Additionally, a survey was conducted to understand gastroenterologists' perceptions of ChatGPT-4.
Results:
ChatGPT-4's responses significantly improved accuracy and coherence over ChatGPT-3.5, with 30.5% of responses fully accurate and 47.2% generally accurate. ChatGPT4-RAG demonstrated a higher ability to integrate current information, achieving 75% full accuracy. Notably, for diagnostic and therapeutic EGD, 51.8% of responses were fully accurate; for ERCP with and without stent placement, 42.8% were fully accurate; and for diagnostic and therapeutic colonoscopy, 50% were fully accurate.
Conclusion:
ChatGPT4-RAG significantly advances anticoagulation management in endoscopic procedures, offering reliable and precise medical guidance. The integration of AI in medical practice should continue to be evaluated to maintain patient confidentiality and the integrity of the physician-patient relationship.