Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework for Medical Applications

Long Cui^¹, Yongbin Liu^¹(), Chunping Ouyang^¹, Ying Yu^¹, Jiangtao Zhang^², Yaping Wan^¹, Fei Yang^³

¹ School of Computer, University of South China,Hengyang 421001, China.

² 305th Hospital of the Chinese People’s Liberation Army, Beijing 100017, China.

³ School of Public Health, University of South China, Hengyang 421001, China.

Show Author Information

Abstract

Large language models (LLMs) excel in various natural language processing tasks and are increasingly applied in specialized fields like medicine. However, their deployment in the medical domain is challenged by limited domain-specific data and the tendency to generate inaccurate information, known as “hallucinations.” While domainspecific fine-tuning has improved open-source LLMs, they still underperform compared to proprietary models like ChatGPT and PaLM. To address this gap, retrieval-augmented generation (RAG) techniques have been explored to enhance LLMs by integrating external knowledge bases. Nevertheless, the success of RAG depends on the quality of retrieved documents, and its application within the medical field remains in the early stages. In this paper, we introduce the “Bailicai” framework as an exploratory approach to integrating RAG with LLMs in the medical field. The framework employs fine-tuning to improve the RAG process, where “falsely relevant” and “completely irrelevant” interference documents are intentionally included in the training data. This enables Bailicai to develop the ability to assess the quality of retrieved documents and selectively incorporate them. The framework is organized into four modules: (1) medical knowledge injection, (2) self-knowledge boundary identification, (3) directed acyclic graph task decomposition, and (4) retrieval-augmented generation. Through the synergy of these modules, Bailicai achieves superior performance on multiple medical benchmarks, outperforming existing large models in the medical domain, RAG-based methods, and proprietary models such as GPT-3.5. Furthermore, Bailicai effectively mitigates the hallucination problem common in LLMs applied to medical tasks and enhances the robustness of RAG when dealing with irrelevant or misleading documents, enabling more accurate information retrieval and integration.

Keywords

large language models (LLMs)retrieval-augmented generation (RAG)domain-specific language models

Big Data Mining and Analytics

Cite this article:

Cui L, Liu Y, Ouyang C, et al. Bailicai: A Domain-Optimized Retrieval-Augmented Generation Framework for Medical Applications. Big Data Mining and Analytics, 2025, https://doi.org/10.26599/BDMA.2024.9020097