| Sign up

PDF (2.5 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

Large Language Models in Psychiatry: Current Applications, Limitations, and Future Scope

Zhe Liu^{¹^,^Z}, Yihang Bao^{¹^,^Z}, Shuai Zeng^{²^,^Z}, Ruiyi Qian^¹, Miaohan Deng^¹, An Gu^¹, Jianye Li^¹, Weidi Wang^¹, Wenxiang Cai^¹, Wenhao Li^³, Han Wang^³(), Dong Xu^²(), Guan Ning Lin^¹()

1Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, and School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200030, China

2Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA

3School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun 130024, China

Show Author Information

Abstract

With the advancements in Artificial Intelligence (AI) technology, Large Language Models (LLMs) provide outstanding capabilities for natural language understanding and generation, enhancing various domains. In psychiatry, LLMs can empower healthcare by analyzing vast amounts of medical data to improve diagnostic accuracy, enhance therapeutic communication, and personalize patient care with their strength in understanding and generating human-like text. In clinical AI, developing and utilizing robust and interpretable models has been a longstanding challenge. This survey investigates the current psychiatric practice of LLMs, along with a series of corpus resources that could be used for training psychiatric LLMs. We discuss the limitations concerning LLM reproducibility, capabilities, usability, interpretability in clinical settings, and ethical considerations. Additionally, we propose potential future directions for research, clinical application, and education in psychiatric LLMs. Finally, we discuss the challenge of integrating LLMs into the evolving landscape of healthcare in real-world scenarios.

Keywords

Artificial Intelligence (AI)Large Language Model (LLM)psychiatry medical application

Electronic Supplementary Material

Download File(s)

BDMA-2024-0111-ESM.xlsx (40.4 KB)

References

[1]

J. A. Lieberman and A. J. Rush, Redefining the role of psychiatry in medicine, Am. J. Psychiatry, vol. 153, no. 11, pp. 1388–1397, 1996.

Crossref Google Scholar

[2]

S. B. Guze, Nature of psychiatric illness: Why psychiatry is a branch of medicine, Compr. Psychiatry, vol. 19, no. 4, pp. 295–307, 1978.

Crossref Google Scholar

[3]

P. Długosz and D. Liszka, The relationship between mental health, educational burnout and strategies for coping with stress among students: A cross-sectional study of Poland, Int. J. Environ. Res. Public Health, vol. 18, no. 20, p. 10827, 2021.

Crossref Google Scholar

[4]

N. Rezaii, P. Wolff, and B. H. Price, Natural language processing in psychiatry: The promises and perils of a transformative approach, Br. J. Psychiatry, vol. 220, no. 5, pp. 251–253, 2022.

Crossref Google Scholar

[5]

A. Le Glaz, Y. Haralambous, D. H. Kim-Dufor, P. Lenca, R. Billot, T. C. Ryan, J. Marsh, J. Devylder, M. Walter, S. Berrouiguet, et al., Machine learning and natural language processing in mental health: Systematic review, J. Med. Internet Res., vol. 23, no. 5, p. e15708, 2021.

Crossref Google Scholar

[6]

L. Tejavibulya, M. Rolison, S. Y. Gao, Q. H. Liang, H. Peterson, J. Dadashkarimi, M. C. Farruggia, C. A. Hahn, S. Noble, S. D. Lichenstein, et al., Predicting the future of neuroimaging predictive models in mental health, Mol. Psychiatry, vol. 27, no. 8, pp. 3129–3137, 2022.

Crossref Google Scholar

[7]

C. Su, Z. Xu, J. Pathak, and F. Wang, Deep learning in mental health outcome research: A scoping review, Transl. Psychiatry, vol. 10, no. 1, p. 116, 2020.

Crossref Google Scholar

[8]

X. F. Geng and J. H. Xu, Application of autoencoder in depression diagnosis, in Proc. 2017 3^rd Int. Conf. Computer Science and Mechanical Automation, Wuhan, China, 2017, pp. 146–151.

[9]

T. Pham, T. Tran, D. Phung, and S. Venkatesh, Predicting healthcare trajectories from medical records: A deep learning approach, J. Biomed. Inform., vol. 69, pp. 218–229.

[10]

T. Zhang, A. M. Schoene, and S. Ananiadou, Automatic identification of suicide notes with a transformer-based deep learning model, Internet Interv., vol. 25, p. 100422, 2021.

Crossref Google Scholar

[11]

A. Abd-Alrazaq, D. Alhuwail, J. Schneider, C. T. Toro, A. Ahmed, M. Alzubaidi, M. Alajlani, and M. Househ, The performance of artificial intelligence-driven technologies in diagnosing mental disorders: An umbrella review, NPJ Digit. Med., vol. 5, no. 1, p. 87, 2022.

Crossref Google Scholar

[12]

V. Vajre, M. Naylor, U. Kamath, and A. Shehu, PsychBERT: A mental health language model for social media mental health behavioral analysis, in Proc. 2021 IEEE Int. Conf. Bioinformatics and Biomedicine (BIBM ), Houston, TX, USA, 2021, pp. 1077–1082.

[13]

S. Ji, T. Zhang, L. Ansari, J. Fu, P. Tiwari, and E. Cambria, Mentalbert: Publicly available pretrained language models for mental healthcare, arXiv preprint arXiv: 2110.15621, 2021.

[14]

S. Gururangan, A. Marasović, S. Swayamdipta, K. Lo, I. Beltagy, D. Downey, and N. A. Smith, Don’t stop pretraining: Adapt language models to domains and tasks, arXiv preprint arXiv: 2004.10964, 2020.

[15]

T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, in Proc. 34^th Int. Conf. neural Information Processing Systems, Vancouver, BC, Canada, 2020, pp. 1877–1901.

[16]

J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkat, et al., Gpt-4 technical report, arXiv preprint arXiv: 2303.08774, 2024.

[17]

OpenAI, GPT-4o, https://openai.com/index/hello-gpt-4o/, 2024.

[18]

R. Anil, S. Borgeaud, J. B. Alayrac, J. Yu, R. Soricut, J. Schalkwyk, A. M. Dai, A. Hauth, K. Millican, D. Silver, et al., Gemini: A family of highly capable multimodal models, arXiv preprint arXiv: 2312.11805, 2024.

[19]

H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, et al., Llama 2: Open foundation and fine-tuned chat models, arXiv preprint arXiv: 2307.09288, 2023.

[20]

Meta, Llama-3, Github, https://github.com/meta-llama/llama3, 2024.

[21]

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V. Le, and D. Zhou, Chain-of-thought prompting elicits reasoning in large language models, in Proc. 36^th Int. Conf. Neural Information Processing Systems, New Orleans, LA, USA, 2022, pp. 24824–24837.

[22]

S. Yao, D. Yu, J. Zhao, I. Shafran, T. L. Griffiths, Y. Cao, and K. Narasimhan, Tree of thoughts: Deliberate problem solving with large language models, in Proc. 37^th Int. Conf. Neural Information Processing Systems, New Orleans, LA, USA, 2024, pp. 11809–11822.

[23]

M. Besta, N. Blach, A. Kubicek, R. Gerstenberger, M. Podstawski, L. Gianinazzi, J. Gajda, T. Lehmann, H. Niewiadomski, P. Nyczyk, et al., Graph of thoughts: Solving elaborate problems with large language models, arXiv preprint arXiv: 2308.09687, 2024.

[24]

B. Lester, R. Al-Rfou, and N. Constant, The power of scale for parameter-efficient prompt tuning, arXiv preprint arXiv: 2104.08691, 2021.

[25]

N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. De Laroussilhe, A. Gesmundo, M. Attariyan, and S. Gelly, Parameter-efficient transfer learning for NLP, in Proc. 36^th Int. Conf. Machine Learning, Long Beach, CA, USA, 2019, pp. 2790–2799.

[26]

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, LoRA: Low-rank adaptation of large language models, arXiv preprint arXiv: 2106.09685, 2021.

[27]

Y. Chen, X. Xing, J. Lin, H. Zheng, Z. Wang, Q. Liu, and X. Xu, SoulChat: Improving LLMs’ empathy, listening, and comfort abilities through fine-tuning with multi-turn empathy conversations, in Proc. Association for Computational Linguistics : EMNLP 2023, Singapore, 2023, pp. 1170–1183.

[28]

J. M. Liu, D. Li, H. Cao, T. Ren, Z. Liao, and J. Wu, Chatcounselor: A large language models for mental health support, arXiv preprint arXiv: 2309.15461, 2023.

[29]

T. Lai, Y. Shi, Z. Du, J. Wu, K. Fu, Y. Dou, and Z. Wang, Psy-LLM: Scaling up global mental health psychological services with AI-based large language models, arXiv preprint arXiv: 2307.11991, 2023.

[30]

X. Xu, B. Yao, Y. Dong, S. Gabriel, H. Yu, J. Hendler, M. Ghassemi, A. K. Dey, and D. Wang, Mental-LLM: Leveraging large language models for mental health prediction via online text data, arXiv preprint arXiv: 2307.14385, 2024.

[31]

K. Yang, T. Zhang, Z. Kuang, Q. Xie, J. Huang, and S. Ananiadou, MentaLLaMA: Interpretable mental health analysis on social media with large language models, arXiv preprint arXiv: 2309.13567, 2024.

[32]

H. Zhang, J. Chen, F. Jiang, F. Yu, Z. Chen, J. Li, G. Chen, X. Wu, Z. Zhang, Q. Xiao, et al., HuatuoGPT, towards taming language model to be a doctor, arXiv preprint arXiv: 2305.15075, 2023.

[33]

Y. Hua, F. Liu, K. Yang, Z. Li, Y. H. Sheu, P. Zhou, L. V. Moran, S. Ananiadou, and A. Beam, Large language models in mental health care: A scoping review, arXiv preprint arXiv: 2401.02984, 2024.

[34]

Z. Zheng, L. Liao, Y. Deng, and L. Nie, Building emotional support chatbots in the era of LLMs, arXiv preprint arXiv: 2308.11584, 2023.

[35]

X. Zhao, Y. Gao, and Y. Zhang, Tuning LLaMA model with mental disorders knowledge, https://doi.org/10.21203/rs.3.rs-4250151/v1, 2024.

[36]

S. Zhang, L. Dong, X. Li, S. Zhang, X. Sun, S. Wang, J. Li, R. Hu, T. Zhang, F. Wu, et al., Instruction tuning for large language models: A survey, arXiv preprint arXiv: 2308.10792, 2024.

[37]

G. Wang, G. Yang, Z. Du, L. Fan, and X. Li, ClinicalGPT: Large language models finetuned with diverse medical data and comprehensive evaluation, arXiv preprint arXiv: 2306.09968, 2023.

[38]

C. Peng, X. Yang, A. Chen, K. E. Smith, N. PourNejatian, A. B. Costa, C. Martin, M. G. Flores, Y. Zhang, T. Magoc, et al., A study of generative large language model for medical research and healthcare, arXiv preprint arXiv: 2305.13523, 2023.

[39]

Z. Chen, A. H. Cano, A. Romanou, A. Bonnet, K. Matoba, F. Salvi, M. Pagliardini, S. Fan, A. Köpf, A. Mohtashami, et al., MEDITRON-70B: Scaling medical pretraining for large language models, arXiv preprint arXiv: 2311.16079, 2023.

[40]

Y. Li, Z. Li, K. Zhang, R. Dan, S. Jiang, and Y. Zhang, ChatDoctor: A medical chat model fine-tuned on a Large Language Model Meta-AI (LLaMA) using medical domain knowledge, arXiv preprint arXiv: 2303.14070, 2023.

[41]

T. Han, L. C. Adams, J. M. Papaioannou, P. Grundmann, T. Oberhauser, A. Löser, D. Truhn, and K. K. Bressem, MedAlpaca−An open-source collection of medical conversational AI models and training data, arXiv preprint arXiv: 2304.08247, 2023.

[42]

J. Chen, X. Wang, A. Gao, F. Jiang, S. Chen, H. Zhang, D. Song, W. Xie, C. Kong, J. Li, et al., HuatuoGPT-II, one-stage training for medical adaption of LLMs, arXiv preprint arXiv: 2311.09774, 2023.

[43]

S. Kweon, J. Kim, J. Kim, S. Im, E. Cho, S. Bae, J. Oh, G. Lee, J. H. Moon, S. C. You, et al., Publicly shareable clinical large language model built on synthetic clinical notes, arXiv preprint arXiv: 2309.00237, 2024.

[44]

S. Yang, H. Zhao, S. Zhu, G. Zhou, H. Xu, Y. Jia, and H. Zan, Zhongjing: Enhancing the Chinese medical capabilities of large language model through expert feedback and real-world multi-turn dialogue, arXiv preprint arXiv: 2308.03549, 2023.

[45]

X. Zhang, C. Tian, X. Yang, L. Chen, Z. Li, and L. R. Petzold, AlpaCare: Instruction-tuned large language models for medical application, arXiv preprint arXiv: 2310.14558, 2024.

[46]

K. Singhal, S. Azizi, T. Tu, S. S. Mahdavi, J. Wei, H. W. Chung, N. Scales, A. Tanwani, H. Cole-Lewis, S. Pfohl, P. Payne, Large language models encode clinical knowledge, Nature, vol. 620, no. 7972, pp. 172–180, 2023.

Crossref Google Scholar

[47]

H. Nori, N. King, S. M. McKinney, D. Carignan, and E. Horvitz, Capabilities of GPT-4 on medical challenge problems, arXiv preprint arXiv: 2303.13375, 2023.

[48]

H. Xiong, S. Wang, Y. Zhu, Z. Zhao, Y. Liu, L. Huang, Q. Wang, and D. Shen, DoctorGLM: Fine-tuning your Chinese doctor is not a herculean task, arXiv preprint arXiv: 2304.01097, 2023.

[49]

H. Wang, C. Liu, N. Xi, Z. Qiang, S. Zhao, B. Qin, and T. Liu, HuaTuo: Tuning llama model with Chinese medical knowledge, arXiv preprint arXiv: 2304.06975, 2023.

[50]

C. Wu, X. Zhang, Y. Zhang, Y. Wang, and W. Xie, PMC-LLaMA: Further finetuning LLaMA on medical papers, arXiv preprint arXiv: 2304.14454, 2023.

[51]

Y. Chen, Z. Wang, X. Xing, H. Zheng, Z. Xu, K. Fang, J. Wang, S. Li, J. Wu, Q. Liu, et al., BianQue: Balancing the questioning and suggestion ability of health LLMs with multi-turn health conversations polished by ChatGPT, arXiv preprint arXiv: 2310.15896, 2023.

[52]

K. Singhal, T. Tu, J. Gottweis, R. Sayres, E. Wulczyn, L. Hou, K. Clark, S. Pfohl, H. Cole-Lewis, D. Neal, et al., Towards expert-level medical question answering with large language models, arXiv preprint arXiv: 2305.09617, 2023.

[53]

H. Wang, C. Liu, S. Zhao, B. Qin, and T. Liu, ChatGLM-Med, GitHub, https://github.com/SCIR-HI/Med-ChatGLM, 2023.

[54]

W. Zhu and X. Wang. ChatMed: A Chinese medical large language model, GitHub, https://github.com/michael-wzhu/ChatMed, 2023.

[55]

W. Zhu, W. Yue, and X. Wang, ShenNong-TCM: A traditional Chinese medicine large language model, GitHub, https://github.com/michael-wzhu/ShenNong-TCM-LLM, 2023.

[56]

R. Wang, R. Zhou, H. Chen, Y. Wang, and T. Tan, Yapeng Wang, Tao Tan. CareGPT: Medical LLM, open source driven for a healthy future, GitHub, https://github.com/WangRongsheng/CareGPT, 2023.

[57]

D. McDuff, M. Schaekermann, T. Tu, A. Palepu, A. Wang, J. Garrison, K. Singhal, Y. Sharma, S. Azizi, K. Kulkarni, et al., Towards accurate differential diagnosis with large language models, arXiv preprint arXiv: 2312.00164, 2023.

[58]

L. Luo, J. Ning, Y. Zhao, Z. Wang, Z. Ding, P. Chen, W. Fu, Q. Han, G. Xu, Y. Qiu, et al., Taiyi: A bilingual fine-tuned large language model for diverse biomedical tasks, arXiv preprint arXiv: 2311.11608, 2023.

[59]

Z. Bao, W. Chen, S. Xiao, K. Ren, J. Wu, C. Zhong, J. Peng, X. Huang, and Z. Wei, DISC-MedLLM: Bridging general large language models and real-world medical consultation, arXiv preprint arXiv: 2308.14346, 2023.

[60]

Q. Ye, J. Liu, D. Chong, P. Zhou, Y. Hua, F. Liu, M. Cao, Z. Wang, X. Cheng, Z. Lei, et al., Qilin-Med: Multi-stage knowledge injection advanced medical large language model, arXiv preprint arXiv: 2310.09089, 2024.

[61]

A. Toma, P. R. Lawler, J. Ba, R. G. Krishnan, B. B. Rubin, and B. Wang, Clinical camel: An open expert-level medical language model with dialogue-based knowledge encoding, arXiv preprint arXiv: 2305.12031, 2023.

[62]

Y. Wang, W. Zhong, L. Li, F. Mi, X. Zeng, W. Huang, L. Shang, X. Jiang, and Q. Liu, Aligning large language models with human: A survey, arXiv preprint arXiv: 2307.12966, 2023.

[63]

A. B. Abacha and D. Demner-Fushman, A question-entailment approach to question answering, BMC Bioinformatics, vol. 20, no. 1, p. 511, 2019.

Crossref Google Scholar

[64]

O. Byambasuren, Y. Yang, Z. Sui, D. Dai, B. Chang, S. Li, and H. Zan, Preliminary study on the construction of Chinese medical knowledge graph, (in Chinese), Journal of Chinese Information Processing, vol. 33, no. 10, pp. 1–9, 2019.

[65]

E. Turcan and K. McKeown, Dreaddit: A reddit dataset for stress analysis in social media, arXiv preprint arXiv: 1911.00133, 2019.

[66]

U. Naseem, A. G. Dunn, J. Kim, and M. Khushi, Early identification of depression severity levels on reddit using ordinal classification, in Proc. ACM Web Conf. 2022, Lyon, France, 2022, pp. 2563–2572.

[67]

M. Gaur, A. Alambo, J. P. Sain, U. Kursuncu, K. Thirunarayan, R. Kavuluru, A. Sheth, R. Welton, and J. Pathak, Knowledge-aware assessment of severity of suicide risk for early intervention, in Proc. World Wide Web Conf., San Francisco, CA, USA, 2019, pp. 514–525.

[68]

D. Jin, E. Pan, N. Oufattole, W. H. Weng, H. Fang, and P. Szolovits, What disease does this patient have? A large-scale open domain question answering dataset from medical exams, Appl. Sci., vol. 11, no. 14, p. 6421, 2021.

Crossref Google Scholar

[69]

A. Pal, L. K. Umapathi, and M. Sankarasubbu, MedMCQA: A large-scale multi-subject multi-choice dataset for medical domain question answering, in Proc. Conf. Health, Inference, and Learning, Virtual Event, 2022, pp. 248–260.

[70]

UF Health IDR, https://idr.ufhealth.org/, 2023.

[71]

n2c2, https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/.

[72]

wikipedia, https://www.wikipedia.org/, 2023.

[73]

Q. Jin, B. Dhingra, Z. Liu, W. W. Cohen, and X. Lu, PubMedQA: A dataset for biomedical research question answering, arXiv preprint arXiv: 1909.06146, 2019.

[74]

A. Haque, V. Reddi, and T. Giallanza, Deep learning for suicide and depression identification with unsupervised label correction, in Proc. 30^th Int. Conf. Artificial Neural Networks, Bratislava, Slovakia, 2021, pp. 436–447.

[75]

H. Sun, Z. Lin, C. Zheng, S. Liu, and M. Huang, PsyQA: A Chinese dataset for generating long counseling text for mental health support, arXiv preprint arXiv: 2106.01702, 2021.

[76]

X. He, S. Chen, Z. Ju, X. Dong, H. Fang, S. Wang, Y. Yang, J. Zeng, R. Zhang, R. Zhang, et al., MedDialog: Two large-scale medical dialogue datasets, arXiv preprint arXiv: 2004.03329, 2020.

[77]

W. Chen, Z. Li, H. Fang, Q. Yao, C. Zhong, J. Hao, Q. Zhang, X. Huang, J. Peng, and Z. Wei, A benchmark for automatic medical consultation system: Frameworks, tasks and datasets, Bioinformatics, vol. 39, no. 1, p. btac817, 2023.

Crossref Google Scholar

[78]

N. Zhang, M. Chen, Z. Bi, X. Liang, L. Li, X. Shang, K. Yin, C. Tan, J. Xu, F. Huang, et al., CBLUE: A Chinese biomedical language understanding evaluation benchmark, arXiv preprint arXiv: 2106.08087, 2022.

[79]

S. Zhang, X. Zhang, H. Wang, L. Guo, and S. Liu, Multi-scale attentive interaction networks for Chinese medical question answer selection, IEEE Access, vol. 6, pp. 74061–74071, 2018.

Crossref Google Scholar

[80]

J. He, M. Fu, and M. Tu, Applying deep matching networks to Chinese medical question answering: A study and a dataset, BMC Med. Inform. Decis. Mak., vol. 19, no. S2, p. 52, 2019.

Crossref Google Scholar

[81]

J. Li, X. Wang, X. Wu, Z. Zhang, X. Xu, J. Fu, P. Tiwari, X. Wan, and B. Wang, Huatuo-26M, a large-scale Chinese medical QA dataset, arXiv preprint arXiv: 2305.01526, 2023.

[82]

H. Qiu, H. He, S. Zhang, A. Li, and Z. Lan, SMILE: Single-turn to multi-turn inclusive language expansion via ChatGPT for mental health support, arXiv preprint arXiv: 2305.00450, 2024.

[83]

G. Coppersmith, M. Dredze, C. Harman, K. Hollingshead, and M. Mitchell, CLPsych 2015 shared task: Depression and PTSD on Twitter, in Proc. 2^nd Workshop on Computational Linguistics and Clinical Psychology : From Linguistic Signal to Clinical Reality, Denver, CO, USA, 2015, pp. 31–39.

[84]

S. Ji, X. Li, Z. Huang, and E. Cambria, Suicidal ideation and mental disorder detection with attentive relation networks, Neural Comput. Appl., vol. 34, no. 13, pp. 10309–10319, 2022.

Crossref Google Scholar

[85]

M. L. Mauriello, T. Lincoln, G. Hon, D. Simon, D. Jurafsky, and P. Paredes, SAD: A stress annotated dataset for recognizing everyday stressors in SMS-like conversational systems, in Proc. 2021 CHI Conf. Human Factors in Computing Systems, Yokohama, Japan, 2021, pp. 1–7.

[86]

M. Garg, C. Saxena, V. Krishnan, R. Joshi, S. Saha, V. Mago, and B. J. Dorr, CAMS: An annotated corpus for causal analysis of mental health issues in social media posts, arXiv preprint arXiv: 2207.04674, 2022.

[87]

M. Garg, A. Shahbandegan, A. Chadha, and V. Mago, An annotated dataset for explainable interpersonal risk factors of mental disturbance in social media posts, arXiv preprint arXiv: 2305.18727, 2023.

[88]

M. Sathvik and M. Garg, MULTIWD: Multiple wellness dimensions in social media posts, TechRxiv preprint, https://doi.org/10.36227/techrxiv.22816586.v1, 2023.

[89]

X. Yang, A. Chen, N. PourNejatian, H. C. Shin, K. E. Smith, C. Parisien, C. Compas, C. Martin, M. G. Flores, Y. Zhang, et al., GatorTron: A large clinical language model to unlock patient information from unstructured electronic health records, arXiv preprint arXiv: 2203.03540, 2022.

[90]

Y. Tan, M. Li, Z. Huang, H. Yu, and G. Fan, MedChatZH: A better medical adviser learns from better instructions, arXiv preprint arXiv: 2309.01114, 2023.

[91]

G. N. Lin, S. Guo, X. Tan, W. Wang, W. Qian, W. Song, J. Wang, S. Yu, Z. Wang, D. Cui, et al., PsyMuKB: An integrative de novo variant knowledge base for developmental disorders, Genomics Proteomics Bioinformatics, vol. 17, no. 4, pp. 453–464, 2019.

Crossref Google Scholar

[92]

X. Pan, X. Zhou, L. Yu, and L. Hou, Switching from offline to online health consultation in the post-pandemic era: The role of perceived pandemic risk, Front. Public Health, vol. 11, p. 1121290, 2023.

Crossref Google Scholar

[93]

A. Suprem, S. Vaidya, J. E. Ferreira, and C. Pu, Time-aware datasets are adaptive knowledge bases for the new normal, arXiv preprint arXiv: 2211.12508, 2022.

[94]

A. J. Thirunavukarasu, D. S. J. Ting, K. Elangovan, L. Gutierrez, T. F. Tan, and D. S. W. Ting, Large language models in medicine, Nat. Med., vol. 29, no. 8, pp. 1930–1940, 2023.

Crossref Google Scholar

[95]

J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, et al., Overcoming catastrophic forgetting in neural networks, Proc. Natl. Acad. Sci. USA, vol. 114, no. 13, pp. 3521–3526, 2017.

Crossref Google Scholar

[96]

H. Huang, O. Zheng, D. Wang, J. Yin, Z. Wang, S. Ding, H. Yin, C. Xu, R. Yang, Q. Zheng, et al., ChatGPT for shaping the future of dentistry: The potential of multi-modal large language model, Int. J. Oral Sci., vol. 15, no. 1, p. 29, 2023.

Crossref Google Scholar

[97]

Y. Wei, L. Guo, C. Lian, and J. Chen, ChatGPT: Opportunities, risks and priorities for psychiatry, Asian J. Psychiatr., vol. 90, p. 103808, 2023.

Crossref Google Scholar

[98]

A. C. van Heerden, J. R. Pozuelo, and B. A. Kohrt, Global mental health services and the impact of artificial intelligence–Powered large language models, JAMA Psychiatry, vol. 80, no. 7, pp. 662–664, 2023.

Crossref Google Scholar

[99]

A. Rieger, A. Gaines, I. Barnett, C. F. Baldassano, M. B. C. Gibbons, and P. Crits-Christoph, Psychiatry outpatients’ willingness to share social media posts and smartphone data for research and clinical purposes: Survey study, JMIR Form. Res., vol. 3, no. 3, p. e14329, 2019.

Crossref Google Scholar

[100]

Z. Obermeyer, B. Powers, C. Vogeli, and S. Mullainathan, Dissecting racial bias in an algorithm used to manage the health of populations, Science, vol. 366, no. 6464, pp. 447–453, 2019.

Crossref Google Scholar

[101]

B. R. Beaulieu-Jones, M. T. Berrigan, S. Shah, J. S. Marwaha, S.-L. Lai, and G. A. Brat, Evaluating capabilities of large language models: Performance of GPT-4 on surgical knowledge assessments, Surgery, vol. 175, no. 4, pp. 936–942, 2024

Crossref Google Scholar

[102]

J. W. A. Strachan, D. Albergo, G. Borghini, O. Pansardi, E. Scaliti, S. Gupta, K. Saxena, A. Rufo, S. Panzeri, G. Manzi, et al., Testing theory of mind in large language models and humans, Nat. Hum. Behav., vol. 8, no. 7, pp. 1285–1295, 2024.

Crossref Google Scholar

[103]

S. Ji, T. Zhang, K. Yang, S. Ananiadou, and E. Cambria, Rethinking large language models in mental health applications, arXiv preprint arXiv: 2311.11267, 2023.

[104]

A. J. Nashwan, A. A. Abujaber, and H. Choudry, Embracing the future of physician-patient communication: GPT-4 in gastroenterology, Gastroenterology & Endoscopy, vol. 1, no. 3, pp. 132–135, 2023.

Crossref Google Scholar

[105]

V. Sorin, D. Brin, Y. Barash, E. Konen, A. Charney, G. Nadkarni, and E. Klang, Large language models (LLMs) and empathy-a systematic review, medRxiv preprint, https://doi.org/10.1101/2023.08.07.23293769, 2023.

[106]

S. Poria, D. Hazarika, N. Majumder, G. Naik, E. Cambria, and R. Mihalcea, MELD: A multimodal multi-party dataset for emotion recognition in conversations, arXiv preprint arXiv: 1810.02508, 2019.

[107]

Y. H. H. Tsai, S. Bai, P. P. Liang, J. Z. Kolter, L. P. Morency, and R. Salakhutdinov, Multimodal transformer for unaligned multimodal language sequences, in Proc. 57^th Annu. Meeting of the Association for Computational Linguistics, Florence, Italy, 2019, p. 6558–6569.

[108]

A. A. B. Zadeh, P. P. Liang, S. Poria, E. Cambria, and L. P. Morency, Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph, in Proc. 56^th Annu. Meeting of the Association for Computational Linguistics (Volume 1 : Long Papers ), Melbourne, Australia, 2018, pp. 2236–2246.

[109]

W. Chen, H. Hu, X. Chen, P. Verga, and W. W. Cohen, MuRAG: Multimodal retrieval-augmented generator for open question answering over images and text, arXiv preprint arXiv: 2210.02928, 2022.

[110]

G. McLoughlin, S. Makeig, and M. T. Tsuang, In search of biomarkers in psychiatry: EEG-based measures of brain function, Am. J. Med. Genet. B: Neuropsychiatr. Genet., vol. 165, no. 2, pp. 111–121, 2014.

Crossref Google Scholar

[111]

S. K. Loo, A. Lenartowicz, and S. Makeig, Research review: Use of EEG biomarkers in child psychiatry research–current state and future directions, J. Child Psychol. Psychiatry, vol. 57, no. 1, pp. 4–17, 2016.

Crossref Google Scholar

[112]

T. Sand, M. H. Bjørk, and A. E. Vaaler, Is EEG a useful test in adult psychiatry? Tidsskr. Nor. Laegeforen., vol. 133, no. 11, pp. 1200–1204, 2013.

Crossref Google Scholar

[113]

M. J. Farah and S. J. Gillihan, Diagnostic brain imaging in psychiatry: Current uses and future prospects, Virtual Mentor, vol. 14, no. 6, pp. 464–471, 2012.

Crossref Google Scholar

[114]

D. E. J. Linden, The challenges and promise of neuroimaging in psychiatry, Neuron, vol. 73, no. 1, pp. 8–22, 2012.

Crossref Google Scholar

[115]

A. Abi-Dargham and G. Horga, The search for imaging biomarkers in psychiatric disorders, Nat. Med., vol. 22, no. 11, pp. 1248–1255, 2016.

Crossref Google Scholar

[116]

F. Vandenberghe, M. Guidi, E. Choong, A. Von Gunten, P. Conus, C. Csajka, and C. B. Eap, Genetics-based population pharmacokinetics and pharmacodynamics of risperidone in a psychiatric cohort, Clin. Pharmacokinet., vol. 54, no. 12, pp. 1259–1272, 2015.

Crossref Google Scholar

[117]

M. Wornow, Y. Xu, R. Thapa, B. Patel, E. Steinberg, S. Fleming, M. A. Pfeffer, J. Fries, and N. H. Shah, The shaky foundations of large language models and foundation models for electronic health records, npj Digit. Med., vol. 6, no. 1, p. 135, 2023.

Crossref Google Scholar

[118]

L. Deng, G. Li, S. Han, L. Shi, and Y. Xie, Model compression and hardware acceleration for neural networks: A comprehensive survey, Proc. IEEE, vol. 108, no. 4, pp. 485–532, 2020.

Crossref Google Scholar

[119]

X. Zhu, J. Li, Y. Liu, C. Ma, and W. Wang, A survey on model compression for large language models, arXiv preprint arXiv: 2308.07633, 2023.

[120]

T. Zhang, S. Ye, K. Zhang, J. Tang, W. Wen, M. Fardad, and Y. Wang, A systematic DNN weight pruning framework using alternating direction method of multipliers, in Proc. 15^th European Conf. Computer Vision, Munich, Germany, 2018, pp. 191–207.

[121]

M. A. Gordon, K. Duh, and N. Andrews, Compressing BERT: Studying the effects of weight pruning on transfer learning, arXiv preprint arXiv: 2002.08307, 2020.

[122]

G. Fang, X. Ma, M. Song, M. B. Mi, and X. Wang, DepGraph: Towards any structural pruning, in Proc. 2023 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Vancouver, Canada, 2023, pp. 16091–16101.

[123]

Y. Gu, L. Dong, F. Wei, and M. Huang, MiniLLM: Knowledge distillation of large language models, arXiv preprint arXiv: 2306.08543, 2024.

[124]

Z. Liu, B. Oguz, C. Zhao, E. Chang, P. Stock, Y. Mehdad, Y. Shi, R. Krishnamoorthi, and V. Chandra, LLM-QAT: Data-free quantization aware training for large language models, arXiv preprint arXiv: 2305.17888, 2023.

[125]

F. X. Doo, P. Kulkarni, E. L. Siegel, M. Toland, P. H. Yi, R. C. Carlos, and V. S. Parekh, Economic and environmental costs of cloud technologies for medical imaging and radiology artificial intelligence, J. Am. Coll. Radiol., vol. 21, no. 2, pp. 248–256, 2024.

Crossref Google Scholar

[126]

C. Chen, X. Feng, J. Zhou, J. Yin, and X. Zheng, Federated large language model: A position paper, arXiv preprint arXiv: 2307.08925, 2023.

[127]

J. E. Zini and M. Awad, On the explainability of natural language processing deep models, ACM Comput. Surv., vol. 55, no. 5, p. 103, 2022.

Crossref Google Scholar

[128]

D. W. Joyce, A. Kormilitzin, K. A. Smith, and A. Cipriani, Explainable artificial intelligence for mental health through transparency and interpretability for understandability, npj Digit. Med., vol. 6, no. 1, p. 6, 2023.

Crossref Google Scholar

[129]

J. Clusmann, F. R. Kolbinger, H. S. Muti, Z. I. Carrero, J. N. Eckardt, N. G. Laleh, C. M. L. Löffler, S. C. Schwarzkopf, M. Unger, G. P. Veldhuizen, et al., The future landscape of large language models in medicine, Commun. Med., vol. 3, no. 1, p. 141, 2023.

Crossref Google Scholar

[130]

B. Meskó and E. J. Topol, The imperative for regulatory oversight of large language models (or generative AI) in healthcare, npj Digit. Med., vol. 6, no. 1, p. 120, 2023.

Crossref Google Scholar

[131]

M. Moor, O. Banerjee, Z. S. H. Abad, H. M. Krumholz, J. Leskovec, E. J. Topol, and P. Rajpurkar, Foundation models for generalist medical artificial intelligence, Nature, vol. 616, no. 7956, p. 259–265, 2023.

Crossref Google Scholar

[132]

B. S. Fernandes, L. M. Williams, J. Steiner, M. Leboyer, A. F. Carvalho, and M. Berk, The new field of ‘precision psychiatry’, BMC Med., vol. 15, no. 1, p. 80, 2017.

Crossref Google Scholar

[133]

M. Bauer, S. Monteith, J. Geddes, M. J. Gitlin, P. Grof, P. C. Whybrow, and T. Glenn, Automation to optimise physician treatment of individual patients: Examples in psychiatry, Lancet Psychiatry, vol. 6, no. 4, pp. 338–349, 2019.

Crossref Google Scholar

[134]

R. Tang, X. Han, X. Jiang, and X. Hu, Does synthetic data generation of LLMs help clinical text mining? arXiv preprint arXiv: 2303.04360, 2023.

[135]

A. Mahmood, J. Wang, B. Yao, D. Wang, and C. M. Huang, LLM-powered conversational voice assistants: Interaction patterns, opportunities, challenges, and design guidelines, arXiv preprint arXiv: 2309.13879, 2023.

[136]

J. Qian, Z. Jin, Q. Zhang, G. Cai, and B. Liu, A liver cancer question-answering system based on next-generation intelligence and the large model Med-PaLM 2, International Journal of Computer Science and Information Technology, vol. 2, pp. 28–35, 2024.

Crossref Google Scholar

[137]

E. Jo, D. A. Epstein, H. Jung, and Y. H. Kim, Understanding the benefits and challenges of deploying conversational AI leveraging large language models for public health intervention, in Proc. 2023 CHI Conf. Human Factors in Computing Systems, Hamburg, Germany, 2023, p. 18.

[138]

R. Yang, T. F. Tan, W. Lu, A. J. Thirunavukarasu, D. S. W. Ting, and N. Liu, Large language models in health care: Development, applications, and challenges, Health Care Sci., vol. 2, no. 4, pp. 255–263, 2023.

Crossref Google Scholar

[139]

R. Bhaumik, V. Srivastava, A. Jalali, S. Ghosh, and R. Chandrasekaran, MindWatch: A smart cloud-based AI solution for suicide ideation detection leveraging large language models, medRxiv preprint, https://doi.org/10.1101/2023.09.25.23296062, 2023.

[140]

G. Wang, X. Liu, Z. Ying, G. Yang, Z. Chen, Z. Liu, M. Zhang, H. Yan, Y. Lu, Y. Gao, et al., Optimized glycemic control of type 2 diabetes with reinforcement learning: A proof-of-concept trial, Nat. Med., vol. 29, no. 10, pp. 2633–2642, 2023.

Crossref Google Scholar

[141]

T. D. Nguyen, Y. S. Ting, I. Ciucă, C. O’Neill, Z. C. Sun, M. Jabłońska, S. Kruk, E. Perkowski, J. Miller, J. Li, et al., AstroLLaMA: Towards specialized foundation models in astronomy, arXiv preprint arXiv: 2309.06126, 2023.

[142]

S. Pal, M. Bhattacharya, S. S. Lee, and C. Chakraborty, A domain-specific next-generation large language model (LLM) or ChatGPT is required for biomedical engineering and research, Ann. Biomed. Eng., vol. 52, no. 3, pp. 451–454, 2024.

Crossref Google Scholar

[143]

A. Abd-Alrazaq, R. AlSaad, D. Alhuwail, A. Ahmed, P. M. Healy, S. Latifi, S. Aziz, R. Damseh, S. A. Alrazak, and J. Sheikh, Large language models in medical education: Opportunities, challenges, and future directions, JMIR Med. Educ., vol. 9, p. e48291, 2023.

Crossref Google Scholar

[144]

M. A. Fink, A. Bischoff, C. A. Fink, M. Moll, J. Kroschke, L. Dulz, C. P. Heußel, H. U. Kauczor, and T. F. Weber, Potential of ChatGPT and GPT-4 for data mining of free-text CT reports on lung cancer, Radiology, vol. 308, no. 3, p. e231362, 2023.

Crossref Google Scholar

[145]

Y. Chang, X. Wang, J. Wang, Y. Wu, L. Yang, K. Zhu, H. Chen, X. Yi, C. Wang, Y. Wang, et al., A survey on evaluation of large language models, arXiv preprint arXiv: 2307.03109, 2023.

[146]

X. Zhang, S. Li, B. Hauer, N. Shi, and G. Kondrak, Don’t trust ChatGPT when your question is not in English: A study of multilingual abilities and types of LLMs, in Proc. 2023 Conf. Empirical Methods in Natural Language Processing, Singapore, 2023, pp. 7915–7927.

[147]

L. Campillos-Llanos, C. Thomas, É. Bilinski, A. Neuraz, S. Rosset, and P. Zweigenbaum, Lessons learned from the usability evaluation of a simulated patient dialogue system, J. Med. Syst., vol. 45, no. 7, p. 69, 2021.

Crossref Google Scholar

[148]

S. Chen, M. Wu, K. Q. Zhu, K. Lan, Z. Zhang, and L. Cui, LLM-empowered chatbots for psychiatrist and patient simulation: Application and evaluation, arXiv preprint arXiv: 2305.13614, 2023.

[149]

P. Cuijpers, J. Li, S. G. Hofmann, and G. Andersson, Self-reported versus clinician-rated symptoms of depression as outcome measures in psychotherapy research on depression: A meta-analysis, Clin. Psychol. Rev., vol. 30, no. 6, pp. 768–778, 2010.

Crossref Google Scholar

[150]

S. E. O’Bryant, C. G. Finlay, and J. R. O’Jile, TOMM performances and self-reported symptoms of depression and anxiety, J. Psychopathol. Behav. Assess., vol. 29, no. 2, pp. 111–114, 2007.

[151]

W. Gan, Z. Qi, J. Wu, and J. C. W. Lin, Large language models in education: Vision and opportunities, arXiv preprint arXiv: 2311.13160, 2023.

[152]

M. Sallam, N. Salim, M. Barakat, and A. Al-Tammemi, ChatGPT applications in medical, dental, pharmacy, and public health education: A descriptive study highlighting the advantages and limitations, Narra J, vol. 3, no. 1, p. e103, 2023.

Crossref Google Scholar

Big Data Mining and Analytics

Volume 7 Issue 4,
December 2024

Pages 1148-1168

DOI: 10.26599/BDMA.2024.9020046

Cite this article:

Liu Z, Bao Y, Zeng S, et al. Large Language Models in Psychiatry: Current Applications, Limitations, and Future Scope. Big Data Mining and Analytics, 2024, 7(4): 1148-1168. https://doi.org/10.26599/BDMA.2024.9020046

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号