A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models

Yuan-Feng Song; Yuan-Qin He; Xue-Fang Zhao; Han-Lin Gu; Di Jiang; Hai-Jun Yang; Li-Xin Fan

doi:10.1007/s11390-024-4058-8

| Sign up

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Abstract

Keywords

References

Show full outline

Hide outline

Survey

A Communication Theory Perspective on Prompting Engineering Methods for Large Language Models

Yuan-Feng Song, Yuan-Qin He, Xue-Fang Zhao, Han-Lin Gu, Di Jiang, Hai-Jun Yang, Li-Xin Fan

AI Group, WeBank Co., Ltd, Shenzhen 518000, China

Show Author Information

Abstract

The springing up of large language models (LLMs) has shifted the community from single-task-orientated natural language processing (NLP) research to a holistic end-to-end multi-task learning paradigm. Along this line of research endeavors in the area, LLM-based prompting methods have attracted much attention, partially due to the technological advantages brought by prompt engineering (PE) as well as the underlying NLP principles disclosed by various prompting methods. Traditional supervised learning usually requires training a model based on labeled data and then making predictions. In contrast, PE methods directly use the powerful capabilities of existing LLMs (e.g., GPT-3 and GPT-4) via composing appropriate prompts, especially under few-shot or zero-shot scenarios. Facing the abundance of studies related to the prompting and the ever-evolving nature of this field, this article aims to 1) illustrate a novel perspective to review existing PE methods within the well-established communication theory framework, 2) facilitate a better/deeper understanding of developing trends of existing PE methods used in three typical tasks, and 3) shed light on promising research directions for future PE methods.

Keywords

prompting method large language model communication theory

References

[1]

Brown T B, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A, Agarwal S, Herbert-Voss A, Krueger G, Henighan T, Child R, Ramesh A, Ziegler D M, Wu J, Winter C, Hesse C, Chen M, Sigler E, Litwin M, Gray S, Chess B, Clark J, Berner C, McCandlish S, Radford A, Sutskever I, Amodei D. Language models are few-shot learners. In Proc. the 34th International Conference on Neural Information Processing Systems, Dec. 2020, Article No. 159.

[2]

OpenAI. GPT-4 technical report. arXiv: 2303.08774, 2023. https://arxiv.org/abs/2303.08774, Jul. 2024.

[3]

Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M A, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, Rodriguez A, Joulin A, Grave E, Lample G. LLaMA: Open and efficient foundation language models. arXiv: 2302.13971, 2023. https://arxiv.org/abs/2302.13971, Jul. 2024.

[4]

Cheng K M, Li Z Y, Li C, Xie R J, Guo Q, He Y B, Wu H Y. The potential of GPT-4 as an AI-powered virtual assistant for surgeons specialized in joint arthroplasty. Annals of Biomedical Engineering , 2023, 51(7): 1366–1370. DOI: 10.1007/s10439-023-03207-z.

Crossref Google Scholar

[5]

Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: An analysis of multiple clinical and research scenarios. Journal of Medical Systems , 2023, 47(1): Article No. 33. DOI: 10.1007/s10916-023-01925-4.

Crossref Google Scholar

[6]

George A S, George A S H. A review of ChatGPT AI’s impact on several business sectors. Partners Universal International Innovation Journal , 2023, 1(1): 9–23. DOI: 10.5281/zenodo.7644359.

Crossref Google Scholar

[7]

Liu P F, Yuan W Z, Fu J L, Jiang Z B, Hayashi H, Neubig G. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys , 2023, 55(9): 195. DOI: 10.1145/3560815.

Crossref Google Scholar

[8]

Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I. Language models are unsupervised multitask learners. OpenAI blog , 2019, 1(8): Article No. 9.

Google Scholar

[9]

Petroni F, Rocktäschel T, Riedel S, Lewis P, Bakhtin A, Wu Y X, Miller A. Language models as knowledge bases? In Proc. the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Nov. 2019, pp.2463–2473. DOI: 10.18653/v1/D19-1250.