Large language models and their application in government affairs

Yun WANG; Min HU; Na TA; Haitao SUN; Yifeng GUO; Wuai ZHOU; Yu GUO; Wanzhe ZHANG; Jianhua FENG

doi:10.16511/j.cnki.qhdxxb.2023.26.042

| Sign up

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Abstract

Keywords

References

Show full outline

Hide outline

Publishing Language: Chinese

Large language models and their application in government affairs

Yun WANG^{¹^,²}, Min HU^², Na TA^³(), Haitao SUN^², Yifeng GUO^², Wuai ZHOU^², Yu GUO^², Wanzhe ZHANG^², Jianhua FENG^¹

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

China Mobile Information System Integration Co., Ltd., Beijing 100032, China

School of Journalism and Communication, Renmin University of China, Beijing 100872, China

Show Author Information

Abstract

Significance

Since the turn of the 21st century, artificial intelligence (AI) has advanced considerably in many domains, including government affairs. Furthermore, the emergence of deep learning has taken the development of many AI fields, including natural language processing (NLP), to a new level. Language models (LMs) are key research directions of NLP. Referred to as statistical models, LMs were initially used to calculate the probability of a sentence; however, in recent years, there have been substantial developments in large language models (LLMs). Notably, LLM products, such as the generative pretrained transformer (GPT) series, have driven the rapid revolution of large language research. Domestic enterprises have also researched LLMs, for example, Huawei's Pangu and Baidu's enhanced language representation with informative entities (ERNIE) bot. These models have been widely used in language translation, abstract construction, named-entity recognition, text classification, and relationship extraction, among other applications, and in government affairs, finance, biomedicine, and other domains.

Progress

In this study, we observe that improving the efficiency of governance has become one of the core tasks of the government in the era of big data. With the continuous accumulation of government data, traditional statistical models relying on expert experience and local features gradually suffer limitations during application. However, LLMs, which offer the advantages of high flexibility, strong representation ability, and effective results, can rapidly enhance the intelligence level of government services. First, we review the research progress on early LMs, such as statistical LMs and neural network LMs. Subsequently, we focus on the research progress on LLMs, namely the Transformers series, GPT series, and bidirectional encoder representations from transformers (BERT) series. Finally, we introduce the application of LLMs in government affairs, including government text classification, relationship extraction, public opinion risk identification, named-entity recognition, and government question answering. Moreover, we propose that research on LLMs for government affairs must focus on multimodality, correctly benefit from the trend of "model as a service, " focus on high data security, and clarify government responsibility boundaries. Additionally, a technical path for studying LLMs for government affairs has been proposed.

Conclusions and Prospects

The application of LLMs in government affairs mainly focuses on small-scale models, lacking examples of application in large-scale models. Compared with smaller models, large models offer many advantages, including high efficiency, broader application scenarios, and more convenience. These advantages can be understood as follows. In terms of efficiency, large models are usually trained on a large amount of heterogeneous data, thus delivering better performance. In terms of application scenarios, large models gradually support multimodal data, resulting in more diverse application scenarios. In terms of convenience, we emphasize the "pretraining + fine-tuning" mode and the invocation method of interfaces, making LLMs more convenient for research and practical applications. This study also analyzes the issues suffered by LLMs, specifically from the technological and ethical perspectives, which have resulted in a panic to a certain extent. For example, ChatGPT has generated many controversies, including whether the generated files are novel, whether using ChatGPT will lead to plagiarism and ambiguity as to who are property rights owners for the generated files. Overall, it can be said that LLMs are in the stage of vigorous development. As the country promotes research on AI and its application in government affairs, LLMs will play an increasingly crucial role in the field.

Keywords

language model natural language processing artificial intelligence transfer learning digital government

CLC number: G250 Document code: A Article ID: 1000-0054(2024)04-0649-10

References

[1]

LIN T Y, WANG Y X, LIU X Y, et al. A survey of transformers[J]. AI Open, 2022, 3: 111-132.

Crossref Google Scholar

[2]

TAY Y, DEHGHANI M, BAHRI D, et al. Efficient transformers: A survey[J]. ACM Computing Surveys, 2022, 55(6): 109.

Crossref Google Scholar

[3]

RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving language understanding by generative pre-training[J/OL]. Openai. (2023-01-06)[2023-06-12]. https://www.semanticscholar.org/paper/Improving-Language-Understanding-by-Generative-Radford-Narasimhan/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035.

[4]

RADFORD A, WU J, CHILD R, et al. Language models are unsupervised multitask learners[J/OL]. Openai. (2019-02-14)[2023-06-12]. https://paperswithcode.com/paper/language-models-are-unsupervised-multitask.

[5]

BROWN T B, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: ACM, 2020: 1877-1901.

[6]

LONG O Y, WU J, SLAMA K, et al. Training language models to follow instructions with human feedback[J/OL]. Computation and Language. (2022-03-04)[2023-06-12]. https://arxiv.org/abs/2203.02155.

[7]

Openai. GPT-4 technical report[R/OL]. (2023-03-27)[2023-06-12]. https://arxiv.org/abs/2303.08774.2023.

[8]

The State Council. Guiding opinions of the State Council on actively promoting the "Internet Plus" action[EB/OL]. (2015-07-04)[2023-06-12]. http://www.gov.cn/zhengce/content/2015-07/04/content_10002.htm. (in Chinese)

[9]

The State Council. Circular of the State Council on printing and issuing the plan for development of the new generation of artificial intelligence[EB/OL]. (2017-07-20)[2023-06-12]. http://www.gov.cn/zhengce/content/2017-07/20/content_5211996.htm. (in Chinese)

[10]

Standardization Administration of the People's Republic of China, Office of the Central Cyberspace Affairs Commission, National Development and Reform Commission, et al. Standardization Administration, Office of the Central Cyberspace Affairs Commission, National Development and Reform Commission, Ministry of Science and Technology of the People's Republic of China, Ministry of Industry and Information Technology of the People's Republic of China notice on issuing the "Guidelines for the construction of the national new generation artificial intelligence standards system"[EB/OL]. (2020-07-27)[2023-06-12]. http://www.gov.cn/zhengce/zhengceku/2020-08/09/content_5533454.htm. (in Chinese)

[11]

The outline of the 14th Five-Year Plan (2021—2025) for national economic and social development of the People's Republic of China and the long-range objectives through the year 2035[EB/OL]. (2021-03-13)[2023-06-12]. http://www.gov.cn/xinwen/2021-03/13/content_5592681.htm. (in Chinese)

[12]

The State Council. Circular of the State Council on printing and issuing the development plan for digital economy during the 14th Five-Year Plan period[EB/OL]. (2022-01-12)[2023-06-12]. http://www.gov.cn/zhengce/content/2022-01/12/content_5667817.htm. (in Chinese)

[13]

Cyberspace Administration of China, National Development and Reform Commission, Ministry of Education of the People's Republic of China, et al. Interim measures for administration of generative artificial intelligence services[EB/OL]. (2023-07-13)[2023-07-25]. http://www.cac.gov.cn/2023-07/13/c_1690898327029107.htm. (in Chinese)

[14]

BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model[J]. The Journal of Machine Learning Research, 2003, 3: 1137-1155.

Google Scholar

[15]

MIKOLOV T, KARAFIÁT M, BURGET L, et al. Recurrent neural network based language model[C]//11th Annual Conference of the International Speech Communication Association. Makuhari, Japan: International Conference on Spoken Language Processing 2010, 2010: 1045-1048.

Crossref

[16]

MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Proceedings of the 26th International Conference on Neural Information Processing Systems. Sydney, Australia: ACM, 2013: 3111-3119.

[17]

MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space[J/OL]. Computer Science. (2013-01-16)[2023-06-12]. https://arxiv.org/abs/1301.3781.

[18]

PENNINGTON J, SOCHER R, MANNING C D. GloVe: Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 1532-1543.

Crossref

[19]

PETERS M E, NEUMANN M, IYYER M, et al. Deep contextualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. New Orleans, USA: Association for Computational Linguistics, 2018: 2227-2237.

Crossref

[20]

VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, USA: ACM, 2017: 6000-6010.

[21]

LEWIS M, LIU Y H, GOYAL N, et al. BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[J/OL]. arXiv. (2019-10-29)[2023-06-12]. https://arxiv.org/abs/1910.13461.

[22]

RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the limits of transfer learning with a unified text-to-text transformer[J]. The Journal of Machine Learning Research, 2020, 21(1): 140.

Google Scholar

[23]

DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis, USA: Association for Computational Linguistics, 2018: 4171-4186.

[24]

LIU Y H, OTT M, GOYAL N, et al. RoBERTa: A robustly optimized BERT pretraining approach[J/OL]. arXiv. (2019-07-26)[2023-06-12]. https://arxiv.org/abs/1907.11692.

[25]

YANG Z L, DAI Z H, YANG Y M, et al. XLNet: Generalized autoregressive pretraining for language understanding[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS 2019). Vancouver, Canada: ACM, 2019: 5753-5763.

[26]

ZHANG Z Y, HAN X, LIU Z Y, et al. ERNIE: Enhanced language representation with informative entities[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019: 1441-1451.

Crossref

[27]

CONNEAU A, LAMPLE G. Cross-lingual language model pretraining[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: ACM, 2019: 7056-7069.

[28]

LAN Z Z, CHEN M D, GOODMAN S, et al. ALBERT: A lite BERT for self-supervised learning of language representations[J/OL]. arXiv. (2019-09-26)[2023-06-12]. https://arxiv.org/abs/1909.11942.

[29]

CLARK K, LUONG M T, LE Q V, et al. ELECTRA: Pre-training text encoders as discriminators rather than generators[J/OL]. arXiv. (2020-03-23)[2023-06-12]. https://arxiv.org/abs/2003.10555v1.

[30]

WEI J, MAARTEN B, ZHAO V Y, et al. Finetuned language models are zero-shot learners[J/OL]. arXiv. (2021-09-03)[2023-06-12]. https://arxiv.org/abs/2109.01652.

[31]

CHRISTIANO P F, LEIKE J, BROWN T B, et al. Deep reinforcement learning from human preferences[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: ACM, 2017: 4302-4310.

[32]

CHOWDHERY A, NARANG S, DEVLIN J, et al. PaLM: Scaling language modeling with pathways[J/OL]. arXiv. (2022-04-05)[2023-06-12]. https://arxiv.org/abs/2204.02311.

[33]

TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: Open and efficient foundation language models [J/OL]. arXiv. (2023-02-27)[2023-06-12]. https://arxiv.org/abs/2302.13971.

[34]

CHEN G. Government hotline work-order classification fusing RoBERTa and feature extraction[J]. Computer and Modernization, 2022(6): 21-26, 31. (in Chinese)

Google Scholar

[35]

ZHU N N, WANG H, ZHANG J L, et al. Policy identification based on pretrained language model [J]. Journal of Chinese Information Processing, 2022, 36(2): 104-110. (in Chinese)

Google Scholar

[36]

SHI G L, CHEN Y Q. A comparative study on the integration of text enhanced and pre-trained language models in the classification of internet political messages[J]. Library and Information Service, 2021, 65(13): 96-107. (in Chinese)

Google Scholar

[37]

LU X B, LU G X, YANG G C, et al. Automatic annotation method for subject classification of policy texts on government official websites[J]. Archives Science Bulletin, 2022(5): 19-27. (in Chinese)

Google Scholar

[38]

HUA B, KANG Y, FAN L H. Knowledge modeling and association Q & A for policy texts[J]. Data Analysis and Knowledge Discovery, 2022, 6(11): 79-92. (in Chinese)

Google Scholar

[39]

YANG C M, WEI C Z, ZHANG H, et al. A method of named entity recognition in government affairs based on BERT-BLSTM-CRF[J]. Journal of Southwest University of Science and Technology, 2020, 35(3): 86-91. (in Chinese)

Google Scholar

[40]

XU X K, YIN J W, WANG X J. Research on Hotspot tracking of "Internet + Government Affairs" mass message text based on BERT model[J]. Journal of Intelligence, 2022, 41(9): 136-142, 78. (in Chinese)

Google Scholar

[41]

ZENG H L, LI L, LÜ S Y, et al. A risk identificaiton method for news public opinion driven by prompt learning[J/OL]. Computer Engineering and Applications. (2022-11-30)[2023-06-12]. http://kns.cnki.net/kcms/detail/11.2127.TP.20221129.1143.002.html. (in Chinese)

[42]

CUI C M, SHI Y M, YUAN B, et al. Research on relation extraction method for government documents[J]. Computer Technology and Development, 2021, 31(12): 26-32. (in Chinese)

Google Scholar

[43]

HUANG S H, DONG L, WANG W H, et al. Language is not all you need: Aligning perception with language models[J/OL]. arXiv. (2023-02-27)[2023-06-12]. https://arxiv.org/abs/2302.14045.

[44]

Interface Quick Report. The Singapore government has developed a ChatGPT similar system to assist civil servants in writing reports[N/OL]. (2023-02-28)[2023-06-12]. https://www.jiemian.com/article/8982146.html. (in Chinese)

[45]

CHEN T, LIU X. The prospects and hidden worries of ChatGPT in intelligent government[J]. E-Government, 2023(4): 36-44. (in Chinese)

Google Scholar

Journal of Tsinghua University (Science and Technology)

Volume 64 Issue 4,
April 2024

Pages 649-658

DOI: 10.16511/j.cnki.qhdxxb.2023.26.042

Cite this article:

WANG Y, HU M, TA N, et al. Large language models and their application in government affairs. Journal of Tsinghua University (Science and Technology), 2024, 64(4): 649-658. https://doi.org/10.16511/j.cnki.qhdxxb.2023.26.042