| Sign up

PDF (9 MB)

Cite

Collect

Submit Manuscript

Show Outline

Figures (3)

Fig. 1

Fig. 2

Fig. 3

Tables (3)

Table 1

Table 2

Table 3

Research Article | Open Access

Automatic construction accident report analysis using large language models (LLMs)

Ehsan Ahmadi, Shashank Muley, Chao Wang()

Bert S. Turner Department of Construction Management, Louisiana State University, Baton Rouge 70803, USA

Show Author Information

Abstract

Construction site safety is a paramount concern, given the high rate of accidents and fatalities in the sector. This study introduces a novel approach to analyzing construction accident reports by employing advanced large language models (LLMs), specifically generative pre-trained transformer (GPT)-3.5, GPT-4.0, Gemini Pro, and large language model Meta artificial intelligence (AI) (LLaMA) 3.1. Our research focuses on the classification of key attributes in accident reports: root cause, injury cause, affected body part, severity, and accident time. The results reveal that GPT-4.0 achieves significantly higher accuracy across most attributes. Gemini Pro demonstrates superior performance in the “injury cause” classification, while LLaMA 3.1 excels in classifying “severity” and “root cause”. GPT-3.5, although lagging behind GPT-4.0, exhibits commendable accuracy. The insights gained from this study are vital for the construction industry, as they indicate the potential for developing more precise and effective safety measures. These findings could lead to a reduction in the frequency and severity of accidents, thereby enhancing worker safety.

Keywords

construction safety accident report analysis large language models GPT Gemini LLaMA

References

[1]

International Labour Organization. Nearly 3 million people die of work-related accidents and diseases [Online]. 2023. https://www.ilo.org/global/about-the-ilo/newsroom/news/WCMS_902220/lang--en/index.htm (accessed 2023-11-12).

[2]

CPWR. The center for construction research and training—Construction focus four [Online]. 2023. https://www.cpwr.com/research/data-center/data-dashboards/construction-focus-four-dashboard/ (accessed 2023-11-12).

[3]

A. J. P. Tixier, M. R. Hallowell, B. Rajagopalan, et al. Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports. Autom Constr, 2016, 62: 45–56.

Crossref Google Scholar

[4]

Y. M. Goh, C. U. Ubeynarayana. Construction accident narrative classification: An evaluation of text mining techniques. Accid Anal Prev, 2017, 108: 122–130.

Crossref Google Scholar

[5]

M. Y. Cheng, D. Kusoemo, R. A. Gosno. Text mining-based construction site accident classification using hybrid supervised machine learning. Autom Constr, 2020, 118: 103265.

Crossref Google Scholar

[6]

F. Zhang. A hybrid structured deep neural network with Word2Vec for construction accident causes classification. Int J Constr Manage, 2022, 22: 1120–1140.

Crossref Google Scholar

[7]

M. Alkaissy, M. Arashpour, E. M. Golafshani, et al. Enhancing construction safety: Machine learning-based classification of injury types. Saf Sci, 2023, 162: 106102.

Crossref Google Scholar

[8]

X. X. Luo, X. C. Li, X. F. Song, et al. Convolutional neural network algorithm-based novel automatic text classification framework for construction accident reports. J Constr Eng Manage, 2023, 149: 04023128.

Crossref Google Scholar

[9]

K. Kowsari, K. J. Meimandi, M. Heidarysafa, et al. Text classification algorithms: A survey. Information, 2019, 10: 150.

Crossref Google Scholar

[10]

P. F. Liu, X. P. Qiu, X. J. Huang. Recurrent neural network for text classification with multi-task learning. 2016, arXiv:1605.05101, arXiv.org e-print archive. https://arxiv.org/abs/1605.05101 (accessed 2023-11-29

[11]

O. Irsoy, C. Cardie. Deep recursive neural networks for compositionality in language. In: Proceedings of the 27^th International Conference on Neural Information Processing Systems, Montreal, Canada, 2014: pp 2096–2104.

[12]

Y. Zhang, B. Wallace. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. 2016, arXiv:1510.03820, arXiv.org e-print archive. https://arxiv.org/abs/1510.03820 (accessed 2023-11-29

[13]

A. Conneau, H. Schwenk, L. Barrault, et al. Very deep convolutional networks for text classification. In: Proceedings of the 15^th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, 2017: pp 1107–1116.

Crossref

[14]

L. Yao, C. S. Mao, Y. Luo. Graph convolutional networks for text classification. In: Proceedings of the 33^rd AAAI Conference on Artificial Intelligence, Honolulu, USA, 2019: pp 7370–7377.

Crossref

[15]

T. B. Brown, B. Mann, N. Ryder, et al. Language models are few-shot learners. In: Proceedings of the 34^th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2020: pp 1877–1901.

[16]

S. V. Balkus, D. H. Yan. Improving short text classification with augmented data using GPT-3. Nat Lang Eng, 2023, 30: 1–30.

Crossref Google Scholar

[17]

X. Han, W. L. Zhao, N. Ding, et al. PTR: Prompt tuning with rules for text classification. AI Open, 2022, 3: 182–192.

Crossref Google Scholar

[18]

X. F. Sun, X. Y. Li, J. W. Li, et al. Text classification via large language models. In: Proceedings of Findings of the Association for Computational Linguistics, Singapore, 2023: pp 8990–9005.

Crossref

[19]

M. Shanahan. Talking about large language models. Commun ACM, 2024, 67: 68–79.

Crossref Google Scholar

[20]

J. Devlin, M. W. Chang, K. Lee, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, 2018: pp 4171–4186.

[21]

A. Radford, J. Wu, R. Child, et al. Language models are unsupervised multitask learners. [Online]. 2019. https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf (accessed 2023-11-29).

[22]

C. Raffel, N. Shazeer, A. Roberts, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res, 2020, 21: 1–67.

Google Scholar

[23]

J. Wei, Y. Tay, R. Bommasani, et al. Emergent abilities of large language models. 2022, arXiv:2206.07682, arXiv.org e-print archive. https://arxiv.org/abs/2206.07682 (accessed 2023-12-01

[24]

H. Naveed, A. U. Khan, S. Qiu, et al. A comprehensive overview of large language models. 2024, arXiv:2307.06435, arXiv.org e-print archive. https://arxiv.org/abs/2307.06435 (accessed 2023-12-16

[25]

W. X. Zhao, K. Zhou, J. Y. Li, et al. A survey of large language models. 2023, arXiv:2303.18223, arXiv.org e-print archive. https://arxiv.org/abs/2303.18223 (accessed 2023-12-03

[26]

J. Wei, X. Z. Wang, D. Schuurmans, et al. Chain-of-thought prompting elicits reasoning in large language models. 2023, arXiv:2201.11903, arXiv.org e-print archive. https://arxiv.org/abs/2201.11903 (accessed 2023-12-03

[27]

OSHA. OSHA accident report [Online]. 2023. https://www.osha.gov/ords/imis/accidentsearch.html (accessed 2023-12-01).

[28]

OpenAI, J. Achiam, S. Adler, et al. GPT-4 technical report. 2024, arXiv:2303.08774, arXiv.org e-print archive. https://arxiv.org/abs/2303.08774 (accessed 2024-01-01

[29]

R. Anil, S. Borgeaud, J. B. Alayrac, et al. Gemini: A family of highly capable multimodal models. 2024, arXiv: 2312.11805, arXiv.org e-print archive. https://arxiv.org/abs/2312.11805 (accessed 2024-01-01

[30]

Meta. LlaMA 3.1 on https://llama.meta.com/llama3/, 2024.

[31]

Blog. Ollama on https://ollama.com, 2024.

[32]

Hugging Face. Transformers: Documentation on https://huggingface.co/docs/transformers/, 2024.

Journal of Intelligent Construction

Volume 3 Issue 1,
March 2025

Article number: 9180039

DOI: 10.26599/JIC.2024.9180039

Cite this article:

Ahmadi E, Muley S, Wang C. Automatic construction accident report analysis using large language models (LLMs). Journal of Intelligent Construction, 2025, 3(1): 9180039. https://doi.org/10.26599/JIC.2024.9180039

Return

Table 1Example annotation of an accident report

Attribute	Detail
Report	At 1:00 pm on January 25, 2017, an employee was working in a suspended ceiling grid installing fire sprinkler piping. The employee came into contact with live electrical wiring and was pulled off a ladder by his foreman, resulting in both falling to the floor. The employee injured his shoulder on impact with the concrete floor
Injury cause	Fall
Root cause	Electrocution
Body part	Shoulder
Severity	Nonfatal
Accident time	1:00 pm

Table 2Classification prompts used in the study

Attribute	Prompt
Injury cause	Determine the injury cause of the accident in the report. Your answer should be strictly one of the following: “electrocution”, “struck by”, “fall”, or “caught in/between” without any additional text or explanations
Root cause	Determine the root cause of the accident in the report. Your answer should be strictly one of the following: “struck by”, “caught in/between’’, “fall’’, “electrocution”, or “unspecified” without any additional text and explanations
Body part	Determine the severity of the incident in the report. Your answer should be strictly one of the following: “fatal” or “nonfatal” without any additional text or explanations
Severity	Determine the main body part affected in the accident. Provide only and strictly the main body part affected without any additional text or explanations. If the information is not available, say “unspecified”
Accident time	Determine the accident time of the accident in the report. The answer should strictly be in the format HH:MM am/pm. If the information is not available, say “unspecified”. Do not include the date and any other additional text and explanations

Table 3Accuracy results

Attribute	GPT-3.5 (%)	GPT-4.0 (%)	Gemini Pro (%)	LLaMA 3.1-70B (%)	BRET, RoBERTa, DeBERTa (%)
Injury cause	91.85	94.02	96.74	96.20	—
Root cause	94.57	97.83	89.67	95.65	83.30, 83.80, 83.62
Body part	79.89	99.46	88.04	66.30	—
Severity	88.04	94.57	86.96	99.46	—
Accident time	94.57	100.00	17.93	100.00	—