PDF (9 MB)
Collect
Submit Manuscript
Show Outline
Figures (3)

Tables (3)
Table 1
Table 2
Table 3
Research Article | Open Access

Automatic construction accident report analysis using large language models (LLMs)

Ehsan AhmadiShashank MuleyChao Wang()
Bert S. Turner Department of Construction Management, Louisiana State University, Baton Rouge 70803, USA
Show Author Information

Abstract

Construction site safety is a paramount concern, given the high rate of accidents and fatalities in the sector. This study introduces a novel approach to analyzing construction accident reports by employing advanced large language models (LLMs), specifically generative pre-trained transformer (GPT)-3.5, GPT-4.0, Gemini Pro, and large language model Meta artificial intelligence (AI) (LLaMA) 3.1. Our research focuses on the classification of key attributes in accident reports: root cause, injury cause, affected body part, severity, and accident time. The results reveal that GPT-4.0 achieves significantly higher accuracy across most attributes. Gemini Pro demonstrates superior performance in the “injury cause” classification, while LLaMA 3.1 excels in classifying “severity” and “root cause”. GPT-3.5, although lagging behind GPT-4.0, exhibits commendable accuracy. The insights gained from this study are vital for the construction industry, as they indicate the potential for developing more precise and effective safety measures. These findings could lead to a reduction in the frequency and severity of accidents, thereby enhancing worker safety.

References

[1]
International Labour Organization. Nearly 3 million people die of work-related accidents and diseases [Online]. 2023. https://www.ilo.org/global/about-the-ilo/newsroom/news/WCMS_902220/lang--en/index.htm (accessed 2023-11-12).
[2]
CPWR. The center for construction research and training—Construction focus four [Online]. 2023. https://www.cpwr.com/research/data-center/data-dashboards/construction-focus-four-dashboard/ (accessed 2023-11-12).
[3]

A. J. P. Tixier, M. R. Hallowell, B. Rajagopalan, et al. Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports. Autom Constr, 2016, 62: 45–56.

[4]

Y. M. Goh, C. U. Ubeynarayana. Construction accident narrative classification: An evaluation of text mining techniques. Accid Anal Prev, 2017, 108: 122–130.

[5]

M. Y. Cheng, D. Kusoemo, R. A. Gosno. Text mining-based construction site accident classification using hybrid supervised machine learning. Autom Constr, 2020, 118: 103265.

[6]

F. Zhang. A hybrid structured deep neural network with Word2Vec for construction accident causes classification. Int J Constr Manage, 2022, 22: 1120–1140.

[7]

M. Alkaissy, M. Arashpour, E. M. Golafshani, et al. Enhancing construction safety: Machine learning-based classification of injury types. Saf Sci, 2023, 162: 106102.

[8]

X. X. Luo, X. C. Li, X. F. Song, et al. Convolutional neural network algorithm-based novel automatic text classification framework for construction accident reports. J Constr Eng Manage, 2023, 149: 04023128.

[9]

K. Kowsari, K. J. Meimandi, M. Heidarysafa, et al. Text classification algorithms: A survey. Information, 2019, 10: 150.

[10]
P. F. Liu, X. P. Qiu, X. J. Huang. Recurrent neural network for text classification with multi-task learning. 2016, arXiv:1605.05101, arXiv.org e-print archive. https://arxiv.org/abs/1605.05101 (accessed 2023-11-29
[11]
O. Irsoy, C. Cardie. Deep recursive neural networks for compositionality in language. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 2014: pp 2096–2104.
[12]
Y. Zhang, B. Wallace. A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification. 2016, arXiv:1510.03820, arXiv.org e-print archive. https://arxiv.org/abs/1510.03820 (accessed 2023-11-29
[13]
A. Conneau, H. Schwenk, L. Barrault, et al. Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, 2017: pp 1107–1116.
[14]
L. Yao, C. S. Mao, Y. Luo. Graph convolutional networks for text classification. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence, Honolulu, USA, 2019: pp 7370–7377.
[15]
T. B. Brown, B. Mann, N. Ryder, et al. Language models are few-shot learners. In: Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, 2020: pp 1877–1901.
[16]

S. V. Balkus, D. H. Yan. Improving short text classification with augmented data using GPT-3. Nat Lang Eng, 2023, 30: 1–30.

[17]

X. Han, W. L. Zhao, N. Ding, et al. PTR: Prompt tuning with rules for text classification. AI Open, 2022, 3: 182–192.

[18]
X. F. Sun, X. Y. Li, J. W. Li, et al. Text classification via large language models. In: Proceedings of Findings of the Association for Computational Linguistics, Singapore, 2023: pp 8990–9005.
[19]

M. Shanahan. Talking about large language models. Commun ACM, 2024, 67: 68–79.

[20]
J. Devlin, M. W. Chang, K. Lee, et al. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, 2018: pp 4171–4186.
[21]
A. Radford, J. Wu, R. Child, et al. Language models are unsupervised multitask learners. [Online]. 2019. https://insightcivic.s3.us-east-1.amazonaws.com/language-models.pdf (accessed 2023-11-29).
[22]

C. Raffel, N. Shazeer, A. Roberts, et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res, 2020, 21: 1–67.

[23]
J. Wei, Y. Tay, R. Bommasani, et al. Emergent abilities of large language models. 2022, arXiv:2206.07682, arXiv.org e-print archive. https://arxiv.org/abs/2206.07682 (accessed 2023-12-01
[24]
H. Naveed, A. U. Khan, S. Qiu, et al. A comprehensive overview of large language models. 2024, arXiv:2307.06435, arXiv.org e-print archive. https://arxiv.org/abs/2307.06435 (accessed 2023-12-16
[25]
W. X. Zhao, K. Zhou, J. Y. Li, et al. A survey of large language models. 2023, arXiv:2303.18223, arXiv.org e-print archive. https://arxiv.org/abs/2303.18223 (accessed 2023-12-03
[26]
J. Wei, X. Z. Wang, D. Schuurmans, et al. Chain-of-thought prompting elicits reasoning in large language models. 2023, arXiv:2201.11903, arXiv.org e-print archive. https://arxiv.org/abs/2201.11903 (accessed 2023-12-03
[27]
OSHA. OSHA accident report [Online]. 2023. https://www.osha.gov/ords/imis/accidentsearch.html (accessed 2023-12-01).
[28]
OpenAI, J. Achiam, S. Adler, et al. GPT-4 technical report. 2024, arXiv:2303.08774, arXiv.org e-print archive. https://arxiv.org/abs/2303.08774 (accessed 2024-01-01
[29]
R. Anil, S. Borgeaud, J. B. Alayrac, et al. Gemini: A family of highly capable multimodal models. 2024, arXiv: 2312.11805, arXiv.org e-print archive. https://arxiv.org/abs/2312.11805 (accessed 2024-01-01
[30]
Meta. LlaMA 3.1 on https://llama.meta.com/llama3/, 2024.
[31]
Blog. Ollama on https://ollama.com, 2024.
[32]
Hugging Face. Transformers: Documentation on https://huggingface.co/docs/transformers/, 2024.
Journal of Intelligent Construction
Article number: 9180039
Cite this article:
Ahmadi E, Muley S, Wang C. Automatic construction accident report analysis using large language models (LLMs). Journal of Intelligent Construction, 2025, 3(1): 9180039. https://doi.org/10.26599/JIC.2024.9180039
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return