Shen, J; Liu, S and Zhang, J (2024) Using text mining and Bayesian network to identify key risk factors for safety accidents in metro construction. Journal of Construction Engineering and Management, 150(6), ISSN 0733-9364
Abstract
Complex risk factors make metro construction safety accidents prone to occur, and there are various types of accidents. Accident reports record detailed information about different types of accidents in text form. However, effectively utilizing such unstructured data presents a significant challenge. Text mining (TM) provides a viable foundation for addressing this challenge, but related studies have limitations in risk feature extraction and lack of in-depth analysis capability. To address the deficiencies of existing studies and provide a feasible strategy for identifying key risk factors in the metro construction domain, this paper proposes an integrated model combining TM and machine learning-based Bayesian networks. Firstly, the term frequency-inverse document frequency (TF-IDF) algorithm in TM was used to separately extract the direct and indirect cause factors from the accident reports, with the missing factors supplemented using the TextRank algorithm. Then, depending on the assumption of whether to consider the conditional independence between factors, an improved naive Bayesian network (NBN) and a tree-augmented naive Bayesian network (TAN) were built based on the extracted factors and the corresponding accident types, respectively, for further in-depth analysis. Finally, the training set was divided to train the two network models, and sensitivity analysis was used to identify the key risk factors. Using 162 accident reports from China as an application example, the results showed that TAN exhibited a higher average accuracy (79.62%) in the test set compared with the improved NBN (71.75%), and the importance of risk factors for different accident types was successfully ranked from multiple perspectives using TAN. Meanwhile, some new insights into metro accidents in China were obtained, which can support decision-making for accident prevention and control. In conclusion, this paper effectively addresses the relevant limitations of accident text utilization and presents a novel approach for metro construction safety management.
Item Type: | Article |
---|---|
Date Deposited: | 11 Apr 2025 19:51 |
Last Modified: | 11 Apr 2025 19:51 |