当前位置: X-MOL 学术PeerJ Comput. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Contextualizing injury severity from occupational accident reports using an optimized deep learning prediction model
PeerJ Computer Science ( IF 3.8 ) Pub Date : 2024-04-17 , DOI: 10.7717/peerj-cs.1985
Mohamed Zul Fadhli Khairuddin 1 , Suresh Sankaranarayanan 2 , Khairunnisa Hasikin 3 , Nasrul Anuar Abd Razak 3 , Rosidah Omar 4
Affiliation  

Background This study introduced a novel approach for predicting occupational injury severity by leveraging deep learning-based text classification techniques to analyze unstructured narratives. Unlike conventional methods that rely on structured data, our approach recognizes the richness of information within injury narrative descriptions with the aim of extracting valuable insights for improved occupational injury severity assessment. Methods Natural language processing (NLP) techniques were harnessed to preprocess the occupational injury narratives obtained from the US Occupational Safety and Health Administration (OSHA) from January 2015 to June 2023. The methodology involved meticulous preprocessing of textual narratives to standardize text and eliminate noise, followed by the innovative integration of Term Frequency-Inverse Document Frequency (TF-IDF) and Global Vector (GloVe) word embeddings for effective text representation. The proposed predictive model adopts a novel Bidirectional Long Short-Term Memory (Bi-LSTM) architecture and is further refined through model optimization, including random search hyperparameters and in-depth feature importance analysis. The optimized Bi-LSTM model has been compared and validated against other machine learning classifiers which are naïve Bayes, support vector machine, random forest, decision trees, and K-nearest neighbor. Results The proposed optimized Bi-LSTM models’ superior predictability, boasted an accuracy of 0.95 for hospitalization and 0.98 for amputation cases with faster model processing times. Interestingly, the feature importance analysis revealed predictive keywords related to the causal factors of occupational injuries thereby providing valuable insights to enhance model interpretability. Conclusion Our proposed optimized Bi-LSTM model offers safety and health practitioners an effective tool to empower workplace safety proactive measures, thereby contributing to business productivity and sustainability. This study lays the foundation for further exploration of predictive analytics in the occupational safety and health domain.

中文翻译:

使用优化的深度学习预测模型从职业事故报告中了解伤害严重程度

背景本研究引入了一种利用基于深度学习的文本分类技术来分析非结构化叙述来预测职业伤害严重程度的新方法。与依赖结构化数据的传统方法不同,我们的方法认识到伤害叙述描述中信息的丰富性,旨在提取有价值的见解以改进职业伤害严重程度评估。方法利用自然语言处理(NLP)技术对2015年1月至2023年6月从美国职业安全与健康管理局(OSHA)获得的工伤叙述进行预处理。该方法包括对文本叙述进行细致的预处理,以标准化文本并消除噪音,其次是词频-逆文档频率 (TF-IDF) 和全局向量 (GloVe) 词嵌入的创新集成,以实现有效的文本表示。所提出的预测模型采用新颖的双向长短期记忆(Bi-LSTM)架构,并通过模型优化进一步细化,包括随机搜索超参数和深入的特征重要性分析。优化的 Bi-LSTM 模型已与其他机器学习分类器(朴素贝叶斯、支持向量机、随机森林、决策树和 K 最近邻)进行了比较和验证。结果 所提出的优化 Bi-LSTM 模型具有出色的预测能力,住院病例的准确度为 0.95,截肢病例的准确度为 0.98,并且模型处理时间更快。有趣的是,特征重要性分析揭示了与职业伤害因果因素相关的预测关键词,从而为增强模型的可解释性提供了宝贵的见解。结论 我们提出的优化 Bi-LSTM 模型为安全和健康从业者提供了一个有效的工具,可以增强工作场所安全主动措施,从而有助于提高企业生产力和可持续性。这项研究为进一步探索职业安全与健康领域的预测分析奠定了基础。
更新日期:2024-04-17
down
wechat
bug