当前位置: X-MOL 学术Biosyst. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A comparative study of machine learning models for respiration rate prediction in dairy cows: Exploring algorithms, feature engineering, and model interpretation
Biosystems Engineering ( IF 5.1 ) Pub Date : 2024-02-29 , DOI: 10.1016/j.biosystemseng.2024.01.010
Geqi Yan , Wanying Zhao , Chaoyuan Wang , Zhengxiang Shi , Hao Li , Zhenwei Yu , Hongchao Jiao , Hai Lin

The respiration rate (RR) of dairy cows is a crucial welfare indicator for assessing heat stress in cows exposed to high temperatures. Machine learning (ML) models can automatically identify patterns from factors related to cow RR. This study utilised ML methods to establish a predictive model for cow RR using easily accessible variables in real production settings. A comparison of 20 ML algorithms, including linear regression, neural networks, and others, was conducted to evaluate their performance in predicting cow RR and investigate the impact of different inputs and feature engineering techniques on algorithm performance, using a cleaned dataset comprising 2977 records. The main findings indicate that the CATBOOST-based model, specifically the CATBOOST algorithm with environmental parameters as input features under ordinal encoding, exhibited the best performance, with a coefficient of determination (R) of 0.676, a mean absolute error (MAE) of 7.246, a mean absolute percentage error (MAPE) of 13.8%, and a root mean square error (RMSE) of 9.341. There was no statistical difference in model performance using environmental parameters, heat indices, or heat flows as input features. Feature polynomials, PCA-based dimensionality reduction, and filter-based feature selection may significantly reduce the performance of the ML-based cow RR model. Additionally, according to the SHAP analysis of the optimal model, air temperature, black globe temperature, and airflow speed are identified as the top three factors contributing to the prediction of cow RR. The findings from this study can offer valuable guidance for the design and regulation of dairy farm environmental control systems.

中文翻译:

奶牛呼吸率预测的机器学习模型的比较研究:探索算法、特征工程和模型解释

奶牛的呼吸率(RR)是评估高温奶牛热应激的重要福利指标。机器学习 (ML) 模型可以自动识别与奶牛 RR 相关的因素的模式。本研究利用机器学习方法,利用实际生产环境中易于访问的变量,建立了奶牛 RR 的预测模型。使用包含 2977 条记录的清理数据集,对 20 种 ML 算法(包括线性回归、神经网络等)进行比较,以评估它们在预测奶牛 RR 方面的性能,并研究不同输入和特征工程技术对算法性能的影响。主要研究结果表明,基于 CATBOOST 的模型,特别是序数编码下以环境参数作为输入特征的 CATBOOST 算法,表现出最佳性能,确定系数(R)为 0.676,平均绝对误差(MAE)为 7.246 ,平均绝对百分比误差 (MAPE) 为 13.8%,均方根误差 (RMSE) 为 9.341。使用环境参数、热量指数或热流作为输入特征的模型性能没有统计差异。特征多项式、基于 PCA 的降维和基于滤波器的特征选择可能会显着降低基于 ML 的奶牛 RR 模型的性能。此外,根据最优模型的SHAP分析,气温、黑球温度和气流速度被确定为影响奶牛RR预测的前三个因素。这项研究的结果可以为奶牛场环境控制系统的设计和监管提供有价值的指导。
更新日期:2024-02-29
down
wechat
bug