当前位置: X-MOL 学术Environ. Earth Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Study on the influence of input variables on the supervised machine learning model for landslide susceptibility mapping
Environmental Earth Sciences ( IF 2.8 ) Pub Date : 2024-03-09 , DOI: 10.1007/s12665-024-11501-9
Peng Lai , Fei Guo , Xiaohu Huang , Dongwei Zhou , Li Wang , Guangfu Chen

Supervised machine learning (ML) models are currently popular in landslide susceptibility mapping (LSM). However, the input variables of these models have some inherent limitations in terms of the lack of nonlinear relationship between the raw input variables and landslides, and the loss of a significant amount of information induced by the demand of the discretization of continuous environmental factors for the discrete and frequency ratio values input variables. Therefore, to address these issues, a new method of neighborhood frequency ratio for obtaining input variables was adopted in this paper. The present study compared the results of four input variables and seven supervised ML models under 28 conditions, with the use of ROC (receiver operating characteristic) curves as evaluation methods for the prediction results. The AUC (area under curve) values, ranging from 0.8223 to 0.9928, shows that the input variables are very important to the evaluation model. The experimental results were analyzed from the perspective of algorithm principles and data characteristics. The main conclusions are as follows: (1) for the non-tree models (i.e., models other than tree models), neighborhood frequency ratio of environmental factors should be used as the model inputs. (2) For tree models (i.e., decision trees and the decision tree based integrated models), the raw values of environmental factors can be used directly as the model inputs of the LSM model. (3) The decision tree based integrated models yielded better prediction results.



中文翻译:

输入变量对滑坡敏感性绘图监督机器学习模型的影响研究

监督机器学习(ML)模型目前在滑坡敏感性测绘(LSM)中很流行。然而,这些模型的输入变量存在一些固有的局限性,例如原始输入变量与滑坡之间缺乏非线性关系,并且由于连续环境因子离散化的要求而导致大量信息丢失。离散值和频率比值输入变量。因此,为了解决这些问题,本文采用了一种新的邻域频率比方法来获取输入变量。本研究比较了 4 个输入变量和 7 个有监督的 ML 模型在 28 种条件下的结果,并使用 ROC(接收器操作特性)曲线作为预测结果的评估方法。AUC(曲线下面积)值范围为 0.8223 至 0.9928,表明输入变量对于评估模型非常重要。从算法原理和数据特征的角度对实验结果进行了分析。主要结论如下:(1)对于非树模型(即树模型以外的模型),应采用环境因子的邻域频率比作为模型输入。(2)对于树模型(即决策树和基于决策树的集成模型),环境因素的原始值可以直接作为LSM模型的模型输入。(3)基于决策树的集成模型产生了更好的预测结果。

更新日期:2024-03-11
down
wechat
bug