Abstract
Supervised machine learning (ML) models are currently popular in landslide susceptibility mapping (LSM). However, the input variables of these models have some inherent limitations in terms of the lack of nonlinear relationship between the raw input variables and landslides, and the loss of a significant amount of information induced by the demand of the discretization of continuous environmental factors for the discrete and frequency ratio values input variables. Therefore, to address these issues, a new method of neighborhood frequency ratio for obtaining input variables was adopted in this paper. The present study compared the results of four input variables and seven supervised ML models under 28 conditions, with the use of ROC (receiver operating characteristic) curves as evaluation methods for the prediction results. The AUC (area under curve) values, ranging from 0.8223 to 0.9928, shows that the input variables are very important to the evaluation model. The experimental results were analyzed from the perspective of algorithm principles and data characteristics. The main conclusions are as follows: (1) for the non-tree models (i.e., models other than tree models), neighborhood frequency ratio of environmental factors should be used as the model inputs. (2) For tree models (i.e., decision trees and the decision tree based integrated models), the raw values of environmental factors can be used directly as the model inputs of the LSM model. (3) The decision tree based integrated models yielded better prediction results.
Similar content being viewed by others
Data availability
Data will be made available on request.
References
Abu El-Magd SA, Ali SA, Pham QB (2021) Spatial modeling and susceptibility zonation of landslides using random forest, naïve bayes and K-nearest neighbor in a complicated terrain. Earth Sci Inf 14(3):1227–1243
Adnan MSG, Rahman MS, Ahmed N et al (2020) Improving spatial agreement in machine learning-based landslide susceptibility mapping. Remote Sens 12(20):3347
Arabameri A, Saha S, Roy J et al (2020) Landslide susceptibility evaluation and management using different machine learning methods in the Gallicash River Watershed. Iran Remote Sens 12(3):475
Azizi V, Hu G (2020) Machine learning methods for revenue prediction in google merchandise store INFORMS International Conference on Service Science. Springer International Publishing, pp 65–75
Barik MG, Adam JC, Barber ME et al (2017) Improved landslide susceptibility prediction for sustainable forest management in an altered climate. Eng Geol 230:104–117
Bui DT, Tsangaratos P, Nguyen VT et al (2020) Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment. CATENA 188:104426
Cao C, Wang Q, Chen J et al (2016) Landslide susceptibility mapping in vertical distribution law of precipitation area: case of the Xulong Hydropower Station Reservoir, Southwestern China. Water 8(7):270
Chang KT, Merghadi A, Yunus AP et al (2019) Evaluating scale effects of topographic variables in landslide susceptibility models using GIS-based machine learning techniques. Sci Rep 9(1):12296
Chang Z, Du Z, Zhang F et al (2020) Landslide susceptibility prediction based on remote sensing images and GIS: Comparisons of supervised and unsupervised machine learning models. Remote Sens 12(3):502
Chen W, Li Y (2020) GIS-based evaluation of landslide susceptibility using hybrid computational intelligence models. CATENA 195:104777
Chen W, Pourghasemi HR, Panahi M et al (2017a) Spatial prediction of landslide susceptibility using an adaptive neuro-fuzzy inference system combined with frequency ratio, generalized additive model, and support vector machine techniques. Geomorphology 297:69–85
Chen W, Xie X, Wang J et al (2017b) A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160
Chen W, Pourghasemi HR, Naghibi SA (2018) A comparative study of landslide susceptibility maps produced using support vector machine with different kernel functions and entropy data mining models in China. Bull Eng Geol Env 77:647–664
Chen W, Panahi M, Tsangaratos P et al (2019) Applying population-based evolutionary algorithms and a neuro-fuzzy system for modeling landslide susceptibility. CATENA 172:212–231
Chen W, Li Y, Xue W et al (2020) Modeling flood susceptibility using data-driven approaches of naïve bayes tree, alternating decision tree, and random forest methods. Sci Total Environ 701:134979
Cigdem O, Demirel H (2018) Performance analysis of different classification algorithms using different feature selection methods on Parkinson’s disease detection. J Neurosci Methods 309:81–90
Dikshit A, Pradhan B, Alamri AM (2021) Pathways and challenges of the application of artificial intelligence to geohazards modelling. Gondwana Res 100:290–301
Dong A, Dou J, Fu Y et al (2023) Unraveling the evolution of landslide susceptibility: a systematic review of 30-years of strategic themes and trends. Geocarto Int 38(1):2256308
Dou J, Yunus AP, Tien Bui D et al (2019) Evaluating GIS-based multiple statistical models and data mining for earthquake and rainfall-induced landslide susceptibility using the LiDAR DEM. Remote Sens 11(6):638
Dou J, Yunus AP, Merghadi A et al (2020a) Different sampling strategies for predicting landslide susceptibilities are deemed less consequential with deep learning. Sci Total Environ 720:137320
Dou J, Yunus AP, Bui DT et al (2020b) Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan. Landslides 17:641–658
Dou J, Xiang Z, Qiang X et al (2022) Application and development trend of machine learning in landslide intelligent disaster prevention and mitigation. Earth Sci. https://doi.org/10.3799/dqkx.2022.419
Farhangi F, Sadegh-Niaraki A, Razavi-Termeh SV et al (2023) Driver drowsiness modeling based on spatial factors and electroencephalography using machine learning methods: a simulator study. Transport Res F Traffic Psychol Behav 98:123–140
Ghosh S, Das A (2020) Wetland conversion risk assessment of East Kolkata Wetland: a Ramsar site using random forest and support vector machine model. J Clean Prod 275:123475
Guo F, Luo Z, Li H et al (2016) Self-organized criticality of significant fording landslides in Three Gorges Reservoir area, China. Environ Earth Sci 75:1–15
Guo F, Lai P, Chen Y et al (2022) Influence of different environmental factor connection methods on Benggang susceptibility assessment. Bull Soil Water Conserv 45(5):123–130. https://doi.org/10.13961/j.cnki.stbctb.2022.05.016
Guo F, Lai P, Huang F et al (2023) Literature review and research progress of landslide susceptibility mapping based on knowledge graph. Earth Sci. https://doi.org/10.3799/dqkx.2023.058
Guzzetti F, Reichenbach P, Ardizzone F et al (2006) Estimating the quality of landslide susceptibility models. Geomorphology 81(1–2):166–184
He Q, Shahabi H, Shirzadi A et al (2019) Landslide spatial modelling using novel bivariate statistical based Naïve Bayes, RBF Classifier, and RBF Network machine learning algorithms. Sci Total Environ 663:1–15
Hong H, Liu J, Bui DT et al (2018) Landslide susceptibility mapping using J48 Decision Tree with AdaBoost, Bagging and Rotation Forest ensembles in the Guangchang area (China). CATENA 163:399–413
Hua Y, Wang X, Li Y et al (2021) Dynamic development of landslide susceptibility based on slope unit and deep neural networks. Landslides 18:281–302
Huang Y, Zhao L (2018) Review on landslide susceptibility mapping using support vector machines. CATENA 165:520–529
Huang F, Yin K, Huang J et al (2017) Landslide susceptibility mapping based on self-organizing-map network and extreme learning machine. Eng Geol 223:11–22
Huang F, Zhang J, Zhou C et al (2020a) A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 17:217–229
Huang X, Guo F, Deng M et al (2020b) Understanding the deformation mechanism and threshold reservoir level of the floating weight-reducing landslide in the Three Gorges Reservoir Area, China. Landslides 17:2879–2894
Huang F, Cao Z, Guo J et al (2020c) Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. CATENA 191:104580
Huang F, Cao Z, Jiang SH et al (2020d) Landslide susceptibility prediction based on a semi-supervised multiple-layer perceptron model. Landslides 17:2919–2930
Huang F, Yan J, Fan X et al (2022) Uncertainty pattern in landslide susceptibility prediction modelling: Effects of different landslide boundaries and spatial shape expressions. Geosci Front 13(2):101317
Huang F, Xiong H, Yao C et al (2023) Uncertainties of landslide susceptibility prediction considering different landslide types. J Rock Mech Geotech Eng 15:2954–2972
Jiang SH, Huang J, Huang F et al (2018) Modelling of spatial variability of soil undrained shear strength by conditional random fields for slope reliability analysis. Appl Math Model 63:374–389
Kadavi PR, Lee CW, Lee S (2018) Application of ensemble-based machine learning models to landslide susceptibility mapping. Remote Sens 10(8):1252
Kavzoglu T, Colkesen I, Sahin E K (2019) Machine learning techniques in landslide susceptibility mapping:a survey and a case study. SP Pradhan et al. (eds.), Landslides: Theory, Practice and Modelling, pp 283-301
Lee S, Sambath T (2006) Landslide susceptibility mapping in the Damrei Romel area, Cambodia using frequency ratio and logistic regression models. Environ Geol 50:847–855
Li L, Lan H, Guo C et al (2017) A modified frequency ratio method for landslide susceptibility assessment. Landslides 14:727–741
Li R, Zhang M, Konstantinov P et al (2022) Permafrost degradation induced thaw settlement susceptibility research and potential risk analysis in the Qinghai-Tibet Plateau. CATENA 214:106239
Li R, Zhang M, Pei W et al (2023) Risk evaluation of thaw settlement using machine learning models for the Wudaoliang-Tuotuohe region, Qinghai-Tibet Plateau. CATENA 220:106700
Liu Z, Gilbert G, Cepeda JM et al (2021) Modelling of shallow landslides with machine learning algorithms. Geosci Front 12(1):385–393
Lombardo L, Tanyas H (2020) Chrono-validation of near-real-time landslide susceptibility models via plug-in statistical simulations. Eng Geol 278:105818
Long J, Liu Y, Li C et al (2021) A novel model for regional susceptibility mapping of rainfall-reservoir induced landslides in Jurassic slide-prone strata of western Hubei Province, Three Gorges Reservoir area. Stoch Environ Res Risk Assess 35:1403–1426
Luo X, Lin F, Chen Y et al (2019) Coupling logistic model tree and random subspace to predict the landslide susceptibility areas with considering the uncertainty of environmental features. Sci Rep 9(1):15369
Luo W, Dou J, Fu Y et al (2022) A novel hybrid LMD–ETS–TCN approach for predicting landslide displacement based on GPS time series analysis. Remote Sens 15(1):229
Mahjoobi J, Etemad-Shahidi A (2008) An alternative approach for the prediction of significant wave heights based on classification and regression trees. Appl Ocean Res 30(3):172–177
Mehrabi, M., Moayedi, H. (2021) Landslide susceptibility mapping using artificial neural network tuned by metaheuristic algorithms. Environ Earth Sci 80, 804
Mehrabi M, Moayedi H (2021) Landslide susceptibility mapping using artificial neural network tuned by metaheuristic algorithms. Environ Earth Sci 80:1–20
Mehrabi M, Nalivan OA, Scaioni M et al (2023) Spatial mapping of gully erosion susceptibility using an efficient metaheuristic neural network. Environ Earth Sci 82(20):1–22
Ngo PTT, Panahi M, Khosravi K et al (2021) Evaluation of deep learning algorithms for national scale landslide susceptibility mapping of Iran. Geosci Front 12(2):505–519
Pham BT, Tien Bui D, Pourghasemi HR et al (2017) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS:a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theoret Appl Climatol 128:255–273
Pham BT, Prakash I, Bui DT (2018) Spatial prediction of landslides using a hybrid machine learning approach based on random subspace and classification and regression trees. Geomorphology 303:256–270
Pham BT, Prakash I, Singh SK et al (2019) Landslide susceptibility modeling using reduced error pruning trees and different ensemble techniques: hybrid machine learning approaches. CATENA 175:203–218
Pham BT, Prakash I, Dou J et al (2020) A novel hybrid approach of landslide susceptibility modelling using rotation forest ensemble and different base classifiers. Geocarto Int 35(12):1267–1292
Pourghasemi HR, Yousefi S, Kornejady A et al (2017) Performance assessment of individual and ensemble data-mining techniques for gully erosion modeling. Sci Total Environ 609:764–775
Prokhorenkova L, Gusev G, Vorobev A, et al (2018) CatBoost: unbiased boosting with categorical features. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada, pp. 1-11
Reichenbach P, Rossi M, Malamud BD et al (2018) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180:60–91
Roccati A, Faccini F, Luino F et al (2019) Heavy rainfall triggering shallow landslides: a susceptibility assessment by a GIS-approach in a Ligurian Apennine Catchment (Italy). Water 11(3):605
Sahin EK (2022) Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping. Geocarto Int 37(9):2441–2465
Shahri AA, Spross J, Johansson F et al (2019) Landslide susceptibility hazard map in southwest Sweden using artificial neural network. CATENA 183:104225
Sharma A, Prakash C, Manivasagam VS (2021) Entropy-based hybrid integration of random forest and support vector machine for landslide susceptibility analysis. Geomatics 1(4):399–416
Shirzadi A, Shahabi H, Chapi K et al (2017) A comparative study between popular statistical and machine learning methods for simulating volume of landslides. CATENA 157:213–226
Tang H, Li C, Hu X et al (2015) Evolution characteristics of the Huangtupo landslide based on in situ tunneling and monitoring. Landslides 12:511–521
Tang RX, Yan EC, Wen T et al (2021) Comparison of logistic regression, information value, and comprehensive evaluating model for landslide susceptibility mapping. Sustainability 13(7):3803
Thapa R, Gupta S, Reddy DV (2017) Application of geospatial modelling technique in delineation of fluoride contamination zones within Dwarka Basin, Birbhum, India. Geosci Front 8(5):1105–1114
Van Dao D, Jaafari A, Bayat M et al (2020) A spatially explicit deep learning neural network model for the prediction of landslide susceptibility. CATENA 188:104451
Wang JF, Hu Y (2012) Environmental health risk detection with GeogDetector. Environ Model Softw 33:114–115
Wang JF, Zhang TL, Fu BJ (2016) A measure of spatial stratified heterogeneity. Ecol Ind 67:250–256
Wang Y, Fang Z, Hong H (2019) Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China. Sci Total Environ 666:975–993
Wang Y, Fang ZC, Niu RQ et al (2021) Landslide susceptibility analysis based on deep learning. J Geoinf Sci 23(12):2244–2260
Wei Y, Wu X, Wang J et al (2021) Identification of geo-environmental factors on Benggang susceptibility and its spatial modelling using comparative data-driven methods. Soil Tillage Res 208:104857
Xiao T, Segoni S, Chen L et al (2020) A step beyond landslide susceptibility maps: a simple method to investigate and explain the different outcomes obtained by different approaches. Landslides 17:627–640
Yang J, Song C, Yang Y et al (2019) New method for landslide susceptibility mapping supported by spatial logistic regression and GeoDetector: a case study of Duwen Highway Basin, Sichuan Province, China. Geomorphology 324:62–71
Ye P, Yu B, Chen W et al (2022) Rainfall-induced landslide susceptibility mapping using machine learning algorithms and comparison of their performance in Hilly area of Fujian Province, China. Nat Hazards 113(2):965–995
Yilmaz I (2009) Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: a case study from Kat landslides (Tokat—Turkey). Comput Geosci 35(6):1125–1138
Yong C, Jinlong D, Fei G et al (2022) Review of landslide susceptibility assessment based on knowledge mapping. Stoch Environ Res Risk Assess 36(9):2399–2417
Youssef AM, Pourghasemi HR (2021) Landslide susceptibility mapping using machine learning algorithms and comparison of their performance at Abha Basin, Asir Region, Saudi Arabia. Geosci Front 12(2):639–655
Zhang T, Han L, Chen W et al (2018) Hybrid integration approach of entropy with logistic regression and support vector machine for landslide susceptibility modeling. Entropy 20(11):884
Zhang Y, Lan H, Li L et al (2020) Optimizing the frequency ratio method for landslide susceptibility assessment: a case study of the Caiyuan Basin in the southeast mountainous area of China. J Mt Sci 17(2):340–357
Acknowledgements
We greatly thank the Professor Li Huizhong and Ye Shengsheng, Three Gorges Geotechnical Consultants Co., Ltd, for supplying the authors with the data of the landslides. This work is supported by the National Natural Science Foundation of China (no. 42107489), the Open Fund of Badong National Observation and Research Station of Geohazards (no. BNORSG-202304), the Natural Science Foundation of Hubei Province (no. 2022CFB557), the Open Fund of Key Laboratory of Geological Hazards on Three Gorges Reservoir Area (China Three Gorges University) of Ministry of Education (no. 2022KDZ14) and the 111 Project of Hubei Province (Grant number 2021EJD026).
Funding
Grant Recipient: Guo Fei the National Natural Science Foundation of China (No. 42107489), the Open Fund of Badong National Observation and Research Station of Geohazards (No. BNORSG-202304), the Natural Science Foundation of Hubei Province (No.2022CFB557), the Open Fund of Key Laboratory of Geological Hazards on Three Gorges Reservoir Area (China Three Gorges University) of Ministry of Education (No.2022KDZ14) and the 111 Project of Hubei Province (Grant Number 2021EJD026).
Author information
Authors and Affiliations
Contributions
Author Contributions Conceptualization, Peng Lai, Fei Guo; Supervision, Fei Guo; Software, Peng Lai, Dongwei Zhou, Li Wang; Writing - original draft, Peng Lai, Xiaohu Huang, Guangfu Chen, Writing - review and editing, Fei Guo, Li Wang, Guangfu Chen. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lai, P., Guo, F., Huang, X. et al. Study on the influence of input variables on the supervised machine learning model for landslide susceptibility mapping. Environ Earth Sci 83, 174 (2024). https://doi.org/10.1007/s12665-024-11501-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12665-024-11501-9