Skip to main content
Log in

Study on the influence of input variables on the supervised machine learning model for landslide susceptibility mapping

  • Original Article
  • Published:
Environmental Earth Sciences Aims and scope Submit manuscript

Abstract

Supervised machine learning (ML) models are currently popular in landslide susceptibility mapping (LSM). However, the input variables of these models have some inherent limitations in terms of the lack of nonlinear relationship between the raw input variables and landslides, and the loss of a significant amount of information induced by the demand of the discretization of continuous environmental factors for the discrete and frequency ratio values input variables. Therefore, to address these issues, a new method of neighborhood frequency ratio for obtaining input variables was adopted in this paper. The present study compared the results of four input variables and seven supervised ML models under 28 conditions, with the use of ROC (receiver operating characteristic) curves as evaluation methods for the prediction results. The AUC (area under curve) values, ranging from 0.8223 to 0.9928, shows that the input variables are very important to the evaluation model. The experimental results were analyzed from the perspective of algorithm principles and data characteristics. The main conclusions are as follows: (1) for the non-tree models (i.e., models other than tree models), neighborhood frequency ratio of environmental factors should be used as the model inputs. (2) For tree models (i.e., decision trees and the decision tree based integrated models), the raw values of environmental factors can be used directly as the model inputs of the LSM model. (3) The decision tree based integrated models yielded better prediction results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

Data will be made available on request.

References

  • Abu El-Magd SA, Ali SA, Pham QB (2021) Spatial modeling and susceptibility zonation of landslides using random forest, naïve bayes and K-nearest neighbor in a complicated terrain. Earth Sci Inf 14(3):1227–1243

    Article  ADS  Google Scholar 

  • Adnan MSG, Rahman MS, Ahmed N et al (2020) Improving spatial agreement in machine learning-based landslide susceptibility mapping. Remote Sens 12(20):3347

    Article  ADS  Google Scholar 

  • Arabameri A, Saha S, Roy J et al (2020) Landslide susceptibility evaluation and management using different machine learning methods in the Gallicash River Watershed. Iran Remote Sens 12(3):475

    Article  ADS  Google Scholar 

  • Azizi V, Hu G (2020) Machine learning methods for revenue prediction in google merchandise store INFORMS International Conference on Service Science. Springer International Publishing, pp 65–75

  • Barik MG, Adam JC, Barber ME et al (2017) Improved landslide susceptibility prediction for sustainable forest management in an altered climate. Eng Geol 230:104–117

    Article  Google Scholar 

  • Bui DT, Tsangaratos P, Nguyen VT et al (2020) Comparing the prediction performance of a Deep Learning Neural Network model with conventional machine learning models in landslide susceptibility assessment. CATENA 188:104426

    Article  Google Scholar 

  • Cao C, Wang Q, Chen J et al (2016) Landslide susceptibility mapping in vertical distribution law of precipitation area: case of the Xulong Hydropower Station Reservoir, Southwestern China. Water 8(7):270

    Article  ADS  Google Scholar 

  • Chang KT, Merghadi A, Yunus AP et al (2019) Evaluating scale effects of topographic variables in landslide susceptibility models using GIS-based machine learning techniques. Sci Rep 9(1):12296

    Article  ADS  PubMed Central  Google Scholar 

  • Chang Z, Du Z, Zhang F et al (2020) Landslide susceptibility prediction based on remote sensing images and GIS: Comparisons of supervised and unsupervised machine learning models. Remote Sens 12(3):502

    Article  ADS  Google Scholar 

  • Chen W, Li Y (2020) GIS-based evaluation of landslide susceptibility using hybrid computational intelligence models. CATENA 195:104777

    Article  Google Scholar 

  • Chen W, Pourghasemi HR, Panahi M et al (2017a) Spatial prediction of landslide susceptibility using an adaptive neuro-fuzzy inference system combined with frequency ratio, generalized additive model, and support vector machine techniques. Geomorphology 297:69–85

    Article  ADS  Google Scholar 

  • Chen W, Xie X, Wang J et al (2017b) A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–160

    Article  Google Scholar 

  • Chen W, Pourghasemi HR, Naghibi SA (2018) A comparative study of landslide susceptibility maps produced using support vector machine with different kernel functions and entropy data mining models in China. Bull Eng Geol Env 77:647–664

    Article  CAS  Google Scholar 

  • Chen W, Panahi M, Tsangaratos P et al (2019) Applying population-based evolutionary algorithms and a neuro-fuzzy system for modeling landslide susceptibility. CATENA 172:212–231

    Article  Google Scholar 

  • Chen W, Li Y, Xue W et al (2020) Modeling flood susceptibility using data-driven approaches of naïve bayes tree, alternating decision tree, and random forest methods. Sci Total Environ 701:134979

    Article  ADS  CAS  Google Scholar 

  • Cigdem O, Demirel H (2018) Performance analysis of different classification algorithms using different feature selection methods on Parkinson’s disease detection. J Neurosci Methods 309:81–90

    Article  Google Scholar 

  • Dikshit A, Pradhan B, Alamri AM (2021) Pathways and challenges of the application of artificial intelligence to geohazards modelling. Gondwana Res 100:290–301

    Article  ADS  Google Scholar 

  • Dong A, Dou J, Fu Y et al (2023) Unraveling the evolution of landslide susceptibility: a systematic review of 30-years of strategic themes and trends. Geocarto Int 38(1):2256308

    Article  ADS  Google Scholar 

  • Dou J, Yunus AP, Tien Bui D et al (2019) Evaluating GIS-based multiple statistical models and data mining for earthquake and rainfall-induced landslide susceptibility using the LiDAR DEM. Remote Sens 11(6):638

    Article  ADS  Google Scholar 

  • Dou J, Yunus AP, Merghadi A et al (2020a) Different sampling strategies for predicting landslide susceptibilities are deemed less consequential with deep learning. Sci Total Environ 720:137320

    Article  ADS  CAS  Google Scholar 

  • Dou J, Yunus AP, Bui DT et al (2020b) Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan. Landslides 17:641–658

    Article  Google Scholar 

  • Dou J, Xiang Z, Qiang X et al (2022) Application and development trend of machine learning in landslide intelligent disaster prevention and mitigation. Earth Sci. https://doi.org/10.3799/dqkx.2022.419

    Article  Google Scholar 

  • Farhangi F, Sadegh-Niaraki A, Razavi-Termeh SV et al (2023) Driver drowsiness modeling based on spatial factors and electroencephalography using machine learning methods: a simulator study. Transport Res F Traffic Psychol Behav 98:123–140

    Article  Google Scholar 

  • Ghosh S, Das A (2020) Wetland conversion risk assessment of East Kolkata Wetland: a Ramsar site using random forest and support vector machine model. J Clean Prod 275:123475

    Article  Google Scholar 

  • Guo F, Luo Z, Li H et al (2016) Self-organized criticality of significant fording landslides in Three Gorges Reservoir area, China. Environ Earth Sci 75:1–15

    Article  ADS  CAS  Google Scholar 

  • Guo F, Lai P, Chen Y et al (2022) Influence of different environmental factor connection methods on Benggang susceptibility assessment. Bull Soil Water Conserv 45(5):123–130. https://doi.org/10.13961/j.cnki.stbctb.2022.05.016

    Article  Google Scholar 

  • Guo F, Lai P, Huang F et al (2023) Literature review and research progress of landslide susceptibility mapping based on knowledge graph. Earth Sci. https://doi.org/10.3799/dqkx.2023.058

    Article  Google Scholar 

  • Guzzetti F, Reichenbach P, Ardizzone F et al (2006) Estimating the quality of landslide susceptibility models. Geomorphology 81(1–2):166–184

    Article  ADS  Google Scholar 

  • He Q, Shahabi H, Shirzadi A et al (2019) Landslide spatial modelling using novel bivariate statistical based Naïve Bayes, RBF Classifier, and RBF Network machine learning algorithms. Sci Total Environ 663:1–15

    Article  ADS  CAS  Google Scholar 

  • Hong H, Liu J, Bui DT et al (2018) Landslide susceptibility mapping using J48 Decision Tree with AdaBoost, Bagging and Rotation Forest ensembles in the Guangchang area (China). CATENA 163:399–413

    Article  Google Scholar 

  • Hua Y, Wang X, Li Y et al (2021) Dynamic development of landslide susceptibility based on slope unit and deep neural networks. Landslides 18:281–302

    Article  Google Scholar 

  • Huang Y, Zhao L (2018) Review on landslide susceptibility mapping using support vector machines. CATENA 165:520–529

    Article  Google Scholar 

  • Huang F, Yin K, Huang J et al (2017) Landslide susceptibility mapping based on self-organizing-map network and extreme learning machine. Eng Geol 223:11–22

    Article  Google Scholar 

  • Huang F, Zhang J, Zhou C et al (2020a) A deep learning algorithm using a fully connected sparse autoencoder neural network for landslide susceptibility prediction. Landslides 17:217–229

    Article  Google Scholar 

  • Huang X, Guo F, Deng M et al (2020b) Understanding the deformation mechanism and threshold reservoir level of the floating weight-reducing landslide in the Three Gorges Reservoir Area, China. Landslides 17:2879–2894

    Article  Google Scholar 

  • Huang F, Cao Z, Guo J et al (2020c) Comparisons of heuristic, general statistical and machine learning models for landslide susceptibility prediction and mapping. CATENA 191:104580

    Article  Google Scholar 

  • Huang F, Cao Z, Jiang SH et al (2020d) Landslide susceptibility prediction based on a semi-supervised multiple-layer perceptron model. Landslides 17:2919–2930

    Article  Google Scholar 

  • Huang F, Yan J, Fan X et al (2022) Uncertainty pattern in landslide susceptibility prediction modelling: Effects of different landslide boundaries and spatial shape expressions. Geosci Front 13(2):101317

    Article  Google Scholar 

  • Huang F, Xiong H, Yao C et al (2023) Uncertainties of landslide susceptibility prediction considering different landslide types. J Rock Mech Geotech Eng 15:2954–2972

    Article  CAS  Google Scholar 

  • Jiang SH, Huang J, Huang F et al (2018) Modelling of spatial variability of soil undrained shear strength by conditional random fields for slope reliability analysis. Appl Math Model 63:374–389

    Article  MathSciNet  Google Scholar 

  • Kadavi PR, Lee CW, Lee S (2018) Application of ensemble-based machine learning models to landslide susceptibility mapping. Remote Sens 10(8):1252

    Article  ADS  Google Scholar 

  • Kavzoglu T, Colkesen I, Sahin E K (2019) Machine learning techniques in landslide susceptibility mapping:a survey and a case study. SP Pradhan et al. (eds.), Landslides: Theory, Practice and Modelling, pp 283-301

  • Lee S, Sambath T (2006) Landslide susceptibility mapping in the Damrei Romel area, Cambodia using frequency ratio and logistic regression models. Environ Geol 50:847–855

    Article  ADS  Google Scholar 

  • Li L, Lan H, Guo C et al (2017) A modified frequency ratio method for landslide susceptibility assessment. Landslides 14:727–741

    Article  Google Scholar 

  • Li R, Zhang M, Konstantinov P et al (2022) Permafrost degradation induced thaw settlement susceptibility research and potential risk analysis in the Qinghai-Tibet Plateau. CATENA 214:106239

    Article  CAS  Google Scholar 

  • Li R, Zhang M, Pei W et al (2023) Risk evaluation of thaw settlement using machine learning models for the Wudaoliang-Tuotuohe region, Qinghai-Tibet Plateau. CATENA 220:106700

    Article  Google Scholar 

  • Liu Z, Gilbert G, Cepeda JM et al (2021) Modelling of shallow landslides with machine learning algorithms. Geosci Front 12(1):385–393

    Article  CAS  Google Scholar 

  • Lombardo L, Tanyas H (2020) Chrono-validation of near-real-time landslide susceptibility models via plug-in statistical simulations. Eng Geol 278:105818

    Article  Google Scholar 

  • Long J, Liu Y, Li C et al (2021) A novel model for regional susceptibility mapping of rainfall-reservoir induced landslides in Jurassic slide-prone strata of western Hubei Province, Three Gorges Reservoir area. Stoch Environ Res Risk Assess 35:1403–1426

    Article  Google Scholar 

  • Luo X, Lin F, Chen Y et al (2019) Coupling logistic model tree and random subspace to predict the landslide susceptibility areas with considering the uncertainty of environmental features. Sci Rep 9(1):15369

    Article  ADS  PubMed Central  Google Scholar 

  • Luo W, Dou J, Fu Y et al (2022) A novel hybrid LMD–ETS–TCN approach for predicting landslide displacement based on GPS time series analysis. Remote Sens 15(1):229

    Article  ADS  Google Scholar 

  • Mahjoobi J, Etemad-Shahidi A (2008) An alternative approach for the prediction of significant wave heights based on classification and regression trees. Appl Ocean Res 30(3):172–177

    Article  Google Scholar 

  • Mehrabi, M., Moayedi, H. (2021) Landslide susceptibility mapping using artificial neural network tuned by metaheuristic algorithms. Environ Earth Sci 80, 804

  • Mehrabi M, Moayedi H (2021) Landslide susceptibility mapping using artificial neural network tuned by metaheuristic algorithms. Environ Earth Sci 80:1–20

    Article  ADS  Google Scholar 

  • Mehrabi M, Nalivan OA, Scaioni M et al (2023) Spatial mapping of gully erosion susceptibility using an efficient metaheuristic neural network. Environ Earth Sci 82(20):1–22

    Article  Google Scholar 

  • Ngo PTT, Panahi M, Khosravi K et al (2021) Evaluation of deep learning algorithms for national scale landslide susceptibility mapping of Iran. Geosci Front 12(2):505–519

    Article  Google Scholar 

  • Pham BT, Tien Bui D, Pourghasemi HR et al (2017) Landslide susceptibility assesssment in the Uttarakhand area (India) using GIS:a comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theoret Appl Climatol 128:255–273

    Article  ADS  Google Scholar 

  • Pham BT, Prakash I, Bui DT (2018) Spatial prediction of landslides using a hybrid machine learning approach based on random subspace and classification and regression trees. Geomorphology 303:256–270

    Article  ADS  Google Scholar 

  • Pham BT, Prakash I, Singh SK et al (2019) Landslide susceptibility modeling using reduced error pruning trees and different ensemble techniques: hybrid machine learning approaches. CATENA 175:203–218

    Article  Google Scholar 

  • Pham BT, Prakash I, Dou J et al (2020) A novel hybrid approach of landslide susceptibility modelling using rotation forest ensemble and different base classifiers. Geocarto Int 35(12):1267–1292

    Article  ADS  Google Scholar 

  • Pourghasemi HR, Yousefi S, Kornejady A et al (2017) Performance assessment of individual and ensemble data-mining techniques for gully erosion modeling. Sci Total Environ 609:764–775

    Article  ADS  CAS  Google Scholar 

  • Prokhorenkova L, Gusev G, Vorobev A, et al (2018) CatBoost: unbiased boosting with categorical features. 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada, pp. 1-11

  • Reichenbach P, Rossi M, Malamud BD et al (2018) A review of statistically-based landslide susceptibility models. Earth Sci Rev 180:60–91

    Article  ADS  Google Scholar 

  • Roccati A, Faccini F, Luino F et al (2019) Heavy rainfall triggering shallow landslides: a susceptibility assessment by a GIS-approach in a Ligurian Apennine Catchment (Italy). Water 11(3):605

    Article  Google Scholar 

  • Sahin EK (2022) Comparative analysis of gradient boosting algorithms for landslide susceptibility mapping. Geocarto Int 37(9):2441–2465

    Article  ADS  Google Scholar 

  • Shahri AA, Spross J, Johansson F et al (2019) Landslide susceptibility hazard map in southwest Sweden using artificial neural network. CATENA 183:104225

    Article  Google Scholar 

  • Sharma A, Prakash C, Manivasagam VS (2021) Entropy-based hybrid integration of random forest and support vector machine for landslide susceptibility analysis. Geomatics 1(4):399–416

    Article  Google Scholar 

  • Shirzadi A, Shahabi H, Chapi K et al (2017) A comparative study between popular statistical and machine learning methods for simulating volume of landslides. CATENA 157:213–226

    Article  Google Scholar 

  • Tang H, Li C, Hu X et al (2015) Evolution characteristics of the Huangtupo landslide based on in situ tunneling and monitoring. Landslides 12:511–521

    Article  Google Scholar 

  • Tang RX, Yan EC, Wen T et al (2021) Comparison of logistic regression, information value, and comprehensive evaluating model for landslide susceptibility mapping. Sustainability 13(7):3803

    Article  Google Scholar 

  • Thapa R, Gupta S, Reddy DV (2017) Application of geospatial modelling technique in delineation of fluoride contamination zones within Dwarka Basin, Birbhum, India. Geosci Front 8(5):1105–1114

    Article  CAS  Google Scholar 

  • Van Dao D, Jaafari A, Bayat M et al (2020) A spatially explicit deep learning neural network model for the prediction of landslide susceptibility. CATENA 188:104451

    Article  Google Scholar 

  • Wang JF, Hu Y (2012) Environmental health risk detection with GeogDetector. Environ Model Softw 33:114–115

    Article  ADS  Google Scholar 

  • Wang JF, Zhang TL, Fu BJ (2016) A measure of spatial stratified heterogeneity. Ecol Ind 67:250–256

    Article  Google Scholar 

  • Wang Y, Fang Z, Hong H (2019) Comparison of convolutional neural networks for landslide susceptibility mapping in Yanshan County, China. Sci Total Environ 666:975–993

    Article  ADS  CAS  Google Scholar 

  • Wang Y, Fang ZC, Niu RQ et al (2021) Landslide susceptibility analysis based on deep learning. J Geoinf Sci 23(12):2244–2260

    Google Scholar 

  • Wei Y, Wu X, Wang J et al (2021) Identification of geo-environmental factors on Benggang susceptibility and its spatial modelling using comparative data-driven methods. Soil Tillage Res 208:104857

    Article  Google Scholar 

  • Xiao T, Segoni S, Chen L et al (2020) A step beyond landslide susceptibility maps: a simple method to investigate and explain the different outcomes obtained by different approaches. Landslides 17:627–640

    Article  Google Scholar 

  • Yang J, Song C, Yang Y et al (2019) New method for landslide susceptibility mapping supported by spatial logistic regression and GeoDetector: a case study of Duwen Highway Basin, Sichuan Province, China. Geomorphology 324:62–71

    Article  ADS  Google Scholar 

  • Ye P, Yu B, Chen W et al (2022) Rainfall-induced landslide susceptibility mapping using machine learning algorithms and comparison of their performance in Hilly area of Fujian Province, China. Nat Hazards 113(2):965–995

    Article  Google Scholar 

  • Yilmaz I (2009) Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: a case study from Kat landslides (Tokat—Turkey). Comput Geosci 35(6):1125–1138

    Article  ADS  Google Scholar 

  • Yong C, Jinlong D, Fei G et al (2022) Review of landslide susceptibility assessment based on knowledge mapping. Stoch Environ Res Risk Assess 36(9):2399–2417

    Article  Google Scholar 

  • Youssef AM, Pourghasemi HR (2021) Landslide susceptibility mapping using machine learning algorithms and comparison of their performance at Abha Basin, Asir Region, Saudi Arabia. Geosci Front 12(2):639–655

    Article  Google Scholar 

  • Zhang T, Han L, Chen W et al (2018) Hybrid integration approach of entropy with logistic regression and support vector machine for landslide susceptibility modeling. Entropy 20(11):884

    Article  ADS  PubMed Central  Google Scholar 

  • Zhang Y, Lan H, Li L et al (2020) Optimizing the frequency ratio method for landslide susceptibility assessment: a case study of the Caiyuan Basin in the southeast mountainous area of China. J Mt Sci 17(2):340–357

    Article  Google Scholar 

Download references

Acknowledgements

We greatly thank the Professor Li Huizhong and Ye Shengsheng, Three Gorges Geotechnical Consultants Co., Ltd, for supplying the authors with the data of the landslides. This work is supported by the National Natural Science Foundation of China (no. 42107489), the Open Fund of Badong National Observation and Research Station of Geohazards (no. BNORSG-202304), the Natural Science Foundation of Hubei Province (no. 2022CFB557), the Open Fund of Key Laboratory of Geological Hazards on Three Gorges Reservoir Area (China Three Gorges University) of Ministry of Education (no. 2022KDZ14) and the 111 Project of Hubei Province (Grant number 2021EJD026).

Funding

Grant Recipient: Guo Fei the National Natural Science Foundation of China (No. 42107489), the Open Fund of Badong National Observation and Research Station of Geohazards (No. BNORSG-202304), the Natural Science Foundation of Hubei Province (No.2022CFB557), the Open Fund of Key Laboratory of Geological Hazards on Three Gorges Reservoir Area (China Three Gorges University) of Ministry of Education (No.2022KDZ14) and the 111 Project of Hubei Province (Grant Number 2021EJD026).

Author information

Authors and Affiliations

Authors

Contributions

Author Contributions Conceptualization, Peng Lai, Fei Guo; Supervision, Fei Guo; Software, Peng Lai, Dongwei Zhou, Li Wang; Writing - original draft, Peng Lai, Xiaohu Huang, Guangfu Chen, Writing - review and editing, Fei Guo, Li Wang, Guangfu Chen. All authors reviewed the manuscript.

Corresponding author

Correspondence to Fei Guo.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lai, P., Guo, F., Huang, X. et al. Study on the influence of input variables on the supervised machine learning model for landslide susceptibility mapping. Environ Earth Sci 83, 174 (2024). https://doi.org/10.1007/s12665-024-11501-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12665-024-11501-9

Keywords

Navigation