Skip to main content
Log in

Snapshot ensemble-based residual network (SnapEnsemResNet) for remote sensing image scene classification

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

Due to their exceptional discriminative ability, the convolutional neural networks (CNNs) have been the center of attention for the research community to achieve scene classification in remote sensing imagery (RSI). The scarcity in the availability of large-scale remote sensing scene classification datasets has held back the researchers to realize the full potential of deep CNN models such as ResNets. As deeper networks tend to overfit limited training data, effective techniques to counter the overfitting phenomenon along with the ability to target the inter-class similarity and intra-class diversity challenges in RSI are deemed necessary. This research, therefore, proposes a snapshot ensemble-based residual network (SnapEnsemResNet) which consists of two sub-networks (FC-1024 and Dilated-Conv) designed to realize the full potential of ResNets. FC-1024 architecture targets the overfitting phenomenon by adding an extra fully connected layer in existing ResNet architecture for effective implementation of regularization techniques resulting in improved generalization ability of the network. Whereas Dilated-Conv architecture focuses on extracting more descriptive features by introducing an additional dilated convolutional layer in the final convolution block which assists in minimizing inter-class similarity. To further enhance the individual sub-network performance, the SnapEnsemResNet is integrated with a two-tier snapshot-based ensembling strategy which is called ensembling the ensembled snapshots. The final prediction of the class label is achieved using the majority voting technique. Comparing the SnapEnsemResNet classification performance with state-of-the-art methods using the challenging NWPU-RESISC45 scene classification dataset as the benchmark, we obtained competitive accuracy results for a training ratio of 20%, whereas a new top performance is achieved with a 10% training ratio.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

Data availability

The dataset analysed during the current study are introduced in [49] and available in the tensorflow_datasets repository, https://www.tensorflow.org/datasets/catalog/resisc45

References

  1. Hu Q et al (2013) Exploring the use of Google Earth imagery and object-based methods in land use/cover mapping. Remote Sensing 5(11):6026–6042

    Article  Google Scholar 

  2. Gómez-Chova L, Tuia D, Moser G, Camps-Valls G (2015) Multimodal classification of remote sensing images: A review and future directions. Proc IEEE 103(9):1560–1584

    Article  Google Scholar 

  3. Longbotham N, Chaapel C, Bleiler L, Padwick C, Emery WJ, Pacifici F (2011) Very high resolution multiangle urban classification analysis. IEEE Trans Geosci Remote Sens 50(4):1155–1170

    Article  Google Scholar 

  4. Huang X, Wen D, Li J, Qin R (2017) Multi-level monitoring of subtle urban changes for the megacities of China using high-resolution multi-view satellite imagery. Remote Sens Environ 196:56–75

    Article  Google Scholar 

  5. Zhang T, Huang X (2018) Monitoring of urban impervious surfaces using time series of high-resolution remote sensing images in rapidly urbanized areas: A case study of Shenzhen. IEEE J Sel Top Appl Earth Observations Remote Sensing 11(8):2692–2708

    Article  Google Scholar 

  6. Li X, Shao G (2013) Object-based urban vegetation mapping with high-resolution aerial photography as a single data source. Int J Remote Sens 34(3):771–789

    Article  MathSciNet  Google Scholar 

  7. Leitloff J, Hinz S, Stilla U (2010) Vehicle detection in very high resolution satellite images of city areas. IEEE Trans Geosci Remote Sens 48(7):2795–2806

    Article  Google Scholar 

  8. Janssen LL, Middelkoop H (1992) Knowledge-based crop classification of a Landsat Thematic Mapper image. Int J Remote Sens 13(15):2827–2837

    Article  Google Scholar 

  9. Ghamisi P, Plaza J, Chen Y, Li J, Plaza AJ (2017) Advanced spectral classifiers for hyperspectral images: A review. IEEE Geosci Remote Sens Mag 5(1):8–32

    Article  Google Scholar 

  10. He L, Li J, Liu C, Li S (2017) Recent advances on spectral–spatial hyperspectral image classification: An overview and new guidelines. IEEE Trans Geosci Remote Sens 56(3):1579–1597

    Article  Google Scholar 

  11. Yan G, Mas JF, Maathuis B, Xiangmin Z, Van Dijk P (2006) Comparison of pixel-based and object-oriented image classification approaches—a case study in a coal fire area, Wuda, Inner Mongolia, China. Int J Remote Sens 27(18):4039–4055

    Article  Google Scholar 

  12. Blaschke T (2010) Object based image analysis for remote sensing. ISPRS J Photogramm Remote Sens 65(1):2–16

    Article  Google Scholar 

  13. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105

    Google Scholar 

  14. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. In International Conference on Learning Representations

  15. Szegedy C et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  16. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  17. Goodfellow I, Bengio Y, Courville A (2016) Deep Learning (Adaptive Computation and Machine Learning series)," ed: e MIT Press, Cambridge, England

  18. Molnar C, Casalicchio G, Bischl B (2021) Interpretable machine learning–a brief history, state-of-the-art and challenges. In: ECML PKDD 2020 Workshops: Workshops of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2020): SoGood 2020, PDFL 2020, MLCS 2020, NFMCP 2020, DINA 2020, EDML 2020, XKDD 2020 and INRA 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, pp 417–431: Springer

  19. Schölkopf B (2022) Causality for machine learning. In: Probabilistic and Causal Inference: The Works of Judea Pearl, pp 765–804

  20. Yuan K et al (2022) Causality guided machine learning model on wetland CH4 emissions across global wetlands. Agric For Meteorol 324:109115

    Article  Google Scholar 

  21. Zhou Z-H, Wu J, Tang W (2002) Ensembling neural networks: many could be better than all. Artif Intell 137(1–2):239–263

    Article  MathSciNet  MATH  Google Scholar 

  22. Ganaie M, Hu M (2021) Ensemble deep learning: a review. Eng Appl Artif Intell 115:105151

  23. Seijo-Pardo B, Porto-Díaz I, Bolón-Canedo V, Alonso-Betanzos A (2017) Ensemble feature selection: homogeneous and heterogeneous approaches. Knowl-Based Syst 118:124–139

    Article  Google Scholar 

  24. Huang G, Li Y, Pleiss G, Liu Z, Hopcroft JE, Weinberger (2017) Snapshot ensembles: train 1, get m for free. International conference on learning representations

  25. Dede MA, Aptoula E, Genc Y (2018) Deep network ensembles for aerial scene classification. IEEE Geosci Remote Sens Lett 16(5):732–735

    Article  Google Scholar 

  26. Birodkar V, Lu Z, Li S, Rathod V, Huang J (2021) The surprising impact of mask-head architecture on novel class segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 7015–7025

  27. He N, Fang L, Li S, Plaza A, Plaza J (2018) Remote sensing scene classification using multilayer stacked covariance pooling. IEEE Trans Geosci Remote Sens 56(12):6899–6910

    Article  Google Scholar 

  28. Wang Q, Xie J, Zuo W, Zhang L, Li P (2020) Deep cnns meet global covariance pooling: better representation and generalization. IEEE Trans Pattern Anal Machine Intell 43(8):2582–2597

  29. He N, Fang L, Li S, Plaza J, Plaza A (2019) Skip-connected covariance network for remote sensing scene classification. IEEE Trans Neural Netw Learn Syst 31(5):1461–1474

    Article  Google Scholar 

  30. Liu Y, Suen CY, Liu Y, Ding L (2018) Scene classification using hierarchical Wasserstein CNN. IEEE Trans Geosci Remote Sens 57(5):2494–2509

    Article  Google Scholar 

  31. Zhang Z, Sabuncu M (2018) Generalized cross entropy loss for training deep neural networks with noisy labels. Adva Neural Inf Process Syst 31

  32. Cheng G, Yang C, Yao X, Guo L, Han J (2018) When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs. IEEE Trans Geosci Remote Sens 56(5):2811–2821

    Article  Google Scholar 

  33. Liu X, Zhou Y, Zhao J, Yao R, Liu B, Zheng Y (2019) Siamese convolutional neural networks for remote sensing scene classification. IEEE Geosci Remote Sens Lett 16(8):1200–1204

    Article  Google Scholar 

  34. Wang J, Liu W, Ma L, Chen H, Chen L (2018) IORN: An effective remote sensing image scene classification framework. IEEE Geosci Remote Sens Lett 15(11):1695–1699

    Article  Google Scholar 

  35. Castelluccio M, Poggi G, Sansone C, Verdoliva L (2015) Land use classification in remote sensing images by convolutional neural networks. arXiv preprint arXiv:1508.00092

  36. Xie J, He N, Fang L, Plaza A (2019) Scale-free convolutional neural network for remote sensing scene classification. IEEE Trans Geosci Remote Sens 57(9):6916–6928

    Article  Google Scholar 

  37. Guo D, Xia Y, Luo X (2020) Scene classification of remote sensing images based on saliency dual attention residual network. IEEE Access 8:6344–6357

    Article  Google Scholar 

  38. Zhang W, Tang P, Zhao L (2019) Remote sensing image scene classification using CNN-CapsNet. Remote Sensing 11(5):494

    Article  Google Scholar 

  39. Deng F, Pu S, Chen X, Shi Y, Yuan T, Pu S (2018) Hyperspectral image classification with capsule network using limited training samples. Sensors 18(9):3153

    Article  Google Scholar 

  40. Demertzis K, Iliadis L, Pimenidis E (2020) Large-scale geospatial data analysis: Geographic object-based scene classification in remote sensing images by GIS and deep residual learning. In: Proceedings of the 21st EANN (Engineering Applications of Neural Networks) 2020 Conference: Proceedings of the EANN 2020 21, pp. 274–291: Springer

  41. Annavarapu CSR (2021) Deep learning-based improved snapshot ensemble technique for COVID-19 chest X-ray classification. Appl Intell 51:3104–3120

    Article  Google Scholar 

  42. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 12(10):993–1001

    Article  Google Scholar 

  43. Minetto R, Segundo MP, Sarkar S (2019) Hydra: An ensemble of convolutional neural networks for geospatial land classification. IEEE Trans Geosci Remote Sens 57(9):6530–6541

    Article  Google Scholar 

  44. Basha SS, Dubey SR, Pulabaigari V, Mukherjee S (2020) Impact of fully connected layers on performance of convolutional neural networks for image classification. Neurocomputing 378:112–119

    Article  Google Scholar 

  45. Awais M, Iqbal MTB, Bae S-H (2020) Revisiting internal covariate shift for batch normalization. IEEE Transactions on Neural Networks Learning Systems 32(11)5082–5092

  46. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp. 448–456: PMLR

  47. Santurkar S, Tsipras D, Ilyas A, Madry A (2018) How does batch normalization help optimization? Adv Neural Inf Process Syst (31)

  48. Dauphin Y, Cubuk ED (2021) Deconstructing the regularization of BatchNorm. In: International Conference on Learning Representations

  49. Cheng G, Han J, Lu X (2017) Remote sensing image scene classification: Benchmark and state of the art. Proc IEEE 105(10):1865–1883

    Article  Google Scholar 

  50. Deng J, Dong W, Socher R, Li L-J, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255: Ieee

  51. Li X, Chen S, Hu X, Yang J (2019) Understanding the disharmony between dropout and batch normalization by variance shift. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 2682–2690

  52. Krogh A, Hertz JA (1991) A simple weight decay can improve generalization. Adv Neural Inf Process Syst 4

  53. Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258

  54. Shorten C, Khoshgoftaar TM (2019) A survey on image data augmentation for deep learning. J Big Data 6(1):1–48

    Article  Google Scholar 

  55. Zhong Z, Zheng L, Kang G, Li S, Yang Y (2020) Random erasing data augmentation. Proc AAAI Conf Artif Intell 34(07):13001–13008

    Google Scholar 

  56. Lei X, Pan H, Huang X (2019) A dilated CNN model for image classification. IEEE Access 7:124087–124095

    Article  Google Scholar 

  57. Schaul T, Zhang S, LeCun Y (2013) No more pesky learning rates. In: International Conference on Machine Learning, pp 343–351: PMLR

  58. Grandini M, Bagli E, Visani G (2020) Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756

  59. Xia G-S et al (2017) AID: A benchmark data set for performance evaluation of aerial scene classification. IEEE Trans Geosci Remote Sens 55(7):3965–3981

    Article  Google Scholar 

  60. Li F et al (2020) A hierarchical temporal attention-based LSTM encoder-decoder model for individual mobility prediction. Neurocomputing 403:153–166

    Article  Google Scholar 

  61. Wang F, Jiang M, Qian C et al (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164

  62. Roy SK, Manna S, Song T, Bruzzone L (2020) Attention-based adaptive spectral–spatial kernel ResNet for hyperspectral image classification. IEEE Trans Geosci Remote Sens 59(9):7831–7843

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Khurram Khan.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Siddiqui, M.I., Khan, K., Fazil, A. et al. Snapshot ensemble-based residual network (SnapEnsemResNet) for remote sensing image scene classification. Geoinformatica 27, 341–372 (2023). https://doi.org/10.1007/s10707-023-00492-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-023-00492-7

Keywords

Navigation