Skip to main content
Log in

MSCNet: Dense vehicle counting method based on multi-scale dilated convolution channel-aware deep network

  • Research
  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

Accurately counting the number of dense objects, such as crowds or vehicles, in an image is a challenging and meaningful task widely used in public safety management and traffic flow prediction. The existing CNN-based density map estimation methods are ineffective for extracting the counting features of long-distance queuing vehicles in traffic jams; In addition, these methods do not focus on counting in complex scenes, such as vehicle counting in the human-vehicle mixed scenes. To tackle this issue, we propose MSCNet, a novel multi-scale dilated convolution channel-aware deep network for vehicle counting. The proposed network solves the problem of scale variation for long-distance queuing vehicles and improves the ability to extract vehicle features in human-vehicle mixed scenes. The MSCNet consists of a front-end module and three functional modules: the front-end module is used to extract the initial features of the counting image; the direction-based perspective coding module (DPCM) encodes the perspective information of the image from four directions to extract continuous long-distance features; the multi-scale dilated residual module (MDRM) can densely extract the large-scale variation features; the channel-aware attention module (CAM) effectively enhances the channel features that are important for vehicle counting in mixed human-vehicle scenes. The MSCNet has conducted extensive comparative experiments on the TRANCOS dataset, the VisDrone2021 Vehicle&Crowd dataset, and the ShanghaiTech dataset. The experimental results show that the MSCNet outperforms the state-of-the-art counting networks for dense vehicle counting, especially in mixed human-vehicle scenes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

All data created or used during this study are publicly available at the following websites: https://gram.web.uah.es/data/datasets/trancos/index.html, https://opendatalab.com/VisDrone, and https://github.com/desenzhou/ShanghaiTechDataset.

References

  1. Min W, Liu R, He D et al (2022) Traffic Sign Recognition Based on Semantic Scene Understanding and Structural Traffic Sign Location. IEEE Trans Intell Transp Syst 23(9):15794–15807

    Article  Google Scholar 

  2. Zhao H, Min W, Wei X et al (2021) MSR-FAN: Multi-Scale Residual Feature-Aware Network for Crowd Counting. IET Image Process 15(14):3512–3521

    Article  Google Scholar 

  3. Fan Z, Zhang H, Zhang Z et al (2022) A Survey of Crowd Counting and Density Estimation Based on Convolutional Neural Network. Neurocomputing 472:224–251

    Article  Google Scholar 

  4. Dirir A, Ignatious H, Elsayed H et al (2021) An Advanced Deep Learning Approach for Multi-Object Counting in Urban Vehicular Environments. Future Internet 13(12):306

  5. Dai Z, Song H, Wang X et al (2019) Video-Based Vehicle Counting Framework. IEEE Access 7:64460–64470

    Article  Google Scholar 

  6. Girshick R, Donahue J, Darrell T et al (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587

  7. Liu Z, Zhang W, Gao X et al (2020) Robust Movement-Specific Vehicle Counting at Crowded Intersections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 614–615

  8. Liang M, Huang X, Chen C et al (2015) Counting and Classification of Highway Vehicles by Regression Analysis. IEEE Trans Intell Transp Syst 16(5):2878–2888

    Article  Google Scholar 

  9. Li Y, Zhang X, Chen D (2018) CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1091–1100

  10. Antonini G, Thiran JP (2006) Counting Pedestrians in Video Sequences Using Trajectory Clustering. IEEE Trans Circuits Syst Video Technol 16(8):1008–1020

    Article  Google Scholar 

  11. Lempitsky V, Zisserman A (2010) Learning to Count Objects in Images. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 1324– 1332

  12. Fu M, Xu P, Li X et al (2015) Fast Crowd Density Estimation with Convolutional Neural Networks. Eng Applic Artif Intell 43(auga):81–88

    Article  Google Scholar 

  13. Zhang C, Li H, Wang X et al (2015) Cross-scene Crowd Counting via Deep Convolutional Neural Networks. In: Proceedings of the 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 833–841

  14. Zhang Y, Zhou D, Chen S et al (2016) Single-Image Crowd Counting via Multi-Column Convolutional Neural Network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 589–597

  15. Liu L, Wang H, Li G et al (2018) Crowd Counting using Deep Recurrent Spatial-Aware Network. In: Proceedings of the 2018 International Joint Conference on Artificial Intelligence (IJCAI), pp. 849–855

  16. Chen J, Su W, Wang Z (2020) Crowd Counting with Crowd Attention Convolutional Neural Network. Neurocomputing 382:210–220

    Article  Google Scholar 

  17. Szegedy C, Ioffe S, Vanhoucke V et al (2017) Inception–v4, Inception-ResNet and the Impact of Residual Connections on Learning. In: Proceedings of the 2017 AAAI Conference on Artificial Intelligence, pp. 4278–4284

  18. Fiaschi L, Kthe U, Nair R et al (2012) Learning to Count with Regression Forest and Structured Labels. In: Proceedings of the 2012 International Conference on Pattern Recognition (ICPR), pp. 2685–2688

  19. PhamVQ, Kozakaya T, Yamaguchi O et al (2015) COUNT Forest: CO-Voting Uncertain Number of Targets Using Random Forest for Crowd Density Estimation. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3253–3261

  20. WangY, Zou Y (2016) Fast Visual Object Counting via Example-Based Density Estimation. In: Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), pp. 3653–3657

  21. Ciregan D, Meier U, Schmidhuber J (2012) Multi-Column Deep Neural Networks for Image Classification. In: Proceedings of the 2012 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3642–3649

  22. ZhouZ, Su L, Li G et al (2020) CSCNet: A Shallow Single Column Network for Crowd Counting. In: Proceedings of the 2020 IEEE International Conference on Visual Communications and Image Processing (VCIP), pp. 535–538

  23. JiangX, Xiao Z, Zhang B et al (2019) Crowd Counting and Density Estimation by Trellis Encoder-Decoder Networks. In: Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6133–6142

  24. Simonyan K, Zisserman A (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition. arXivpreprint, arXiv.1409.1556

  25. Pan X, Shi J, Luo P et al (2018) Spatial as Deep: Spatial CNN for Traffic Scene Understanding. In: Proceedings of the 2018 AAAI Conference on Artificial Intelligence 32(1):7276–7283

  26. He K, Zhang X, Ren S et al (2016) Deep Residual Learning for Image Recognition. In: Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778

  27. Hu J, Shen L, Sun G et al (2018) Squeeze-and-Excitation Networks. In: Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141

  28. Siva P, Javad Shafiee M, Jamieson M (2016) Real-Time, Embedded Scene Invariant Crowd Counting Using Scale-Normalized Histogram of Moving Gradients (HoMG). In: Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 67–74

  29. Guerrero-Gmez-Olmedo R, Torre-Jimnez B, Lpez-Sastre R et al (2015) Extremely overlapping Vehicle Counting. In: Proceedings of the 2015 Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), pp. 423–431

  30. Zhu P, Wen L, Bian X et al (2018) Vision meets drones:A challenge. arXivpreprint, arXiv:1804.07437

  31. Onoro-Rubio D, Lpez-Sastre RJ (2016) Towards Perspective-Free Object Counting with Deep Learning. In: Proceedings of the 2016 European Conference on Computer Vision (ECCV), pp. 615–629

  32. Zhang S, Wu G, Costeira JP (2017) FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras. In: Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3667–3676

  33. Gao J, Wang Q, Li X (2019) PCC Net: Perspective Crowd Counting via Spatial Convolutional Network. IEEE Trans Circuits Syst Video Technol 30(10):3486–3498

    Article  Google Scholar 

  34. Dai F, Liu H, Ma Y et al (2021) Dense Scale Network for Crowd Counting. In: Proceedings of the 2021 International Conference on Multimedia Retrieval (ICMR), pp. 64–72

  35. Sindagi VA, Patel VM (2017) Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs. In: Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1861–1870

  36. Ma Z, Wei X, Hong X et al (2019) Bayesian Loss for Crowd Count Estimation With Point Supervision. In: Proceedings of the 2019 IEEE International Conference on Computer Vision (ICCV), pp. 6142–6151

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No.62076117) and Jiangxi Key Laboratory of Smart City, China (Grant No.20192BCD40002).

Funding

This work was supported by the National Natural Science Foundation of China (Grant No.62076117) and Jiangxi Key Laboratory of Smart City, China (Grant No.20192BCD40002).

Author information

Authors and Affiliations

Authors

Contributions

Qiyan Fu: Conceptualization, Methodology, Software, Validation, Investigation, Formal Analysis, Writing—Original Draft; Weidong Min (Corresponding Author): Conceptualization, Funding Acquisition, Resources, Supervision, Writing—Review & Editing; Chunbo Li: Data Curation, Visualization, Writing—Review & Editing; Haoyu Zhao: Resources, Investigation; Ye Cao: Writing—Review & Editing; Meng Zhu: Validation; All authors reviewed the manuscript.

Corresponding author

Correspondence to Weidong Min.

Ethics declarations

Competing interests

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fu, Q., Min, W., Li, C. et al. MSCNet: Dense vehicle counting method based on multi-scale dilated convolution channel-aware deep network. Geoinformatica 28, 245–269 (2024). https://doi.org/10.1007/s10707-023-00503-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-023-00503-7

Keywords

Navigation