Static video summarization with multi-objective constrained optimization

Dhanushree, M.; Priya, R.; Aruna, P.; Bhavani, R.

doi:10.1007/s12652-024-04777-z

Static video summarization with multi-objective constrained optimization

Original Research
Published: 01 April 2024

Volume 15, pages 2621–2639, (2024)
Cite this article

Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

M. Dhanushree ORCID: orcid.org/0000-0003-1747-7617¹,
R. Priya¹^na1,
P. Aruna¹^na1 &
…
R. Bhavani¹^na1

47 Accesses
Explore all metrics

Abstract

Video summarization is an emerging research field. In particular, static video summarization plays a major role in abstraction and indexing of video repositories. It extracts the vital events in a video such that it covers the entire content of the video. Frames having those important events are called keyframes which are eventually used in video indexing. It also helps in giving an abstract view of the video content such that the internet users are aware of the events present in the video before watching it completely. The proposed research work is focused on efficient static video summarization by extracting various visual features namely color, texture and shape features. These features are aggregated and clustered using a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. In order to produce good video summary by clustering, the parameters of DBSCAN algorithm are optimized by using a meta heuristic population based optimization called Artificial Algae Algorithm (AAA). The experimental results on two public datasets namely VSUMM and OVP dataset show that the proposed Static Video Summarization with Multi-objective Constrained Optimization (SVS_MCO) achieves better results when compared to existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Article 30 January 2023

Video shot-boundary detection: issues, challenges and solutions

Article Open access 30 March 2024

Complex background segmentation for noncontact cable vibration frequency estimation using semantic segmentation and complexity pursuit algorithm

Article 18 April 2024

Data availability

The datasets analyzed during the current study are available in the following link. https://sites.google.com/site/vsummsite/download.

References

Angadi S, Naik V (2014) Entropy based fuzzy c means clustering and key frame extraction for sports video summarization. In: 2014 Fifth International Conference on signal and image processing, pp 271–279, https://doi.org/10.1109/ICSIP.2014.49
Ankerst M, Breunig MM, Kriegel HP et al (1999) Optics: ordering points to identify the clustering structure. ACM SIGMOD Rec 28(2):49–60
Article Google Scholar
Asim M, Almaadeed N, Al-Máadeed S, et al (2018) A key frame based video summarization using color features. In: 2018 Colour and Visual Computing Symposium (CVCS), IEEE, pp 1–6
Basavarajaiah M, Sharma P (2021) Gvsum: generic video summarization using deep visual features. Multimedia Tools Appl 80:14459–14476
Article Google Scholar
Belo L, Caetano C, Patrocinio Z, et al (2014) Graph-based hierarchical video summarization using global descriptors. In: 2014 IEEE 26th International Conference on tools with artificial intelligence, IEEE, pp 822–829
Bendraou Y, Essannouni F, Salam A (2019) From local to global key-frame extraction based on important scenes using SVD of centrist features. Multimedia Tools Appl 78:1441–1456
Article Google Scholar
Chai C, Lu G, Wang R et al (2021) Graph-based structural difference analysis for video summarization. Inf Sci 577:483–509
Article MathSciNet Google Scholar
Chang X, Ren P, Xu P et al (2021) A comprehensive survey of scene graphs: generation and application. IEEE Trans Pattern Anal Mach Intell 45(1):1–26
Article Google Scholar
De Avila SEF, Lopes APB, da Luz A Jr et al (2011) Vsumm: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn Lett 32(1):56–68
Article Google Scholar
Dhanushree M, Priya R, Aruna P, et al (2023) A keyframe extraction using hdbscan with particle swarm optimization. In: 2023 10th International Conference on signal processing and integrated networks (SPIN), IEEE, pp 445–450
Ester M, Kriegel HP, Sander J, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, pp 226–231
Fei M, Jiang W, Mao W (2017) Memorable and rich video summarization. J Vis Commun Image Represent 42:207–217
Article Google Scholar
Furini M, Geraci F, Montangero M et al (2010) Stimo: still and moving video storyboard for the web scenario. Multimedia Tools Appl 46:47–69
Article Google Scholar
Gharbi H, Bahroun S, Massaoudi M et al (2017) Key frames extraction using graph modularity clustering for efficient video summarization. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1502–1506
Gharbi H, Bahroun S, Zagrouba E (2019) Key frame extraction for video summarization using local description and repeatability graph clustering. SIViP 13:507–515
Article Google Scholar
Guan G, Wang Z, Lu S et al (2012) Keypoint based keyframe selection. IEEE Trans Circuits Syst Video Technol 23(4):729–734
Article Google Scholar
Gunantara N (2018) A review of multi-objective optimization: methods and its applications. Cogent Eng 5(1):1502242
Article Google Scholar
Hannane R, Elboushaki A, Afdel K (2016a) Efficient video summarization based on motion sift-distribution histogram. In: 2016 13th international conference on computer graphics, imaging and visualization (CGiV). IEEE, pp 312–317
Hannane R, Elboushaki A, Afdel K et al (2016b) An efficient method for video shot boundary detection and keyframe extraction using sift-point distribution histogram. Int J Multimedia Inf Retr 5:89–104
Article Google Scholar
Hannane R, Elboushaki A, Afdel K (2018) Mskvs: adaptive mean shift-based keyframe extraction for video summarization and a new objective verification approach. J Vis Commun Image Represent 55:179–200
Article Google Scholar
Haralick RM, Shanmugam K, Dinstein IH (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621
Article Google Scholar
Hu W, Xie N, Li L et al (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C (Applications and Reviews) 41(6):797–819
Article Google Scholar
Issa O, Shanableh T (2022) Cnn and hevc video coding features for static video summarization. IEEE Access 10:72080–72091
Article Google Scholar
Issa O, Shanableh T (2023) Static video summarization using video coding features with frame-level temporal subsampling and deep learning. Appl Sci 13(10):6065
Article Google Scholar
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-International Conference on neural networks, IEEE, pp 1942–1948
Khotanzad A, Hong YH (1990) Invariant image recognition by Zernike moments. IEEE Trans Pattern Anal Mach Intell 12(5):489–497
Article Google Scholar
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article Google Scholar
Li J, Yao T, Ling Q et al (2017) Detecting shot boundary with sparse coding for video summarization. Neurocomputing 266:66–78
Article Google Scholar
Li Z, Nie F, Chang X et al (2018a) Rank-constrained spectral clustering with flexible embedding. IEEE Trans Neural Netw Learn Syst 29(12):6073–6082
Article MathSciNet Google Scholar
Li Z, Nie F, Chang X et al (2018b) Dynamic affinity graph construction for spectral clustering using multiple features. IEEE Trans Neural Netw Learn Syst 29(12):6323–6332
Article MathSciNet Google Scholar
Li M, Huang PY, Chang X et al (2022) Video pivoting unsupervised multi-modal machine translation. IEEE Trans Pattern Anal Mach Intell 45(3):3918–3932
Google Scholar
Martins GB, Pereira DR, Almeida JG et al (2020) Opfsumm: on the video summarization using optimum-path forest. Multimedia Tools Appl 79:11195–11211
Article Google Scholar
Medentzidou P, Kotropoulos C (2015) Video summarization based on shot boundary detection with penalized contrasts. In: 2015 9th International Symposium on Image and Signal Processing and Analysis (ISPA), IEEE, pp 199–203
Mohan J, Nair MS (2019a) Domain independent redundancy elimination based on flow vectors for static video summarization. Heliyon 5(10):e02699
Article Google Scholar
Mohan J, Nair MS (2019b) Static video summarization using sparse autoencoders. In: 2019 IEEE International Conference on electrical, computer and communication technologies (ICECCT), IEEE, pp 1–8
Nair MS, Mohan J (2021) Static video summarization using multi-cnn with sparse autoencoder and random forest classifier. SIViP 15:735–742
Article Google Scholar
Parihar AS, Pal J, Sharma I (2021) Multiview video summarization using video partitioning and clustering. J Vis Commun Image Represent 74:102991
Article Google Scholar
Park J, Lee J, Kim IJ, et al (2020) Sumgraph: video summarization via recursive graph modeling. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16, Springer, pp 647–663
Pramanik A, Pal SK, Maiti J et al (2022) Traffic anomaly detection and video summarization using spatio-temporal rough fuzzy granulation with z-numbers. IEEE Trans Intell Transp Syst 23(12):24116–24125
Article Google Scholar
Rani S, Kumar M (2020) Social media video summarization using multi-visual features and Kohnen’s self organizing map. Inform Process Manag 57(3):102190
Article Google Scholar
Sreeja M, Kovoor BC (2022) A multi-stage deep adversarial network for video summarization with knowledge distillation. J Ambient Intell Human Comput 14(8):1–16
Google Scholar
Sun Y, Li P, Jiang Z et al (2021) Feature fusion and clustering for key frame extraction. Math Biosci Eng 18(6):9294–9311
Article Google Scholar
Thomas SS, Gupta S, Subramanian VK (2017) Event detection on roads using perceptual video summarization. IEEE Trans Intell Transp Syst 19(9):2944–2954
Article Google Scholar
Uymaz SA, Tezel G, Yel E (2015) Artificial algae algorithm (aaa) for nonlinear global optimization. Appl Soft Comput 31:153–171
Article Google Scholar
Yan C, Chang X, Li Z et al (2021) Zeronas: differentiable generative adversarial networks search for zero-shot learning. IEEE Trans Pattern Anal Mach Intell 44(12):9733–9740
Article Google Scholar
Yasmin G, Chowdhury S, Nayak J et al (2023) Key moment extraction for designing an agglomerative clustering algorithm-based video summarization framework. Neural Comput Appl 35(7):4881–4902
Article Google Scholar
Zhang L, Chang X, Liu J et al (2022) Tn-zstad: transferable network for zero-shot temporal activity detection. IEEE Trans Pattern Anal Mach Intell 45(3):3848–3861
Google Scholar

Download references

Funding

This study was funded by Ministry of Social Justice and Empowerment of the Government of India.

Author information

R. Priya, P. Aruna, R. Bhavani, contributed equally to this work.

Authors and Affiliations

Department of Computer Science and Engineering, Annamalai University, Annamalainagar, Chidambaram, Tamil Nadu, 608001, India
M. Dhanushree, R. Priya, P. Aruna & R. Bhavani

Authors

M. Dhanushree
View author publications
You can also search for this author in PubMed Google Scholar
R. Priya
View author publications
You can also search for this author in PubMed Google Scholar
P. Aruna
View author publications
You can also search for this author in PubMed Google Scholar
R. Bhavani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Dhanushree.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dhanushree, M., Priya, R., Aruna, P. et al. Static video summarization with multi-objective constrained optimization. J Ambient Intell Human Comput 15, 2621–2639 (2024). https://doi.org/10.1007/s12652-024-04777-z

Download citation

Received: 16 April 2023
Accepted: 22 February 2024
Published: 01 April 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s12652-024-04777-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Static video summarization with multi-objective constrained optimization

Abstract

Access this article

Similar content being viewed by others

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Video shot-boundary detection: issues, challenges and solutions

Complex background segmentation for noncontact cable vibration frequency estimation using semantic segmentation and complexity pursuit algorithm

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Static video summarization with multi-objective constrained optimization

Abstract

Access this article

Similar content being viewed by others

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Video shot-boundary detection: issues, challenges and solutions

Complex background segmentation for noncontact cable vibration frequency estimation using semantic segmentation and complexity pursuit algorithm

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation