Abstract
Video summarization is an emerging research field. In particular, static video summarization plays a major role in abstraction and indexing of video repositories. It extracts the vital events in a video such that it covers the entire content of the video. Frames having those important events are called keyframes which are eventually used in video indexing. It also helps in giving an abstract view of the video content such that the internet users are aware of the events present in the video before watching it completely. The proposed research work is focused on efficient static video summarization by extracting various visual features namely color, texture and shape features. These features are aggregated and clustered using a Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. In order to produce good video summary by clustering, the parameters of DBSCAN algorithm are optimized by using a meta heuristic population based optimization called Artificial Algae Algorithm (AAA). The experimental results on two public datasets namely VSUMM and OVP dataset show that the proposed Static Video Summarization with Multi-objective Constrained Optimization (SVS_MCO) achieves better results when compared to existing methods.
Similar content being viewed by others
Data availability
The datasets analyzed during the current study are available in the following link. https://sites.google.com/site/vsummsite/download.
References
Angadi S, Naik V (2014) Entropy based fuzzy c means clustering and key frame extraction for sports video summarization. In: 2014 Fifth International Conference on signal and image processing, pp 271–279, https://doi.org/10.1109/ICSIP.2014.49
Ankerst M, Breunig MM, Kriegel HP et al (1999) Optics: ordering points to identify the clustering structure. ACM SIGMOD Rec 28(2):49–60
Asim M, Almaadeed N, Al-Máadeed S, et al (2018) A key frame based video summarization using color features. In: 2018 Colour and Visual Computing Symposium (CVCS), IEEE, pp 1–6
Basavarajaiah M, Sharma P (2021) Gvsum: generic video summarization using deep visual features. Multimedia Tools Appl 80:14459–14476
Belo L, Caetano C, Patrocinio Z, et al (2014) Graph-based hierarchical video summarization using global descriptors. In: 2014 IEEE 26th International Conference on tools with artificial intelligence, IEEE, pp 822–829
Bendraou Y, Essannouni F, Salam A (2019) From local to global key-frame extraction based on important scenes using SVD of centrist features. Multimedia Tools Appl 78:1441–1456
Chai C, Lu G, Wang R et al (2021) Graph-based structural difference analysis for video summarization. Inf Sci 577:483–509
Chang X, Ren P, Xu P et al (2021) A comprehensive survey of scene graphs: generation and application. IEEE Trans Pattern Anal Mach Intell 45(1):1–26
De Avila SEF, Lopes APB, da Luz A Jr et al (2011) Vsumm: a mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn Lett 32(1):56–68
Dhanushree M, Priya R, Aruna P, et al (2023) A keyframe extraction using hdbscan with particle swarm optimization. In: 2023 10th International Conference on signal processing and integrated networks (SPIN), IEEE, pp 445–450
Ester M, Kriegel HP, Sander J, et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: kdd, pp 226–231
Fei M, Jiang W, Mao W (2017) Memorable and rich video summarization. J Vis Commun Image Represent 42:207–217
Furini M, Geraci F, Montangero M et al (2010) Stimo: still and moving video storyboard for the web scenario. Multimedia Tools Appl 46:47–69
Gharbi H, Bahroun S, Massaoudi M et al (2017) Key frames extraction using graph modularity clustering for efficient video summarization. In: 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1502–1506
Gharbi H, Bahroun S, Zagrouba E (2019) Key frame extraction for video summarization using local description and repeatability graph clustering. SIViP 13:507–515
Guan G, Wang Z, Lu S et al (2012) Keypoint based keyframe selection. IEEE Trans Circuits Syst Video Technol 23(4):729–734
Gunantara N (2018) A review of multi-objective optimization: methods and its applications. Cogent Eng 5(1):1502242
Hannane R, Elboushaki A, Afdel K (2016a) Efficient video summarization based on motion sift-distribution histogram. In: 2016 13th international conference on computer graphics, imaging and visualization (CGiV). IEEE, pp 312–317
Hannane R, Elboushaki A, Afdel K et al (2016b) An efficient method for video shot boundary detection and keyframe extraction using sift-point distribution histogram. Int J Multimedia Inf Retr 5:89–104
Hannane R, Elboushaki A, Afdel K (2018) Mskvs: adaptive mean shift-based keyframe extraction for video summarization and a new objective verification approach. J Vis Commun Image Represent 55:179–200
Haralick RM, Shanmugam K, Dinstein IH (1973) Textural features for image classification. IEEE Trans Syst Man Cybern 6:610–621
Hu W, Xie N, Li L et al (2011) A survey on visual content-based video indexing and retrieval. IEEE Trans Syst Man Cybern Part C (Applications and Reviews) 41(6):797–819
Issa O, Shanableh T (2022) Cnn and hevc video coding features for static video summarization. IEEE Access 10:72080–72091
Issa O, Shanableh T (2023) Static video summarization using video coding features with frame-level temporal subsampling and deep learning. Appl Sci 13(10):6065
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-International Conference on neural networks, IEEE, pp 1942–1948
Khotanzad A, Hong YH (1990) Invariant image recognition by Zernike moments. IEEE Trans Pattern Anal Mach Intell 12(5):489–497
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Li J, Yao T, Ling Q et al (2017) Detecting shot boundary with sparse coding for video summarization. Neurocomputing 266:66–78
Li Z, Nie F, Chang X et al (2018a) Rank-constrained spectral clustering with flexible embedding. IEEE Trans Neural Netw Learn Syst 29(12):6073–6082
Li Z, Nie F, Chang X et al (2018b) Dynamic affinity graph construction for spectral clustering using multiple features. IEEE Trans Neural Netw Learn Syst 29(12):6323–6332
Li M, Huang PY, Chang X et al (2022) Video pivoting unsupervised multi-modal machine translation. IEEE Trans Pattern Anal Mach Intell 45(3):3918–3932
Martins GB, Pereira DR, Almeida JG et al (2020) Opfsumm: on the video summarization using optimum-path forest. Multimedia Tools Appl 79:11195–11211
Medentzidou P, Kotropoulos C (2015) Video summarization based on shot boundary detection with penalized contrasts. In: 2015 9th International Symposium on Image and Signal Processing and Analysis (ISPA), IEEE, pp 199–203
Mohan J, Nair MS (2019a) Domain independent redundancy elimination based on flow vectors for static video summarization. Heliyon 5(10):e02699
Mohan J, Nair MS (2019b) Static video summarization using sparse autoencoders. In: 2019 IEEE International Conference on electrical, computer and communication technologies (ICECCT), IEEE, pp 1–8
Nair MS, Mohan J (2021) Static video summarization using multi-cnn with sparse autoencoder and random forest classifier. SIViP 15:735–742
Parihar AS, Pal J, Sharma I (2021) Multiview video summarization using video partitioning and clustering. J Vis Commun Image Represent 74:102991
Park J, Lee J, Kim IJ, et al (2020) Sumgraph: video summarization via recursive graph modeling. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXV 16, Springer, pp 647–663
Pramanik A, Pal SK, Maiti J et al (2022) Traffic anomaly detection and video summarization using spatio-temporal rough fuzzy granulation with z-numbers. IEEE Trans Intell Transp Syst 23(12):24116–24125
Rani S, Kumar M (2020) Social media video summarization using multi-visual features and Kohnen’s self organizing map. Inform Process Manag 57(3):102190
Sreeja M, Kovoor BC (2022) A multi-stage deep adversarial network for video summarization with knowledge distillation. J Ambient Intell Human Comput 14(8):1–16
Sun Y, Li P, Jiang Z et al (2021) Feature fusion and clustering for key frame extraction. Math Biosci Eng 18(6):9294–9311
Thomas SS, Gupta S, Subramanian VK (2017) Event detection on roads using perceptual video summarization. IEEE Trans Intell Transp Syst 19(9):2944–2954
Uymaz SA, Tezel G, Yel E (2015) Artificial algae algorithm (aaa) for nonlinear global optimization. Appl Soft Comput 31:153–171
Yan C, Chang X, Li Z et al (2021) Zeronas: differentiable generative adversarial networks search for zero-shot learning. IEEE Trans Pattern Anal Mach Intell 44(12):9733–9740
Yasmin G, Chowdhury S, Nayak J et al (2023) Key moment extraction for designing an agglomerative clustering algorithm-based video summarization framework. Neural Comput Appl 35(7):4881–4902
Zhang L, Chang X, Liu J et al (2022) Tn-zstad: transferable network for zero-shot temporal activity detection. IEEE Trans Pattern Anal Mach Intell 45(3):3848–3861
Funding
This study was funded by Ministry of Social Justice and Empowerment of the Government of India.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Dhanushree, M., Priya, R., Aruna, P. et al. Static video summarization with multi-objective constrained optimization. J Ambient Intell Human Comput 15, 2621–2639 (2024). https://doi.org/10.1007/s12652-024-04777-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-024-04777-z