research-article

An Optimal Edge-weighted Graph Semantic Correlation Framework for Multi-view Feature Representation Learning

Authors:
Lei Gao

Toronto Metropolitan University, Toronto, Canada

Toronto Metropolitan University, Toronto, Canada

0000-0001-5583-713X
View Profile

,
Zheng Guo

Toronto Metropolitan University, Toronto, Canada

Toronto Metropolitan University, Toronto, Canada

0009-0005-9990-4580
View Profile

,
Ling Guan

Toronto Metropolitan University, Toronto, Canada

Toronto Metropolitan University, Toronto, Canada

0000-0002-2681-2504
View Profile

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 20 Issue 7Article No.: 200pp 1–23https://doi.org/10.1145/3649466

Published:25 April 2024Publication History

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

In this article, we present an optimal edge-weighted graph semantic correlation (EWGSC) framework for multi-view feature representation learning. Different from most existing multi-view representation methods, local structural information and global correlation in multi-view feature spaces are exploited jointly in the EWGSC framework, leading to a new and high-quality multi-view feature representation. Specifically, a novel edge-weighted graph model is first conceptualized and developed to preserve local structural information in each of the multi-view feature spaces. Then, the explored structural information is integrated with a semantic correlation algorithm, labeled multiple canonical correlation analysis (LMCCA), to form a powerful platform for effectively exploiting local and global relations across multi-view feature spaces jointly. We then theoretically verified the relation between the upper limit on the number of projected dimensions and the optimal solution to the multi-view feature representation problem. To validate the effectiveness and generality of the proposed framework, we conducted experiments on five datasets of different scales, including visual-based (University of California Irvine (UCI) iris database, Olivetti Research Lab (ORL) face database, and Caltech 256 database), text-image-based (Wiki database), and video-based (Ryerson Multimedia Lab (RML) audio-visual emotion database) examples. The experimental results show the superiority of the proposed framework on multi-view feature representation over state-of-the-art algorithms.

REFERENCES

[1] Li Yingming, Yang Ming, and Zhang Zhongfei. 2018. A survey of multi-view representation learning. IEEE Trans. Knowl. Data Eng. 31, 10 (2018), 1863–1883.Google ScholarCross Ref
[2] Zhu Wenwu, Wang Xin, and Li Hongzhi. 2019. Multi-modal deep analysis for multimedia. IEEE Trans. Circ. Syst. Vid. Technol. 30, 10 (2019), 3740–3764.Google ScholarCross Ref
[3] Zhao Jing, Xie Xijiong, Xu Xin, and Sun Shiliang. 2017. Multi-view learning overview: Recent progress and new challenges. Inf. Fusion 38 (2017), 43–54.Google ScholarDigital Library
[4] Rupnik Jan and Shawe-Taylor John. 2010. Multi-view canonical correlation analysis. In Conference on Data Mining and Data Warehouses (SiKDD’10). 1–4.Google Scholar
[5] Peng Yuxin, Huang Xin, and Zhao Yunzhen. 2017. An overview of cross-media retrieval: Concepts, methodologies, benchmarks, and challenges. IEEE Trans. Circ. Syst. Vid. Technol. 28, 9 (2017), 2372–2385.Google ScholarDigital Library
[6] Hu Haifeng. 2013. Multiview gait recognition based on patch distribution features and uncorrelated multilinear sparse local discriminant canonical correlation analysis. IEEE Trans. Circ. Syst. Vid. Technol. 24, 4 (2013), 617–630.Google Scholar
[7] Sun Ting-Kai, Chen Song-Can, Jin Zhong, and Yang Jing-Yu. 2007. Kernelized discriminative canonical correlation analysis. In International Conference on Wavelet Analysis and Pattern Recognition, Vol. 3. IEEE, 1283–1287.Google Scholar
[8] Li Yi-Ou, Adali Tülay, Wang Wei, and Calhoun Vince D.. 2009. Joint blind source separation by multiset canonical correlation analysis. IEEE Trans. Sig. Process. 57, 10 (2009), 3918–3929.Google ScholarDigital Library
[9] Gao Lei, Qi Lin, and Guan Ling. 2015. Sparsity preserving multiple canonical correlation analysis with visual emotion recognition to multi-feature fusion. In IEEE International Conference on Image Processing (ICIP’15). IEEE, 2710–2714.Google ScholarDigital Library
[10] Chen Jia, Wang Gang, and Giannakis Georgios B.. 2019. Graph multiview canonical correlation analysis. IEEE Trans. Sig. Process. 67, 11 (2019), 2826–2838.Google ScholarCross Ref
[11] Gao Lei, Qi Lin, Chen Enqing, and Guan Ling. 2017. Discriminative multiple canonical correlation analysis for information fusion. IEEE Trans. Image Process. 27, 4 (2017), 1951–1965.Google ScholarCross Ref
[12] He Xiaofei and Niyogi Partha. 2003. Locality preserving projections. Adv. Neural Inf. Process. Syst. 16 (2003).Google Scholar
[13] Relan Devanjali, Ballerini Lucia, Trucco Emanuele, and MacGillivray Tom. 2019. Using orthogonal locality preserving projections to find dominant features for classifying retinal blood vessels. Multim. Tools Applic. 78, 10 (2019), 12783–12803.Google ScholarDigital Library
[14] Gao Lei, Zhang Rui, Qi Lin, Chen Enqing, and Guan Ling. 2018. The labeled multiple canonical correlation analysis for information fusion. IEEE Trans. Multim. 21, 2 (2018), 375–387.Google ScholarDigital Library
[15] Zhou Jie, Cui Ganqu, Hu Shengding, Zhang Zhengyan, Yang Cheng, Liu Zhiyuan, Wang Lifeng, Li Changcheng, and Sun Maosong. 2020. Graph neural networks: A review of methods and applications. AI Open 1 (2020), 57–81.Google ScholarCross Ref
[16] Chen Cen, Li Kenli, Wei Wei, Zhou Joey Tianyi, and Zeng Zeng. 2021. Hierarchical graph neural networks for few-shot learning. IEEE Trans. Circ. Syst. Vid. Technol. 32, 1 (2021), 240–252.Google ScholarDigital Library
[17] Li Dan, Wang Haibao, Wang Yufeng, and Wang Shengpei. 2023. Instance-wise multi-view representation learning. Inf. Fusion 91 (2023), 612–622.Google ScholarDigital Library
[18] Yijie Lin, Yuanbiao Gou, Xiaotian Liu, Jinfeng Bai, Jiancheng Lv, and Xi Peng. 2023. Dual contrastive prediction for incomplete multi-view representation learning. IEEE Trans. Pattern Anal. Mach. Intell. 45, 4 (2023), 4447–4461.Google Scholar
[19] Gao Lei and Guan Ling. 2023. A discriminant information theoretic learning framework for multi-modal feature representation. ACM Trans. Intell. Syst. Technol. 14, 3 (2023), 1–24.Google ScholarDigital Library
[20] Wang Ren, Sun Haoliang, Ma Yuling, Xi Xiaoming, and Yin Yilong. 2023. MetaViewer: Towards a unified multi-view representation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11590–11599.Google ScholarCross Ref
[21] Zheng Qinghai, Zhu Jihua, Li Zhongyu, Tian Zhiqiang, and Li Chen. 2023. Comprehensive multi-view representation learning. Inf. Fusion 89 (2023), 198–209.Google ScholarDigital Library
[22] Cao Guanqun, Iosifidis Alexandros, and Gabbouj Moncef. 2017. Multi-view nonparametric discriminant analysis for image retrieval and recognition. IEEE Sig. Process. Lett. 24, 10 (2017), 1537–1541.Google ScholarCross Ref
[23] Pan Heng, He Jinrong, Ling Yu, Ju Lie, and He Guoliang. 2018. Graph regularized multiview marginal discriminant projection. J. Visual Commun. Image Represent. 57 (2018), 12–22.Google ScholarCross Ref
[24] You Xinge, Xu Jiamiao, Yuan Wei, Jing Xiao-Yuan, Tao Dacheng, and Zhang Taiping. 2019. Multi-view common component discriminant analysis for cross-view classification. Pattern Recog. 92 (2019), 37–51.Google ScholarDigital Library
[25] Chan Tsung-Han, Jia Kui, Gao Shenghua, Lu Jiwen, Zeng Zinan, and Ma Yi. 2015. PCANet: A simple deep learning baseline for image classification? IEEE Trans. Image Process. 24, 12 (2015), 5017–5032.Google ScholarDigital Library
[26] Xu Yong, Zhong Zuofeng, Yang Jian, You Jane, and Zhang David. 2016. A new discriminative sparse representation method for robust face recognition via \(l\_\){2}\(\) regularization. IEEE Trans. Neural Netw. Learn. Syst. 28, 10 (2016), 2233–2242.Google ScholarCross Ref
[27] Meng Min, Lan Mengcheng, Yu Jun, Wu Jigang, and Tao Dapeng. 2019. Constrained discriminative projection learning for image classification. IEEE Trans. Image Process. 29 (2019), 186–198.Google ScholarCross Ref
[28] Singha Anu, Bhowmik Mrinal Kanti, and Bhattacherjee Debotosh. 2020. Akin-based orthogonal space (AOS): A subspace learning method for face recognition. Multim. Tools Applic. 79, 47 (2020), 35069–35091.Google ScholarDigital Library
[29] Rejeesh M. R.. 2019. Interest point based face recognition using adaptive neuro fuzzy inference system. Multim. Tools Applic. 78, 16 (2019), 22691–22710.Google ScholarDigital Library
[30] Yang Xiaojun, Liu Gang, Yu Qiang, and Wang Rong. 2018. Stable and orthogonal local discriminant embedding using trace ratio criterion for dimensionality reduction. Multim. Tools Applic. 77, 3 (2018), 3071–3081.Google ScholarDigital Library
[31] Wan Ming-Hua and Lai Zhi-Hui. 2019. Generalized discriminant local median preserving projections (GDLMPP) for face recognition. Neural Process. Lett. 49, 3 (2019), 951–963.Google ScholarDigital Library
[32] Theofanis Sapatinas. 2005. Discriminant analysis and statistical pattern recognition. Journal of the Royal Statistical Society Series A: Statistics in Society 168, 3 (2005), 635–636.Google Scholar
[33] Seng Kah Phooi, Ang Li-Minn, and Ooi Chien Shing. 2016. A combined rule-based & machine learning audio-visual emotion recognition approach. IEEE Trans. Affect. Comput. 9, 1 (2016), 3–13.Google ScholarDigital Library
[34] Wang Zhan, Wang Lizhi, and Huang Hua. 2020. Joint low rank embedded multiple features learning for audio–visual emotion recognition. Neurocomputing 388 (2020), 324–333.Google ScholarDigital Library
[35] Zhang Shiqing, Zhang Shiliang, Huang Tiejun, and Gao Wen. 2017. Speech emotion recognition using deep convolutional neural network and discriminant temporal pyramid matching. IEEE Trans. Multim. 20, 6 (2017), 1576–1590.Google ScholarCross Ref
[36] Ma Yaxiong, Hao Yixue, Chen Min, Chen Jincai, Lu Ping, and Košir Andrej. 2019. Audio-visual emotion fusion (AVEF): A deep efficient weighted approach. Inf. Fusion 46 (2019), 184–192.Google ScholarDigital Library
[37] Ioannis Kansizoglou, Loukas Bampis, and Antonios Gasteratos. 2022. An active learning paradigm for online audio-visual emotion recognition. IEEE Trans. Affect. Comput. 13, 2 (2022), 756–768.Google Scholar
[38] Zhang Dongqing and Li Wu-Jun. 2014. Large-scale supervised multimodal hashing with semantic correlation maximization. In AAAI Conference on Artificial Intelligence.Google ScholarDigital Library
[39] Liu Junhao, Yang Min, Li Chengming, and Xu Ruifeng. 2020. Improving cross-modal image-text retrieval with teacher-student learning. IEEE Trans. Circ. Syst. Vid. Technol. 31, 8 (2020), 3242–3253.Google ScholarCross Ref
[40] Wang Cheng, Yang Haojin, and Meinel Christoph. 2016. A deep semantic framework for multimodal representation learning. Multim. Tools Applic. 75, 15 (2016), 9255–9276.Google ScholarDigital Library
[41] Liong Venice Erin, Lu Jiwen, Tan Yap-Peng, and Zhou Jie. 2016. Deep coupled metric learning for cross-modal matching. IEEE Trans. Multim. 19, 6 (2016), 1234–1244.Google ScholarDigital Library
[42] Xu Jun, An Wangpeng, Zhang Lei, and Zhang David. 2019. Sparse, collaborative, or nonnegative representation: which helps pattern classification? Pattern Recog. 88 (2019), 679–688.Google ScholarDigital Library
[43] Lan Rushi and Zhou Yicong. 2017. An extended probabilistic collaborative representation based classifier for image classification. In IEEE International Conference on Multimedia and Expo (ICME’17). IEEE, 1392–1397.Google ScholarCross Ref
[44] Wen Jie, Fang Xiaozhao, Cui Jinrong, Fei Lunke, Yan Ke, Chen Yan, and Xu Yong. 2018. Robust sparse linear discriminant analysis. IEEE Trans. Circ. Syst. Vid. Technol. 29, 2 (2018), 390–403.Google ScholarDigital Library
[45] Zheng Chengyong and Wang Ningning. 2019. Collaborative representation with k-nearest classes for classification. Pattern Recog. Lett. 117 (2019), 30–36.Google ScholarCross Ref
[46] Yang Feng, Ma Zheng, and Xie Mei. 2021. Image classification with superpixels and feature fusion method. J. Electron. Sci. Technol. 19, 1 (2021), 100096.Google ScholarCross Ref
[47] Zhang Chunjie, Cheng Jian, and Tian Qi. 2017. Multiview label sharing for visual representations and classifications. IEEE Trans. Multim. 20, 4 (2017), 903–913.Google ScholarDigital Library
[48] Liu Qingfeng and Liu Chengjun. 2016. A novel locally linear KNN method with applications to visual recognition. IEEE Trans. Neural Netw. Learn. Syst. 28, 9 (2016), 2010–2021.Google ScholarCross Ref
[49] Zhang Chunjie, Cheng Jian, and Tian Qi. 2017. Structured weak semantic space construction for visual categorization. IEEE Trans. Neural Netw. Learn. Syst. 29, 8 (2017), 3442–3451.Google Scholar
[50] Mahmood Ammar, Bennamoun Mohammed, An Senjian, Sohel Ferdous, and Boussaid Farid. 2020. ResFeats: Residual network based features for underwater image classification. Image Vis. Comput. 93 (2020), 103811.Google ScholarDigital Library
[51] He Kaiming, Zhang Xiangyu, Ren Shaoqing, and Sun Jian. 2016. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google ScholarCross Ref
[52] Mahmood Ammar, Bennamoun Mohammed, An Senjian, and Sohel Ferdous. 2017. ResFeats: Residual network based features for image classification. In IEEE International Conference on Image Processing (ICIP’17). IEEE, 1597–1601.Google ScholarDigital Library
[53] Feng Xiexing, Wu Q. M. Jonathan, Yang Yimin, and Cao Libo. 2020. An autuencoder-based data augmentation strategy for generalization improvement of DCNNs. Neurocomputing 402 (2020), 283–297.Google ScholarCross Ref
[54] Luo Wei, Li Jun, Yang Jian, Xu Wei, and Zhang Jian. 2017. Convolutional sparse autoencoders for image classification. IEEE Trans. Neural Netw. Learn. Syst. 29, 7 (2017), 3289–3294.Google Scholar
[55] Tang Hao, Liu Hong, Xiao Wei, and Sebe Nicu. 2020. When dictionary learning meets deep learning: Deep dictionary learning and coding network for image recognition with limited data. IEEE Trans. Neural Netw. Learn. Syst. 32, 5 (2020), 2129–2141.Google ScholarCross Ref
[56] Ge Weifeng and Yu Yizhou. 2017. Borrowing treasures from the wealthy: Deep transfer learning through selective joint fine-tuning. In IEEE Conference on Computer Vision and Pattern Recognition. 1086–1095.Google ScholarCross Ref
[57] Liu Bingyan, Cai Yifeng, Guo Yao, and Chen Xiangqun. 2021. TransTailor: Pruning the pre-trained model for improved transfer learning. In AAAI Conference on Artificial Intelligence. 8627–8634.Google ScholarCross Ref
[58] Ghalyan Ibrahim F. Jasim. 2020. Estimation of ergodicity limits of bag-of-words modeling for guaranteed stochastic convergence. Pattern Recog. 99 (2020), 107094.Google ScholarDigital Library
[59] Zhou Bolei, Lapedriza Agata, Khosla Aditya, Oliva Aude, and Torralba Antonio. 2017. Places: A 10 million image database for scene recognition. IEEE Trans. Image Process. 40, 6 (2017), 1452–1464.Google Scholar
[60] Yang Yimin and Wu Q. M. Jonathan. 2019. Features combined from hundreds of midlayers: Hierarchical networks with subnetwork nodes. IEEE Trans. Neural Netw. Learn. Syst. 30, 11 (2019), 3313–3325.Google ScholarCross Ref
[61] Zhang Wandong, Wu Jonathan, and Yang Yimin. 2020. Wi-HSNN: A subnetwork-based encoding structure for dimension reduction and food classification via harnessing multi-CNN model high-level features. Neurocomputing 414 (2020), 57–66.Google ScholarCross Ref
[62] Zhang Wandong, Wu Q. M. Jonathan, Yang Yimin, Akilan Thangarajah, and Zhang Hui. 2020. A width-growth model with subnetwork nodes and refinement structure for representation learning and image classification. IEEE Trans. Industr. Inform. 17, 3 (2020), 1562–1572.Google Scholar
[63] Déniz Oscar, Bueno Gloria, Salido Jesús, and Torre Fernando De la. 2011. Face recognition using histograms of oriented gradients. Pattern Recog. Lett. 32, 12 (2011), 1598–1603.Google ScholarDigital Library
[64] Yang Bo and Chen Songcan. 2013. A comparative study on local binary pattern (LBP) based face recognition: LBP histogram versus LBP image. Neurocomputing 120 (2013), 365–379.Google ScholarCross Ref
[65] Liu Chengjun and Wechsler Harry. 2002. Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition. IEEE Trans. Image Process. 11, 4 (2002), 467–476.Google ScholarDigital Library
[66] Krizhevsky Alex, Sutskever Ilya, and Hinton Geoffrey E.. 2017. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84–90.Google ScholarDigital Library
[67] Zhang Yongjun, Liu Wenjie, Fan Haisheng, Zou Yongjie, Cui Zhongwei, and Wang Qian. 2022. Dictionary learning and face recognition based on sample expansion. Appl. Intell. 52, 4 (2022), 3766–3780.Google ScholarDigital Library
[68] Wei Xuqin, Shi Yun, Gong Weiyin, and Guan Yanyun. 2022. Improved image representation and sparse representation for face recognition. Multim. Tools Applic. (2022), 1–15.Google Scholar
[69] Begum Nazmin and Mustafa A. Syed. 2022. A novel approach for multimodal facial expression recognition using deep learning techniques. Multim. Tools Applic. 81, 13 (2022), 18521–18529.Google ScholarDigital Library
[70] Lestariningati Susmini Indriani, Suksmono Andriyan Bayu, Edward Ian Joseph Matheus, and Usman Koredianto. 2022. Group class residual l 1-minimization on random projection sparse representation classifier for face recognition. Electronics 11, 17 (2022), 2723.Google ScholarCross Ref
[71] Tie Yun and Guan Ling. 2012. A deformable 3-D facial expression model for dynamic human emotional state recognition. IEEE Trans. Circ. Syst. Vid. Technol. 23, 1 (2012), 142–157.Google ScholarDigital Library
[72] Manjunath Bangalore S. and Ma Wei-Ying. 1996. Texture features for browsing and retrieval of image data. IEEE Trans. Image Process. 18, 8 (1996), 837–842.Google ScholarDigital Library
[73] Slimi Anwer, Zrigui Mounir, and Nicolas Henri. 2022. MuLER: Multiplet-loss for emotion recognition. In International Conference on Multimedia Retrieval. 435–442.Google ScholarDigital Library
[74] Bhaskar Shabina and Thasleema T. M.. 2023. LSTM model for visual speech recognition through facial expressions. Multim. Tools Applic. 82, 4 (2023), 5455–5472.Google ScholarDigital Library
[75] Nguyen Khanh-Duy, Le Duy-Dinh, and Duong Duc Anh. 2013. Efficient traffic sign detection using bag of visual words and multi-scales sift. In International Conference on Neural Information Processing. Springer, 433–441.Google ScholarCross Ref
[76] Blei David M., Ng Andrew Y., and Jordan Michael I.. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3, Jan. (2003), 993–1022.Google ScholarDigital Library
[77] Sharma Abhishek and Jacobs David W.. 2011. Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In Conference on Computer Vision and Pattern Recognition. IEEE, 593–600.Google ScholarDigital Library
[78] He Ran, Zhang Man, Wang Liang, Ji Ye, and Yin Qiyue. 2015. Cross-modal subspace learning via pairwise constraints. IEEE Trans. Image Process. 24, 12 (2015), 5543–5556.Google ScholarDigital Library
[79] Xu Meixiang, Zhu Zhenfeng, Zhang Xingxing, Zhao Yao, and Li Xuelong. 2019. Canonical correlation analysis with l 2, 1-norm for multiview data representation. IEEE Trans. Cybern. 50, 11 (2019), 4772–4782.Google ScholarCross Ref
[80] Yan Xiaoqiang, Ye Yangdong, Mao Yiqiao, and Yu Hui. 2019. Shared-private information bottleneck method for cross-modal clustering. IEEE Access 7 (2019), 36045–36056.Google ScholarCross Ref
[81] Gao Lei and Guan Ling. 2021. A discriminative vectorial framework for multi-modal feature representation. IEEE Trans. Multim. 24 (2021), 1503–1514.Google ScholarDigital Library
[82] Gao Lei, Qi Lin, and Guan Ling. 2021. A discriminant kernel entropy-based framework for feature representation learning. J. Vis. Commun. Image Represent. 81 (2021), 103366.Google ScholarDigital Library
[83] Ghalyan Ibrahim F.. 2023. Capacitive empirical risk function-based bag-of-words and pattern classification processes. Pattern Recog. 139 (2023), 109482.Google ScholarDigital Library
[84] Li Xingjian, Xiong Haoyi, Chen Zeyu, Huan Jun, Liu Ji, Xu Cheng-Zhong, and Dou Dejing. 2021. Knowledge distillation with attention for deep transfer learning of convolutional networks. ACM Trans. Knowl. Discov. Data 16, 3 (2021), 1–20.Google ScholarDigital Library
[85] Li Xuhong, Grandvalet Yves, and Davoine Franck. 2020. A baseline regularization scheme for transfer learning with convolutional neural networks. Pattern Recog. 98 (2020), 107049.Google ScholarDigital Library
[86] Xiong Haoyi, Wan Ruosi, Zhao Jian, Chen Zeyu, Li Xingjian, Zhu Zhanxing, and Huan Jun. 2022. GrOD: Deep learning with gradients orthogonal decomposition for knowledge transfer, distillation, and adversarial training. ACM Trans. Knowl. Discov. Data 16, 6 (2022), 1–25.Google ScholarDigital Library
[87] Zhong Yang and Maki Atsuto. 2020. Regularizing CNN transfer learning with randomised regression. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13637–13646.Google ScholarCross Ref
[88] Wang Zengmao, Chen Zixi, and Du Bo. 2023. Active learning with co-auxiliary learning and multi-level diversity for image classification. IEEE Trans. Circ. Syst. Vid. Technol. 33, 8 (2023), 3899–3911.Google ScholarDigital Library
[89] Melzer Thomas, Reiter Michael, and Bischof Horst. 2003. Appearance models based on kernel canonical correlation analysis. Pattern Recog. 36, 9 (2003), 1961–1971.Google ScholarCross Ref

Index Terms

An Optimal Edge-weighted Graph Semantic Correlation Framework for Multi-view Feature Representation Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
  2. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees
2. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic algorithms

Recommendations

Robust Face Recognition with Deep Multi-View Representation Learning
MM '16: Proceedings of the 24th ACM international conference on Multimedia

This paper describes our proposed method targeting at the MSR Image Recognition Challenge MS-Celeb-1M. The challenge is to recognize one million celebrities from their face images captured in the real world. The challenge provides a large scale dataset ...
Read More
Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition

Human action may be observed from multi-view, which are highly related but sometimes look different from each other. Traditional metric learning algorithms have achieved satisfactory performance in single-view, but they often fail or do not satisfy when ...
Read More
Nonnegative Constrained Graph Based Canonical Correlation Analysis for Multi-view Feature Learning
Abstract
Understanding and analyzing multi-view data is a fundamental research topic of feature learning for a wide range of practical applications such as image classification. Canonical correlation analysis (CCA) is a popular unsupervised method of ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 20, Issue 7
July 2024
463 pages
ISSN:1551-6857
EISSN:1551-6865
DOI:10.1145/3613662
Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 April 2024
- Online AM: 27 February 2024
- Accepted: 11 February 2024
- Revised: 10 February 2024
- Received: 5 February 2023
Published in tomm Volume 20, Issue 7

Check for updates
Author Tags
Multi-view feature representation
graph model
semantic correlation
data visualization
face recognition
emotion recognition
text-image recognition
object recognition
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 84
  Total Downloads
- Downloads (Last 12 months)84
- Downloads (Last 6 weeks)28
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

An Optimal Edge-weighted Graph Semantic Correlation Framework for Multi-view Feature Representation Learning

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Robust Face Recognition with Deep Multi-View Representation Learning

Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition

Nonnegative Constrained Graph Based Canonical Correlation Analysis for Multi-view Feature Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

Caption

An Optimal Edge-weighted Graph Semantic Correlation Framework for Multi-view Feature Representation Learning

ACM Transactions on Multimedia Computing, Communications, and Applications

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

Robust Face Recognition with Deep Multi-View Representation Learning

Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition

Nonnegative Constrained Graph Based Canonical Correlation Analysis for Multi-view Feature Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

Share this Publication link

Share on Social Media