Abstract
With the development of data collection technologies, a significant volume of multiview data has appeared, and their clustering has become topical. Most methods of multiview clustering assume that all views are fully observable. However, in many cases this is not the case. Several tensor methods have been proposed to deal with incomplete multiview data. However, the traditional tensor norm is computationally expensive, and such methods generally cannot handle undersampling and imbalances of various views. A new method for clustering incomplete multiview data is proposed. A new tensor norm is defined to reconstruct the connectivity graph, and the graphs are regularized to a consistent low-dimensional representation of patterns. The weights are then iteratively updated for each view. Compared to the existing ones, the proposed method not only determines the consistency between views but also obtains a low-dimensional representation of the samples using the resulting projection matrix. An efficient optimization algorithm based on the method of indefinite Lagrange multipliers is developed for the solution. The experimental results on four data sets demonstrate the effectiveness of the method.
Similar content being viewed by others
Notes
This definition does not coincide with the generally accepted definition in mathematics, since the coordinate system and the law of transformation when it changes are not set. However, in computer science, such an understanding of the term is accepted.
REFERENCES
J. Zhao, X. Xie, X. Xu, and S. Sun, “Multi-view learning overview: Recent progress and new challenges,” Inf. Fusion 38, 43–54 (2017).
Y. Liu, L. Fan, C. Zhang, T. Zhou, Z. Xiao, L. Geng, and D. Shen, “Incomplete multi-modal representation learning for Alzheimer’s disease diagnosis,” Med. Image Anal. 69, 101953 (2021).
L. Qiao, L. Zhang, S. Chen, and D. Shen, “Data-driven graph construction and graph learning: A review,” Neurocomputing 312, 336–351 (2018).
J. Wen, Y. Xu, and H. Liu, “Incomplete multiview spectral clustering with adaptive graph learning,” IEEE Trans. Cybern. 50 (4), 1418–1429 (2020).
J. Wen, Zheng Zhang, Zhao Zhang, L. K. Fei, and M. Wang, “Generalized incomplete multiview clustering with flexible locality structure diffusion,” IEEE Trans. Cybern. 51 (1), 101–114 (2021).
N. Zhang and S. Sun, “Incomplete multiview nonnegative representation learning with multiple graphs,” Pattern Recognit. 123, 108412 (2022).
J. Wen, K. Yan, Z. Zhang, Y. Xu, J. Q. Wang, L. K. Fei, and B. Zhang, “Adaptive graph completion based incomplete multiview clustering,” IEEE Trans. Multimedia 23, 2493–2504 (2021).
J. Liu, S. Teng, W. Zhang, X. Fang, L. Fei, and Z. Zhang, “Incomplete multiview subspace clustering with low-rank tensor,” Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (Toronto, Canada, 2021), pp. 3180–3184.
J. Wen, Zheng Zhang, Zhao Zhang, L. Zhu, L. K. Fei, B. Zhang, and Y. Xu, “Unified tensor framework for incomplete multiview clustering and missing-view inferring,” in Proc. 35th AAAI Conf. Artificial Intelligence (AAAI Press: Palo Alto, Calif., 2021), vol. 35, pp. 10273–10281.
W. Xia, Q. Gao, Q. Wang, and X. Gao, “Tensor completion-based incomplete multiview clustering,” IEEE Trans. Cybern. 52 (12), 13635–13644 (2022).
M. B. Blaschko, C. H. Lampert, and A. Gretton, “Semi-supervised Laplacian regularization of kernel canonical correlation analysis,” in Proc. Joint Eur. Conf. Machine Learning and Knowledge Discovery in Databases (Antwerp, Belgium, 2008), pp. 133–145.
X. Chen, S. Chen, H. Xue, and X. Zhou, “A unified dimensionality reduction framework for semi-paired and semi-supervised multiview data,” Pattern Recognit. 45 (5), 2005–2018 (2012).
X. Zhou, X. Chen, and S. Chen, “Neighborhood correlation analysis for semi-paired two-view data,” Neural Processing Lett. 37 (3), 335–354 (2013).
Y. Yuan, Z. Wu, Y. Li, J. Qiang, J. Gou, and Y. Zhu, “Regularized multiset neighborhood correlation analysis for semi-paired multiview learning,” in Int. Conf. Neural Information Processing (Vancouver, Canada, 2020), pp. 616–625.
W. Yang, Y. Shi, Y. Gao, L. Wang, and M. Yang, “Incomplete data oriented multiview dimension reduction via sparse low-rank representation,” IEEE Trans. Neural Networks Learn. Syst. 29 (12), 6276–6291 (2018).
C. Zhu, C. Chen, R. Zhou, L. Wei, and X. Zhang, “A new multiview learning machine with incomplete data,” Pattern Anal. Appl. 23 (3), 1085–1116 (2020).
S. Li, Y. Jiang, and Z. Zhou, “Partial multiview clustering,” in Proc. AAAI Conf. Artificial Intelligence (Quebec City, Canada, 2014), vol. 28, no. 1.
C. Xu, D. Tao, and C. Xu, “Multiview learning with incomplete views,” IEEE Trans. Image Process. 24 (12), 5812–5825 (2015).
J. Wen, Z. Zhang, Y. Xu, and Z. Zhong, “Incomplete multiview clustering via graph regularized matrix factorization,” in Proc. Eur. Conf. Computer Vision Workshops (Munich, Germany, 2018), pp. 1–16.
M. Hu and S. Chen, “Doubly aligned incomplete multiview clustering,” in Proc. Int. Joint Conf. Artificial Intelligence (Stockholm, Sweden, 2018), pp. 2262–2268.
M. Hu and S. Chen, “One-pass incomplete multiview clustering,” in Proc. AAAI Conf. Artificial Intelligence (Honolulu, Hawaii, 2019), vol. 33, pp. 3838–3845.
J. Liu, S. Teng, L. Fei, W. Zhang, X. Fang, Z. Zhang, and N. Wu, “A novel consensus learning approach to incomplete multiview clustering,” Pattern Recognit. 115, 107890 (2021).
X. Liu, X. Zhu, M. Li, L. Wang, E. Zhu, T. Liu, M. Kloft, D. Shen, J. Yin, and W. Gao, “Multiple kernel k-means with incomplete kernels,” IEEE Trans. Pattern Anal. Mach. Intell. 42 (5), 1191–1204 (2019).
J. Wen, H. Sun, L. Fei, J. Li, Z. Zhang, and B. Zhang, “Consensus guided incomplete multiview spectral clustering,” Neuron Networks 133, 207–219 (2021).
W. Zhuge, T. Luo, H. Tao, C. Hou, and D. Yi, “Multiview spectral clustering with incomplete graphs,” IEEE Access. 8, 99820–99831 (2020).
X. Liu, X. Zhu, M. Li, L. Wang, C. Tang, J. Yin, D. Shen, H. Wang, and W. Gao, “Late fusion incomplete multiview clustering,” IEEE Trans. Pattern Anal. Mach. Intell. 41 (10), 2410–2423 (2018).
X. Zheng, X. Liu, J. Chen, and E. Zhu, “Adaptive partial graph learning and fusion for incomplete multiview clustering,” Int. J. Intell. Syst. 37 (1), 991–1009 (2022).
M. Xie, Z. Ye, G. Pan, and X. Liu, “Incomplete multiview subspace clustering with adaptive instance sample mapping and deep feature fusion,” Appl. Intell. 51 (8), 5584–5597 (2021).
L. Zhao, Z. Chen, Y. Yang, Z. J. Wang, and V. C. Leung, “Incomplete multiview clustering via deep semantic mapping,” Neurocomputing 275, 1053–1062 (2018).
C. Zhang, Z. Han, H. Fu, J. T. Zhou, and Q. Hu, “CPM-nets: Cross partial multiview networks,” in Advances in Neural Information Processing Systems 32 (NeurIPS, Vancouver, 2019).
Q. Wang, Z. Ding, Z. Tao, Q. Gao, and Y. Fu, “Partial multiview clustering via consistent GAN,” in Proc. IEEE Int. Conf. Data Mining (Singapore, 2018).
C. Xu, H. Liu, Z. Guan, X. Wu, J. Tan, and B. Ling, “Adversarial incomplete multiview subspace clustering networks,” IEEE Trans. Cybern. 52 (10), 10490–10503 (2022).
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Comm. ACM 63 (11), 139–144 (2020).
Y. Lin, Y. Gou, Z. Liu, B. Li, J. Lv, and X. Peng, “Completer: Incomplete multiview clustering via contrastive prediction,” in Proc. IEEE/CVF Conf. Computer Vision and Pattern Recognition (Nashville, Tenn., 2021), pp. 11174–11183.
B. Zhang, J. Hao, G. Ma, J. Yue, and Z. Shi, “Semi-paired probabilistic canonical correlation analysis,” in Intelligent Information Processing VII. IFIP Advances in Information and Communication Technology (Springer, Berlin–Heidelberg, 2014).
T. Matsuura, K. Saito, Y. Ushiku, and T. Harada, “Generalized Bayesian Canonical Correlation Analysis with Missing Modalities,” in 15th Eur. Conf. Computer Vision (ECCV) (Munich, Germany, 2018), Vol. 11134, pp. 641–656.
P. Li and S. Chen, “Shared Gaussian process latent variable model for incomplete multiview clustering,” IEEE Trans. Cybern. 50 (1), 61–73 (2018).
C. Kamada, A. Kanezaki, and T. Harada, “Probabilistic semi-canonical correlation analysis,” in Proc. 23rd ACM Int. Conf. Multimedia (Brisbane, Australia, 2015), pp. 1131–1134.
C. Wang, “Variational Bayesian approach to canonical correlation analysis,” IEEE Trans. Neural Networks 18 (3), 905–910 (2007).
A. Kimura, M. Sugiyama, T. Nakano, H. Kameoka, H. Sakano, E. Maeda, and K. Ishiguro, “SemiCCA: Efficient semi-supervised learning of canonical correlations,” Inf. Media Technol. 8 (2), 311–318 (2013).
Y. Luo, D. Tao, K. Ramamohanarao, C. Xu, and Y. Wen, “Tensor canonical correlation analysis for multiview dimension reduction,” IEEE Trans. Knowl. Data Eng. 27 (11), 3111–3124 (2015).
H. Wong, L. Wang, R. Chan, and T. Zeng, “Deep tensor CCA for multiview learning,” IEEE Trans. Big Data 8, 1664–1677 (2021).
M. Cheng, L. Jing, and M. K. Ng, “Tensor-based low-dimensional representation learning for multiview clustering,” IEEE Trans. Image Process. 28 (5), 2399–2414 (2018).
C. Zhang, H. Fu, S. Liu, G. Liu, and X. Cao, “Low-rank tensor constrained multiview subspace clustering,” in Proc. IEEE Int. Conf. Computer Vision (Santiago, Chile, 2015), pp. 1582–1590.
J. Wu, Z. Lin, and H. Zha, “Essential Tensor Learning for Multiview Spectral Clustering,” IEEE Trans. Image Processing 28 (12), 5910–5922 (2019).
J. Carroll, “Generalization of canonical correlation analysis to three or more sets of variables,” in Proc. 76th Annual Convention of the American Psychological Association (APA, 1968), vol. 3, pp. 227–228.
J. Chen, G. Wang, and G. B. Giannakis, “Graph multiview canonical correlation analysis,” IEEE Trans. Signal Process. 67 (11), 2826–2838 (2019).
F. Nie, J. Li, and X. Li, “Self-weighted multiview clustering with multiple graphs,” in Int. Joint Conf. Artificial Intelligence (Melbourne, Australia, 2017), pp. 2564–2570.
K. Fan, “On a theorem of Weyl concerning eigenvalues of linear transformations,” Proc. Natl. Acad. Sci. 35 (11), 652–655 (1949).
M. van Breukelen, R. P. W. Duin, D. M. J. Tax, and J. E. Hartog, “Handwritten digit recognition by combined classifiers,” Kybernetika 34 (4), 381–386 (1998).
D. Greene, 3 sources dataset. http://erdos.ucd.ie/datasets/3sources.html. Accessed January 7, 2023.
D. Greene and P. Cunningham, “Practical solutions to the problem of diagonal dominance in kernel document clustering,” in Proc. 23rd Int. Conf. Mach. Learning (Pittsburg, Pa., 2006), pp. 377–384.
F. S. Samaria and A. C. Harter, “Parameterisation of a stochastic model for human face identification,” in Proc. IEEE Workshop on Applications of Computer Vision (Sarasota, Fla., 1994), pp. 138–142.
H. Zhao, H. Liu, and Y. Fu, “Incomplete multi-modal visual data grouping,” in Proc. Int. Joint Conf. Artificial Intell. (New York, 2016), pp. 2392–2398.
Y. Xie, S. Gu, Y. Liu, W. Zuo, W. Zhang, and L. Zhang, “Weighted Schatten p-norm minimization for image denoising and background subtraction,” IEEE Trans. Image Process. 25 (10), 4842–4857 (2016).
Funding
This study was supported in part by the Russian Foundation for Basic Research, grant no. 21-51-53019 and the National Natural Science Foundation of China, grant nos. 11971231 and 12111530001.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare that they have no conflicts of interest.
Additional information
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
APPENDIX
APPENDIX
Proof of Theorems 1 and 2, as well as Lemma 1. Before proving Theorem 1, we consider Lemmas 2 and 3, which were presented in [55].
Lemma 2. We assume that for the optimization problem
with p and \(w\) given, the following threshold is defined
It can be concluded that the following points are valid:
(1) if \(\sigma \leqslant {{\tau }_{p}}\left( w \right)\), then the optimal solution \({{x}_{p}}\left( {\sigma ,w} \right)\) from (A.1) is zero;
(2) if \(\sigma > {{\tau }_{p}}\left( w \right)\), then the optimal solution \({{x}_{p}}\left( {\sigma ,w} \right) = \) \({\text{sign}}\left( \sigma \right){{S}_{p}}\left( {\sigma ,w} \right)\), where \({{S}_{p}}\left( {\sigma ,w} \right)\) can be obtained from \({{S}_{p}}\left( {\sigma ,w} \right) - \sigma + wp{{\left( {{{S}_{p}}\left( {\sigma ,w} \right)} \right)}^{{p - 1}}} = 0\).
Lemma 3. We assume that \(Y = {{U}_{Y}}{{\Sigma }_{Y}}V_{Y}^{ \top }\) is a singular decomposition, \(Y \in {{\mathbb{R}}^{{m \times n}}}\), \(\tau > 0\), \(l = \min (m,n)\), and \({{w}_{1}} \geqslant {{w}_{2}} \geqslant \; \cdots \; \geqslant {{w}_{l}} \geqslant 0\). Then the global optimum of the problem
will be
where \({{P}_{{\tau ,w,p}}}\left( Y \right) = {\text{diag}}\left( {{{\gamma }_{1}},{{\gamma }_{2}},\; \ldots ,\;{{\gamma }_{l}}} \right)\) and \({{\gamma }_{i}} = {{x}_{p}}\left( {{{\sigma }_{i}}\left( Y \right),\tau {{w}_{i}}} \right)\) can be obtained according to Lemma 2.
Proof of Theorem 1. For the given semiorthogonal matrix \(\Phi \in {{\mathbb{R}}^{{{{n}_{3}} \times r}}}\), there exists a semiorthogonal matrix \({{\Phi }^{c}} \in {{\mathbb{R}}^{{{{n}_{3}} \times \left( {{{n}_{3}} - r} \right)}}}\) satisfying \({{\overline \Phi }^{ \top }}\overline \Phi = I\), where \(\bar {\Phi } = \left[ {\Phi ,{{\Phi }^{c}}} \right] \in {{\mathbb{R}}^{{{{n}_{3}} \times {{n}_{3}}}}}\). By definition, we have
In (A.5) the variables \(X_{\Phi }^{{(i)}}\) and \(X_{{{{\Phi }^{c}}}}^{{(j)}}\) are independent. Thus, the task can be divided into \({{n}_{3}}\) independent subtasks:
and
According to Lemma 3, the optimal solutions (A.6) and (A.7) are \(X{{_{\Phi }^{{(i)}}}^{*}} = {{U}_{{A_{\Phi }^{{(i)}}}}}{{P}_{{\tau ,w,p}}}\left( {A_{\Phi }^{{(i)}}} \right)V_{{A_{\Phi }^{{(i)}}}}^{ \top }\) and \(X{{_{{{{\Phi }^{c}}}}^{{(j)}}}^{*}} = A_{{{{\Phi }^{c}}}}^{{(j)}}\), respectively. Thus, we obtain the optimal solutions
Further, the optimal solution to (2.32) is \(\mathcal{X}{\text{*}} = {{\left( {\mathcal{X}_{{\overline \Phi }}^{*}} \right)}_{{{{{\overline \Phi }}^{ \top }}}}}\).
Proof of Lemma 1. From the limitation to \(\mathcal{G}\) in (2.12), it is clear that \(\left\{ {{{\mathcal{G}}^{k}}} \right\}\) should be limited. Further, according to (2.18), it is known that {Ak} and \(\left\{ {\left\{ {W_{i}^{k}} \right\}_{{i = 1}}^{m}} \right\}\) are also limited. In the kth iteration in Algorithm 2, we have
From (2.34) we get
Substituting (A.10) into (A.9), we have
where J is the number of iterations in solving problem (A.1). Thus, \(\left\{ {{{\mathcal{C}}^{k}}} \right\}\) is bounded. Further, according to (2.34), we see that
It has already been mentioned that \(\left\{ {{{\mathcal{C}}^{k}}} \right\}\) and \(\left\{ {{{\mathcal{G}}^{k}}} \right\}\) are bounded; therefore, \(\left\{ {{{\mathcal{B}}^{k}}} \right\}\) is bounded, which means \(\left\{ {{{\mathcal{Y}}^{k}}} \right\}\) is also bounded.
Below we prove that the extended Lagrange function (2.16) is bounded. Note that we cannot assert that (2.16) is bounded based on the fact that all variables are bounded, since \(\left\{ {{{\rho }^{k}}} \right\}\) is not bounded. Let us first prove that
Recall (2.17), assume \(f\left( {A,{{W}_{i}}} \right) = \left\| {\left( {A - {{W}_{i}}^{ \top }{{X}_{i}}} \right){{P}_{i}}} \right\|_{F}^{2} + \lambda {\text{tr}}\left( {A{{L}_{{G_{i}^{k}}}}{{A}^{ \top }}} \right)\) and, combined with \(\delta _{i}^{k} = {{{{n}_{i}}} \mathord{\left/ {\vphantom {{{{n}_{i}}} {\sqrt {f\left( {{{A}^{k}},W_{i}^{k}} \right)} }}} \right. \kern-0em} {\sqrt {f\left( {{{A}^{k}},W_{i}^{k}} \right)} }}\), it can be deduced that
For any positive numbers a and b, the following inequality holds:
Moreover, since \(f\left( {A,{{W}_{i}}} \right) \geqslant 0\), we have
Combining (A.14) and (A.16), we obtain
Thus, Eq. (A.13) is derived. In Sections 2.2.2 and 2.2.3, since subtasks \(\mathcal{G}\) and \(\mathcal{Y}\) have an optimal solution, it is clear that
Then,
Assume \({{b}_{c}}\) is the boundary \(\left\| {{{\mathcal{C}}^{k}} - {{\mathcal{C}}^{{k - 1}}}} \right\|_{F}^{2}\). Combining Eqs. (A.18), we obtain
Since \(\rho > 1\), the following inequality holds:
Combining (A.19)–(A.21), the extended Lagrange function (2.16) is bounded.
Proof of Theorem 2. According to Lemma 1, we know that \(\left\{ {{{\mathcal{C}}^{k}}} \right\}\), \(\left\{ {{{\mathcal{G}}^{k}}} \right\}\), and \(\left\{ {{{\mathcal{Y}}^{k}}} \right\}\) are all limited. The Bolzano–Weierstrass theorem states that every bounded sequence of real numbers has a convergent subsequence. Therefore, there is at least one accumulation point for \(\left\{ {{{\mathcal{C}}^{k}}} \right\}\), \(\left\{ {{{\mathcal{G}}^{k}}} \right\}\) and \(\left\{ {{{\mathcal{Y}}^{k}}} \right\}\), respectively. In particular, we can get
As in Section 2.2.2, each tube of \({{\mathcal{G}}^{k}}\) can be analyzed independently. If we remember that \({{g}^{{k + 1}}}\) is updated by method (2.27), then the following inequality holds:
Let us analyze the first and second terms (A.23) separately:
and
Below, we show that \({v}\) tends to 0 when k tends to infinity. According to the definition uk, the following equation holds:
Comparing (2.27) and (2.28), (A.26) implies \(f\left( 0 \right) = 0\) when \(k\) tends to infinity; i.e., \({v}\) tends to 0. Eventually, \(\mathop {\lim }\limits_{k \to \infty } {{\left\| {{{g}^{{k + 1}}} - {{g}^{k}}} \right\|}_{2}} = 0\) and hence
For \({{\mathcal{Y}}^{k}}\), the following inequality is valid:
Thus, combining (A.22) and (A.27), we have
Rights and permissions
About this article
Cite this article
Zhang, H., Chen, X., Zhu, Y. et al. Improvement of Incomplete Multiview Clustering by the Tensor Reconstruction of the Connectivity Graph. J. Comput. Syst. Sci. Int. 62, 469–491 (2023). https://doi.org/10.1134/S1064230723030139
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1064230723030139