Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data

Du, Xiuli; Jiang, Xiaohu; Lin, Jinguan

doi:10.1007/s11336-023-09918-5

Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data

Theory and Methods
Published: 02 June 2023

Volume 88, pages 975–1001, (2023)
Cite this article

Psychometrika Aims and scope Submit manuscript

Xiuli Du ORCID: orcid.org/0000-0002-6798-0504¹,
Xiaohu Jiang¹,
Jinguan Lin² &
The Alzheimer’s Disease Neuroimaging Initiative

566 Accesses
Explore all metrics

Abstract

Multi-source functional block-wise missing data arise more commonly in medical care recently with the rapid development of big data and medical technology, hence there is an urgent need to develop efficient dimension reduction to extract important information for classification under such data. However, most existing methods for classification problems consider high-dimensional data as covariates. In the paper, we propose a novel multinomial imputed-factor Logistic regression model with multi-source functional block-wise missing data as covariates. Our main contribution is to establishing two multinomial factor regression models by using the imputed multi-source functional principal component scores and imputed canonical scores as covariates, respectively, where the missing factors are imputed by both the conditional mean imputation and the multiple block-wise imputation approaches. Specifically, the univariate FPCA is carried out for the observable data of each data source firstly to obtain the univariate principal component scores and the eigenfunctions. Then, the block-wise missing univariate principal component scores instead of the block-wise missing functional data are imputed by the conditional mean imputation method and the multiple block-wise imputation method, respectively. After that, based on the imputed univariate factors, the multi-source principal component scores are constructed by using the relationship between the multi-source principal component scores and the univariate principal component scores; and at the same time, the canonical scores are obtained by the multiple-set canonial correlation analysis. Finally, the multinomial imputed-factor Logistic regression model is established with the multi-source principal component scores or the canonical scores as factors. Numerical simulations and real data analysis on ADNI data show the proposed method works well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature Based Multivariate Data Imputation

Bayesian network-based missing mechanism identification (BN-MMI) method in medical research

Article Open access 12 November 2021

Principal Components Analysis Based Frameworks for Efficient Missing Data Imputation Algorithms

References

Bai, J., & Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70(1), 191–221.
Article Google Scholar
Bai, J. S., & Li, K. P. (2012). Statistical analysis of factor models of high dimension. The Annals of Statistics, 40(1), 436–465.
Article Google Scholar
Bair, E., Hastie, T., Paul, D., & Tibshirani, R. (2006). Prediction by supervised principal components. Journal of the American Statistical Association, 101, 119–137.
Article Google Scholar
Berrendero, J. R., Justel, A., & Svarc, M. (2011). Principal components for multivariate functional data. Computational Statistics and Data Analysis, 55(9), 2619–2634.
Article Google Scholar
Cai, T., Cai, T. T., & Zhang, A. (2016). Structured matrix completion with applications to genomic data integration. Journal of the American Statistical Association, 111(514), 621–633.
Article PubMed PubMed Central Google Scholar
Campos, S., Pizarro, L., Valle, C., Gray, K. R., Rueckert, D., & Allende, H. (2015). Evaluating imputation techniques for missing data in ADNI: A patient classification study. Iberoamerican congress on pattern Recognition, Vol. 9423, pp. 3–10. Cham, Switzerland: Springer.
Chiou, J. M., Chen, Y. T., & Yang, Y. F. (2014). Multivariate functional principal component analysis: A normalization approach. Statistica Sinica, 24, 1571–1596.
Google Scholar
Choi, J. Y., Hwang, H., Yamamoto, M., et al. (2017). A unified approach to functional principal component analysis and functional multiple-set canonical correlation. Psychometrika, 82, 427–441.
Article PubMed Google Scholar
Correa, N. M., Eichele, T., Adali, T., Li, Y., & Calhoun, V. D. (2010). Multi-set canonical correlation analysis for the fusion of concurrent single trial ERP and functional MRI. NeuroImage, 50, 1438–1445.
Article PubMed Google Scholar
Gao, Q., & Lee, T. C. (2017). High-dimensional variable selection in regression and classification with missing data. Signal Processing the Official Publication of the European Association for Signal Processing, 131, 1–7.
Google Scholar
Happ, C., & Greven, S. (2018). Multivariate functional principal component analysis for data observed on different (dimensional) domains. Journal of the American Statistical Association, 113(522), 649–659.
Article Google Scholar
He, Y., Kong, X. B., Yu, L., & Zhang, X. S. (2022). Large-dimensional factor analysis without moment constraints. Journal of Business & Economic Statistics, 40(1), 302–312.
Article Google Scholar
Hwang, H., Jung, K., Takane, Y., et al. (2012). Functional multiple-set canonical correlation analysis. Psychometrika, 77, 48–64.
Article Google Scholar
Hwang, H., Jung, K., Takane, Y., & Woodward, T. S. (2013). A unified approach to multiple-set canonical correlation analysis and principal components analysis. British Journal of Mathematical & Statistical Psychology, 66(2), 308–321.
Article Google Scholar
Jacques, J., & Preda, C. (2014). Model-based clustering for multivariate functional data. Computational Statistics and Data Analysis, 71, 92–106.
Article Google Scholar
Koldar, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51, 455–500.
Article Google Scholar
Li, Y., Wang, N., & Carroll, R. J. (2013). Selecting the number of principal components in functional data. Journal of the American Statistical Association, 108, 1284–1294.
Article Google Scholar
Liu, M., Zhang, J., Yap, P. T., & Shen, D. (2017). View-aligned hypergraph learning for Alzheimer’s disease diagnosis with incomplete multi-modality data. Medical Image Analysis, 36, 123–134.
Article PubMed Google Scholar
Poldrack, R. A., Mumford, J. A., & Nichols, T. E. (2011). Handbook of functional MRI data analysis. Cambridge University Press.
Book Google Scholar
Ramsay, J. O., & Silverman, B. W. (2005). Functional data analysis. Berlin: Springer.
Book Google Scholar
Saporta, G. (1981). Méthodes exploratoires d’analyse de données temporelles. Cahiers Du Bureau Universitaire De Recherche Opérationnelle Série Recherche, 37, 7–194.
Google Scholar
Takane, Y., & Hwang, H. (2002). Generalized constrained canonical correlation analysis. Multivariate Behavioral Research, 37, 163–195.
Article Google Scholar
Takane, Y., Hwang, H., & Abdi, H. (2008). Regularized multiple-set canonical correlation analysis. Psychometrika, 73, 753–775.
Article Google Scholar
Tenenhaus, A., & Tenenhaus, M. (2011). Regularized generalized canonical correlation analysis. Psychometrika, 76, 257–284.
Article Google Scholar
Tenenhaus, M., Tenenhaus, A., & Groenen, P. J. F. (2017). Regularized generalized canonical correlation analysis: a framework for sequential multiblock component methods. Psychometrika, 82, 737–777.
Article Google Scholar
Tenenhaus, A., Philippe, C., & Frouin, V. (2015). Kernel generalized canonical correlation analysis. Computational Statistics & Data Analysis, 90, 114–131.
Article Google Scholar
Xiang, S., Yuan, L., Fan, W., Wang, Y., Thompson, P. M., Ye, J., & Initiative, Alzheimer’s Disease Neuroimaging. (2014). Bi-level multi-source learning for heterogeneous block-wise missing data. NeuroImage, 102, 192–206.
Article PubMed Google Scholar
Xue, F., & Qu, A. (2021). Integrating multisource block-wise missing data in model selection. Journal of the American Statistical Association, 116(536), 1914–1927.
Article Google Scholar
Yao, F., Müller, H. G., & Wang, J. L. (2005). Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association, 100(470), 577–590.
Article Google Scholar
Yu, G., Li, Q., Shen, D., & Liu, Y. (2020). Optimal sparse linear prediction for block-missing multi-modality data without imputation. Journal of the American Statistical Association, 115(531), 1406–1419.
Article PubMed Google Scholar
Yuan, L., Wang, Y., Thompson, P. M., Narayan, V. A., Ye, J., & Initiative, Alzheimer’s Disease Neuroimaging. (2012). Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data. NeuroImage, 61(3), 622–632.
Article PubMed Google Scholar
Zhang, Y., Tang, N., & Qu, A. (2020). Imputed factor regression for high-dimensional block-wise missing data. Statistica Sinica, 30(2), 631–651.
Google Scholar
Zhu, H., Shen, D., Peng, X., & Liu, L. Y. (2017). MWPCR: Multiscale weighted principal component regression for high-dimensional prediction. Journal of the American Statistical Association, 112, 1009–1021.
Article PubMed Google Scholar

Download references

Acknowledgements

Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. We also thank editors and three anonymous reviews for their constructive comments that helped improve the quality of this article.

Funding

This research is supported by the National Social Science Foundation of China (No.21BTJ044).

Author information

Authors and Affiliations

College of Mathematical Sciences, Nanjing Normal University, Nanjing, 210023, China
Xiuli Du & Xiaohu Jiang
Institute of Statistics and Data Science, Nanjing Audit University, Nanjing, 211815, China
Jinguan Lin

Authors

Xiuli Du
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jinguan Lin
View author publications
You can also search for this author in PubMed Google Scholar

Consortia

The Alzheimer’s Disease Neuroimaging Initiative

Corresponding author

Correspondence to Xiuli Du.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wpcontent/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (zip 1421 KB)

Supplementary file 2 (pdf 524 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Du, X., Jiang, X., Lin, J. et al. Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data. Psychometrika 88, 975–1001 (2023). https://doi.org/10.1007/s11336-023-09918-5

Download citation

Received: 18 July 2022
Revised: 23 March 2023
Published: 02 June 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s11336-023-09918-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data

Abstract

Access this article

Similar content being viewed by others

Feature Based Multivariate Data Imputation

Bayesian network-based missing mechanism identification (BN-MMI) method in medical research

Principal Components Analysis Based Frameworks for Efficient Missing Data Imputation Algorithms

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Consortia

The Alzheimer’s Disease Neuroimaging Initiative

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (zip 1421 KB)

Supplementary file 2 (pdf 524 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multinomial Logistic Factor Regression for Multi-source Functional Block-wise Missing Data

Abstract

Access this article

Similar content being viewed by others

Feature Based Multivariate Data Imputation

Bayesian network-based missing mechanism identification (BN-MMI) method in medical research

Principal Components Analysis Based Frameworks for Efficient Missing Data Imputation Algorithms

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Consortia

The Alzheimer’s Disease Neuroimaging Initiative

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (zip 1421 KB)

Supplementary file 2 (pdf 524 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation