Wavelet-based Bayesian approximate kernel method for high-dimensional data analysis

Guo, Wenxing; Zhang, Xueying; Jiang, Bei; Kong, Linglong; Hu, Yaozhong

doi:10.1007/s00180-023-01438-1

Wavelet-based Bayesian approximate kernel method for high-dimensional data analysis

Original Paper
Published: 26 November 2023

(2023)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Wenxing Guo^1,2^na1,
Xueying Zhang²^na1,
Bei Jiang²,
Linglong Kong ORCID: orcid.org/0000-0003-3011-9216² &
…
Yaozhong Hu²

197 Accesses
Explore all metrics

Abstract

Kernel methods are often used for nonlinear regression and classification in statistics and machine learning because they are computationally cheap and accurate. The wavelet kernel functions based on wavelet analysis can efficiently approximate any nonlinear functions. In this article, we construct a novel wavelet kernel function in terms of random wavelet bases and define a linear vector space that captures nonlinear structures in reproducing kernel Hilbert spaces (RKHS). Based on the wavelet transform, the data are mapped into a low-dimensional randomized feature space and convert kernel function into operations of a linear machine. We then propose a new Bayesian approximate kernel model with the random wavelet expansion and use the Gibbs sampler to compute the model’s parameters. Finally, some simulation studies and two real datasets analyses are carried out to demonstrate that the proposed method displays good stability, prediction performance compared to some other existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection techniques for machine learning: a survey of more than two decades of research

Article 01 December 2023

A review of unsupervised feature selection methods

Article 29 January 2019

Selecting critical features for data classification based on machine learning methods

Article Open access 23 July 2020

References

Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Statist Assoc 88:669–679
Article MathSciNet MATH Google Scholar
Aronszajn N (1950) Theory of reproducing kernels. Trans Am Math Soc 58:337–404
Article MathSciNet MATH Google Scholar
Banerjee S, Carlin BP, Gelfand AE (2003) Hierarchical modeling and analysis for spatial data. Chapman and Hall/CRC
Băzăvan EG, Li F, Sminchisescu C (2012) Fourier kernel learning. In European Conference on Computer Vision, Springer, pp. 459-473
Bernardo JM, Degroot MH, Lindley DV (1985) Bayesian Statistics 2: Proceedings of the Second Valencia International Meeting. North-Holland, 2: 371-372
Bernardo JM, Smith AF (2009) Bayesian theory. John Wiley & Sons, vol. 405
Brown PJ, Fearn T, Vannucci M (2001) Bayesian wavelet regression on curves with application to a spectroscopic calibration problem. J Am Statist Assoc 96:398–408
Article MathSciNet MATH Google Scholar
Chakraborty S (2009) Bayesian binary kernel probit model for microarray based cancer classification and gene selection. Comput Stat Data Anal 53:4198–4209
Article MathSciNet MATH Google Scholar
Chakraborty S, Ghosh M, Mallick BK (2012) Bayesian nonlinear regression for large \(p\) small \(n\) problems. J Multiv Anal 108:28–40
Article MathSciNet MATH Google Scholar
Crawford L, Wood KC, Zhou X, Mukherjee S (2018) Bayesian approximate kernel regression with variable selection. J Am Stat Assoc 113:1710–1721
Article MathSciNet MATH Google Scholar
DeCoste D, Mazzoni D (2003) Fast query-optimized kernel machine classification via incremental approximate nearest support vectors. In IEEE International Conference on Machine Learning (ICML), pp. 115-122
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer, Second edition
Lynch SM (2007) Introduction to applied Bayesian statistics and estimation for social scientists. Springer Science & Business Media
Mallick BK, Ghosh D, Ghosh M (2005) Bayesian classification of tumours by using gene expression data. J Royal Stat Soc Series B (Statistical Methodology) 67:219–234
Article MathSciNet MATH Google Scholar
Noble WS (2006) What is a support vector machine? Nat Biotechnol 24:1565–1567
Article Google Scholar
Rahimi A, Recht B (2007) Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems (NIPS), pp. 1177-1184
Schölkopf B, Herbrich R, Smola AJ (2001) A generalized representer theorem. In International Conference on Computational Learning Theory, Springer, pp. 416-426
Shyu H, Sun Y (2002) Construction of a morlet wavelet power spectrum. Multidimens Syst Signal Process 13:101–111
Article MATH Google Scholar
Sifuzzaman M, Islam MR, Ali M (2009) Application of wavelet transform and its advantages compared to fourier transform. J Phys Sci 13:121–134
Google Scholar
Vapnik VN (1999) An overview of statistical learning theory. IEEE Trans Neural Netw 10:988–999
Article Google Scholar
Vapnik VN (2013) The nature of statistical learning theory. Springer Science & Business Media
Wahba G (1990) Spline models for observational data. SIAM, Philadelphia
Book MATH Google Scholar
Wang F, Du T (2000) Using principal component analysis in process performance for multivariate data. Omega 28:185–194
Article Google Scholar
West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, Zuzan H, Olson JA, Marks JR, Nevins JR (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Nat Acad Sci 98:11462–11467
Article Google Scholar
Zhang L, Zhou W, Jiao L (2004) Wavelet support vector machine. IEEE Trans Syst Man Cybern Part B (Cybernetics) 34:34–39
Article Google Scholar
Zhang N, Ding S (2017) Unsupervised and semi-supervised extreme learning machine with wavelet kernel for high dimensional data. Memetic Comput 9:129–139
Article Google Scholar
Zhang Z, Dai G, Jordan MI (2011) Bayesian generalized kernel mixed models. J Mach Learn Res 12:111–139
MathSciNet MATH Google Scholar

Download references

Acknowledgements

Bei Jiang and Linglong Kong were partially supported by grants from the Canada CIFAR AI Chairs program, the Alberta Machine Intelligence Institute (AMII), and Natural Sciences and Engineering Council of Canada (NSERC), and Linglong Kong was also partially supported by grants from the Canada Research Chair program from NSERC. Yaozhong Hu was supported by the NSERC discovery fund and a centennial fund of the University of Alberta. The authors would like to thank the Editor, the Associate Editor and the two anonymous referees for the critical comments and constructive suggestions which have led to the improvement of this article.

Author information

Wenxing Guo and Xueying Zhang have contributed equally to this work.

Authors and Affiliations

School of Mathematics, Statistics and Actuarial Science, University of Essex, Colchester, CO4 3SQ, UK
Wenxing Guo
Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, T6G 2G1, Canada
Wenxing Guo, Xueying Zhang, Bei Jiang, Linglong Kong & Yaozhong Hu

Authors

Wenxing Guo
View author publications
You can also search for this author in PubMed Google Scholar
Xueying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Bei Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Linglong Kong
View author publications
You can also search for this author in PubMed Google Scholar
Yaozhong Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Linglong Kong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guo, W., Zhang, X., Jiang, B. et al. Wavelet-based Bayesian approximate kernel method for high-dimensional data analysis. Comput Stat (2023). https://doi.org/10.1007/s00180-023-01438-1

Download citation

Received: 02 August 2022
Accepted: 06 November 2023
Published: 26 November 2023
DOI: https://doi.org/10.1007/s00180-023-01438-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Wavelet-based Bayesian approximate kernel method for high-dimensional data analysis

Abstract

Access this article

Similar content being viewed by others

Feature selection techniques for machine learning: a survey of more than two decades of research

A review of unsupervised feature selection methods

Selecting critical features for data classification based on machine learning methods

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Wavelet-based Bayesian approximate kernel method for high-dimensional data analysis

Abstract

Access this article

Similar content being viewed by others

Feature selection techniques for machine learning: a survey of more than two decades of research

A review of unsupervised feature selection methods

Selecting critical features for data classification based on machine learning methods

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation