Abstract
A new two-level ensemble regression method, as well as its modifications and application in applied problems, are considered. The key feature of the method is its focus on constructing an ensemble of predictors that approximate the target variable well and, at the same time, consist of algorithms that, if possible, differ from each other in terms of the calculated predictions. The ensemble with the indicated properties at the first stage is constructed through the optimization of a special functional, whose choice is theoretically substantiated in this study. At the second stage, a collective solution is calculated based on the forecasts formed by this ensemble. In addition, some heuristic modifications are described that have a positive effect on the quality of the forecast in applied problems. The effectiveness of the method is confirmed by the results obtained for specific applied problems.
Similar content being viewed by others
REFERENCES
Regulations on the “Informatics” common use center. http://www.frccsc.ru/ckp. Accessed February 14, 2023.
Z. H. Zhou, Ensemble Methods: Foundations and Algorithms (Chapman and Hall/CRC, New York, 2012).
T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning Data Mining, Inference, and Prediction (Springer, New York, 2009).
L. Breiman, “Random forests,” Mach. Learning 45 (1), 5–32 (2001).
R. E. Schapire and Y. Freund, Foundations and Algorithms (MIT Press, Cambridge, Mass., 2012).
T. K. Ho, “The random subspace method for constructing decision forests,” IEEE Trans. Pattern Anal. Mach. Intell. 20 (8), 832–844 (1998).
N. Garcia-Pedrajas and D. Ortiz-Boyer, “Boosting random subspace method,” Neuron Networks 21 (9), 1344–1362 (2008).
Yu. I. Zhuravlev, O. V. Senko, A. A. Dokukin, N. N. Kiselyova, and I. A. Saenko, “Two-level regression method using ensembles of trees with optimal divergence,” Dokl. Math. 103, 1–4 (2021).
F. Pedregosa, G. Varoquaux, A. Gramfort, et al., “Scikit-learn: Machine learning in Python,” Mac. Learn. Res. 12, 2825–2830 (2011).
D. H. Wolpert, “Stacked generalization,” Neuron Networks 5 (2), 241–259 (1992).
E. M. Braverman and I. B. Muchnik, Structural Methods for Processing Empirical Data (Nauka, Moscow, 1983) [in Russian].
O. V. Senko, A. A. Dokukin, N. N. Kiselyova, V. A. Dudarev, and Yu. O. Kuznetsova, “New two-level ensemble method and its application to chemical compounds properties prediction,” Lobachevskii J. Math. 44 (1), 188–197 (2023).
M. H. Rafiei and H. Adeli, “A Novel Machine Learning Model for Estimation of Sale Prices of Real Estate Units,” J. Constr. Eng. Manage. 142 (2) (2015).
O. V. Sen’ko, V. Ya. Chuchupal, and A. A. Dokukin, “Noninvasive blood pressure assessment with CardioQvark,” Mat. Biol. Bioinf. 2 (12), 536–546 (2017).
F. Mostofi, V. Toğan, and H. B. Basağa, “Real-estate price prediction with deep neural network and principal component analysis,” Organ. Technol. Manage. Constr. 14 (1), 2741–2759 (2022).
Funding
This work was supported by the state task, project 0063-2019-0003 with the help of the infrastructure of the Informatika Center for Collective Use of the Federal Research Center “Computer Science and Control,” Russian Academy of Sciences [1].
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare that they have no conflicts of interest.
Additional information
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Dokukin, A.A., Sen’ko, O.V. New Two-Level Machine Learning Method for Evaluating the Real Characteristics of Objects. J. Comput. Syst. Sci. Int. 62, 619–626 (2023). https://doi.org/10.1134/S1064230723040020
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S1064230723040020