当前位置: X-MOL 学术Ann. Inst. Stat. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On the universal consistency of an over-parametrized deep neural network estimate learned by gradient descent
Annals of the Institute of Statistical Mathematics ( IF 1 ) Pub Date : 2024-04-08 , DOI: 10.1007/s10463-024-00898-6
Selina Drews , Michael Kohler

Estimation of a multivariate regression function from independent and identically distributed data is considered. An estimate is defined which fits a deep neural network consisting of a large number of fully connected neural networks, which are computed in parallel, via gradient descent to the data. The estimate is over-parametrized in the sense that the number of its parameters is much larger than the sample size. It is shown that with a suitable random initialization of the network, a sufficiently small gradient descent step size, and a number of gradient descent steps that slightly exceed the reciprocal of this step size, the estimate is universally consistent. This means that the expected \(L_2\) error converges to zero for all distributions of the data where the response variable is square integrable.



