当前位置: X-MOL 学术Scand. J. Stat. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sparse additive models in high dimensions with wavelets
Scandinavian Journal of Statistics ( IF 1 ) Pub Date : 2023-08-24 , DOI: 10.1111/sjos.12680
Sylvain Sardy 1 , Xiaoyu Ma 2, 3
Affiliation  

In multiple regression, when covariates are numerous, it is often reasonable to assume that only a small number of them has predictive information. In some medical applications for instance, it is believed that only a few genes out of thousands are responsible for cancer. In that case, the aim is not only to propose a good fit, but also to select the relevant covariates (genes). We propose to perform model selection with additive models in high dimensions (sample size and number of covariates). Our approach is computationally efficient thanks to fast wavelet transforms, it does not rely on cross validation, and it solves a convex optimization problem for a prescribed penalty parameter, called the quantile universal threshold. We also propose a second rule based on Stein unbiased risk estimation geared toward prediction. We use Monte Carlo simulations and real data to compare various methods based on false discovery rate (FDR), true positive rate (TPR) and mean squared error. Our approach is the only one to handle high dimensions, and has a good FDR–TPR trade-off.

中文翻译:

小波高维稀疏加性模型

在多元回归中,当协变量很多时,通常可以合理地假设其中只有一小部分具有预测信息。例如,在某些医学应用中,人们相信数千个基因中只有少数几个与癌症有关。在这种情况下,目标不仅是提出良好的拟合,而且是选择相关的协变量(基因)。我们建议使用高维度的附加模型(样本大小和协变量数量)来执行模型选择。由于快速小波变换,我们的方法在计算上非常高效,它不依赖于交叉验证,并且它解决了指定惩罚参数(称为分位数通用阈值)的凸优化问题。我们还提出了基于 Stein 无偏风险估计的第二条规则,旨在进行预测。我们使用蒙特卡洛模拟和真实数据来比较基于错误发现率(FDR)、真阳性率(TPR)和均方误差的各种方法。我们的方法是唯一处理高维度的方法,并且具有良好的 FDR-TPR 权衡。
更新日期:2023-08-24
down
wechat
bug