当前位置: X-MOL 学术Found. Comput. Math. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Subexponential-Time Algorithms for Sparse PCA
Foundations of Computational Mathematics ( IF 3 ) Pub Date : 2023-01-19 , DOI: 10.1007/s10208-023-09603-0
Yunzi Ding , Dmitriy Kunisky , Alexander S. Wein , Afonso S. Bandeira

We study the computational cost of recovering a unit-norm sparse principal component \(x \in \mathbb {R}^n\) planted in a random matrix, in either the Wigner or Wishart spiked model (observing either \(W + \lambda xx^\top \) with W drawn from the Gaussian orthogonal ensemble, or N independent samples from \(\mathcal {N}(0, I_n + \beta xx^\top )\), respectively). Prior work has shown that when the signal-to-noise ratio (\(\lambda \) or \(\beta \sqrt{N/n}\), respectively) is a small constant and the fraction of nonzero entries in the planted vector is \(\Vert x\Vert _0 / n = \rho \), it is possible to recover x in polynomial time if \(\rho \lesssim 1/\sqrt{n}\). While it is possible to recover x in exponential time under the weaker condition \(\rho \ll 1\), it is believed that polynomial-time recovery is impossible unless \(\rho \lesssim 1/\sqrt{n}\). We investigate the precise amount of time required for recovery in the “possible but hard” regime \(1/\sqrt{n} \ll \rho \ll 1\) by exploring the power of subexponential-time algorithms, i.e., algorithms running in time \(\exp (n^\delta )\) for some constant \(\delta \in (0,1)\). For any \(1/\sqrt{n} \ll \rho \ll 1\), we give a recovery algorithm with runtime roughly \(\exp (\rho ^2 n)\), demonstrating a smooth tradeoff between sparsity and runtime. Our family of algorithms interpolates smoothly between two existing algorithms: the polynomial-time diagonal thresholding algorithm and the \(\exp (\rho n)\)-time exhaustive search algorithm. Furthermore, by analyzing the low-degree likelihood ratio, we give rigorous evidence suggesting that the tradeoff achieved by our algorithms is optimal.



中文翻译:

稀疏 PCA 的次指数时间算法

我们研究了在 Wigner 或 Wishart 尖峰模型(观察\(W + \ lambda xx^\top \),其中W分别来自高斯正交系综,或来自\(\mathcal {N}(0, I_n + \beta xx^\top )\)的N个独立样本)。先前的工作表明,当信噪比(分别为\(\lambda\)\(\beta\sqrt{N/n}\))是一个小常数并且非零项在种植向量是\(\Vert x\Vert _0 / n = \rho \),可以恢复x在多项式时间内如果\(\rho \lesssim 1/\sqrt{n}\)。虽然在较弱的条件\(\rho \ll 1\)下可以在指数时间内恢复x,但据信多项式时间恢复是不可能的,除非\(\rho \lesssim 1/\sqrt{n}\) . 我们通过探索次指数时间算法的能力,即算法运行及时\(\exp (n^\delta )\)对于一些常数\(\delta \in (0,1)\)。对于任何\(1/\sqrt{n} \ll \rho \ll 1\),我们粗略地给出一个具有运行时间的恢复算法\(\exp (\rho ^2 n)\),展示了稀疏性和运行时之间的平滑权衡。我们的算法系列在两种现有算法之间进行平滑插值:多项式时间对角线阈值算法和\(\exp (\rho n)\)时间穷举搜索算法。此外,通过分析低度似然比,我们给出了严格的证据表明我们的算法实现的权衡是最优的。

更新日期:2023-01-21
down
wechat
bug