当前位置: X-MOL 学术J. R. Stat. Soc. B › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A statistical interpretation of spectral embedding: The generalised random dot product graph
The Journal of the Royal Statistical Society, Series B (Statistical Methodology) ( IF 5.8 ) Pub Date : 2022-06-03 , DOI: 10.1111/rssb.12509
Patrick Rubin‐Delanchy 1 , Joshua Cape 2 , Minh Tang 3 , Carey E. Priebe 4
Affiliation  

Spectral embedding is a procedure which can be used to obtain vector representations of the nodes of a graph. This paper proposes a generalisation of the latent position network model known as the random dot product graph, to allow interpretation of those vector representations as latent position estimates. The generalisation is needed to model heterophilic connectivity (e.g. ‘opposites attract’) and to cope with negative eigenvalues more generally. We show that, whether the adjacency or normalised Laplacian matrix is used, spectral embedding produces uniformly consistent latent position estimates with asymptotically Gaussian error (up to identifiability). The standard and mixed membership stochastic block models are special cases in which the latent positions take only K distinct vector values, representing communities, or live in the (K − 1)-simplex with those vertices respectively. Under the stochastic block model, our theory suggests spectral clustering using a Gaussian mixture model (rather than K-means) and, under mixed membership, fitting the minimum volume enclosing simplex, existing recommendations previously only supported under non-negative-definite assumptions. Empirical improvements in link prediction (over the random dot product graph), and the potential to uncover richer latent structure (than posited under the standard or mixed membership stochastic block models) are demonstrated in a cyber-security example.

中文翻译:

谱嵌入的统计解释:广义随机点积图

谱嵌入是一种可用于获得图节点的向量表示的过程。本文提出了潜在位置网络模型的推广,称为随机点积图,以允许将这些向量表示解释为潜在位置估计。需要泛化来模拟异性连接(例如“相反吸引”)并更普遍地处理负特征值。我们表明,无论是使用邻接矩阵还是归一化拉普拉斯矩阵,谱嵌入都会产生一致一致的潜在位置估计,并具有渐近高斯误差(直至可识别性)。标准和混合成员随机块模型是潜在位置仅占K的特殊情况不同的向量值,代表社区,或者 分别与这些顶点存在于 ( K -1)-单纯形中。在随机块模型下,我们的理论建议使用高斯混合模型(而不是K均值)进行谱聚类,并且在混合隶属度下,拟合包含单纯形的最小体积,现有建议以前仅在非负定假设下支持。在网络安全示例中展示了链接预测(在随机点积图上)的经验改进,以及发现更丰富潜在结构的潜力(比在标准或混合成员随机块模型下假定的)。
更新日期:2022-06-03
down
wechat
bug