当前位置: X-MOL 学术Ann. Math. Artif. Intel. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Clustering, coding, and the concept of similarity
Annals of Mathematics and Artificial Intelligence ( IF 1.2 ) Pub Date : 2024-03-19 , DOI: 10.1007/s10472-024-09929-7
L. Thorne McCarty

This paper develops a theory of clustering and coding that combines a geometric model with a probabilistic model in a principled way. The geometric model is a Riemannian manifold with a Riemannian metric, \({g}_{ij}(\textbf{x})\), which we interpret as a measure of dissimilarity. The probabilistic model consists of a stochastic process with an invariant probability measure that matches the density of the sample input data. The link between the two models is a potential function, \(U(\textbf{x})\), and its gradient, \(\nabla U(\textbf{x})\). We use the gradient to define the dissimilarity metric, which guarantees that our measure of dissimilarity will depend on the probability measure. Finally, we use the dissimilarity metric to define a coordinate system on the embedded Riemannian manifold, which gives us a low-dimensional encoding of our original data.



中文翻译:

聚类、编码和相似性的概念

本文发展了一种聚类编码理论,以原则性的方式将几何模型与概率模型结合起来。几何模型是具有黎曼度量\({g}_{ij}(\textbf{x})\) 的黎曼流形,我们将其解释为相异性的度量。概率模型由随机过程组成,该过程具有与样本输入数据的密度相匹配的不变概率度量。两个模型之间的联系是势函数\(U(\textbf{x})\)及其梯度\(\nabla U(\textbf{x})\)。我们使用梯度来定义相异性度量,这保证了我们的相异性度量将取决于概率度量。最后,我们使用相异度量在嵌入式黎曼流形上定义坐标系,这为我们提供了原始数据的低维编码。

更新日期:2024-03-20
down
wechat
bug