当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
On characterizing the evolution of embedding space of neural networks using algebraic topology
Pattern Recognition Letters ( IF 5.1 ) Pub Date : 2024-02-07 , DOI: 10.1016/j.patrec.2024.02.003
S. Suresh , B. Das , V. Abrol , S. Dutta Roy

We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers. Motivated by existing studies using simplicial complexes on shallow fully connected networks (FCN), we present an extended analysis using Cubical homology instead, with a variety of popular deep architectures and real image datasets. We demonstrate that as depth increases, a topologically complicated dataset is transformed into a simple one, resulting in Betti numbers attaining their lowest possible value. The rate of decay in topological complexity (as a metric) helps quantify the impact of architectural choices on the generalization ability. Interestingly from a representation learning perspective, we highlight several invariances such as topological invariance of (1) an architecture on similar datasets; (2) embedding space of a dataset for architectures of variable depth; (3) embedding space to input resolution/size, and (4) data sub-sampling. In order to further demonstrate the link between expressivity & the generalization capability of a network, we consider the task of ranking pre-trained models for downstream classification task (transfer learning). Compared to existing approaches, the proposed metric has a better correlation to the actually achievable accuracy via fine-tuning the pre-trained model.

中文翻译:

用代数拓扑表征神经网络嵌入空间的演化

我们通过 Betti 数研究特征嵌入空间的拓扑在穿过训练有素的深度神经网络 (DNN) 的各层时如何变化。受现有在浅层全连接网络(FCN)上使用单纯复形的研究的启发,我们提出了使用立方同调的扩展分析,以及各种流行的深层架构和真实图像数据集。我们证明,随着深度的增加,拓扑复杂的数据集会转换为简单的数据集,从而导致贝蒂数达到尽可能低的值。拓扑复杂性的衰减率(作为度量)有助于量化架构选择对泛化能力的影响。有趣的是,从表示学习的角度来看,我们强调了几个不变性,例如(1)类似数据集上的架构的拓扑不变性;(2) 可变深度架构的数据集嵌入空间;(3) 将空间嵌入到输入分辨率/大小,以及 (4) 数据子采样。为了进一步证明网络的表达能力和泛化能力之间的联系,我们考虑对下游分类任务(迁移学习)的预训练模型进行排序的任务。与现有方法相比,所提出的指标通过微调预训练模型与实际可实现的精度具有更好的相关性。
更新日期:2024-02-07
down
wechat
bug