On Human-like Biases in Convolutional Neural Networks for the Perception of Slant from Texture,ACM Transactions on Applied Perception

当前位置： X-MOL 学术 › ACM Trans. Appl. Percept. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

On Human-like Biases in Convolutional Neural Networks for the Perception of Slant from Texture
ACM Transactions on Applied Perception ( IF 1.6 ) Pub Date : 2023-10-25 , DOI: 10.1145/3613451
Yuanhao Wang ₁ , Qian Zhang ₁ , Celine Aubuchon ₂ , Jovan Kemp ₂ , Fulvio Domini ₂ , James Tompkin ₁

Affiliation

Depth estimation is fundamental to 3D perception, and humans are known to have biased estimates of depth. This study investigates whether convolutional neural networks (CNNs) can be biased when predicting the sign of curvature and depth of surfaces of textured surfaces under different viewing conditions (field of view) and surface parameters (slant and texture irregularity). This hypothesis is drawn from the idea that texture gradients described by local neighborhoods—a cue identified in human vision literature—are also representable within convolutional neural networks. To this end, we trained both unsupervised and supervised CNN models on the renderings of slanted surfaces with random Polka dot patterns and analyzed their internal latent representations. The results show that the unsupervised models have similar prediction biases as humans across all experiments, while supervised CNN models do not exhibit similar biases. The latent spaces of the unsupervised models can be linearly separated into axes representing field of view and optical slant. For supervised models, this ability varies substantially with model architecture and the kind of supervision (continuous slant vs. sign of slant). Even though this study says nothing of any shared mechanism, these findings suggest that unsupervised CNN models can share similar predictions to the human visual system. Code: github.com/brownvc/Slant-CNN-Biases.

中文翻译：

卷积神经网络中用于感知纹理倾斜的类人偏差

深度估计是 3D 感知的基础，众所周知，人类对深度的估计存在偏差。本研究研究了在不同观察条件（视场）和表面参数（倾斜和纹理不规则性）下预测纹理表面的曲率符号和表面深度时，卷积神经网络（CNN）是否会出现偏差。这一假设源自这样的想法：局部邻域描述的纹理梯度（人类视觉文献中确定的线索）也可以在卷积神经网络中表示。为此，我们在具有随机圆点图案的倾斜表面的渲染上训练了无监督和监督 CNN 模型，并分析了它们的内部潜在表示。结果表明，无监督模型在所有实验中都具有与人类相似的预测偏差，而有监督 CNN 模型则没有表现出类似的偏差。无监督模型的潜在空间可以线性地分为代表视场和光学倾斜的轴。对于监督模型，这种能力随着模型架构和监督类型（连续倾斜与倾斜符号）的不同而有很大差异。尽管这项研究没有提及任何共享机制，但这些发现表明无监督 CNN 模型可以与人类视觉系统共享类似的预测。代码：github.com/brownvc/Slant-CNN-Biases。

更新日期：2023-10-25

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>