A contrastive learning based unsupervised multi-view stereo with multi-stage self-training strategy,Displays

当前位置： X-MOL 学术 › Displays › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

A contrastive learning based unsupervised multi-view stereo with multi-stage self-training strategy
Displays ( IF 4.3 ) Pub Date : 2024-02-24 , DOI: 10.1016/j.displa.2024.102672
Zihang Wang , Haonan Luo , Xiang Wang , Jin Zheng , Xin Ning , Xiao Bai

Recent years, unsupervised multi-view stereo (MVS) methods have achieved excellent success that can produce comparable results to earlier supervised work. However, as unsupervised MVS uses image reconstruction as pretext task, it faces two vital drawbacks: RGB value, which is the measurement of image, is not robust enough across views due to complicated environment like lighting conditions and reconstruction itself cannot reflect quality of depth estimation linearly. These problems cause the actual optimization goal to diverge from the expected optimization goal, thus could impair the training process. To enhance robustness of pretext task, we propose a contrastive learning based constraint. The constraint adds featuremetric consistency across views by forcing the features between matching points similar and the features between unmatched points opposite. To add linear properties to overall training procedure, we propose a multi-stage training strategy that uses pseudo label as supervision after unsupervised training at the beginning. On the other hand, we adopt an iterative optimizer that proven to be quite effective in supervised MVS to accelerate training. Finally, we conduct a series of experiments on the DTU dataset and Tanks and Temples dataset that demonstrate the efficiency and robustness of our method compared with the state-of-the-art methods in terms of accuracy, completeness and speed.

中文翻译：

基于对比学习的无监督多视图立体多阶段自训练策略

近年来，无监督多视图立体（MVS）方法取得了巨大的成功，可以产生与早期监督工作相当的结果。然而，由于无监督 MVS 使用图像重建作为借口任务，它面临两个重要的缺点：作为图像测量的 RGB 值由于光照条件等复杂环境而在不同视图上不够鲁棒，并且重建本身无法反映深度估计的质量线性地。这些问题导致实际的优化目标与预期的优化目标背离，从而可能损害训练过程。为了增强借口任务的鲁棒性，我们提出了基于对比学习的约束。该约束通过强制匹配点之间的特征相似而未匹配点之间的特征相反来增加跨视图的特征度量一致性。为了向整个训练过程添加线性属性，我们提出了一种多阶段训练策略，在无监督训练开始后使用伪标签作为监督。另一方面，我们采用了迭代优化器，该优化器在监督 MVS 中被证明非常有效，可以加速训练。最后，我们在 DTU 数据集和 Tanks and Temples 数据集上进行了一系列实验，证明了我们的方法在准确性、完整性和速度方面与最先进的方法相比的效率和鲁棒性。

更新日期：2024-02-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>