样式: 排序: IF: - GO 导出 标记为已读
-
I2DFormer+: Learning Image to Document Summary Attention for Zero-Shot Image Classification Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-24 Muhammad Ferjad Naeem, Yongqin Xian, Luc Van Gool, Federico Tombari
-
Integrated Heterogeneous Graph Attention Network for Incomplete Multi-modal Clustering Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-24 Yu Wang, Xinjie Yao, Pengfei Zhu, Weihao Li, Meng Cao, Qinghua Hu
-
WildCLIP: Scene and Animal Attribute Retrieval from Camera Trap Data with Domain-Adapted Vision-Language Models Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-24 Valentin Gabeff, Marc Rußwurm, Devis Tuia, Alexander Mathis
-
An Open-World, Diverse, Cross-Spatial-Temporal Benchmark for Dynamic Wild Person Re-Identification Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-24 Lei Zhang, Xiaowei Fu, Fuxiang Huang, Yi Yang, Xinbo Gao
-
Position, Padding and Predictions: A Deeper Look at Position Information in CNNs Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-24 Md Amirul Islam, Matthew Kowal, Sen Jia, Konstantinos G. Derpanis, Neil D. B. Bruce
-
Descriptor Distillation: A Teacher-Student-Regularized Framework for Learning Local Descriptors Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-24 Yuzhen Liu, Qiulei Dong
-
MutualFormer: Multi-modal Representation Learning via Cross-Diffusion Attention Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-24 Xixi Wang, Xiao Wang, Bo Jiang, Jin Tang, Bin Luo
-
Multimodal Machine Learning in Image-Based and Clinical Biomedicine: Survey and Prospects Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-23 Elisa Warner, Joonsang Lee, William Hsu, Tanveer Syeda-Mahmood, Charles E. Kahn Jr., Olivier Gevaert, Arvind Rao
-
VNAS: Variational Neural Architecture Search Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-23 Benteng Ma, Jing Zhang, Yong Xia, Dacheng Tao
-
Augmenting the Softmax with Additional Confidence Scores for Improved Selective Classification with Out-of-Distribution Data Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-23 Guoxuan Xia, Christos-Savvas Bouganis
-
On Finite Difference Jacobian Computation in Deformable Image Registration Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-18 Yihao Liu, Junyu Chen, Shuwen Wei, Aaron Carass, Jerry Prince
-
Learning with Noisy Correspondence Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-13 Zhenyu Huang, Peng Hu, Guocheng Niu, Xinyan Xiao, Jiancheng Lv, Xi Peng
-
Ensemble Quadratic Assignment Network for Graph Matching Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-13 Haoru Tan, Chuang Wang, Sitong Wu, Xu-Yao Zhang, Fei Yin, Cheng-Lin Liu
-
Error-Aware Conversion from ANN to SNN via Post-training Parameter Calibration Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-08 Yuhang Li, Shikuang Deng, Xin Dong, Shi Gu
-
CRetinex: A Progressive Color-Shift Aware Retinex Model for Low-Light Image Enhancement Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-08 Han Xu, Hao Zhang, Xunpeng Yi, Jiayi Ma
-
FSODv2: A Deep Calibrated Few-Shot Object Detection Network Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-04 Qi Fan, Wei Zhuo, Chi-Keung Tang, Yu-Wing Tai
Traditional methods for object detection typically necessitate a substantial amount of training data, and creating high-quality training data is time-consuming. We propose a novel Few-Shot Object Detection network (FSODv2) in this paper that aims to detect objects from previously unseen categories using only a few annotated examples. Attention RPN, Multi-Relation Detector, and Contrastive Training
-
EATFormer: Improving Vision Transformer Inspired by Evolutionary Algorithm Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-02
Abstract Motivated by biological evolution, this paper explains the rationality of Vision Transformer by analogy with the proven practical evolutionary algorithm (EA) and derives that both have consistent mathematical formulation. Then inspired by effective EA variants, we propose a novel pyramid EATFormer backbone that only contains the proposed EA-based transformer (EAT) block, which consists of
-
MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-02 Jianbin Zheng, Daqing Liu, Chaoyue Wang, Minghui Hu, Zuopeng Yang, Changxing Ding, Dacheng Tao
Existing multimodal conditional image synthesis (MCIS) methods generate images conditioned on any combinations of various modalities that require all of them must be exactly conformed, hindering the synthesis controllability and leaving the potential of cross-modality under-exploited. To this end, we propose to generate images conditioned on the compositions of multimodal control signals, where modalities
-
SegViT v2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-04-01 Bowen Zhang, Liyang Liu, Minh Hieu Phan, Zhi Tian, Chunhua Shen, Yifan Liu
-
Pictorial and Apictorial Polygonal Jigsaw Puzzles from Arbitrary Number of Crossing Cuts Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-22 Peleg Harel, Ofir Itzhak Shahar, Ohad Ben-Shahar
-
UrbanEvolver: Function-Aware Urban Layout Regeneration Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-19 Yiming Qin, Nanxuan Zhao, Jiale Yang, Siyuan Pan, Bin Sheng, Rynson W. H. Lau
-
Vision-Language Alignment Learning Under Affinity and Divergence Principles for Few-Shot Out-of-Distribution Generalization Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-18 Lin Zhu, Weihan Yin, Yiyao Yang, Fan Wu, Zhaoyu Zeng, Qinying Gu, Xinbing Wang, Chenghu Zhou, Nanyang Ye
-
Softmax-Free Linear Transformers Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-13 Jiachen Lu, Junge Zhang, Xiatian Zhu, Jianfeng Feng, Tao Xiang, Li Zhang
-
One-Shot Neural Face Reenactment via Finding Directions in GAN’s Latent Space Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-13
Abstract In this paper, we present our framework for neural face/head reenactment whose goal is to transfer the 3D head orientation and expression of a target face to a source face. Previous methods focus on learning embedding networks for identity and head pose/expression disentanglement which proves to be a rather hard task, degrading the quality of the generated images. We take a different approach
-
PL $${}_{1}$$ P: Point-Line Minimal Problems under Partial Visibility in Three Views Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-10
Abstract We present a complete classification of minimal problems for generic arrangements of points and lines in space observed partially by three calibrated perspective cameras when each line is incident to at most one point. This is a large class of interesting minimal problems that allows missing observations in images due to occlusions and missed detections. There is an infinite number of such
-
Deep Learning Technique for Human Parsing: A Survey and Outlook Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-09 Lu Yang, Wenhe Jia, Shan Li, Qing Song
-
Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-08 Guofeng Mei, Cristiano Saltori, Elisa Ricci, Nicu Sebe, Qiang Wu, Jian Zhang, Fabio Poiesi
-
Adaptive Multi-Source Predictor for Zero-Shot Video Object Segmentation Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-07 Xiaoqi Zhao, Shijie Chang, Youwei Pang, Jiaxing Yang, Lihe Zhang, Huchuan Lu
-
Open Set Recognition in Real World Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-07 Zhen Yang, Jun Yue, Pedram Ghamisi, Shiliang Zhang, Jiayi Ma, Leyuan Fang
-
Does Confusion Really Hurt Novel Class Discovery? Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-07
Abstract When sampling data of specific classes (i.e., known classes) for a scientific task, collectors may encounter unknown classes (i.e., novel classes). Since these novel classes might be valuable for future research, collectors will also sample them and assign them to several clusters with the help of known-class data. This assigning process is known as novel class discovery (NCD). However, category
-
Domain Generalization with Small Data Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-06 Kecheng Chen, Elena Gal, Hong Yan, Haoliang Li
-
A Survey on Global LiDAR Localization: Challenges, Advances and Open Problems Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-06 Huan Yin, Xuecheng Xu, Sha Lu, Xieyuanli Chen, Rong Xiong, Shaojie Shen, Cyrill Stachniss, Yue Wang
-
CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text Detection Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-05 Xi Zhao, Wei Feng, Zheng Zhang, Jingjing Lv, Xin Zhu, Zhangang Lin, Jinghe Hu, Jingping Shao
-
Automated Detection of Cat Facial Landmarks Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-05
Abstract The field of animal affective computing is rapidly emerging, and analysis of facial expressions is a crucial aspect. One of the most significant challenges that researchers in the field currently face is the scarcity of high-quality, comprehensive datasets that allow the development of models for facial expressions analysis. One of the possible approaches is the utilisation of facial landmarks
-
PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-04
Abstract We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment. It comprises more than 7 million frames across \(\sim \) 20,000 camera trap videos of chimpanzees and gorillas collected at 18 field sites in tropical Africa as part of the Pan African Programme: The Cultured Chimpanzee. The footage is accompanied by
-
Cross-Modal Fusion and Progressive Decoding Network for RGB-D Salient Object Detection Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-02 Xihang Hu, Fuming Sun, Jing Sun, Fasheng Wang, Haojie Li
-
Uncertainty Modeling for Group Re-Identification Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-01 Quan Zhang, Jianhuang Lai, Zhanxiang Feng, Xiaohua Xie
-
SplatFlow: Learning Multi-frame Optical Flow via Splatting Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-29
Abstract The occlusion problem remains a crucial challenge in optical flow estimation (OFE). Despite the recent significant progress brought about by deep learning, most existing deep learning OFE methods still struggle to handle occlusions; in particular, those based on two frames cannot correctly handle occlusions because occluded regions have no visual correspondences. However, there is still hope
-
A Survey on Adaptive Cameras Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-28 Julien Ducrocq, Guillaume Caron
-
Estimation of Near-Instance-Level Attribute Bottleneck for Zero-Shot Learning Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-27 Chenyi Jiang, Yuming Shen, Dubing Chen, Haofeng Zhang, Ling Shao, Philip H. S. Torr
-
Correction to: Deep Unpaired Blind Image Super-Resolution Using Self-supervised Learning and Exemplar Distillation Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-26 Jiangxin Dong, Haoran Bai, Jinhui Tang, Jinshan Pan
-
Training Object Detectors from Scratch: An Empirical Study in the Era of Vision Transformer Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-26 Weixiang Hong, Wang Ren, Jiangwei Lao, Lele Xie, Liheng Zhong, Jian Wang, Jingdong Chen, Honghai Liu, Wei Chu
-
Robust Heterogeneous Model Fitting for Multi-source Image Correspondences Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-23 Shuyuan Lin, Feiran Huang, Taotao Lai, Jianhuang Lai, Hanzi Wang, Jian Weng
-
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-23 Zhi-Song Liu, Robin Courant, Vicky Kalogeiton
-
Learning to Generalize over Subpartitions for Heterogeneity-Aware Domain Adaptive Nuclei Segmentation Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-22 Jianan Fan, Dongnan Liu, Hang Chang, Weidong Cai
-
UniMod1K: Towards a More Universal Large-Scale Dataset and Benchmark for Multi-modal Learning Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-22 Xue-Feng Zhu, Tianyang Xu, Zongtao Liu, Zhangyong Tang, Xiao-Jun Wu, Josef Kittler
-
Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale Feature Fusion Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-20 Gongjie Zhang, Zhipeng Luo, Jiaxing Huang, Shijian Lu, Eric P. Xing
-
Cross-Architecture Knowledge Distillation Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-19 Yufan Liu, Jiajiong Cao, Bing Li, Weiming Hu, Jingting Ding, Liang Li, Stephen Maybank
-
Hugs Bring Double Benefits: Unsupervised Cross-Modal Hashing with Multi-granularity Aligned Transformers Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-18
Abstract Unsupervised cross-modal hashing (UCMH) has been commonly explored to support large-scale cross-modal retrieval of unlabeled data. Despite promising progress, most existing approaches are developed on convolutional neural network and multilayer perceptron architectures, sacrificing the quality of hash codes due to limited capacity for excavating multi-modal semantics. To pursue better content
-
Annotation-Free Human Sketch Quality Assessment Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-17 Lan Yang, Kaiyue Pang, Honggang Zhang, Yi-Zhe Song
-
MixStyle Neural Networks for Domain Generalization and Adaptation Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-01 Kaiyang Zhou, Yongxin Yang, Yu Qiao, Tao Xiang
-
Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-01 Wei Zhai, Pingyu Wu, Kai Zhu, Yang Cao, Feng Wu, Zheng-Jun Zha
-
3D Adversarial Augmentations for Robust Out-of-Domain Predictions Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-03-01 Alexander Lehner, Stefano Gasperini, Alvaro Marcos-Ramiro, Michael Schmidt, Nassir Navab, Benjamin Busam, Federico Tombari
-
ReliTalk: Relightable Talking Portrait Generation from a Single Video Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-16
Abstract Recent years have witnessed great progress in creating vivid audio-driven portraits from monocular videos. However, how to seamlessly adapt the created video avatars to other scenarios with different backgrounds and lighting conditions remains unsolved. On the other hand, existing relighting studies mostly rely on dynamically lighted or multi-view data, which are too expensive for creating
-
A New Dataset and a Distractor-Aware Architecture for Transparent Object Tracking Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-16 Alan Lukežič, Žiga Trojer, Jiří Matas, Matej Kristan
-
Learning Adaptive Spatio-Temporal Inference Transformer for Coarse-to-Fine Animal Visual Tracking: Algorithm and Benchmark Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-12 Tianyang Xu, Ze Kang, Xuefeng Zhu, Xiao-Jun Wu
-
Benchmarking the Robustness of LiDAR Semantic Segmentation Models Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-12 Xu Yan, Chaoda Zheng, Ying Xue, Zhen Li, Shuguang Cui, Dengxin Dai
-
Are Multi-view Edges Incomplete for Depth Estimation? Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-12 Numair Khan, Min H. Kim, James Tompkin
-
Relative Norm Alignment for Tackling Domain Shift in Deep Multi-modal Classification Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-09 Mirco Planamente, Chiara Plizzari, Simone Alberto Peirone, Barbara Caputo, Andrea Bottino
-
Focus for Free in Density-Based Counting Int. J. Comput. Vis. (IF 19.5) Pub Date : 2024-02-09 Zenglin Shi, Pascal Mettes, Cees G. M. Snoek