-
Detail-aware image denoising via structure preserved network and residual diffusion model Vis. Comput. (IF 3.5) Pub Date : 2024-04-18 Jing Wu, Hao Wu, Guowu Yuan
-
RESTHT: relation-enhanced spatial–temporal hierarchical transformer for video captioning Vis. Comput. (IF 3.5) Pub Date : 2024-04-18 Lihuan Zheng, Wanru Xu, Zhenjiang Miao, Xinxiu Qiu, Shanshan Gong
-
Feature distribution normalization network for multi-view stereo Vis. Comput. (IF 3.5) Pub Date : 2024-04-17 Ziyang Chen, Yang Zhao, Junling He, Yujie Lu, Zhongwei Cui, Wenting Li, Yongjun Zhang
-
Robust corner detection in continuous space Vis. Comput. (IF 3.5) Pub Date : 2024-04-17 Xiyu Wei, Yanmei Dong, Qin Liu, Lei Wang, Liantang Lou
-
Military education in extended reality (XR): learning troublesome knowledge through immersive experiential application Vis. Comput. (IF 3.5) Pub Date : 2024-04-17 Jose Garcia Estrada, Ekaterina Prasolova-Førland, Stian Kjeksrud, Chryssa Themelis, Petter Lindqvist, Kristine Kvam, Ole Midthun, Knut Sverre, Leif Martin Hokstad, Soud Khalifa Mohamed, Simone Grassini, Serena Ricci
-
GFPE-ViT: vision transformer with geometric-fractal-based position encoding Vis. Comput. (IF 3.5) Pub Date : 2024-04-17 Lei Wang, Xue-song Tang, Kuangrong Hao
-
A personalized insertion centers preoperative positioning method for minimally invasive surgery of cruciate ligament reconstruction Vis. Comput. (IF 3.5) Pub Date : 2024-04-17 Hui Liu, Pengxi Li, Dongpei Liu, Bocheng Zhang, Jieshu Ren, Yichao Wang, Hongyu Li, Jianxin Zhang, Liang Yang, Bin Liu
-
Point clouds feature frequency domain analysis based on multilayer perceptron Vis. Comput. (IF 3.5) Pub Date : 2024-04-17 Can Zhang, Feipeng Da, Shaoyan Gai
-
Topology-guided accelerated vector field streamline visualization Vis. Comput. (IF 3.5) Pub Date : 2024-04-16 Hao Zhou, Junjie Yin, Yilun Yang, Meie Fang, Ping Li
-
Instructor emotion recognition system using manta ray foraging algorithm for improving the content delivery in video lecture Vis. Comput. (IF 3.5) Pub Date : 2024-04-16 Sameer Bhimrao Patil, Suresh Shirgave
-
Structural self-contrast learning based on adaptive weighted negative samples for facial expression recognition Vis. Comput. (IF 3.5) Pub Date : 2024-04-15 Huihui Li, Junhao Zhu, Guihua Wen, Haoyang Zhong
-
Per-class curriculum for Unsupervised Domain Adaptation in semantic segmentation Vis. Comput. (IF 3.5) Pub Date : 2024-04-15 Roberto Alcover-Couso, Juan C. SanMiguel, Marcos Escudero-Viñolo, Pablo Carballeira
-
Automated fabric defect detection using multi-scale fusion MemAE Vis. Comput. (IF 3.5) Pub Date : 2024-04-14 Kun Wu, Lei Zhu, Weihang Shi, Wenwu Wang
-
QEAN: quaternion-enhanced attention network for visual dance generation Vis. Comput. (IF 3.5) Pub Date : 2024-04-14 Zhizhen Zhou, Yejing Huo, Guoheng Huang, An Zeng, Xuhang Chen, Lian Huang, Zinuo Li
-
Underwater image sharpening and color correction via dataset based on revised underwater image formation model Vis. Comput. (IF 3.5) Pub Date : 2024-04-13 Shunsuke Takao
-
A novel approach for improving open scene text translation with modified GAN Vis. Comput. (IF 3.5) Pub Date : 2024-04-13 Yasmeen Cheema, Muhammad Nadeem Cheema, Anam Nazir, Fahad Ahmed Khokhar, Ping Li, Ayaz Ahmed
-
Twinenet: coupling features for synthesizing volume rendered images via convolutional encoder–decoders and multilayer perceptrons Vis. Comput. (IF 3.5) Pub Date : 2024-04-12 Shengzhou Luo, Jingxing Xu, John Dingliana, Mingqiang Wei, Lu Han, Lewei He, Jiahui Pan
-
Collaborative neural radiance fields for novel view synthesis Vis. Comput. (IF 3.5) Pub Date : 2024-04-12 Junqing Yuan, Mengting Fan, Zhenyang Liu, Tongxuan Han, Zhenzhong Kuang, Chihao Pan, Jiajun Ding
-
A dual-branch feature fusion neural network for fish image fine-grained recognition Vis. Comput. (IF 3.5) Pub Date : 2024-04-10 Xu Geng, Jinxiong Gao, Yonghui Zhang, Rong Wang
-
Visual question answering on blood smear images using convolutional block attention module powered object detection Vis. Comput. (IF 3.5) Pub Date : 2024-04-09 A. Lubna, Saidalavi Kalady, A. Lijiya
-
An improved residual learning model and its application to hardware image classification Vis. Comput. (IF 3.5) Pub Date : 2024-04-09 Zhentao Zhang, Wenhao Li, Yuxi Cheng, Qingnan Huang, Taorong Qiu
-
Content-based image retrieval through fusion of deep features extracted from segmented neutrosophic using depth map Vis. Comput. (IF 3.5) Pub Date : 2024-04-09 Fatemeh Taheri, Kambiz Rahbar, Ziaeddin Beheshtifard
-
A cascaded graph convolutional network for point cloud completion Vis. Comput. (IF 3.5) Pub Date : 2024-04-09 Luhan Wang, Jun Li, Shangwei Guo, Shaokun Han
-
A survey on soccer player detection and tracking with videos Vis. Comput. (IF 3.5) Pub Date : 2024-04-08 Chao Yang, Meng Yang, Hongyu Li, Linlu Jiang, Xiang Suo, Lijuan Mao, Weiliang Meng, Zhen Li
-
Learnable scene prior for point cloud semantic segmentation Vis. Comput. (IF 3.5) Pub Date : 2024-04-08 Yuanhao Chai, Jingyu Gong, Xin Tan, Jiachen Xu, Yuan Xie, Lizhuang Ma
-
Diff-pcg: diffusion point cloud generation conditioned on continuous normalizing flow Vis. Comput. (IF 3.5) Pub Date : 2024-04-08 Ting Yu, Weiliang Meng, Zhongqi Wu, Jianwei Guo, Xiaopeng Zhang
-
Coarse-to-fine cascaded 3D hand reconstruction based on SSGC and MHSA Vis. Comput. (IF 3.5) Pub Date : 2024-04-08 Wenji Yang, Liping Xie, Wenbin Qian, Canghai Wu, Hongyun Yang
-
Image classification with consistency-regularized bad semi-supervised generative adversarial networks: a visual data analysis and synthesis Vis. Comput. (IF 3.5) Pub Date : 2024-04-06
Abstract Semi-supervised learning, which entails training a model with manually labeled images and pseudo-labels for unlabeled images, has garnered considerable attention for its potential to improve image classification performance. Nevertheless, incorrect decision boundaries of classifiers and wrong pseudo-labels for beneficial unlabeled images below the confidence threshold increase the generalization
-
Facial expression recognition based on local–global information reasoning and spatial distribution of landmark features Vis. Comput. (IF 3.5) Pub Date : 2024-04-06
Abstract In the field of facial expression recognition (FER), two main trends point to the data-driven FER and feature-driven FER exist. The former focused on the data problems (e.g., sample imbalance and multimodal fusion), while the latter explored the facial expression features. As the feature-driven FER is more important than the data-driven FER, for deeper mining of facial features, we propose
-
OBB detector: occluded object detection based on geometric modeling of video frames Vis. Comput. (IF 3.5) Pub Date : 2024-04-04 Supriya Agrawal, Prachi Natu
Object detection is an important research area in video surveillance systems, aimed at identifying and locating target objects within recorded scenes. Various object detectors fail when partial occlusion occurs in which only some features of the objects are visible due to overlapped bounding boxes. This situation can result in miscounting of the objects and misaligning the bounding boxes leading to
-
Infrared tracking for accurate localization by capturing global context information Vis. Comput. (IF 3.5) Pub Date : 2024-04-04 Zhixuan Tang, Haiyun Shen, Peng Yu, Kaisong Zhang, Jianyu Chen
-
Video anomaly detection based on attention and efficient spatio-temporal feature extraction Vis. Comput. (IF 3.5) Pub Date : 2024-04-04 Seyed Mohammad Rahimpour, Mohammad Kazemi, Payman Moallem, Mehran Safayani
-
A new multi-focus image fusion quality assessment method with convolutional sparse representation Vis. Comput. (IF 3.5) Pub Date : 2024-04-03 Yanxiang Hu, Panpan Wu, Bo Zhang, Wenhao Sun, Yaru Gao, Caixia Hao, Xinran Chen
Assessing image fusion quality purposefully is a challenging task due to the diversities of fused features. In this work, a specific multi-focus image fusion quality assessment method is proposed based on joint image layering and convolutional sparse representation. Specifically, the proposed method includes two stages: Tikhonov regularization optimization-based joint image layering and convolutional
-
StylePart: image-based shape part manipulation Vis. Comput. (IF 3.5) Pub Date : 2024-04-02
Abstract Direct part-level manipulation of man-made shapes in an image is desired given its simplicity. However, it is not intuitive given the existing manually created cuboid and cylinder controllers. To tackle this problem, we present StylePart, a framework that enables direct shape manipulation of an image by leveraging generative models of both images and 3D shapes. Our key contribution is a shape-consistent
-
Boosted verification using siamese neural network with DiffBlock Vis. Comput. (IF 3.5) Pub Date : 2024-04-02 Junjie Liu, Junlong Liu, Rongxin Jiang, Boxuan Gu, Yaowu Chen, Chen Shen
On face recognition, person and vehicle re-identification tasks, different networks and losses have been proposed to learn better features, which further maximizes the decision margin in the feature space. Despite the promising progress having been made, it still remains a challenge to discriminate the different but similar targets while recognizing the same but dissimilar objects, which results from
-
Unpaired semantic neural person image synthesis Vis. Comput. (IF 3.5) Pub Date : 2024-04-02 Yixiu Liu, Tao Jiang, Pengju Si, Shangdong Zhu, Chenggang Yan, Shuai Wang, Haibing Yin
Pose-guided person image synthesis is a challenging task that aims to generate photo-realistic images of a person with the same appearance as a source image but the same pose as a target image. Existing methods often suffer from noticeable artifacts due to the omission of multi-view information, and the requirement for paired source–target images in certain methods during training further limits the
-
SSNet: a joint learning network for semantic segmentation and disparity estimation Vis. Comput. (IF 3.5) Pub Date : 2024-04-02 Dayu Jia, Yanwei Pang, Jiale Cao, Pan Jing
Joint learning for semantic segmentation and disparity estimation is adopted to scene parsing for mutual benefit. However, existing joint learning approaches unify the two task briefly which may result in negative feature mixing. In order to solve the problem, a win–win approach Stereo Semantic Network (SSNet) is proposed for pixel-wise scene parsing. SSNet is the first Transformer based end-to-end
-
A novel highland and freshwater-circumstance dataset: advancing underwater image enhancement Vis. Comput. (IF 3.5) Pub Date : 2024-04-01
Abstract As an important underlying visual processing task, underwater image enhancement techniques have received a lot of attention from researchers due to their importance in marine engineering and lake ecosystem optimization. However, various underwater image enhancement algorithms have been proposed to be evaluated mainly with marine water body datasets, and it is not clear whether these algorithms
-
Encoder-decoder networks with guided transmission map for effective image dehazing Vis. Comput. (IF 3.5) Pub Date : 2024-04-01
Abstract A plain-architecture and effective image dehazing scheme, called Encoder-Decoder Network with Guided Transmission Map (EDN-GTM), is proposed in this paper. Nowadays, neural networks are often built based on complex architectures and modules, which inherently prevent them from being efficiently deployed on general mobile platforms that are not integrated with latest deep learning operators
-
HGMVAE: hierarchical disentanglement in Gaussian mixture variational autoencoder Vis. Comput. (IF 3.5) Pub Date : 2024-04-01 Jiashuang Zhou, Yongqi Liu, Xiaoqin Du
Recent advancements in deep neural networks have shown great potential in generating realistic data and performing clustering tasks. This is due to their ability to capture intricate patterns. However, current generative models face challenges such as poor performance and computational complexity caused by the issue of dimension disaster. The variational autoencoder (VAE), a commonly used method, also
-
Robust gradient aware and reliable entropy minimization for stable test-time adaptation in dynamic scenarios Vis. Comput. (IF 3.5) Pub Date : 2024-04-01
Abstract Test-time adaptation (TTA) aims to provide neural networks capable of adapting to the target domain distribution using only unlabeled test data. Most existing TTA methods have achieved success under mild conditions, such as independently sampled data from a single or multiple static domains. However, these attempts may fail in dynamic scenarios, where the test data distribution undergoes continuous
-
An improved target tracking method based on extraction of corner points Vis. Comput. (IF 3.5) Pub Date : 2024-03-30 Qingyang Jing, Peng Zhang, Wei Zhang, Weimin Lei
Kernel correlation filter (KCF) algorithm is famous for fast tracking speed and has been used widely, while it is susceptible under some challenging scenes. Aiming at the problem of limited applicable scenes of KCF, an improved target tracking method based on extraction of corner points is proposed. Eight neighborhood template is used to filter corner points, and adaptive threshold approach is introduced
-
Single image reflection removal via self-attention and local discrimination Vis. Comput. (IF 3.5) Pub Date : 2024-03-30 Yan Huang, Xinchang Lu, Jia Fu
In practical scenarios, reflections may impair the visual quality of images, bring negative impacts to both human perception and subsequent computer vision tasks. Removing reflections from an image poses considerable challenges due to the diverse nature of reflection content, often blended with background targets. To address the issue, this paper proposes a single-image reflection removal method, which
-
Graph neural networks in vision-language image understanding: a survey Vis. Comput. (IF 3.5) Pub Date : 2024-03-29 Henry Senior, Gregory Slabaugh, Shanxin Yuan, Luca Rossi
2D image understanding is a complex problem within computer vision, but it holds the key to providing human-level scene comprehension. It goes further than identifying the objects in an image, and instead, it attempts to understand the scene. Solutions to this problem form the underpinning of a range of tasks, including image captioning, visual question answering (VQA), and image retrieval. Graphs
-
Loose–tight cluster regularization for unsupervised person re-identification Vis. Comput. (IF 3.5) Pub Date : 2024-03-29 Yixiu Liu, Long Zhan, Yu Feng, Pengju Si, Shaowei Jiang, Qiang Zhao, Chenggang Yan
Unsupervised person re-identification (Re-ID) is a critical and challenging task in computer vision. It aims to identify the same person across different camera views or locations without using any labeled data or annotations. Most existing unsupervised Re-ID methods adopt a clustering and fine-tuning strategy, which alternates between generating pseudo-labels through clustering and updating the model
-
CS-VITON: a realistic virtual try-on network based on clothing region alignment and SPM Vis. Comput. (IF 3.5) Pub Date : 2024-03-28 Jinguang Chen, Xin Zhang, Lili Ma, Bo Yang, Kaibing Zhang
Image-based virtual try-on involves generating an image of a person wearing a given clothing. Existing virtual try-on works suffer from the problem of misaligned regions between the predicted segmentation map and the deformed clothing, and the generation results of try-on are unnatural. To address this issue, we refine the definition of the misaligned regions and propose a high-resolution virtual try-on
-
ISOD: improved small object detection based on extended scale feature pyramid network Vis. Comput. (IF 3.5) Pub Date : 2024-03-28 Ping Ma, Xinyi He, Yiyang Chen, Yuan Liu
-
PMGAN: pretrained model-based generative adversarial network for text-to-image generation Vis. Comput. (IF 3.5) Pub Date : 2024-03-28
Abstract Text-to-image generation is a challenging task. Although diffusion models can generate high-quality images of complex scenes, they sometimes suffer from a lack of realism. Additionally, there is often a large diversity among images generated from different text with the same semantics. Furthermore, the generation of details is sometimes insufficient. Generative adversarial networks can generate
-
SISIM: statistical information similarity-based point cloud quality assessment Vis. Comput. (IF 3.5) Pub Date : 2024-03-28 Shuyu Xiao, Yongfang Wang, Yihan Wang
-
Search on dual-space: discretization accuracy-based architecture search for person re-identification Vis. Comput. (IF 3.5) Pub Date : 2024-03-28
Abstract Network architectures automatically generated for person re-identification (re-ID) using neural architecture search (NAS) algorithms exhibit unique advantages. However, existing NAS algorithms are primarily designed to solve the image classification task, and person re-ID, as a sub-problem of image retrieval, differs significantly from classification. To address this issue, we propose a neural
-
TAMDepth: self-supervised monocular depth estimation with transformer and adapter modulation Vis. Comput. (IF 3.5) Pub Date : 2024-03-27 Shaokang Li, Chengzhi Lyu, Bin Xia, Ziheng Chen, Lei Zhang
-
BENet: boundary-enhanced network for real-time semantic segmentation Vis. Comput. (IF 3.5) Pub Date : 2024-03-27 Xiaochun Lei, Zeyu Chen, Zhaoxin Yu, Zetao Jiang
-
Multi-view stereo-regulated NeRF for urban scene novel view synthesis Vis. Comput. (IF 3.5) Pub Date : 2024-03-27
Abstract Neural radiance fields (NeRF), which encode a scene into a neural representation, have demonstrated impressive novel view synthesis quality on single object and small regions of space. However, when faced with urban outdoor environments, NeRF is limited by the capacity of a single MLP and insufficient input views, leading to incorrect geometries that hinder the production of realistic renderings
-
Multi-keypoints matching network for clothing detection Vis. Comput. (IF 3.5) Pub Date : 2024-03-25 Ye Li, Wu Zhang, Meiling Wu, Di Zhang, Zhiguo Wang, Changjiang You
-
Self-supervised single-image 3D face reconstruction method based on attention mechanism and attribute refinement Vis. Comput. (IF 3.5) Pub Date : 2024-03-22 Xujia Qin, Xinyu Li, Mengjia Li, Hongbo Zheng, Xiaogang Xu
-
Soccer player tracking and data correction based on attention with full-field videos Vis. Comput. (IF 3.5) Pub Date : 2024-03-22 Chao Yang, Meng Yang, Hongyu Li, Linlu Jiang, Xiang Suo, Zhen Li, Weiliang Meng, Lijuan Mao
-
Combining YOLO and background subtraction for small dynamic target detection Vis. Comput. (IF 3.5) Pub Date : 2024-03-21 Jian Xiong, Jie Wu, Ming Tang, Pengwen Xiong, Yushui Huang, Hang Guo
-
Shape generation via learning an adaptive multimodal prior Vis. Comput. (IF 3.5) Pub Date : 2024-03-20 Xianglin Guo, Mingqiang Wei
-
Internal and external transmission encoder–decoder network for single-image deraining Vis. Comput. (IF 3.5) Pub Date : 2024-03-20 Yingcheng Xu, Congwei Han, Shuqi Lv, Ze Wang, Miao Wang
-
Veintr: robust end-to-end full-hand vein identification with transformer Vis. Comput. (IF 3.5) Pub Date : 2024-03-20 Shenglin Lu, Sheldon Fung, Wei Pan, Nilmini Wickramasinghe, Xuequan Lu