IEEE Transactions on Image Processing期刊最新论文, 计算机, 应用类期刊,

Exploring Video Denoising in Thermal Infrared Imaging: Physics-inspired Noise Generator, Dataset and Model

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-23
Lijing Cai, Xiangyu Dong, Kailai Zhou, Xun Cao

更新日期：2024-04-23

详情收藏

Accurate 3D measurement of complex texture objects by height compensation using a dual-projector structure

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-22
Pengcheng Yao, Yuchong Chen, Shaoyan Gai, Feipeng Da

更新日期：2024-04-22

详情收藏

Classification of Small Drones Using Low-Uncertainty Micro-Doppler Signature Images and Ultra-Lightweight Convolutional Neural Network

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19
Junhyeong Park, Jun-Sung Park

更新日期：2024-04-19

详情收藏

Image Reconstruction for Accelerated MR Scan with Faster Fourier Convolutional Neural Networks

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19
Xiaohan Liu, Yanwei Pang, Xuebin Sun, Yiming Liu, Yonghong Hou, Zhenchang Wang, Xuelong Li

更新日期：2024-04-19

详情收藏

Fast Continual Multi-View Clustering with Incomplete Views

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19
Xinhang Wan, Bin Xiao, Xinwang Liu, Jiyuan Liu, Weixuan Liang, En Zhu

更新日期：2024-04-19

详情收藏

Multi-Relational Deep Hashing for Cross-Modal Search

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-16
Xiao Liang, Erkun Yang, Yanhua Yang, Cheng Deng

更新日期：2024-04-16

详情收藏

GLPanoDepth: Global-to-Local Panoramic Depth Estimation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-15
Jiayang Bai, Haoyu Qin, Shuichang Lai, Jie Guo, Yanwen Guo

Depth estimation is a fundamental task in many vision applications. With the popularity of omnidirectional cameras, it becomes a new trend to tackle this problem in the spherical space. In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image. An omnidirectional image has a full field-of-view, providing much more complete

更新日期：2024-04-15

详情收藏

ISTR: Mask-Embedding-Based Instance Segmentation Transformer

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12
Jie Hu, Yao Lu, Shengchuan Zhang, Liujuan Cao

Transformer-based instance-level recognition has attracted increasing research attention recently due to the superior performance. However, although attempts have been made to encode masks as embeddings into Transformer-based frameworks, how to combine mask embeddings and spatial information for a transformer-based approach is still not fully explored. In this paper, we revisit the design of mask-embedding-based

更新日期：2024-04-12

详情收藏

Deep Variation Prior: Joint Image Denoising and Noise Variance Estimation Without Clean Data

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12
Rihuan Ke

With recent deep learning based approaches showing promising results in removing noise from images, the best denoising performance has been reported in a supervised learning setup that requires a large set of paired noisy images and ground truth data for training. The strong data requirement can be mitigated by unsupervised learning techniques, however, accurate modelling of images or noise variances

更新日期：2024-04-12

详情收藏

Saliency Guided Deep Neural Network for Color Transfer With Light Optimization

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12
Yuming Fang, Pengwei Yuan, Chenlei Lv, Chen Peng, Jiebin Yan, Weisi Lin

Color transfer aims to change the color information of the target image according to the reference one. Many studies propose color transfer methods by analysis of color distribution and semantic relevance, which do not take the perceptual characteristics for visual quality into consideration. In this study, we propose a novel color transfer method based on the saliency information with brightness optimization

更新日期：2024-04-12

详情收藏

Single Stage Adaptive Multi-Attention Network for Image Restoration

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10
Anas Zafar, Danyal Aftab, Rizwan Qureshi, Xinqi Fan, Pingjun Chen, Jia Wu, Hazrat Ali, Shah Nawaz, Sheheryar Khan, Mubarak Shah

更新日期：2024-04-10

详情收藏

High-Quality and Diverse Few-Shot Image Generation via Masked Discrimination

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10
Jingyuan Zhu, Huimin Ma, Jiansheng Chen, Jian Yuan

Few-shot image generation aims to generate images of high quality and great diversity with limited data. However, it is difficult for modern GANs to avoid overfitting when trained on only a few images. The discriminator can easily remember all the training samples and guide the generator to replicate them, leading to severe diversity degradation. Several methods have been proposed to relieve overfitting

更新日期：2024-04-10

详情收藏

RefQSR: Reference-Based Quantization for Image Super-Resolution Networks

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10
Hongjae Lee, Jun-Sang Yoo, Seung-Won Jung

Single image super-resolution (SISR) aims to reconstruct a high-resolution image from its low-resolution observation. Recent deep learning-based SISR models show high performance at the expense of increased computational costs, limiting their use in resource-constrained environments. As a promising solution for computationally efficient network design, network quantization has been extensively studied

更新日期：2024-04-10

详情收藏

Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10
Wenjin Qin, Hailin Wang, Feng Zhang, Weijun Ma, Jianjun Wang, Tingwen Huang

Within the tensor singular value decomposition (T-SVD) framework, existing robust low-rank tensor completion approaches have made great achievements in various areas of science and engineering. Nevertheless, these methods involve the T-SVD based low-rank approximation, which suffers from high computational costs when dealing with large-scale tensor data. Moreover, most of them are only applicable to

更新日期：2024-04-10

详情收藏

Source-Guided Target Feature Reconstruction for Cross-Domain Classification and Detection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-09
Yifan Jiao, Hantao Yao, Bing-Kun Bao, Changsheng Xu

Existing cross-domain classification and detection methods usually apply a consistency constraint between the target sample and its self-augmentation for unsupervised learning without considering the essential source knowledge. In this paper, we propose a Source-guided Target Feature Reconstruction (STFR) module for cross-domain visual tasks, which applies source visual words to reconstruct the target

更新日期：2024-04-09

详情收藏

Relationship-Incremental Scene Graph Generation by a Divide-and-Conquer Pipeline with Feature Adapter

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-08
Xuewei Li, Guangcong Zheng, Yunlong Yu, Naye Ji, Xi Li

更新日期：2024-04-08

详情收藏

DriftRec: Adapting Diffusion Models to Blind JPEG Restoration

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-05
Simon Welker, Henry N. Chapman, Timo Gerkmann

In this work, we utilize the high-fidelity generation abilities of diffusion models to solve blind JPEG restoration at high compression levels. We propose an elegant modification of the forward stochastic differential equation of diffusion models to adapt them to this restoration task and name our method DriftRec. Comparing DriftRec against an $L_{2}$ regression baseline with the same network architecture

更新日期：2024-04-05

详情收藏

Generalizing to Out-of-Sample Degradations via Model Reprogramming

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-05
Runhua Jiang, Yahong Han

Existing image restoration models are typically designed for specific tasks and struggle to generalize to out-of-sample degradations not encountered during training. While zero-shot methods can address this limitation by fine-tuning model parameters on testing samples, their effectiveness relies on predefined natural priors and physical models of specific degradations. Nevertheless, determining out-of-sample

更新日期：2024-04-05

详情收藏

Shared Manifold Regularized Joint Feature Selection for Joint Classification and Regression in Alzheimer’s Disease Diagnosis

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-04
Zhi Chen, Yongguo Liu, Yun Zhang, Jiajing Zhu, Qiaoqin Li, Xindong Wu

In Alzheimer’s disease (AD) diagnosis, joint feature selection for predicting disease labels (classification) and estimating cognitive scores (regression) with neuroimaging data has received increasing attention. In this paper, we propose a model named Shared Manifold regularized Joint Feature Selection (SMJFS) that performs classification and regression in a unified framework for AD diagnosis. For

更新日期：2024-04-04

详情收藏

Orthogonal Spatial Binary Coding Method for High-Speed 3D Measurement

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-01
Haitao Wu, Yiping Cao, Yongbo Dai, Zhimi Wei

Temporal phase unwrapping based on single auxiliary binary coded pattern has been proven to be effective for high-speed 3D measurement. However, in traditional spatial binary coding, it often leads to an imbalance between the number of periodic divisions and codewords. To meet this challenge, a large codewords orthogonal spatial binary coding method is proposed in this paper. By expanding spatial multiplexing

更新日期：2024-04-01

详情收藏

Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-01
Simin Li, Huangxinxin Xu, Jiakai Wang, Ruixiao Xu, Aishan Liu, Fazhi He, Xianglong Liu, Dacheng Tao

Billions of people share images from their daily lives on social media every day. However, their biometric information (e.g., fingerprints) could be easily stolen from these images. The threat of fingerprint leakage from social media has created a strong desire to anonymize shared images while maintaining image quality, since fingerprints act as a lifelong individual biometric password. To guard the

更新日期：2024-04-01

详情收藏

Bilateral Context Modeling for Residual Coding in Lossless 3D Medical Image Compression

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25
Xiangrui Liu, Meng Wang, Shiqi Wang, Sam Kwong

Residual coding has gained prevalence in lossless compression, where a lossy layer is initially employed and the reconstruction errors (i.e., residues) are then losslessly compressed. The underlying principle of the residual coding revolves around the exploration of priors based on context modeling. Herein, we propose a residual coding framework for 3D medical images, involving the off-the-shelf video

更新日期：2024-03-30

详情收藏

Anomaly Detection for Medical Images Using Heterogeneous Auto-Encoder

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-29
Shuai Lu, Weihang Zhang, He Zhao, Hanruo Liu, Ningli Wang, Huiqi Li

Anomaly detection is an important task for medical image analysis, which can alleviate the reliance of supervised methods on large labelled datasets. Most existing methods use a pixel-wise self-reconstruction framework for anomaly detection. However, there are two challenges of these studies: 1) they tend to overfit learning an identity mapping between the input and output, which leads to failure in

更新日期：2024-03-29

详情收藏

Region Aware Video Object Segmentation With Deep Motion Modeling

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-29
Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian

Current semi-supervised video object segmentation (VOS) methods often employ the entire features of one frame to predict object masks and update memory. This introduces significant redundant computations. To reduce redundancy, we introduce a Region Aware Video Object Segmentation (RAVOS) approach, which predicts regions of interest (ROIs) for efficient object segmentation and memory storage. RAVOS

更新日期：2024-03-29

详情收藏

Knowledge-Augmented Visual Question Answering With Natural Language Explanation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-28
Jiayuan Xie, Yi Cai, Jiali Chen, Ruohang Xu, Jiexin Wang, Qing Li

Visual question answering with natural language explanation (VQA-NLE) is a challenging task that requires models to not only generate accurate answers but also to provide explanations that justify the relevant decision-making processes. This task is accomplished by generating natural language sentences based on the given question-image pair. However, existing methods often struggle to ensure consistency

更新日期：2024-03-28

详情收藏

Robust Fine-Grained Visual Recognition With Neighbor-Attention Label Correction

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-28
Shunan Mao, Shiliang Zhang

Existing deep learning methods for fine-grained visual recognition often rely on large-scale, well-annotated training data. Obtaining fine-grained annotations in the wild typically requires concentration and expertise, such as fine category annotation for species recognition, instance annotation for person re-identification (re-id) and dense annotation for segmentation, which inevitably leads to label

更新日期：2024-03-28

详情收藏

Label-Aware Calibration and Relation-Preserving in Visual Intention Understanding

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
QingHongYa Shi, Mang Ye, Wenke Huang, Weijian Ruan, Bo Du

Visual intention understanding is a challenging task that explores the hidden intention behind the images of publishers in social media. Visual intention represents implicit semantics, whose ambiguous definition inevitably leads to label shifting and label blemish. The former indicates that the same image delivers intention discrepancies under different data augmentations, while the latter represents

更新日期：2024-03-27

详情收藏

Weakly-Supervised Contrastive Learning for Unsupervised Object Discovery

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Yunqiu Lv, Jing Zhang, Nick Barnes, Yuchao Dai

Unsupervised object discovery (UOD) refers to the task of discriminating the whole region of objects from the background within a scene without relying on labeled datasets, which benefits the task of bounding-box-level localization and pixel-level segmentation. This task is promising due to its ability to discover objects in a generic manner. We roughly categorize existing techniques into two main

更新日期：2024-03-27

详情收藏

Temporal Feature Fusion for 3D Detection in Monocular Video

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Haoran Cheng, Liang Peng, Zheng Yang, Binbin Lin, Xiaofei He, Boxi Wu

Previous monocular 3D detection works focus on the single frame input in both training and inference. In real-world applications, temporal and motion information naturally exists in monocular video. It is valuable for 3D detection but under-explored in monocular works. In this paper, we propose a straightforward and effective method for temporal feature fusion, which exhibits low computation cost and

更新日期：2024-03-27

详情收藏

Instance-Specific Semantic Augmentation for Long-Tailed Image Classification

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Jiahao Chen, Bing Su

Recent long-tailed classification methods generally adopt the two-stage pipeline and focus on learning the classifier to tackle the imbalanced data in the second stage via re-sampling or re-weighting, but the classifier is easily prone to overconfidence in head classes. Data augmentation is a natural way to tackle this issue. Existing augmentation methods either perform low-level transformations or

更新日期：2024-03-27

详情收藏

BadCM: Invisible Backdoor Attack Against Cross-Modal Learning

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Zheng Zhang, Xu Yuan, Lei Zhu, Jingkuan Song, Liqiang Nie

Despite remarkable successes in unimodal learning tasks, backdoor attacks against cross-modal learning are still underexplored due to the limited generalization and inferior stealthiness when involving multiple modalities. Notably, since works in this area mainly inherit ideas from unimodal visual attacks, they struggle with dealing with diverse cross-modal attack circumstances and manipulating imperceptible

更新日期：2024-03-27

详情收藏

Toward Accurate Human Parsing Through Edge Guided Diffusion

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Ting Liu, Hongkun Zhu, Yunchao Wei, Shikui Wei, Yao Zhao, Yanning Zhang

Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained

更新日期：2024-03-27

详情收藏

In Defense of Clip-Based Video Relation Detection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Meng Wei, Long Chen, Wei Ji, Xiaoyu Yue, Roger Zimmermann

Video Visual Relation Detection (VidVRD) aims to detect visual relationship triplets in videos using spatial bounding boxes and temporal boundaries. Existing VidVRD methods can be broadly categorized into bottom-up and top-down paradigms, depending on their approach to classifying relations. Bottom-up methods follow a clip-based approach where they classify relations of short clip tubelet pairs and

更新日期：2024-03-27

详情收藏

Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression Recognition

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Weicheng Xie, Zhibin Peng, Linlin Shen, Wenya Lu, Yang Zhang, Siyang Song

Convolutional neural networks (CNNs) have achieved significant improvement for the task of facial expression recognition. However, current training still suffers from the inconsistent learning intensities among different layers, i.e., the feature representations in the shallow layers are not sufficiently learned compared with those in deep layers. To this end, this work proposes a contrastive learning

更新日期：2024-03-27

详情收藏

Single-Image-Based Deep Learning for Segmentation of Early Esophageal Cancer Lesions

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Haipeng Li, Dingrui Liu, Yu Zeng, Shuaicheng Liu, Tao Gan, Nini Rao, Jinlin Yang, Bing Zeng

Accurate segmentation of lesions is crucial for diagnosis and treatment of early esophageal cancer (EEC). However, neither traditional nor deep learning-based methods up to today can meet the clinical requirements, with the mean Dice score - the most important metric in medical image analysis - hardly exceeding 0.75. In this paper, we present a novel deep learning approach for segmenting EEC lesions

更新日期：2024-03-27

详情收藏

DeGCN: Deformable Graph Convolutional Networks for Skeleton-Based Action Recognition

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25
Woomin Myung, Nan Su, Jing-Hao Xue, Guijin Wang

Graph convolutional networks (GCN) have recently been studied to exploit the graph topology of the human body for skeleton-based action recognition. However, most of these methods unfortunately aggregate messages via an inflexible pattern for various action samples, lacking the awareness of intra-class variety and the suitableness for skeleton sequences, which often contain redundant or even detrimental

更新日期：2024-03-25

详情收藏

Cross-Modal Retrieval With Noisy Correspondence via Consistency Refining and Mining

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25
Xinran Ma, Mouxing Yang, Yunfan Li, Peng Hu, Jiancheng Lv, Xi Peng

The success of existing cross-modal retrieval (CMR) methods heavily rely on the assumption that the annotated cross-modal correspondence is faultless. In practice, however, the correspondence of some pairs would be inevitably contaminated during data collection or annotation, thus leading to the so-called Noisy Correspondence (NC) problem. To alleviate the influence of NC, we propose a novel method

更新日期：2024-03-25

详情收藏

Unsupervised Out-of-Distribution Object Detection via PCA-Driven Dynamic Prototype Enhancement

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22
Aming Wu, Cheng Deng, Wei Liu

To promote the application of object detectors in real scenes, out-of-distribution object detection (OOD-OD) is proposed to distinguish whether detected objects belong to the ones that are unseen during training or not. One of the key challenges is that detectors lack unknown data for supervision, and as a result, can produce overconfident detection results on OOD data. Thus, this task requires to

更新日期：2024-03-22

详情收藏

Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22
Christiaan G. A. Viviers, Lena Filatova, Maurice Termeer, Peter H. N. de With, Fons van der Sommen

Accurate 6-DoF pose estimation of surgical instruments during minimally invasive surgeries can substantially improve treatment strategies and eventual surgical outcome. Existing deep learning methods have achieved accurate results, but they require custom approaches for each object and laborious setup and training environments often stretching to extensive simulations, whilst lacking real-time computation

更新日期：2024-03-22

详情收藏

Neighbor-Guided Pseudo-Label Generation and Refinement for Single-Frame Supervised Temporal Action Localization

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22
Guozhang Li, De Cheng, Nannan Wang, Jie Li, Xinbo Gao

Due to the sparse single-frame annotations, current Single-Frame Temporal Action Localization (SF-TAL) methods generally employ threshold-based pseudo-label generation strategies. However, these approaches suffer from inefficient data utilization, as only parts of unlabeled frames with confidence scores surpassing a predefined threshold are selected for training. Moreover, the variability of single-frame

更新日期：2024-03-22

详情收藏

TOPIQ: A Top-Down Approach From Semantics to Distortions for Image Quality Assessment

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22
Chaofeng Chen, Jiadi Mo, Jingwen Hou, Haoning Wu, Liang Liao, Wenxiu Sun, Qiong Yan, Weisi Lin

Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks. Inspired by the characteristics of the human visual system, existing methods typically use a combination of global and local representations (i.e., multi-scale features) to achieve superior performance. However, most of them adopt simple linear fusion of multi-scale

更新日期：2024-03-22

详情收藏

Depth-Aware Unpaired Video Dehazing

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22
Yang Yang, Chun-Le Guo, Xiaojie Guo

This paper investigates a novel unpaired video dehazing framework, which can be a good candidate in practice by relieving pressure from collecting paired data. In such a paradigm, two key issues including 1) temporal consistency uninvolved in single image dehazing, and 2) better dehazing ability need to be considered for satisfied performance. To handle the mentioned problems, we alternatively resort

更新日期：2024-03-22

详情收藏

CCDet: Confidence-Consistent Learning for Dense Object Detection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22
Chang Liu, Xiaomao Li, Weiping Xiao, Shaorong Xie

Modern detectors commonly employ classification scores to reflect the localization quality of detection results. However, there exists an inconsistency between them, misguiding the selection of high-quality predictions and providing unreliable results for downstream applications. In this paper, we find that the root of this confidence inconsistency lies in the inaccurate IoU estimation and the spatial

更新日期：2024-03-22

详情收藏

Learning Temporal Distribution and Spatial Correlation Toward Universal Moving Object Segmentation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22
Guanfang Dong, Chenqiu Zhao, Xichen Pan, Anup Basu

The goal of moving object segmentation is separating moving objects from stationary backgrounds in videos. One major challenge in this problem is how to develop a universal model for videos from various natural scenes since previous methods are often effective only in specific scenes. In this paper, we propose a method called Learning Temporal Distribution and Spatial Correlation (LTS) that has the

更新日期：2024-03-22

详情收藏

Double Discrete Cosine Transform-Oriented Multi-View Subspace Clustering

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22
Yongyong Chen, Shuqin Wang, Yin-Ping Zhao, C. L. Philip Chen

Low-rank tensor representation with the tensor nuclear norm has been rising in popularity in multi-view subspace clustering (MVSC), in which the tensor nuclear norm is commonly implemented using discrete Fourier transform (DFT). Unfortunately, existing DFT-oriented MVSC methods may provide unsatisfactory results since (1) DFT exploits complex arithmetic in the Fourier domain, usually resulting in high

更新日期：2024-03-22

详情收藏

Toward Robust and Unconstrained Full Range of Rotation Head Pose Estimation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-21
Thorsten Hempel, Ahmed A. Abdelrahman, Ayoub Al-Hamadi

Estimating the head pose of a person is a crucial problem for numerous applications that is yet mainly addressed as a subtask of frontal pose prediction. We present a novel method for unconstrained end-to-end head pose estimation to tackle the challenging task of full range of orientation head pose prediction. We address the issue of ambiguous rotation labels by introducing the rotation matrix formalism

更新日期：2024-03-21

详情收藏

MERF: A Practical HDR-Like Image Generator via Mutual-Guided Learning Between Multi-Exposure Registration and Fusion

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-21
Wenhui Hong, Hao Zhang, Jiayi Ma

In this paper, we present a novel high dynamic range (HDR)-like image generator that utilizes mutual-guided learning between multi-exposure registration and fusion, leading to promising dynamic multi-exposure image fusion. The method consists of three main components: the registration network, the fusion network, and the dual attention network which seamlessly integrates registration and fusion processes

更新日期：2024-03-21

详情收藏

Meta Clothing Status Calibration for Long-Term Person Re-Identification

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Yan Huang, Qiang Wu, Zhang Zhang, Caifeng Shan, Yan Huang, Yi Zhong, Liang Wang

Recent studies have seen significant advancements in the field of long-term person re-identification (LT-reID) through the use of clothing-irrelevant or insensitive features. This work takes the field a step further by addressing a previously unexplored issue, the Clothing Status Distribution Shift (CSDS). CSDS refers to the differing ratios of samples with clothing changes to those without clothing

更新日期：2024-03-18

详情收藏

Online Streaming Video Super-Resolution With Convolutional Look-Up Table

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Guanghao Yin, Zefan Qu, Xinyang Jiang, Shan Jiang, Zhenhua Han, Ningxin Zheng, Huan Yang, Xiaohong Liu, Yuqing Yang, Dongsheng Li, Lili Qiu

Online video streaming has fundamental limitations on the transmission bandwidth and computational capacity and super-resolution is a promising potential solution. However, applying existing video super-resolution methods to online streaming is non-trivial. Existing video codecs and streaming protocols (e.g., WebRTC) dynamically change the video quality both spatially and temporally, which leads to

更新日期：2024-03-18

详情收藏

Convolution-Enhanced Bi-Branch Adaptive Transformer With Cross-Task Interaction for Food Category and Ingredient Recognition

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Yuxin Liu, Weiqing Min, Shuqiang Jiang, Yong Rui

Recently, visual food analysis has received more and more attention in the computer vision community due to its wide application scenarios, e.g., diet nutrition management, smart restaurant, and personalized diet recommendation. Considering that food images are unstructured images with complex and unfixed visual patterns, mining food-related semantic-aware regions is crucial. Furthermore, the ingredients

更新日期：2024-03-18

详情收藏

Neuromorphic Imaging With Joint Image Deblurring and Event Denoising

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Pei Zhang, Haosen Liu, Zhou Ge, Chutian Wang, Edmund Y. Lam

Neuromorphic imaging reacts to per-pixel brightness changes of a dynamic scene with high temporal precision and responds with asynchronous streaming events as a result. It also often supports a simultaneous output of an intensity image. Nevertheless, the raw events typically involve a large amount of noise due to the high sensitivity of the sensor, while capturing fast-moving objects at low frame rates

更新日期：2024-03-18

详情收藏

Adaptive Feature Learning for Unbiased Scene Graph Generation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Jiarui Yang, Chuan Wang, Liang Yang, Yuchen Jiang, Angelina Cao

Scene Graph Generation (SGG) aims to detect all objects and identify their pairwise relationships in the scene. Recently, tremendous progress has been made in exploring better context relationship representations. Previous work mainly focuses on contextual information aggregation and uses de-biasing strategies on samples to eliminate the preference for head predicates. However, there remain challenges

更新日期：2024-03-18

详情收藏

CreativeSeg: Semantic Segmentation of Creative Sketches

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Yixiao Zheng, Kaiyue Pang, Ayan Das, Dongliang Chang, Yi-Zhe Song, Zhanyu Ma

The problem of sketch semantic segmentation is far from being solved. Despite existing methods exhibiting near-saturating performances on simple sketches with high recognisability, they suffer serious setbacks when the target sketches are products of an imaginative process with high degree of creativity. We hypothesise that human creativity, being highly individualistic, induces a significant shift

更新日期：2024-03-18

详情收藏

Semantics Disentangling for Cross-Modal Retrieval

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Zheng Wang, Xing Xu, Jiwei Wei, Ning Xie, Yang Yang, Heng Tao Shen

Cross-modal retrieval (e.g., query a given image to obtain a semantically similar sentence, and vice versa) is an important but challenging task, as the heterogeneous gap and inconsistent distributions exist between different modalities. The dominant approaches struggle to bridge the heterogeneity by capturing the common representations among heterogeneous data in a constructed subspace which can reflect

更新日期：2024-03-18

详情收藏

RCUMP: Residual Completion Unrolling With Mixed Priors for Snapshot Compressive Imaging

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Yin-Ping Zhao, Jiancheng Zhang, Yongyong Chen, Zhen Wang, Xuelong Li

Deep unrolling-based snapshot compressive imaging (SCI) methods, which employ iterative formulas to construct interpretable iterative frameworks and embedded learnable modules, have achieved remarkable success in reconstructing 3-dimensional (3D) hyperspectral images (HSIs) from 2D measurement induced by coded aperture snapshot spectral imaging (CASSI). However, the existing deep unrolling-based methods

更新日期：2024-03-18

详情收藏

Satellite Video Multi-Label Scene Classification With Spatial and Temporal Feature Cooperative Encoding: A Benchmark Dataset and Method

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Weilong Guo, Shengyang Li, Feixiang Chen, Yuhan Sun, Yanfeng Gu

Satellite video multi-label scene classification predicts semantic labels of multiple ground contents to describe a given satellite observation video, which plays an important role in applications like ocean observation, smart cities, et al. However, the lack of a high-quality and large-scale dataset prevents further improvement of the task. And existing methods on general videos have the difficulty

更新日期：2024-03-18

详情收藏

Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Peng Wu, Jing Liu, Xiangteng He, Yuxin Peng, Peng Wang, Yanning Zhang

Video anomaly detection (VAD) has been paid increasing attention due to its potential applications, its current dominant tasks focus on online detecting anomalies, which can be roughly interpreted as the binary or multiple event classification. However, such a setup that builds relationships between complicated anomalous events and single labels, e.g., “vandalism”, is superficial, since single labels

更新日期：2024-03-18

详情收藏

MM-Net: A MixFormer-Based Multi-Scale Network for Anatomical and Functional Image Fusion

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Yu Liu, Chen Yu, Juan Cheng, Z. Jane Wang, Xun Chen

Anatomical and functional image fusion is an important technique in a variety of medical and biological applications. Recently, deep learning (DL)-based methods have become a mainstream direction in the field of multi-modal image fusion. However, existing DL-based fusion approaches have difficulty in effectively capturing local features and global contextual information simultaneously. In addition

更新日期：2024-03-18

详情收藏

Relationship-Guided Knowledge Transfer for Class-Incremental Facial Expression Recognition

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Yuanling Lv, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang

Human emotions contain both basic and compound facial expressions. In many practical scenarios, it is difficult to access all the compound expression categories at one time. In this paper, we investigate comprehensive facial expression recognition (FER) in the class-incremental learning paradigm, where we define well-studied and easily-accessible basic expressions as initial classes and learn new compound

更新日期：2024-03-18

详情收藏

Anycost Network Quantization for Image Super-Resolution

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18
Jingyi Zhang, Ziwei Wang, Haoyu Wang, Jie Zhou, Jiwen Lu

In this paper, we propose an anycost network quantization method for efficient image super-resolution with variable resource budgets. Conventional quantization approaches acquire discrete network parameters for deployment with fixed complexity constraints, while image super-resolution networks are usually applied on mobile devices with frequently modified resource budgets due to the change of battery

更新日期：2024-03-18

详情收藏