样式: 排序: IF: - GO 导出 标记为已读
-
Exploring Video Denoising in Thermal Infrared Imaging: Physics-inspired Noise Generator, Dataset and Model IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-23 Lijing Cai, Xiangyu Dong, Kailai Zhou, Xun Cao
-
Accurate 3D measurement of complex texture objects by height compensation using a dual-projector structure IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-22 Pengcheng Yao, Yuchong Chen, Shaoyan Gai, Feipeng Da
-
Classification of Small Drones Using Low-Uncertainty Micro-Doppler Signature Images and Ultra-Lightweight Convolutional Neural Network IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19 Junhyeong Park, Jun-Sung Park
-
Image Reconstruction for Accelerated MR Scan with Faster Fourier Convolutional Neural Networks IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19 Xiaohan Liu, Yanwei Pang, Xuebin Sun, Yiming Liu, Yonghong Hou, Zhenchang Wang, Xuelong Li
-
Fast Continual Multi-View Clustering with Incomplete Views IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19 Xinhang Wan, Bin Xiao, Xinwang Liu, Jiyuan Liu, Weixuan Liang, En Zhu
-
Multi-Relational Deep Hashing for Cross-Modal Search IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-16 Xiao Liang, Erkun Yang, Yanhua Yang, Cheng Deng
-
GLPanoDepth: Global-to-Local Panoramic Depth Estimation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-15 Jiayang Bai, Haoyu Qin, Shuichang Lai, Jie Guo, Yanwen Guo
Depth estimation is a fundamental task in many vision applications. With the popularity of omnidirectional cameras, it becomes a new trend to tackle this problem in the spherical space. In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image. An omnidirectional image has a full field-of-view, providing much more complete
-
ISTR: Mask-Embedding-Based Instance Segmentation Transformer IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12 Jie Hu, Yao Lu, Shengchuan Zhang, Liujuan Cao
Transformer-based instance-level recognition has attracted increasing research attention recently due to the superior performance. However, although attempts have been made to encode masks as embeddings into Transformer-based frameworks, how to combine mask embeddings and spatial information for a transformer-based approach is still not fully explored. In this paper, we revisit the design of mask-embedding-based
-
Deep Variation Prior: Joint Image Denoising and Noise Variance Estimation Without Clean Data IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12 Rihuan Ke
With recent deep learning based approaches showing promising results in removing noise from images, the best denoising performance has been reported in a supervised learning setup that requires a large set of paired noisy images and ground truth data for training. The strong data requirement can be mitigated by unsupervised learning techniques, however, accurate modelling of images or noise variances
-
Saliency Guided Deep Neural Network for Color Transfer With Light Optimization IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12 Yuming Fang, Pengwei Yuan, Chenlei Lv, Chen Peng, Jiebin Yan, Weisi Lin
Color transfer aims to change the color information of the target image according to the reference one. Many studies propose color transfer methods by analysis of color distribution and semantic relevance, which do not take the perceptual characteristics for visual quality into consideration. In this study, we propose a novel color transfer method based on the saliency information with brightness optimization
-
Single Stage Adaptive Multi-Attention Network for Image Restoration IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10 Anas Zafar, Danyal Aftab, Rizwan Qureshi, Xinqi Fan, Pingjun Chen, Jia Wu, Hazrat Ali, Shah Nawaz, Sheheryar Khan, Mubarak Shah
-
High-Quality and Diverse Few-Shot Image Generation via Masked Discrimination IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10 Jingyuan Zhu, Huimin Ma, Jiansheng Chen, Jian Yuan
Few-shot image generation aims to generate images of high quality and great diversity with limited data. However, it is difficult for modern GANs to avoid overfitting when trained on only a few images. The discriminator can easily remember all the training samples and guide the generator to replicate them, leading to severe diversity degradation. Several methods have been proposed to relieve overfitting
-
RefQSR: Reference-Based Quantization for Image Super-Resolution Networks IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10 Hongjae Lee, Jun-Sang Yoo, Seung-Won Jung
Single image super-resolution (SISR) aims to reconstruct a high-resolution image from its low-resolution observation. Recent deep learning-based SISR models show high performance at the expense of increased computational costs, limiting their use in resource-constrained environments. As a promising solution for computationally efficient network design, network quantization has been extensively studied
-
Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10 Wenjin Qin, Hailin Wang, Feng Zhang, Weijun Ma, Jianjun Wang, Tingwen Huang
Within the tensor singular value decomposition (T-SVD) framework, existing robust low-rank tensor completion approaches have made great achievements in various areas of science and engineering. Nevertheless, these methods involve the T-SVD based low-rank approximation, which suffers from high computational costs when dealing with large-scale tensor data. Moreover, most of them are only applicable to
-
Source-Guided Target Feature Reconstruction for Cross-Domain Classification and Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-09 Yifan Jiao, Hantao Yao, Bing-Kun Bao, Changsheng Xu
Existing cross-domain classification and detection methods usually apply a consistency constraint between the target sample and its self-augmentation for unsupervised learning without considering the essential source knowledge. In this paper, we propose a Source-guided Target Feature Reconstruction (STFR) module for cross-domain visual tasks, which applies source visual words to reconstruct the target
-
Relationship-Incremental Scene Graph Generation by a Divide-and-Conquer Pipeline with Feature Adapter IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-08 Xuewei Li, Guangcong Zheng, Yunlong Yu, Naye Ji, Xi Li
-
DriftRec: Adapting Diffusion Models to Blind JPEG Restoration IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-05 Simon Welker, Henry N. Chapman, Timo Gerkmann
In this work, we utilize the high-fidelity generation abilities of diffusion models to solve blind JPEG restoration at high compression levels. We propose an elegant modification of the forward stochastic differential equation of diffusion models to adapt them to this restoration task and name our method DriftRec. Comparing DriftRec against an $L_{2}$ regression baseline with the same network architecture
-
Generalizing to Out-of-Sample Degradations via Model Reprogramming IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-05 Runhua Jiang, Yahong Han
Existing image restoration models are typically designed for specific tasks and struggle to generalize to out-of-sample degradations not encountered during training. While zero-shot methods can address this limitation by fine-tuning model parameters on testing samples, their effectiveness relies on predefined natural priors and physical models of specific degradations. Nevertheless, determining out-of-sample
-
Shared Manifold Regularized Joint Feature Selection for Joint Classification and Regression in Alzheimer’s Disease Diagnosis IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-04 Zhi Chen, Yongguo Liu, Yun Zhang, Jiajing Zhu, Qiaoqin Li, Xindong Wu
In Alzheimer’s disease (AD) diagnosis, joint feature selection for predicting disease labels (classification) and estimating cognitive scores (regression) with neuroimaging data has received increasing attention. In this paper, we propose a model named Shared Manifold regularized Joint Feature Selection (SMJFS) that performs classification and regression in a unified framework for AD diagnosis. For
-
Orthogonal Spatial Binary Coding Method for High-Speed 3D Measurement IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-01 Haitao Wu, Yiping Cao, Yongbo Dai, Zhimi Wei
Temporal phase unwrapping based on single auxiliary binary coded pattern has been proven to be effective for high-speed 3D measurement. However, in traditional spatial binary coding, it often leads to an imbalance between the number of periodic divisions and codewords. To meet this challenge, a large codewords orthogonal spatial binary coding method is proposed in this paper. By expanding spatial multiplexing
-
Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-01 Simin Li, Huangxinxin Xu, Jiakai Wang, Ruixiao Xu, Aishan Liu, Fazhi He, Xianglong Liu, Dacheng Tao
Billions of people share images from their daily lives on social media every day. However, their biometric information (e.g., fingerprints) could be easily stolen from these images. The threat of fingerprint leakage from social media has created a strong desire to anonymize shared images while maintaining image quality, since fingerprints act as a lifelong individual biometric password. To guard the
-
Bilateral Context Modeling for Residual Coding in Lossless 3D Medical Image Compression IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25 Xiangrui Liu, Meng Wang, Shiqi Wang, Sam Kwong
Residual coding has gained prevalence in lossless compression, where a lossy layer is initially employed and the reconstruction errors (i.e., residues) are then losslessly compressed. The underlying principle of the residual coding revolves around the exploration of priors based on context modeling. Herein, we propose a residual coding framework for 3D medical images, involving the off-the-shelf video
-
Anomaly Detection for Medical Images Using Heterogeneous Auto-Encoder IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-29 Shuai Lu, Weihang Zhang, He Zhao, Hanruo Liu, Ningli Wang, Huiqi Li
Anomaly detection is an important task for medical image analysis, which can alleviate the reliance of supervised methods on large labelled datasets. Most existing methods use a pixel-wise self-reconstruction framework for anomaly detection. However, there are two challenges of these studies: 1) they tend to overfit learning an identity mapping between the input and output, which leads to failure in
-
Region Aware Video Object Segmentation With Deep Motion Modeling IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-29 Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian
Current semi-supervised video object segmentation (VOS) methods often employ the entire features of one frame to predict object masks and update memory. This introduces significant redundant computations. To reduce redundancy, we introduce a Region Aware Video Object Segmentation (RAVOS) approach, which predicts regions of interest (ROIs) for efficient object segmentation and memory storage. RAVOS
-
Knowledge-Augmented Visual Question Answering With Natural Language Explanation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-28 Jiayuan Xie, Yi Cai, Jiali Chen, Ruohang Xu, Jiexin Wang, Qing Li
Visual question answering with natural language explanation (VQA-NLE) is a challenging task that requires models to not only generate accurate answers but also to provide explanations that justify the relevant decision-making processes. This task is accomplished by generating natural language sentences based on the given question-image pair. However, existing methods often struggle to ensure consistency
-
Robust Fine-Grained Visual Recognition With Neighbor-Attention Label Correction IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-28 Shunan Mao, Shiliang Zhang
Existing deep learning methods for fine-grained visual recognition often rely on large-scale, well-annotated training data. Obtaining fine-grained annotations in the wild typically requires concentration and expertise, such as fine category annotation for species recognition, instance annotation for person re-identification (re-id) and dense annotation for segmentation, which inevitably leads to label
-
Label-Aware Calibration and Relation-Preserving in Visual Intention Understanding IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 QingHongYa Shi, Mang Ye, Wenke Huang, Weijian Ruan, Bo Du
Visual intention understanding is a challenging task that explores the hidden intention behind the images of publishers in social media. Visual intention represents implicit semantics, whose ambiguous definition inevitably leads to label shifting and label blemish. The former indicates that the same image delivers intention discrepancies under different data augmentations, while the latter represents
-
Weakly-Supervised Contrastive Learning for Unsupervised Object Discovery IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Yunqiu Lv, Jing Zhang, Nick Barnes, Yuchao Dai
Unsupervised object discovery (UOD) refers to the task of discriminating the whole region of objects from the background within a scene without relying on labeled datasets, which benefits the task of bounding-box-level localization and pixel-level segmentation. This task is promising due to its ability to discover objects in a generic manner. We roughly categorize existing techniques into two main
-
Temporal Feature Fusion for 3D Detection in Monocular Video IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Haoran Cheng, Liang Peng, Zheng Yang, Binbin Lin, Xiaofei He, Boxi Wu
Previous monocular 3D detection works focus on the single frame input in both training and inference. In real-world applications, temporal and motion information naturally exists in monocular video. It is valuable for 3D detection but under-explored in monocular works. In this paper, we propose a straightforward and effective method for temporal feature fusion, which exhibits low computation cost and
-
Instance-Specific Semantic Augmentation for Long-Tailed Image Classification IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Jiahao Chen, Bing Su
Recent long-tailed classification methods generally adopt the two-stage pipeline and focus on learning the classifier to tackle the imbalanced data in the second stage via re-sampling or re-weighting, but the classifier is easily prone to overconfidence in head classes. Data augmentation is a natural way to tackle this issue. Existing augmentation methods either perform low-level transformations or
-
BadCM: Invisible Backdoor Attack Against Cross-Modal Learning IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Zheng Zhang, Xu Yuan, Lei Zhu, Jingkuan Song, Liqiang Nie
Despite remarkable successes in unimodal learning tasks, backdoor attacks against cross-modal learning are still underexplored due to the limited generalization and inferior stealthiness when involving multiple modalities. Notably, since works in this area mainly inherit ideas from unimodal visual attacks, they struggle with dealing with diverse cross-modal attack circumstances and manipulating imperceptible
-
Toward Accurate Human Parsing Through Edge Guided Diffusion IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Ting Liu, Hongkun Zhu, Yunchao Wei, Shikui Wei, Yao Zhao, Yanning Zhang
Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained
-
In Defense of Clip-Based Video Relation Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Meng Wei, Long Chen, Wei Ji, Xiaoyu Yue, Roger Zimmermann
Video Visual Relation Detection (VidVRD) aims to detect visual relationship triplets in videos using spatial bounding boxes and temporal boundaries. Existing VidVRD methods can be broadly categorized into bottom-up and top-down paradigms, depending on their approach to classifying relations. Bottom-up methods follow a clip-based approach where they classify relations of short clip tubelet pairs and
-
Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Weicheng Xie, Zhibin Peng, Linlin Shen, Wenya Lu, Yang Zhang, Siyang Song
Convolutional neural networks (CNNs) have achieved significant improvement for the task of facial expression recognition. However, current training still suffers from the inconsistent learning intensities among different layers, i.e., the feature representations in the shallow layers are not sufficiently learned compared with those in deep layers. To this end, this work proposes a contrastive learning
-
Single-Image-Based Deep Learning for Segmentation of Early Esophageal Cancer Lesions IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27 Haipeng Li, Dingrui Liu, Yu Zeng, Shuaicheng Liu, Tao Gan, Nini Rao, Jinlin Yang, Bing Zeng
Accurate segmentation of lesions is crucial for diagnosis and treatment of early esophageal cancer (EEC). However, neither traditional nor deep learning-based methods up to today can meet the clinical requirements, with the mean Dice score - the most important metric in medical image analysis - hardly exceeding 0.75. In this paper, we present a novel deep learning approach for segmenting EEC lesions
-
DeGCN: Deformable Graph Convolutional Networks for Skeleton-Based Action Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25 Woomin Myung, Nan Su, Jing-Hao Xue, Guijin Wang
Graph convolutional networks (GCN) have recently been studied to exploit the graph topology of the human body for skeleton-based action recognition. However, most of these methods unfortunately aggregate messages via an inflexible pattern for various action samples, lacking the awareness of intra-class variety and the suitableness for skeleton sequences, which often contain redundant or even detrimental
-
Cross-Modal Retrieval With Noisy Correspondence via Consistency Refining and Mining IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25 Xinran Ma, Mouxing Yang, Yunfan Li, Peng Hu, Jiancheng Lv, Xi Peng
The success of existing cross-modal retrieval (CMR) methods heavily rely on the assumption that the annotated cross-modal correspondence is faultless. In practice, however, the correspondence of some pairs would be inevitably contaminated during data collection or annotation, thus leading to the so-called Noisy Correspondence (NC) problem. To alleviate the influence of NC, we propose a novel method
-
Unsupervised Out-of-Distribution Object Detection via PCA-Driven Dynamic Prototype Enhancement IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22 Aming Wu, Cheng Deng, Wei Liu
To promote the application of object detectors in real scenes, out-of-distribution object detection (OOD-OD) is proposed to distinguish whether detected objects belong to the ones that are unseen during training or not. One of the key challenges is that detectors lack unknown data for supervision, and as a result, can produce overconfident detection results on OOD data. Thus, this task requires to
-
Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22 Christiaan G. A. Viviers, Lena Filatova, Maurice Termeer, Peter H. N. de With, Fons van der Sommen
Accurate 6-DoF pose estimation of surgical instruments during minimally invasive surgeries can substantially improve treatment strategies and eventual surgical outcome. Existing deep learning methods have achieved accurate results, but they require custom approaches for each object and laborious setup and training environments often stretching to extensive simulations, whilst lacking real-time computation
-
Neighbor-Guided Pseudo-Label Generation and Refinement for Single-Frame Supervised Temporal Action Localization IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22 Guozhang Li, De Cheng, Nannan Wang, Jie Li, Xinbo Gao
Due to the sparse single-frame annotations, current Single-Frame Temporal Action Localization (SF-TAL) methods generally employ threshold-based pseudo-label generation strategies. However, these approaches suffer from inefficient data utilization, as only parts of unlabeled frames with confidence scores surpassing a predefined threshold are selected for training. Moreover, the variability of single-frame
-
TOPIQ: A Top-Down Approach From Semantics to Distortions for Image Quality Assessment IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22 Chaofeng Chen, Jiadi Mo, Jingwen Hou, Haoning Wu, Liang Liao, Wenxiu Sun, Qiong Yan, Weisi Lin
Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks. Inspired by the characteristics of the human visual system, existing methods typically use a combination of global and local representations (i.e., multi-scale features) to achieve superior performance. However, most of them adopt simple linear fusion of multi-scale
-
Depth-Aware Unpaired Video Dehazing IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22 Yang Yang, Chun-Le Guo, Xiaojie Guo
This paper investigates a novel unpaired video dehazing framework, which can be a good candidate in practice by relieving pressure from collecting paired data. In such a paradigm, two key issues including 1) temporal consistency uninvolved in single image dehazing, and 2) better dehazing ability need to be considered for satisfied performance. To handle the mentioned problems, we alternatively resort
-
CCDet: Confidence-Consistent Learning for Dense Object Detection IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22 Chang Liu, Xiaomao Li, Weiping Xiao, Shaorong Xie
Modern detectors commonly employ classification scores to reflect the localization quality of detection results. However, there exists an inconsistency between them, misguiding the selection of high-quality predictions and providing unreliable results for downstream applications. In this paper, we find that the root of this confidence inconsistency lies in the inaccurate IoU estimation and the spatial
-
Learning Temporal Distribution and Spatial Correlation Toward Universal Moving Object Segmentation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22 Guanfang Dong, Chenqiu Zhao, Xichen Pan, Anup Basu
The goal of moving object segmentation is separating moving objects from stationary backgrounds in videos. One major challenge in this problem is how to develop a universal model for videos from various natural scenes since previous methods are often effective only in specific scenes. In this paper, we propose a method called Learning Temporal Distribution and Spatial Correlation (LTS) that has the
-
Double Discrete Cosine Transform-Oriented Multi-View Subspace Clustering IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-22 Yongyong Chen, Shuqin Wang, Yin-Ping Zhao, C. L. Philip Chen
Low-rank tensor representation with the tensor nuclear norm has been rising in popularity in multi-view subspace clustering (MVSC), in which the tensor nuclear norm is commonly implemented using discrete Fourier transform (DFT). Unfortunately, existing DFT-oriented MVSC methods may provide unsatisfactory results since (1) DFT exploits complex arithmetic in the Fourier domain, usually resulting in high
-
Toward Robust and Unconstrained Full Range of Rotation Head Pose Estimation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-21 Thorsten Hempel, Ahmed A. Abdelrahman, Ayoub Al-Hamadi
Estimating the head pose of a person is a crucial problem for numerous applications that is yet mainly addressed as a subtask of frontal pose prediction. We present a novel method for unconstrained end-to-end head pose estimation to tackle the challenging task of full range of orientation head pose prediction. We address the issue of ambiguous rotation labels by introducing the rotation matrix formalism
-
MERF: A Practical HDR-Like Image Generator via Mutual-Guided Learning Between Multi-Exposure Registration and Fusion IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-21 Wenhui Hong, Hao Zhang, Jiayi Ma
In this paper, we present a novel high dynamic range (HDR)-like image generator that utilizes mutual-guided learning between multi-exposure registration and fusion, leading to promising dynamic multi-exposure image fusion. The method consists of three main components: the registration network, the fusion network, and the dual attention network which seamlessly integrates registration and fusion processes
-
Meta Clothing Status Calibration for Long-Term Person Re-Identification IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Yan Huang, Qiang Wu, Zhang Zhang, Caifeng Shan, Yan Huang, Yi Zhong, Liang Wang
Recent studies have seen significant advancements in the field of long-term person re-identification (LT-reID) through the use of clothing-irrelevant or insensitive features. This work takes the field a step further by addressing a previously unexplored issue, the Clothing Status Distribution Shift (CSDS). CSDS refers to the differing ratios of samples with clothing changes to those without clothing
-
Online Streaming Video Super-Resolution With Convolutional Look-Up Table IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Guanghao Yin, Zefan Qu, Xinyang Jiang, Shan Jiang, Zhenhua Han, Ningxin Zheng, Huan Yang, Xiaohong Liu, Yuqing Yang, Dongsheng Li, Lili Qiu
Online video streaming has fundamental limitations on the transmission bandwidth and computational capacity and super-resolution is a promising potential solution. However, applying existing video super-resolution methods to online streaming is non-trivial. Existing video codecs and streaming protocols (e.g., WebRTC) dynamically change the video quality both spatially and temporally, which leads to
-
Convolution-Enhanced Bi-Branch Adaptive Transformer With Cross-Task Interaction for Food Category and Ingredient Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Yuxin Liu, Weiqing Min, Shuqiang Jiang, Yong Rui
Recently, visual food analysis has received more and more attention in the computer vision community due to its wide application scenarios, e.g., diet nutrition management, smart restaurant, and personalized diet recommendation. Considering that food images are unstructured images with complex and unfixed visual patterns, mining food-related semantic-aware regions is crucial. Furthermore, the ingredients
-
Neuromorphic Imaging With Joint Image Deblurring and Event Denoising IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Pei Zhang, Haosen Liu, Zhou Ge, Chutian Wang, Edmund Y. Lam
Neuromorphic imaging reacts to per-pixel brightness changes of a dynamic scene with high temporal precision and responds with asynchronous streaming events as a result. It also often supports a simultaneous output of an intensity image. Nevertheless, the raw events typically involve a large amount of noise due to the high sensitivity of the sensor, while capturing fast-moving objects at low frame rates
-
Adaptive Feature Learning for Unbiased Scene Graph Generation IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Jiarui Yang, Chuan Wang, Liang Yang, Yuchen Jiang, Angelina Cao
Scene Graph Generation (SGG) aims to detect all objects and identify their pairwise relationships in the scene. Recently, tremendous progress has been made in exploring better context relationship representations. Previous work mainly focuses on contextual information aggregation and uses de-biasing strategies on samples to eliminate the preference for head predicates. However, there remain challenges
-
CreativeSeg: Semantic Segmentation of Creative Sketches IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Yixiao Zheng, Kaiyue Pang, Ayan Das, Dongliang Chang, Yi-Zhe Song, Zhanyu Ma
The problem of sketch semantic segmentation is far from being solved. Despite existing methods exhibiting near-saturating performances on simple sketches with high recognisability, they suffer serious setbacks when the target sketches are products of an imaginative process with high degree of creativity. We hypothesise that human creativity, being highly individualistic, induces a significant shift
-
Semantics Disentangling for Cross-Modal Retrieval IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Zheng Wang, Xing Xu, Jiwei Wei, Ning Xie, Yang Yang, Heng Tao Shen
Cross-modal retrieval (e.g., query a given image to obtain a semantically similar sentence, and vice versa) is an important but challenging task, as the heterogeneous gap and inconsistent distributions exist between different modalities. The dominant approaches struggle to bridge the heterogeneity by capturing the common representations among heterogeneous data in a constructed subspace which can reflect
-
RCUMP: Residual Completion Unrolling With Mixed Priors for Snapshot Compressive Imaging IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Yin-Ping Zhao, Jiancheng Zhang, Yongyong Chen, Zhen Wang, Xuelong Li
Deep unrolling-based snapshot compressive imaging (SCI) methods, which employ iterative formulas to construct interpretable iterative frameworks and embedded learnable modules, have achieved remarkable success in reconstructing 3-dimensional (3D) hyperspectral images (HSIs) from 2D measurement induced by coded aperture snapshot spectral imaging (CASSI). However, the existing deep unrolling-based methods
-
Satellite Video Multi-Label Scene Classification With Spatial and Temporal Feature Cooperative Encoding: A Benchmark Dataset and Method IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Weilong Guo, Shengyang Li, Feixiang Chen, Yuhan Sun, Yanfeng Gu
Satellite video multi-label scene classification predicts semantic labels of multiple ground contents to describe a given satellite observation video, which plays an important role in applications like ocean observation, smart cities, et al. However, the lack of a high-quality and large-scale dataset prevents further improvement of the task. And existing methods on general videos have the difficulty
-
Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Peng Wu, Jing Liu, Xiangteng He, Yuxin Peng, Peng Wang, Yanning Zhang
Video anomaly detection (VAD) has been paid increasing attention due to its potential applications, its current dominant tasks focus on online detecting anomalies, which can be roughly interpreted as the binary or multiple event classification. However, such a setup that builds relationships between complicated anomalous events and single labels, e.g., “vandalism”, is superficial, since single labels
-
MM-Net: A MixFormer-Based Multi-Scale Network for Anatomical and Functional Image Fusion IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Yu Liu, Chen Yu, Juan Cheng, Z. Jane Wang, Xun Chen
Anatomical and functional image fusion is an important technique in a variety of medical and biological applications. Recently, deep learning (DL)-based methods have become a mainstream direction in the field of multi-modal image fusion. However, existing DL-based fusion approaches have difficulty in effectively capturing local features and global contextual information simultaneously. In addition
-
Relationship-Guided Knowledge Transfer for Class-Incremental Facial Expression Recognition IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Yuanling Lv, Yan Yan, Jing-Hao Xue, Si Chen, Hanzi Wang
Human emotions contain both basic and compound facial expressions. In many practical scenarios, it is difficult to access all the compound expression categories at one time. In this paper, we investigate comprehensive facial expression recognition (FER) in the class-incremental learning paradigm, where we define well-studied and easily-accessible basic expressions as initial classes and learn new compound
-
Anycost Network Quantization for Image Super-Resolution IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-18 Jingyi Zhang, Ziwei Wang, Haoyu Wang, Jie Zhou, Jiwen Lu
In this paper, we propose an anycost network quantization method for efficient image super-resolution with variable resource budgets. Conventional quantization approaches acquire discrete network parameters for deployment with fixed complexity constraints, while image super-resolution networks are usually applied on mobile devices with frequently modified resource budgets due to the change of battery