-
Weakly-supervised Incremental learning for Semantic segmentation with Class Hierarchy Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-11 Hyoseo Kim, Junsuk Choe
Although current semantic segmentation approaches have achieved impressive performance, their ability to incrementally learn new classes is limited. Moreover, pixel-by-pixel annotations are costly and time-consuming. Therefore, a new field called Weakly-supervised Incremental Learning for Semantic Segmentation (WILSS) has emerged, which learns new classes using image-level labels. However, image-level
-
Towards better small object detection in UAV scenes: Aggregating more object-oriented information Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-06 Chenyue Yang, Yichao Cao, Xiaobo Lu
Security, transportation, and rescue applications require fully analyzing the visual data interpretation via drone platforms. While various aspects of object detection research are expanding at a rapid pace, the detection of small objects in drone platforms continues to pose significant challenges. Specifically, targets in drone-captured scenarios are notoriously hard to detect due to factors such
-
Multiresolution causality of Bitcoin on GCC stock markets: Utilizing EMD-Granger analytical methodology Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-05 Foued Saâdaoui, Bochra Rabbouch, Harish Garg
This article employs an Empirical Mode Decomposition (EMD)-based multiresolution causality approach to explore the scale-by-scale interconnectedness between Bitcoin and the stock markets of Gulf Cooperation Council (GCC) countries. EMD is utilized to decompose signals into intrinsic mode functions (IMFs), which delineate variations across different frequency scales, thus facilitating the identification
-
Co–TES: Learning noisy labels with a Co-Teaching Exchange Student method Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-04 Chan Ho Shin, Seong-jun Oh
The performance of a machine-learning model is influenced by two main factors: the structure of the model, and the quality of the dataset it processes. As high-quality labeled data in substantial size is often difficult to obtain, there are ongoing efforts to develop machine learning algorithms that are robust with noisy datasets. Among these algorithms, multi-network learning utilizes learning from
-
On the prediction of power outage length based on linear multifractional Lévy stable motion Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-04 Wanqing Song, Wujin Deng, Piercarlo Cattani, Deyu Qi, Xianhua Yang, Xuyin Yao, Dongdong Chen, Wenduan Yan, Enrico Zio
-
Enhancing mass spectrometry data analysis: A novel framework for calibration, outlier detection, and classification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-03 Weili Peng, Tao Zhou, Yuanyuan Chen
-
A guided-based approach for deepfake detection: RGB-depth integration via features fusion Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-01 Giorgio Leporoni, Luca Maiano, Lorenzo Papa, Irene Amerini
Deep fake technology paves the way for a new generation of super realistic artificial content. While this opens the door to extraordinary new applications, the malicious use of deepfakes allows for far more realistic disinformation attacks than ever before. In this paper, we start from the intuition that generating fake content introduces possible inconsistencies in the depth of the generated images
-
Kreĭn twin support vector machines for imbalanced data classification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-30 C. Jimenez-Castaño, A. Álvarez-Meza, D. Cárdenas-Peña, A. Orozco-Gutíerrez, J. Guerrero-Erazo
Conventional classification assumes a balanced sample distribution among classes. However, such a premise leads to biased performance over the majority class (with the highest number of instances). The Twin Support Vector Machines (TWSVM) obtained great prominence due to their low computational burden compared to the standard SVM. Besides, traditional machine learning seeks methods whose solution depends
-
Interpretable answer retrieval based on heterogeneous network embedding Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-30 Yongliang Wu, Xiao Pan, Jinghui Li, Shimao Dou, Xiaoxue Wang
Community question answering is a rising technology based on users' autonomous interactive behaviors, such as posting their issues, answering questions based on their experience, and commenting on existing questions. As a result of its use of natural language for communication and stimulation of user interest in information sharing, it has increasingly taken the place of other channels as the main
-
Paired relation feature network for spatial relation recognition Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27 Nanxi Chen, Xu Wang, Qi Sun, Jiamao Li, Xiaolin Zhang
Recognizing relations between objects in an image is challenging for neural networks because some relations may not have obvious dedicated visual features. This paper proposes a Paired Relation Feature Network (PRFN), where all spatial and semantic features are extracted from the subject–object pair jointly, without using any hand-crafted features. PRFN includes a paired 2D spatial feature module that
-
Loose to compact feature alignment for domain adaptive object detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27 Yang Li, Shanshan Zhang, Yunan Liu, Jian Yang
Recently, great achievements have been made for deep learning based object detection methods. But their performance drops significantly when domain shifts occur. To address this problem, in this work we propose a loose to compact feature alignment method under an unsupervised domain adaptation framework. The entire feature alignment is performed in a manner, so as to distribute the alignment difficulties
-
A flexible non-monotonic discretization method for pre-processing in supervised learning Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27 Hatice Şenozan, Banu Soylu
-
Multi-layer encoder–decoder time-domain single channel speech separation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27 Debang Liu, Tianqi Zhang, Mads Græsbøll Christensen, Chen Yi, Ying Wei
With the emergence of more advanced separation networks, significant progress has been made in time-domain speech separation methods. These methods typically use a temporal encoder–decoder structure to encode speech feature sequences, thereby accomplishing the separation task. However, due to the limitation of traditional encoder–decoder structure, the separation performance decreases sharply when
-
Keep DRÆMing: Discriminative 3D anomaly detection through anomaly simulation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27 Vitjan Zavrtanik, Matej Kristan, Danijel Skočaj
Recent surface anomaly detection methods rely on pretrained backbone networks for efficient anomaly detection. On standard RGB anomaly detection benchmarks these methods achieve excellent results but fail on 3D anomaly detection due to a lack of pretrained backbones that suit this domain. Additionally, there is a lack of industrial depth data that would enable the backbone network training that could
-
Joint facial action unit recognition and self-supervised optical flow estimation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27 Zhiwen Shao, Yong Zhou, Feiran Li, Hancheng Zhu, Bing Liu
Facial action unit (AU) recognition and optical flow estimation are two highly correlated tasks, since optical flow can provide motion information of facial muscles to facilitate AU recognition. However, most existing AU recognition methods handle the two tasks independently by offline extracting optical flow as auxiliary information or directly ignoring the use of optical flow. In this paper, we propose
-
Deep neural networks for automatic speaker recognition do not learn supra-segmental temporal features Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-26 Daniel Neururer, Volker Dellwo, Thilo Stadelmann
While deep neural networks have shown impressive results in automatic speaker recognition and related tasks, it is dissatisfactory how little is understood about what exactly is responsible for these results. Part of the success has been attributed in prior work to their capability to model supra-segmental temporal information (SST), i.e., learn rhythmic-prosodic characteristics of speech in addition
-
YOLO2U-Net: Detection-guided 3D instance segmentation for microscopy Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-24 Amirkoushyar Ziabari, Derek C. Rose, Abbas Shirinifard, David Solecki
Microscopy imaging techniques are instrumental for characterization and analysis of biological structures. As these techniques typically render 3D visualization of cells by stacking 2D projections, issues such as out-of-plane excitation and low resolution in the -axis may pose challenges (even for human experts) to detect individual cells in 3D volumes as these non-overlapping cells may appear as overlapping
-
Graph contrastive learning with consistency regularization Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-21 Soohong Lee, Sangho Lee, Jaehwan Lee, Woojin Lee, Youngdoo Son
Contrastive learning has actively been used for unsupervised graph representation learning owing to its success in computer vision. Most graph contrastive learning methods use instance discrimination. It treats each instance as a distinct class against a query instance as the pretext task. However, such methods inevitably cause a class collision problem because some instances may belong to the same
-
EgoCap and EgoFormer: First-person image captioning with context fusion Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-20 Zhuangzhuang Dai, Vu Tran, Andrew Markham, Niki Trigoni, M. Arif Rahman, L.N.S. Wijayasingha, John Stankovic, Chen Li
First-person captioning is significant because it provides veracious descriptions of egocentric scenes in a unique perspective. Also, there is a need to caption the scene, a.k.a. life-logging, for patients, travellers, and emergency responders in an egocentric narrative. Ego-captioning is indeed non-trivial since (1) Ego-images can be noisy due to motion and angles; (2) Describing a scene in a first-person
-
Real-time 3-D image analysis via Jacobi moments Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-19 Puwei Wang, Simon Liao
In this research, we have proposed the parallel GPU-accelerated algorithms to compute the Jacobi moments defined in a rectangular region with substantially improved computational efficiency and highly satisfied accuracy. In our algorithms, the parallel 3-D matrix multiplications are adopted to increase the computational efficiency, while the techniques of coalesced memory access, shared memory and
-
Analysis of systems’ performance in natural language processing competitions Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-19 Sergio Nava-Muñoz, Mario Graff, Hugo Jair Escalante
Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises when
-
A simple and efficient filter feature selection method via document-term matrix unitization Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-19 Qing Li, Shuai Zhao, Tengjiao He, Jinming Wen
Text processing tasks commonly grapple with the challenge of high dimensionality. One of the most effective solutions to this challenge is to preprocess text data through feature selection methods. Feature selection can select the most advantageous features for subsequent operations (e.g., classification) from the native feature space of the text. This process effectively trims the feature space’s
-
Uncovering the authorship: Linking media content to social user profiles Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-16 Daniele Baracchi, Dasara Shullani, Massimo Iuliani, Damiano Giani, Alessandro Piva
The extensive spread of fake news on social networks is carried out by a diverse range of users, encompassing private individuals, newspapers, and organizations. With widely accessible image and video editing tools, malicious users can easily create manipulated media. They can then distribute this content through multiple fake profiles, aiming to maximize its social impact. To tackle this problem effectively
-
Multimodal prediction of student performance: A fusion of signed graph neural networks and large language models Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-16 Sijie Wang, Lin Ni, Zeyu Zhang, Xiaoxuan Li, Xianda Zheng, Jiamou Liu
In online education platforms, accurately predicting student performance is essential for timely dropout prevention and interventions for at-risk students. This task is made difficult by the prevalent use of Multiple-Choice Questions (MCQs) in learnersourcing platforms, where noise in student-generated content and the limitations of existing unsigned graph-based models, specifically their inability
-
A more reliable local-global-guided network for correspondence pruning Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-15 Chengli Peng, Zhenyu Yang, Yiwei Lu, Zizhuo Li, Qiwen Jin
The correspondence pruning task relies on both local and global contexts, which are considered to be essential in inferring the probability of inliers. Many previous approaches seek to devise various structures to make effective use of them, but they either use only a plain structure or base it on their own hypothetical relationships, which leads to some limitations remaining to be improved. Derived
-
PDTE: Pyramidal deep Taylor expansion for optical flow estimation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-15 Zifan Zhu, Qing An, Chen Huang, Zhenghua Huang, Likun Huang, Hao Fang
Optical flow estimation is an important hot research in computer vision. Although existing methods had got a considerable progress in improving their performance, they still have drawbacks, such as heavily computational burden, inaccurate pixel-level offset estimation, and poor interpretability. To address these issues, this letter proposes a pyramidal deep Taylor expansion (PDTE) framework, including:
-
OSPC: Online Sequential Photometric Calibration Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-14 Jawad Haidar, Douaa Khalil, Daniel Asmar
Photometric calibration is essential to many computer vision applications. One of its key benefits is enhancing the performance of Visual SLAM, especially when it depends on a direct method for tracking, such as the standard KLT algorithm. Additionally, it proves valuable in extracting sensor irradiance values from measured intensities, serving as a pre-processing step for a number of vision algorithms
-
Improvised contrastive loss for improved face recognition in open-set nature Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-13 Zafran Khan, Abhijeet Boragule, Brian J. d’Auriol, Moongu Jeon
Face recognition models often encounter various unseen domains and environments in real-world applications, leading to unsatisfactory performance due to the open-set nature of face recognition. Models trained on central datasets may exhibit poor generalization when faced with different candidates under varying illumination and blur conditions. In this paper, our goal is to enhance the generalization
-
Hierarchical matrix factorization for interpretable collaborative filtering Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-07 Kai Sugahara, Kazushi Okamoto
-
Learning on sample-efficient and label-efficient multi-view cardiac data with graph transformer Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-05 Lujing Wang, Yunting Ma, Wanqiu Zhang, Xiaoying Zhao, Xinxiang Zhao
Predicting cardiovascular disease has been a challenging task, as assessing samples based on a single view of information may be insufficient. Therefore, in this paper, we focus on the challenge of predicting cardiovascular disease using multi-view cardiac data. However, multi-view cardiac data is usually difficult to collect and label. Based on this motivation, learning an effective predictive model
-
A siamese-based verification system for open-set architecture attribution of synthetic images Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-05 Lydia Abady, Jun Wang, Benedetta Tondi, Mauro Barni
Despite the wide variety of methods developed for synthetic image attribution, most of them can only attribute images generated by models or architectures included in the training set and do not work with architectures, hindering their applicability in real-world scenarios. In this paper, we propose a verification framework that relies on a Siamese Network to address the problem of open-set attribution
-
Multifractal characterization and recognition of animal behavior based on deep wavelet transform Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-04 Kexin Meng, Shanjie Yang, Piercarlo Cattani, Shijiao Gao, Shuli Mei
The study conduct an in-depth exploration of the multifractal characteristics of dairy cows behavioral data, aiming to reveal their complexity and representation in behavioral patterns. By means of Multifractal Detrended Fluctuation Analysis (MFDFA) in conjunction with deep wavelet transform, we extract multifractal indices that precisely depict the differences and dynamic changes of cows behavior
-
Towards high-fidelity facial UV map generation in real-world Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-02 Yuanming Li, Jeong-gi Kwak, Bon-hwa Ku, David Han, Hanseok Ko
We present a framework for completing high-fidelity 3D facial UV maps from single-face image. Despite the success of Generative Adversarial Networks (GANs) in this area, generating accurate UV maps from in-the-wild images remains challenging. Our approach involves a novel network called “Map and Edit” that combines a 2D generative model and a 3D prior to explicitly control the generation of multi-view
-
Adaptive regularized ensemble for evolving data stream classification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01 Aldo M. Paim, Fabrício Enembreck
Extracting knowledge from data streams requires fast incremental algorithms that are able to handle unlimited processing and ever-changing data with finite memory. A strategy for this challenge is the use of ensembles owing to their ability to tackle concept drift and achieve highly accurate predictions. However, ensembles often require a lot of computational resources. In this study, we propose a
-
Channel-spatial knowledge distillation for efficient semantic segmentation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01 Ayoub Karine, Thibault Napoléon, Maher Jridi
In this paper, we propose a new lightweight Channel-Spatial Knowledge Distillation (CSKD) method to handle the task of efficient image semantic segmentation. More precisely, we investigate the KD approach that train a compressed neural network called student under the supervision of a heavy one called teacher. In this context, we propose to improve the distillation mechanism by capturing the contextual
-
Frame-part-activated deep reinforcement learning for Action Prediction Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01 Lei Chen, Zhanjie Song
In this paper, we propose a frame-part-activated deep reinforcement learning (FPA-DRL) for action prediction. Most existing methods for action prediction utilize the evolution of whole frames to model actions, which cannot avoid the noise of the current action, especially in the early prediction. Moreover, the loss of structural information of human body diminishes the capacity of features to describe
-
Continual learning for adaptive social network identification Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-28 Simone Magistri, Daniele Baracchi, Dasara Shullani, Andrew D. Bagdanov, Alessandro Piva
The popularity of social networks as primary mediums for sharing visual content has made it crucial for forensic experts to identify the original platform of multimedia content. Various methods address this challenge, but the constant emergence of new platforms and updates to existing ones often render forensic tools ineffective shortly after release. This necessitates the regular updating of methods
-
SPACE: Senti-Prompt As Classifying Embedding for sentiment analysis Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-28 Jinyoung Kim, Youngjoong Ko
In natural language processing, the general approach to sentiment analysis involves a pre-training and fine-tuning paradigm using pre-trained language models combined with classifier models. Recently, numerous studies have applied prompts not only to downstream generation but also to classification tasks as well. However, to fully utilize the advantages of prompts and incorporate the context-dependent
-
Adaptive watermarking with self-mutual check parameters in deep neural networks Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-24 Zhenzhe Gao, Zhaoxia Yin, Hongjian Zhan, Heng Yin, Yue Lu
Artificial Intelligence has found wide application, but also poses risks due to unintentional or malicious tampering during deployment. Regular checks are therefore necessary to detect and prevent such risks. Fragile watermarking is a technique used to identify tampering in AI models. However, previous methods have faced challenges including risks of omission, additional information transmission, and
-
GBCA: Graph Convolution Network and BERT combined with Co-Attention for fake news detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-23 Zhen Zhang, Qiyun Lv, Xiyuan Jia, Wenhao Yun, Gongxun Miao, Zongqing Mao, Guohua Wu
Social media has evolved into a widely influential information source in contemporary society. However, the widespread use of social media also enables the rapid spread of fake news, which can pose a significant threat to national and social stability. Current fake news detection methods primarily rely on graph neural network, which analyze the dissemination patterns of news articles. Nevertheless
-
Attention based multi-task interpretable graph convolutional network for Alzheimer’s disease analysis Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-22 Shunqin Jiang, Qiyuan Feng, Hengxin Li, Zhenyun Deng, Qinghong Jiang
Alzheimer’s Disease impairs the memory and cognitive function of patients, and early intervention can effectively mitigate its deterioration. Most existing methods for Alzheimer’s analysis rely solely on medical images, ignoring the impact of some clinical indicators associated with the disease. Furthermore, these methods have thus far failed to identify the specific brain regions affected by the disease
-
Forensic analysis of AI-compression traces in spatial and frequency domain Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-22 Sandra Bergmann, Denise Moussa, Fabian Brand, André Kaup, Christian Riess
-
Less is more: A minimalist approach to robust GAN-generated face detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-22 Tanusree Ghosh, Ruchira Naskar
Hyper-realistic images that are not differentiable from authentic images to regular viewers have become extremely easy to generate and highly accessible. Furthermore, the increasing pervasiveness of social media networks in our daily lives has facilitated the easy dissemination of fake news accompanied by such synthetic images. Hyper-realistic artificial face images are often illicitly used as profile
-
Learning interactions across sentiment and emotion with graph attention network and position encodings Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-16 Ao Jia, Yazhou Zhang, Sagar Uprety, Dawei Song
Sentiment classification and emotion recognition are two close related tasks in NLP. However, most of the recent studies have treated them as two separate tasks, where the shared knowledge are neglected. In this paper, we propose a multi-task interactive graph attention network with position encodings, termed MIP-GAT, to improve the performance of each task by simultaneously leveraging similarities
-
PNSP: Overcoming catastrophic forgetting using Primary Null Space Projection in continual learning Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-15 DaiLiang Zhou, YongHong Song
Continual Learning (CL) plays a crucial role in enhancing learning performance for both new and previous tasks in continuous data streams, thus contributing to the advancement of cognitive computing. However, CL faces a fundamental challenge known as the stability-plasticity quandary. In this research, we present an innovative and effective CL algorithm called Primary Null Space Projection (PNSP) to
-
CrossFormer: Cross-guided attention for multi-modal object detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-15 Seungik Lee, Jaehyeong Park, Jinsun Park
Object detection is one of the essential tasks in a variety of real-world applications such as autonomous driving and robotics. In a real-world scenario, unfortunately, there are numerous challenges such as illumination changes, adverse weather conditions, and geographical changes, to name a few. To tackle the problem, we propose a novel multi-modal object detection model that is built upon a hierarchical
-
A lightness-aware loss for low-light image enhancement Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-14 Dian Xie, Huajun Xing, Liangyu Chen, Shijie Hao
Current low-light image enhancement methods have made great progress on improving the visibility of low-light images. Nevertheless, they pay less attention to preserving visual naturalness and therefore often introduce over-enhancement and local artifacts into their results. To address this issue, it is useful to introduce additional multi-view information of an image into enhancement models, such
-
Human Gait Recognition by using Two Stream Neural Network along with Spatial and Temporal Features Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-11 Asif Mehmood, Javeria Amin, Muhammad Sharif, Seifedine Kadry
Human Gait Recognition (HGR) is referred to as a biometric tactic that is broadly used for the recognition of an individual by using the pattern of walking. There are some key factors such as angle variation, clothing variation, foot shadows, and carrying conditions that affect the human gait. In this work, a new approach is proposed for the HGR that contains five major steps. In the first step, the
-
M[formula omitted]TTS: Multi-modal text-to-speech of multi-scale style control for dubbing Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10 Yan Liu, Li-Fang Wei, Xinyuan Qian, Tian-Hao Zhang, Song-Lu Chen, Xu-Cheng Yin
Dubbing refers to the procedure of recording characters by professional voice actors in films and games. It is more expressive and immersive than conventional Text-to-Speech (TTS) technologies and requires synchronization and style consistency of audio and video. Previous dubbing methods use video to provide either a global style vector or a local prosody embedding, limiting the expressiveness of the
-
CustomDepth: Customizing point-wise depth categories for depth completion Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10 Shenglun Chen, Xinchen Ye, Hong Zhang, Haojie Li, Zhihui Wang
Classification-based depth completion methods have achieved remarkable performance. However, the result is still coarse due to the limitation of using unified depth categories to represent depth distribution. In this work, we propose CustomDepth which can customize exclusive depth categories for each image point to boost performance. To this end, CustomDepth introduces a depth subdivision module that
-
Feature enhancement and coarse-to-fine detection for RGB-D tracking Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10 Xue-Feng Zhu, Tianyang Xu, Xiao-Jun Wu, Josef Kittler
Existing RGB-D tracking algorithms advance the performance by constructing typical appearance models from the RGB-only tracking frameworks. There is no attempt to exploit any complementary visual information from the multi-modal input. This paper addresses this deficit and presents a novel algorithm to boost the performance of RGB-D tracking by taking advantage of collaborative clues. To guarantee
-
Machine learning for low signal-to-noise ratio detection Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-09 Fred Lacy, Angel Ruiz-Reyes, Anthony Brescia
[Display omitted]
-
Data-efficient 3D instance segmentation by transferring knowledge from synthetic scans Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07 Xiaodong Wu, Ruiping Wang, Xilin Chen
The 3D comprehension ability of indoor environments is critical for robots. While deep learning-based methods have improved performance, they require significant amounts of annotated training data. Nevertheless, the cost of scanning and annotating point cloud data in real scenes is high, leading to data scarcity. Consequently, there is an urgent need to investigate data-efficient methods for point
-
On characterizing the evolution of embedding space of neural networks using algebraic topology Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07 S. Suresh, B. Das, V. Abrol, S. Dutta Roy
We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers. Motivated by existing studies using simplicial complexes on shallow fully connected networks (FCN), we present an extended analysis using Cubical homology instead, with a variety of popular deep architectures and real image datasets. We demonstrate
-
Hierarchical reinforcement learning for chip-macro placement in integrated circuit Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07 Zhentao Tan, Yadong Mu
The complexity of chip design has consistently grown, adhering to Moore’s law. In this paper, we examine a crucial step in integrated circuit design called chip macro placement. Traditionally, human experts are consulted to optimize placement for reduced power consumption, but this requires significant effort. Recently, machine learning-based methods have emerged to address this task, showing promising
-
Enhanced blind face inpainting via structured mask prediction Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07 Honglei Li, Yifan Zhang, Wenmin Wang
Blind face inpainting is the task of automatically recovering an occluded face image without given masks indicating missing areas. Popular inpainting methods assume that the occlusion patterns are known with given occlusion masks. Previous blind inpainting methods, ignoring the structure in faces and occlusions, treat occlusion detection as an independent pixel prediction problem. To overcome the limitations
-
N-QGNv2: Predicting the optimum quadtree representation of a depth map from a monocular camera Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-05 Daniel Braun, Olivier Morel, Cédric Demonceaux, Pascal Vasseur
Self-supervised monocular depth prediction is a widely researched field that aims to provide a better scene understanding. However, most existing methods prioritize prediction accuracy over computation cost, which can hinder the deployment of these methods in real-world applications. Our objective is to propose a solution that efficiently compresses the depth map while maintaining a high level of accuracy
-
-
Even small correlation and diversity shifts pose dataset-bias issues Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-03 Alceu Bissoto, Catarina Barata, Eduardo Valle, Sandra Avila
Distribution shifts hinder the deployment of deep learning in real-world problems. Distribution shifts appear when train and test data come from different sources, which commonly happens in practice. Despite shifts occurring concurrently in many forms (e.g., correlation and diversity shifts) and intensities, the literature focuses only on severe and isolated shifts. In this work, we propose a comprehensive
-
Subdivided Mask Dispersion Framework for semi-supervised semantic segmentation Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-02 Yooseung Wang, Jaehyuk Jang, Changick Kim
Learning the relationship between weak and strong perturbations has been considered a major part of semi-supervised semantic segmentation. We observed two problems with a publicly used perturbation method, which randomly generates a mask with a single large bounding box. The large single bounding box that entirely covers the important object components in an image, hindering the model from capturing