Pattern Recognition Letters期刊最新论文, 计算机, 综合类期刊,

Weakly-supervised Incremental learning for Semantic segmentation with Class Hierarchy

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-11
Hyoseo Kim, Junsuk Choe

Although current semantic segmentation approaches have achieved impressive performance, their ability to incrementally learn new classes is limited. Moreover, pixel-by-pixel annotations are costly and time-consuming. Therefore, a new field called Weakly-supervised Incremental Learning for Semantic Segmentation (WILSS) has emerged, which learns new classes using image-level labels. However, image-level

更新日期：2024-04-11

详情收藏

Towards better small object detection in UAV scenes: Aggregating more object-oriented information

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-06
Chenyue Yang, Yichao Cao, Xiaobo Lu

Security, transportation, and rescue applications require fully analyzing the visual data interpretation via drone platforms. While various aspects of object detection research are expanding at a rapid pace, the detection of small objects in drone platforms continues to pose significant challenges. Specifically, targets in drone-captured scenarios are notoriously hard to detect due to factors such

更新日期：2024-04-06

详情收藏

Multiresolution causality of Bitcoin on GCC stock markets: Utilizing EMD-Granger analytical methodology

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-05
Foued Saâdaoui, Bochra Rabbouch, Harish Garg

This article employs an Empirical Mode Decomposition (EMD)-based multiresolution causality approach to explore the scale-by-scale interconnectedness between Bitcoin and the stock markets of Gulf Cooperation Council (GCC) countries. EMD is utilized to decompose signals into intrinsic mode functions (IMFs), which delineate variations across different frequency scales, thus facilitating the identification

更新日期：2024-04-05

详情收藏

Co–TES: Learning noisy labels with a Co-Teaching Exchange Student method

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-04
Chan Ho Shin, Seong-jun Oh

The performance of a machine-learning model is influenced by two main factors: the structure of the model, and the quality of the dataset it processes. As high-quality labeled data in substantial size is often difficult to obtain, there are ongoing efforts to develop machine learning algorithms that are robust with noisy datasets. Among these algorithms, multi-network learning utilizes learning from

更新日期：2024-04-04

详情收藏

A guided-based approach for deepfake detection: RGB-depth integration via features fusion

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-04-01
Giorgio Leporoni, Luca Maiano, Lorenzo Papa, Irene Amerini

Deep fake technology paves the way for a new generation of super realistic artificial content. While this opens the door to extraordinary new applications, the malicious use of deepfakes allows for far more realistic disinformation attacks than ever before. In this paper, we start from the intuition that generating fake content introduces possible inconsistencies in the depth of the generated images

更新日期：2024-04-01

详情收藏

Kreĭn twin support vector machines for imbalanced data classification

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-30
C. Jimenez-Castaño, A. Álvarez-Meza, D. Cárdenas-Peña, A. Orozco-Gutíerrez, J. Guerrero-Erazo

Conventional classification assumes a balanced sample distribution among classes. However, such a premise leads to biased performance over the majority class (with the highest number of instances). The Twin Support Vector Machines (TWSVM) obtained great prominence due to their low computational burden compared to the standard SVM. Besides, traditional machine learning seeks methods whose solution depends

更新日期：2024-03-30

详情收藏

Interpretable answer retrieval based on heterogeneous network embedding

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-30
Yongliang Wu, Xiao Pan, Jinghui Li, Shimao Dou, Xiaoxue Wang

Community question answering is a rising technology based on users' autonomous interactive behaviors, such as posting their issues, answering questions based on their experience, and commenting on existing questions. As a result of its use of natural language for communication and stimulation of user interest in information sharing, it has increasingly taken the place of other channels as the main

更新日期：2024-03-30

详情收藏

Paired relation feature network for spatial relation recognition

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27
Nanxi Chen, Xu Wang, Qi Sun, Jiamao Li, Xiaolin Zhang

Recognizing relations between objects in an image is challenging for neural networks because some relations may not have obvious dedicated visual features. This paper proposes a Paired Relation Feature Network (PRFN), where all spatial and semantic features are extracted from the subject–object pair jointly, without using any hand-crafted features. PRFN includes a paired 2D spatial feature module that

更新日期：2024-03-27

详情收藏

Loose to compact feature alignment for domain adaptive object detection

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27
Yang Li, Shanshan Zhang, Yunan Liu, Jian Yang

Recently, great achievements have been made for deep learning based object detection methods. But their performance drops significantly when domain shifts occur. To address this problem, in this work we propose a loose to compact feature alignment method under an unsupervised domain adaptation framework. The entire feature alignment is performed in a manner, so as to distribute the alignment difficulties

更新日期：2024-03-27

详情收藏

Multi-layer encoder–decoder time-domain single channel speech separation

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27
Debang Liu, Tianqi Zhang, Mads Græsbøll Christensen, Chen Yi, Ying Wei

With the emergence of more advanced separation networks, significant progress has been made in time-domain speech separation methods. These methods typically use a temporal encoder–decoder structure to encode speech feature sequences, thereby accomplishing the separation task. However, due to the limitation of traditional encoder–decoder structure, the separation performance decreases sharply when

更新日期：2024-03-27

详情收藏

Keep DRÆMing: Discriminative 3D anomaly detection through anomaly simulation

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27
Vitjan Zavrtanik, Matej Kristan, Danijel Skočaj

Recent surface anomaly detection methods rely on pretrained backbone networks for efficient anomaly detection. On standard RGB anomaly detection benchmarks these methods achieve excellent results but fail on 3D anomaly detection due to a lack of pretrained backbones that suit this domain. Additionally, there is a lack of industrial depth data that would enable the backbone network training that could

更新日期：2024-03-27

详情收藏

Joint facial action unit recognition and self-supervised optical flow estimation

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-27
Zhiwen Shao, Yong Zhou, Feiran Li, Hancheng Zhu, Bing Liu

Facial action unit (AU) recognition and optical flow estimation are two highly correlated tasks, since optical flow can provide motion information of facial muscles to facilitate AU recognition. However, most existing AU recognition methods handle the two tasks independently by offline extracting optical flow as auxiliary information or directly ignoring the use of optical flow. In this paper, we propose

更新日期：2024-03-27

详情收藏

Deep neural networks for automatic speaker recognition do not learn supra-segmental temporal features

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-26
Daniel Neururer, Volker Dellwo, Thilo Stadelmann

While deep neural networks have shown impressive results in automatic speaker recognition and related tasks, it is dissatisfactory how little is understood about what exactly is responsible for these results. Part of the success has been attributed in prior work to their capability to model supra-segmental temporal information (SST), i.e., learn rhythmic-prosodic characteristics of speech in addition

更新日期：2024-03-26

详情收藏

YOLO2U-Net: Detection-guided 3D instance segmentation for microscopy

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-24
Amirkoushyar Ziabari, Derek C. Rose, Abbas Shirinifard, David Solecki

Microscopy imaging techniques are instrumental for characterization and analysis of biological structures. As these techniques typically render 3D visualization of cells by stacking 2D projections, issues such as out-of-plane excitation and low resolution in the -axis may pose challenges (even for human experts) to detect individual cells in 3D volumes as these non-overlapping cells may appear as overlapping

更新日期：2024-03-24

详情收藏

Graph contrastive learning with consistency regularization

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-21
Soohong Lee, Sangho Lee, Jaehwan Lee, Woojin Lee, Youngdoo Son

Contrastive learning has actively been used for unsupervised graph representation learning owing to its success in computer vision. Most graph contrastive learning methods use instance discrimination. It treats each instance as a distinct class against a query instance as the pretext task. However, such methods inevitably cause a class collision problem because some instances may belong to the same

更新日期：2024-03-21

详情收藏

EgoCap and EgoFormer: First-person image captioning with context fusion

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-20
Zhuangzhuang Dai, Vu Tran, Andrew Markham, Niki Trigoni, M. Arif Rahman, L.N.S. Wijayasingha, John Stankovic, Chen Li

First-person captioning is significant because it provides veracious descriptions of egocentric scenes in a unique perspective. Also, there is a need to caption the scene, a.k.a. life-logging, for patients, travellers, and emergency responders in an egocentric narrative. Ego-captioning is indeed non-trivial since (1) Ego-images can be noisy due to motion and angles; (2) Describing a scene in a first-person

更新日期：2024-03-20

详情收藏

Real-time 3-D image analysis via Jacobi moments

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-19
Puwei Wang, Simon Liao

In this research, we have proposed the parallel GPU-accelerated algorithms to compute the Jacobi moments defined in a rectangular region with substantially improved computational efficiency and highly satisfied accuracy. In our algorithms, the parallel 3-D matrix multiplications are adopted to increase the computational efficiency, while the techniques of coalesced memory access, shared memory and

更新日期：2024-03-19

详情收藏

Analysis of systems’ performance in natural language processing competitions

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-19
Sergio Nava-Muñoz, Mario Graff, Hugo Jair Escalante

Collaborative competitions have gained popularity in the scientific and technological fields. These competitions involve defining tasks, selecting evaluation scores, and devising result verification methods. In the standard scenario, participants receive a training set and are expected to provide a solution for a held-out dataset kept by organizers. An essential challenge for organizers arises when

更新日期：2024-03-19

详情收藏

A simple and efficient filter feature selection method via document-term matrix unitization

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-19
Qing Li, Shuai Zhao, Tengjiao He, Jinming Wen

Text processing tasks commonly grapple with the challenge of high dimensionality. One of the most effective solutions to this challenge is to preprocess text data through feature selection methods. Feature selection can select the most advantageous features for subsequent operations (e.g., classification) from the native feature space of the text. This process effectively trims the feature space’s

更新日期：2024-03-19

详情收藏

Uncovering the authorship: Linking media content to social user profiles

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-16
Daniele Baracchi, Dasara Shullani, Massimo Iuliani, Damiano Giani, Alessandro Piva

The extensive spread of fake news on social networks is carried out by a diverse range of users, encompassing private individuals, newspapers, and organizations. With widely accessible image and video editing tools, malicious users can easily create manipulated media. They can then distribute this content through multiple fake profiles, aiming to maximize its social impact. To tackle this problem effectively

更新日期：2024-03-16

详情收藏

Multimodal prediction of student performance: A fusion of signed graph neural networks and large language models

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-16
Sijie Wang, Lin Ni, Zeyu Zhang, Xiaoxuan Li, Xianda Zheng, Jiamou Liu

In online education platforms, accurately predicting student performance is essential for timely dropout prevention and interventions for at-risk students. This task is made difficult by the prevalent use of Multiple-Choice Questions (MCQs) in learnersourcing platforms, where noise in student-generated content and the limitations of existing unsigned graph-based models, specifically their inability

更新日期：2024-03-16

详情收藏

A more reliable local-global-guided network for correspondence pruning

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-15
Chengli Peng, Zhenyu Yang, Yiwei Lu, Zizhuo Li, Qiwen Jin

The correspondence pruning task relies on both local and global contexts, which are considered to be essential in inferring the probability of inliers. Many previous approaches seek to devise various structures to make effective use of them, but they either use only a plain structure or base it on their own hypothetical relationships, which leads to some limitations remaining to be improved. Derived

更新日期：2024-03-15

详情收藏

PDTE: Pyramidal deep Taylor expansion for optical flow estimation

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-15
Zifan Zhu, Qing An, Chen Huang, Zhenghua Huang, Likun Huang, Hao Fang

Optical flow estimation is an important hot research in computer vision. Although existing methods had got a considerable progress in improving their performance, they still have drawbacks, such as heavily computational burden, inaccurate pixel-level offset estimation, and poor interpretability. To address these issues, this letter proposes a pyramidal deep Taylor expansion (PDTE) framework, including:

更新日期：2024-03-15

详情收藏

OSPC: Online Sequential Photometric Calibration

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-14
Jawad Haidar, Douaa Khalil, Daniel Asmar

Photometric calibration is essential to many computer vision applications. One of its key benefits is enhancing the performance of Visual SLAM, especially when it depends on a direct method for tracking, such as the standard KLT algorithm. Additionally, it proves valuable in extracting sensor irradiance values from measured intensities, serving as a pre-processing step for a number of vision algorithms

更新日期：2024-03-14

详情收藏

Improvised contrastive loss for improved face recognition in open-set nature

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-13
Zafran Khan, Abhijeet Boragule, Brian J. d’Auriol, Moongu Jeon

Face recognition models often encounter various unseen domains and environments in real-world applications, leading to unsatisfactory performance due to the open-set nature of face recognition. Models trained on central datasets may exhibit poor generalization when faced with different candidates under varying illumination and blur conditions. In this paper, our goal is to enhance the generalization

更新日期：2024-03-13

详情收藏

Learning on sample-efficient and label-efficient multi-view cardiac data with graph transformer

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-05
Lujing Wang, Yunting Ma, Wanqiu Zhang, Xiaoying Zhao, Xinxiang Zhao

Predicting cardiovascular disease has been a challenging task, as assessing samples based on a single view of information may be insufficient. Therefore, in this paper, we focus on the challenge of predicting cardiovascular disease using multi-view cardiac data. However, multi-view cardiac data is usually difficult to collect and label. Based on this motivation, learning an effective predictive model

更新日期：2024-03-05

详情收藏

A siamese-based verification system for open-set architecture attribution of synthetic images

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-05
Lydia Abady, Jun Wang, Benedetta Tondi, Mauro Barni

Despite the wide variety of methods developed for synthetic image attribution, most of them can only attribute images generated by models or architectures included in the training set and do not work with architectures, hindering their applicability in real-world scenarios. In this paper, we propose a verification framework that relies on a Siamese Network to address the problem of open-set attribution

更新日期：2024-03-05

详情收藏

Multifractal characterization and recognition of animal behavior based on deep wavelet transform

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-04
Kexin Meng, Shanjie Yang, Piercarlo Cattani, Shijiao Gao, Shuli Mei

The study conduct an in-depth exploration of the multifractal characteristics of dairy cows behavioral data, aiming to reveal their complexity and representation in behavioral patterns. By means of Multifractal Detrended Fluctuation Analysis (MFDFA) in conjunction with deep wavelet transform, we extract multifractal indices that precisely depict the differences and dynamic changes of cows behavior

更新日期：2024-03-04

详情收藏

Towards high-fidelity facial UV map generation in real-world

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-02
Yuanming Li, Jeong-gi Kwak, Bon-hwa Ku, David Han, Hanseok Ko

We present a framework for completing high-fidelity 3D facial UV maps from single-face image. Despite the success of Generative Adversarial Networks (GANs) in this area, generating accurate UV maps from in-the-wild images remains challenging. Our approach involves a novel network called “Map and Edit” that combines a 2D generative model and a 3D prior to explicitly control the generation of multi-view

更新日期：2024-03-02

详情收藏

Adaptive regularized ensemble for evolving data stream classification

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01
Aldo M. Paim, Fabrício Enembreck

Extracting knowledge from data streams requires fast incremental algorithms that are able to handle unlimited processing and ever-changing data with finite memory. A strategy for this challenge is the use of ensembles owing to their ability to tackle concept drift and achieve highly accurate predictions. However, ensembles often require a lot of computational resources. In this study, we propose a

更新日期：2024-03-01

详情收藏

Channel-spatial knowledge distillation for efficient semantic segmentation

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01
Ayoub Karine, Thibault Napoléon, Maher Jridi

In this paper, we propose a new lightweight Channel-Spatial Knowledge Distillation (CSKD) method to handle the task of efficient image semantic segmentation. More precisely, we investigate the KD approach that train a compressed neural network called student under the supervision of a heavy one called teacher. In this context, we propose to improve the distillation mechanism by capturing the contextual

更新日期：2024-03-01

详情收藏

Frame-part-activated deep reinforcement learning for Action Prediction

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-03-01
Lei Chen, Zhanjie Song

In this paper, we propose a frame-part-activated deep reinforcement learning (FPA-DRL) for action prediction. Most existing methods for action prediction utilize the evolution of whole frames to model actions, which cannot avoid the noise of the current action, especially in the early prediction. Moreover, the loss of structural information of human body diminishes the capacity of features to describe

更新日期：2024-03-01

详情收藏

Continual learning for adaptive social network identification

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-28
Simone Magistri, Daniele Baracchi, Dasara Shullani, Andrew D. Bagdanov, Alessandro Piva

The popularity of social networks as primary mediums for sharing visual content has made it crucial for forensic experts to identify the original platform of multimedia content. Various methods address this challenge, but the constant emergence of new platforms and updates to existing ones often render forensic tools ineffective shortly after release. This necessitates the regular updating of methods

更新日期：2024-02-28

详情收藏

SPACE: Senti-Prompt As Classifying Embedding for sentiment analysis

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-28
Jinyoung Kim, Youngjoong Ko

In natural language processing, the general approach to sentiment analysis involves a pre-training and fine-tuning paradigm using pre-trained language models combined with classifier models. Recently, numerous studies have applied prompts not only to downstream generation but also to classification tasks as well. However, to fully utilize the advantages of prompts and incorporate the context-dependent

更新日期：2024-02-28

详情收藏

Adaptive watermarking with self-mutual check parameters in deep neural networks

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-24
Zhenzhe Gao, Zhaoxia Yin, Hongjian Zhan, Heng Yin, Yue Lu

Artificial Intelligence has found wide application, but also poses risks due to unintentional or malicious tampering during deployment. Regular checks are therefore necessary to detect and prevent such risks. Fragile watermarking is a technique used to identify tampering in AI models. However, previous methods have faced challenges including risks of omission, additional information transmission, and

更新日期：2024-02-24

详情收藏

GBCA: Graph Convolution Network and BERT combined with Co-Attention for fake news detection

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-23
Zhen Zhang, Qiyun Lv, Xiyuan Jia, Wenhao Yun, Gongxun Miao, Zongqing Mao, Guohua Wu

Social media has evolved into a widely influential information source in contemporary society. However, the widespread use of social media also enables the rapid spread of fake news, which can pose a significant threat to national and social stability. Current fake news detection methods primarily rely on graph neural network, which analyze the dissemination patterns of news articles. Nevertheless

更新日期：2024-02-23

详情收藏

Attention based multi-task interpretable graph convolutional network for Alzheimer’s disease analysis

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-22
Shunqin Jiang, Qiyuan Feng, Hengxin Li, Zhenyun Deng, Qinghong Jiang

Alzheimer’s Disease impairs the memory and cognitive function of patients, and early intervention can effectively mitigate its deterioration. Most existing methods for Alzheimer’s analysis rely solely on medical images, ignoring the impact of some clinical indicators associated with the disease. Furthermore, these methods have thus far failed to identify the specific brain regions affected by the disease

更新日期：2024-02-22

详情收藏

Less is more: A minimalist approach to robust GAN-generated face detection

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-22
Tanusree Ghosh, Ruchira Naskar

Hyper-realistic images that are not differentiable from authentic images to regular viewers have become extremely easy to generate and highly accessible. Furthermore, the increasing pervasiveness of social media networks in our daily lives has facilitated the easy dissemination of fake news accompanied by such synthetic images. Hyper-realistic artificial face images are often illicitly used as profile

更新日期：2024-02-22

详情收藏

Learning interactions across sentiment and emotion with graph attention network and position encodings

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-16
Ao Jia, Yazhou Zhang, Sagar Uprety, Dawei Song

Sentiment classification and emotion recognition are two close related tasks in NLP. However, most of the recent studies have treated them as two separate tasks, where the shared knowledge are neglected. In this paper, we propose a multi-task interactive graph attention network with position encodings, termed MIP-GAT, to improve the performance of each task by simultaneously leveraging similarities

更新日期：2024-02-16

详情收藏

PNSP: Overcoming catastrophic forgetting using Primary Null Space Projection in continual learning

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-15
DaiLiang Zhou, YongHong Song

Continual Learning (CL) plays a crucial role in enhancing learning performance for both new and previous tasks in continuous data streams, thus contributing to the advancement of cognitive computing. However, CL faces a fundamental challenge known as the stability-plasticity quandary. In this research, we present an innovative and effective CL algorithm called Primary Null Space Projection (PNSP) to

更新日期：2024-02-15

详情收藏

CrossFormer: Cross-guided attention for multi-modal object detection

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-15
Seungik Lee, Jaehyeong Park, Jinsun Park

Object detection is one of the essential tasks in a variety of real-world applications such as autonomous driving and robotics. In a real-world scenario, unfortunately, there are numerous challenges such as illumination changes, adverse weather conditions, and geographical changes, to name a few. To tackle the problem, we propose a novel multi-modal object detection model that is built upon a hierarchical

更新日期：2024-02-15

详情收藏

A lightness-aware loss for low-light image enhancement

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-14
Dian Xie, Huajun Xing, Liangyu Chen, Shijie Hao

Current low-light image enhancement methods have made great progress on improving the visibility of low-light images. Nevertheless, they pay less attention to preserving visual naturalness and therefore often introduce over-enhancement and local artifacts into their results. To address this issue, it is useful to introduce additional multi-view information of an image into enhancement models, such

更新日期：2024-02-14

详情收藏

Human Gait Recognition by using Two Stream Neural Network along with Spatial and Temporal Features

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-11
Asif Mehmood, Javeria Amin, Muhammad Sharif, Seifedine Kadry

Human Gait Recognition (HGR) is referred to as a biometric tactic that is broadly used for the recognition of an individual by using the pattern of walking. There are some key factors such as angle variation, clothing variation, foot shadows, and carrying conditions that affect the human gait. In this work, a new approach is proposed for the HGR that contains five major steps. In the first step, the

更新日期：2024-02-11

详情收藏

M[formula omitted]TTS: Multi-modal text-to-speech of multi-scale style control for dubbing

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10
Yan Liu, Li-Fang Wei, Xinyuan Qian, Tian-Hao Zhang, Song-Lu Chen, Xu-Cheng Yin

Dubbing refers to the procedure of recording characters by professional voice actors in films and games. It is more expressive and immersive than conventional Text-to-Speech (TTS) technologies and requires synchronization and style consistency of audio and video. Previous dubbing methods use video to provide either a global style vector or a local prosody embedding, limiting the expressiveness of the

更新日期：2024-02-10

详情收藏

CustomDepth: Customizing point-wise depth categories for depth completion

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10
Shenglun Chen, Xinchen Ye, Hong Zhang, Haojie Li, Zhihui Wang

Classification-based depth completion methods have achieved remarkable performance. However, the result is still coarse due to the limitation of using unified depth categories to represent depth distribution. In this work, we propose CustomDepth which can customize exclusive depth categories for each image point to boost performance. To this end, CustomDepth introduces a depth subdivision module that

更新日期：2024-02-10

详情收藏

Feature enhancement and coarse-to-fine detection for RGB-D tracking

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-10
Xue-Feng Zhu, Tianyang Xu, Xiao-Jun Wu, Josef Kittler

Existing RGB-D tracking algorithms advance the performance by constructing typical appearance models from the RGB-only tracking frameworks. There is no attempt to exploit any complementary visual information from the multi-modal input. This paper addresses this deficit and presents a novel algorithm to boost the performance of RGB-D tracking by taking advantage of collaborative clues. To guarantee

更新日期：2024-02-10

详情收藏

Machine learning for low signal-to-noise ratio detection

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-09
Fred Lacy, Angel Ruiz-Reyes, Anthony Brescia

[Display omitted]

更新日期：2024-02-09

详情收藏

Data-efficient 3D instance segmentation by transferring knowledge from synthetic scans

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07
Xiaodong Wu, Ruiping Wang, Xilin Chen

The 3D comprehension ability of indoor environments is critical for robots. While deep learning-based methods have improved performance, they require significant amounts of annotated training data. Nevertheless, the cost of scanning and annotating point cloud data in real scenes is high, leading to data scarcity. Consequently, there is an urgent need to investigate data-efficient methods for point

更新日期：2024-02-07

详情收藏

On characterizing the evolution of embedding space of neural networks using algebraic topology

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07
S. Suresh, B. Das, V. Abrol, S. Dutta Roy

We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers. Motivated by existing studies using simplicial complexes on shallow fully connected networks (FCN), we present an extended analysis using Cubical homology instead, with a variety of popular deep architectures and real image datasets. We demonstrate

更新日期：2024-02-07

详情收藏

Hierarchical reinforcement learning for chip-macro placement in integrated circuit

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07
Zhentao Tan, Yadong Mu

The complexity of chip design has consistently grown, adhering to Moore’s law. In this paper, we examine a crucial step in integrated circuit design called chip macro placement. Traditionally, human experts are consulted to optimize placement for reduced power consumption, but this requires significant effort. Recently, machine learning-based methods have emerged to address this task, showing promising

更新日期：2024-02-07

详情收藏

Enhanced blind face inpainting via structured mask prediction

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-07
Honglei Li, Yifan Zhang, Wenmin Wang

Blind face inpainting is the task of automatically recovering an occluded face image without given masks indicating missing areas. Popular inpainting methods assume that the occlusion patterns are known with given occlusion masks. Previous blind inpainting methods, ignoring the structure in faces and occlusions, treat occlusion detection as an independent pixel prediction problem. To overcome the limitations

更新日期：2024-02-07

详情收藏

N-QGNv2: Predicting the optimum quadtree representation of a depth map from a monocular camera

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-05
Daniel Braun, Olivier Morel, Cédric Demonceaux, Pascal Vasseur

Self-supervised monocular depth prediction is a widely researched field that aims to provide a better scene understanding. However, most existing methods prioritize prediction accuracy over computation cost, which can hinder the deployment of these methods in real-world applications. Our objective is to propose a solution that efficiently compresses the depth map while maintaining a high level of accuracy

更新日期：2024-02-05

详情收藏

Editorial Board

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-04

更新日期：2024-02-04

详情收藏

Even small correlation and diversity shifts pose dataset-bias issues

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-03
Alceu Bissoto, Catarina Barata, Eduardo Valle, Sandra Avila

Distribution shifts hinder the deployment of deep learning in real-world problems. Distribution shifts appear when train and test data come from different sources, which commonly happens in practice. Despite shifts occurring concurrently in many forms (e.g., correlation and diversity shifts) and intensities, the literature focuses only on severe and isolated shifts. In this work, we propose a comprehensive

更新日期：2024-02-03

详情收藏

Subdivided Mask Dispersion Framework for semi-supervised semantic segmentation

Pattern Recogn. Lett. (IF 5.1) Pub Date : 2024-02-02
Yooseung Wang, Jaehyuk Jang, Changick Kim

Learning the relationship between weak and strong perturbations has been considered a major part of semi-supervised semantic segmentation. We observed two problems with a publicly used perturbation method, which randomly generates a mask with a single large bounding box. The large single bounding box that entirely covers the important object components in an image, hindering the model from capturing

更新日期：2024-02-02

详情收藏