-
A review on network representation learning with multi-granularity perspective Intell. Data Anal. (IF 1.7) Pub Date : 2024-02-03 Shun Fu, Lufeng Wang, Jie Yang
Network data is ubiquitous, such as telecommunication, transport systems, online social networks, protein-protein interactions, etc. Since the huge scale and the complexity of network data, former machine learning system tried to understand network data arduously. On the other hand, thought of multi-granular cognitive computation simulates the problem-solving process of human brains. It simplifies
-
Knowledge graph embedding in a uniform space Intell. Data Anal. (IF 1.7) Pub Date : 2024-02-03 Da Tong, Shudong Chen, Rong Ma, Donglin Qi, Yong Yu
Knowledge graph embedding (KGE) is typically used for link prediction to automatically predict missing links in knowledge graphs. Current KGE models are mainly based on complicated mathematical associations, which are highly expressive but ignore the uniformity behind the classical bilinear translational model TransE, a model that embeds all entities of knowledge graphs in a uniform space, enabling
-
How graph features from message passing affect graph classification and regression? Intell. Data Anal. (IF 1.7) Pub Date : 2024-02-03 Masatsugu Yamada, Mahito Sugiyama
Abstract Graph neural networks (GNNs) have been applied to various graph domains. However, GNNs based on the message passing scheme, which iteratively aggregates information from neighboring nodes, have difficulty learning to represent larger subgraph structures because of the nature of the scheme. We investigate the prediction performance of GNNs when the number of message passing iteration increases
-
TSAGNN: Temporal link predict method based on two stream adaptive graph neural network Intell. Data Anal. (IF 1.7) Pub Date : 2024-02-03 Yuhang Zhu, Jing Guo, Haitao Li, Shuxin Liu, Yingle Li
Temporal link prediction based on graph neural networks has become a hot spot in the field of complex networks. To solve the problems of the existing temporal link prediction methods based on graph neural networks do not consider the future time-domain features and spatial-domain features are limited used, this paper proposes a novel temporal link prediction method based on two streams adaptive graph
-
Evolutionary feature selection based on hybrid bald eagle search and particle swarm optimization Intell. Data Anal. (IF 1.7) Pub Date : 2024-02-03 Zhao Liu, Aimin Wang, Geng Sun, Jiahui Li, Haiming Bao, Yanheng Liu
Feature selection is a complicated multi-objective optimization problem with aims at reaching to the best subset of features while remaining a high accuracy in the field of machine learning, which is considered to be a difficult task. In this paper, we design a fitness function to jointly optimizethe classification accuracy and the selected features in the linear weighting manner. Then, we propose
-
Supervised probabilistic latent semantic analysis with applications to controversy analysis of legislative bills Intell. Data Anal. (IF 1.7) Pub Date : 2024-02-03 Eyor Alemayehu, Yi Fang
Abstract Probabilistic Latent Semantic Analysis (PLSA) is a fundamental text analysis technique that models each word in a document as a sample from a mixture of topics. PLSA is the precursor of probabilistic topic models including Latent Dirichlet Allocation (LDA). PLSA, LDA and their numerous extensions have been successfully applied to many text mining and retrieval tasks. One important extension
-
Economic and financial news hybrid- classification based on category-associated feature set Intell. Data Anal. (IF 1.7) Pub Date : 2024-02-03 Wilawan Yathongkhum, Yongyut Laosiritaworn, Jakramate Bootkrajang, Pucktada Treeratpituk, Jeerayut Chaijaruwanich
A large amount of economic and financial news is now accessible through various news websites and social media platforms. Categorizing them into appropriate categories can be advantageous for various tasks, such as sentiment analysis and news-based market prediction. Unfortunately, news headlines categories may contain ambiguities due to the subjective nature of label assignment by authors or publishers
-
The role of consultative leadership on administrative development Intell. Data Anal. (IF 1.7) Pub Date : 2024-02-03 Ayas Mohammed Rasheed Omar, Khairi Ali Auso
Consultative leadership is a democratic style that deliberately incorporates employees into organizational management and decision-making to increase employees’ feelings of ownership and align their objectives with company objectives. As a result, during their everyday tasks, leaders constantly utilize “consultation management” for their staff. As examples, consider how to coordinate reports, communicate
-
Resformer: Combine quadratic linear transformation with efficient sparse Transformer for long-term series forecasting Intell. Data Anal. (IF 1.7) Pub Date : 2023-11-20 Gongguan Chen, Hua Wang, Yepeng Liu, Mingli Zhang, Fan Zhang
With the continuous development of deep learning, long sequence time-series forecasting (LSTF) has attracted more and more attention in power consumption prediction, traffic prediction and stock prediction. In recent studies, various improved models of Transformer are favored. While these models have made breakthroughs in reducing the time and space complexity of Transformer, there are still some problems
-
HSNF: Hybrid sampling with two-step noise filtering for imbalanced data classification Intell. Data Anal. (IF 1.7) Pub Date : 2023-11-20 Lilong Duan, Wei Xue, Xiaolei Gu, Xiao Luo, Yongsheng He
Imbalanced data classification has received much attention in machine learning, and many oversampling methods exist to solve this problem. However, these methods may suffer from insufficient noise filtering, overlap between synthetic and original samples, etc., resulting in degradation of classification performance. To this end, we propose a hybrid sampling with two-step noise filtering (HSNF) method
-
Heterogeneous information fusion based graph collaborative filtering recommendation Intell. Data Anal. (IF 1.7) Pub Date : 2023-11-20 Ruihui Mu, Xiaoqin Zeng, Jiying Zhang
Nowadays, with the application of 5G, graph-based recommendation algorithms have become a research hotspot. Graph neural networks encode the graph structure information in the node representation through an iterative neighbor aggregation method, which can effectively alleviate the problem of data sparsity. In addition, more and more information graph can be used in collaborative filtering recommendation
-
SARW: Similarity-Aware Random Walk for GCN Intell. Data Anal. (IF 1.7) Pub Date : 2023-11-20 Linlin Hou, Haixiang Zhang, Qing-Hu Hou, Alan J.X. Guo, Ou Wu, Ting Yu, Ji Zhang
Graph Convolutional Network (GCN) is an important method for learning graph representations of nodes. For large-scale graphs, the GCN could meet with the neighborhood expansion phenomenon, which makes the model complexity high and the training time long. An efficient solution is to adopt graph sampling techniques, such as node sampling and random walk sampling. However, the existing sampling methods
-
Small object detection based on attention mechanism and enhanced network Intell. Data Anal. (IF 1.7) Pub Date : 2023-11-20 Bingbing Wang, Fengxiang Zhang, Kaipeng Li, Kuijie Shi, Lei Wang, Gang Liu
Small object detection has a broad application prospect in image processing of unmanned aerial vehicles, autopilot and remote sensing. However, some difficulties exactly exist in small object detection, such as aggregation, occlusion and insufficient feature extraction, resulting in a great challenge for small object detection. In this paper, we propose an improved algorithm for small object detection
-
A lightweight vision transformer with symmetric modules for vision tasks Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-19 Shengjun Liang, Mingxin Yu, Wenshuai Lu, Xinglong Ji, Xiongxin Tang, Xiaolin Liu, Rui You
Transformer-based networks have demonstrated their powerful performance in various vision tasks. However, these transformer-based networks are heavyweight and cannot be applied to edge computing (mobile) devices. Despite that the lightweight transformer network has emerged, several problems remain,i.e., weak feature extraction ability, feature redundancy, and lack of convolutional inductive bias. To
-
Multiple Distilling-based spatial-temporal attention networks for unsupervised human action recognition Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-13 Cheng Zhang, Jianqi Zhong, Wenming Cao, Jianhua Ji
Unsupervised action recognition based on spatiotemporal fusion feature extraction has attracted much attention in recent years. However, existing methods still have several limitations: (1) The long-term dependence relationship is not effectively extracted at the time level. (2) The high-order motion relationship between non-adjacent nodes is not effectively captured at the spatial level. (3) The model
-
Ordination-based verification of feature selection in pattern evolution research Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-12 Gábor Hosszú
This article explains the idea of pattern systems that develop gradually. These systems involve symbolic communication that includes symbols, syntax, and layout rules. Some pattern systems change over time, like historical scripts. The scientific study of pattern systems is called pattern evolutionresearch, and scriptinformatics is concerned with the modelling of the evolution of scripts. The symbol
-
CMCEE: A joint learning framework for cascade decoding with multi-feature fusion and conditional enhancement for overlapping event extraction Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-12 Zerui Dai, Shengwei Tian, Long Yu, Qimeng Yang
Event extraction (EE) is an important natural language processing task. With the passage of time, many powerful and effective models for event extraction tasks have been developed. However, there has been limited research on complex overlapping event extraction. Therefore, we propose a new cascadedecoding model: A Joint Learning Framework for Cascade Decoding with Multi-Feature Fusion and Conditional
-
Generate custom travel magazine layouts Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-12 Xiangping Wu, Shuaiwei Yao, Zheng Zhang, Jun Hu
Among the problems of specifying the style and number of elements of a travel magazine, the problem of generating magazine layout by constraining text, and constraining graph layout remains a complex and unsolved problem. In this paper, we generate layouts of text satisfying constraints via GAN. Due to the complexity and variety of graph designs, we enhance the performance of the discriminator and
-
Learning bayesian multinets from labeled and unlabeled data for knowledge representation Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-09 Meng Pang, Limin Wang, Qilong Li, Guo Lu, Kuo Li
The Bayesian network classifiers (BNCs) learned from labeled training data are expected to generalize to fit unlabeled testing data based on the independent and identically distributed (i.i.d.) assumption, whereas the asymmetric independence assertion demonstrates the uncertainty of significance ofdependency or independency relationships mined from data. A highly scalable BNC should form a distinct
-
A multi-instance multi-label learning algorithm based on radial basis functions and multi-objective particle swarm optimization Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-08 Xiang Bao, Fei Han, Qing-Hua Ling, Yan-Qiong Ren
Radial basis function (RBF) neural networks for Multi-Instance Multi-Label (MIML) directly can exploit the connections between instances and labels so that they can preserve useful prior information, but they only adopt Gaussian radial basis function as their RBF whose parameters are difficult to determine. In this paper, parameters can be obtained by multi-objective optimization methods with multi
-
An improved k-NN anomaly detection framework based on locality sensitive hashing for edge computing environment Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-06 Cong Gao, Yuzhe Chen, Yanping Chen, Zhongmin Wang, Hong Xia
Large deployment of wireless sensor networks in various fields bring great benefits. With the increasing volume of sensor data, traditional data collection and processing schemes gradually become unable to meet the requirements in actual scenarios. As data quality is vital to data mining and valueextraction, this paper presents a distributed anomaly detection framework which combines cloud computing
-
Oversampling method based on GAN for tabular binary classification problems Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-06 Jie Yang, Zhenhao Jiang, Tingting Pan, Yueqi Chen, Witold Pedrycz
Data-imbalanced problems are present in many applications. A big gap in the number of samples in different classes induces classifiers to skew to the majority class and thus diminish the performance of learning and quality of obtained results. Most data level imbalanced learning approaches generatenew samples only using the information associated with the minority samples through linearly generating
-
Data management scheme for building internet of things based on blockchain sharding Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-06 Xu Wang, Wenhu Zheng, Jinlong Wang, Xiaoyun Xiong, Yumin Shen, Wei Mu, Zengliang Fan
As an important part of digital building, building internet of things (BIoT) plays a positive role in promoting the construction of smart cities. Existing schemes utilize blockchain to achieve trusted data storage in BIoT. However, the full-copy storage mechanism of blockchain and the management requirements of massive data have brought computing and storage challenges to edge nodes with limited resources
-
NonPC: Non-parametric clustering algorithm with adaptive noise detecting Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-06 Lin Li, Xiang Chen, Chengyun Song
Graph-based clustering performs efficiently for identifying clusters in local and nonlinear data Patterns. The existing methods face the problem of parameter selection, such as the setting of k of the k-nearest neighbor graph and the threshold in noise detection. In this paper, a non-parametric clustering algorithm (NonPC) is proposed to tackle those inherent limitations and improve clustering performance
-
Mining skyline frequent-utility patterns from big data environment based on MapReduce framework Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-06 Jimmy Ming-Tai Wu, Ranran Li, Mu-En Wu, Jerry Chun-Wei Lin
When the concentration focuses on data mining, frequent itemset mining (FIM) and high-utility itemset mining (HUIM) are commonly addressed and researched. Many related algorithms are proposed to reveal the general relationship between utility, frequency, and items in transaction databases. Althoughthese algorithms can mine FIMs or HUIMs quickly, these algorithms merely take into account frequency or
-
Design of an energy efficient dynamic virtual machine consolidation model for smart cities in urban areas Intell. Data Anal. (IF 1.7) Pub Date : 2023-10-06 Nirmal Kr. Biswas, Sourav Banerjee, Uttam Ghosh, Utpal Biswas
The growing smart cities in urban areas are becoming more intelligent day by day. Massive storage and high computational resources are required to provide smart services in urban areas. It can be provided through intelligence cloud computing. The establishment of large-scale cloud data centres is rapidly increasing to provide utility-based services in urban areas. Enormous energy consumption of data
-
Exploiting scatter matrix on one-class support vector machine based on low variance direction Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-26 Soumaya Nheri, Riadh Ksantini, Mohamed Bécha Kaâniche, Adel Bouhoula
When building a performing one-class classifier, the low variance direction of the training data set might provide important information. The low variance direction of the training data set improves the Covariance-guided One-Class Support Vector Machine (COSVM), resulting in better accuracy. However, this classifier does not use data dispersion in the one class. It explicitly does not make use of target
-
Incremental density clustering framework based on dynamic microlocal clusters Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-26 Tao Zhang, Decai Li, Jingya Dong, Yuqing He, Yanchun Chang
With the prevailing development of the internet and sensors, various streaming raw data are generated continually. However, traditional clustering algorithms are unfavorable for discovering the underlying patterns of incremental data in time; clustering accuracy cannot be assured if fixed parameters clustering algorithms are used to handle incremental data. In this paper, an Incremental-Density-Micro-Clustering
-
Learning hierarchical embedding space for image-text matching Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-14 Sun Hao, Xiaolin Qin, Xiaojing Liu
There are two mainstream strategies for image-text matching at present. The one, termed as joint embedding learning, aims to model the semantic information of both image and sentence in a shared feature subspace, which facilitates the measurement of semantic similarity but only focuses on global alignment relationship. To explore the local semantic relationship more fully, the other one, termed as
-
A multi-domain adaptive neural machine translation method based on domain data balancer Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-14 Jinlei Xu, Yonghua Wen, Shuanghong Huang, Zhengtao Yu
Most methods for multi-domain adaptive neural machine translation (NMT) currently rely on mixing data from multiple domains in a single model to achieve multi-domain translation. However, this mixing can lead to imbalanced training data, causing the model to focus on training for the large-scale general domain while ignoring the scarce resources of specific domains, resulting in a decrease in translation
-
Conversational recommender based on graph sparsification and multi-hop attention Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-14 Yihao Zhang, Yuhao Wang, Wei Zhou, Pengxiang Lan, Haoran Xiang, Junlin Zhu, Meng Yuan
Conversational recommender systems provide users with item recommendations via interactive dialogues. Existing methods using graph neural networks have been proven to be an adequate representation of the learning framework for knowledge graphs. However, the knowledge graph involved in the dialoguecontext is vast and noisy, especially the noise graph nodes, which restrict the primary node’s aggregation
-
Cross-modality semantic guidance for multi-label image classification Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-14 Jun Huang, Dian Wang, Xudong Hong, Xiwen Qu, Wei Xue
Multi-label image classification aims to predict a set of labels that are present in an image. The key challenge of multi-label image classification lies in two aspects: modeling label correlations and utilizing spatial information. However, the existing approaches mainly calculate the correlationbetween labels according to co-occurrence among them. While the result is easily affected by the label
-
Pure large kernel convolutional neural network transformer for medical image registration Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-14 Zhao Fang, Wenming Cao
Deformable medical image registration is a fundamental and critical task in medical image analysis. Recently, deep learning-based methods have rapidly developed and have shown impressive results in deformable image registration. However, existing approaches still suffer from limitations in registration accuracy or generalization performance. To address these challenges, in this paper, we propose a
-
Boosting active domain adaptation with exploration of samples Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-14 Qing Tian, Heng Zhang
Nowadays, the idea of active learning is gradually adopted to assist domain adaptation. However, due to the existence of domain shift, the traditional active learning methods originating from semi-supervised scenarios can not be directly applied to domain adaptation. To solve the problem, active domain adaptation is proposed as a new domain adaptation paradigm, which aims to improve the performance
-
A feature-aware long-short interest evolution network for sequential recommendation Intell. Data Anal. (IF 1.7) Pub Date : 2023-09-14 Jing Tang, Yongquan Fan, Yajun Du, Xianyong Li, Xiaoliang Chen
Recommendation systems are an effective solution to deal with information overload, particularly in the e-commerce sector, in which sequential recommendation is extensively utilized. Sequential recommendations aim to acquire users’ interests and provide accurate recommendations by analyzing users’historical interaction sequences. To improve recommendation performance, it is vital to take into account
-
E-Health technological barriers faced by Iraqi healthcare institutions Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-24 Saif Mohammed Ali, M.A. Burhanuddin, Ali Taha Yaseen, Mustafa Musa Jaber, Mustafa Mohammed Jassim, Aseel Mohammed Ali, Ahmed Alkhayyat, Mohammed A. Mohammed, Auday A.H. Mohamad
The health records management issues have detrimentally affected the Iraqi healthcare sector resultant from the inferior information technology integrity and the complicatedness of data. In order to resolve this problem, other methods of storage, management, and retrieval of health-related data canbe offered by e-Health services. These aspects are important in tracking patients’ health conditions using
-
Comparative observation of changes in natriuretic peptides before and after interventional therapy for congenital heart disease Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-24 Xinghui Liu, Hongwen Tan, Xiaoqiao Liu, Qiang Wu
OBJECTIVE: To explore changes in the plasma atrial natriuretic peptide (ANP) and brain natriuretic peptide (BNP) in patients with left-to-right shunt congenital heart disease (CHD) before and in the early stage after interventional occlusion and to evaluate the clinical significance. METHODS: Among97 patients with left-to-right shunt CHD undergoing interventional occlusion, 34 cases had a VSD (ventricular
-
Deep learning models for predicting the position of the head on an X-ray image for Cephalometric analysis Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-17 K. Prasanna, Chinna Babu Jyothi, M. Sandeep Kumar, J. Prabhu, Abdu Saif, Dinesh Jackson Samuel
Abstract Cephalometric analysis is used to identify problems in the development of the skull, evaluate their treatment, and plan for possible surgical interventions. The paper aims to develop a Convolutional Neural Network that will analyze the head position on an X-ray image. It takes place in such a way that it recognizes whether the image is suitable and, if not, suggests a change in the position
-
A hybrid classification method via keywords screening and attention mechanisms in extreme short text Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-10 Xinke Zhou, Yi Zhu, Yun Li, Jipeng Qiang, Yunhao Yuan, Xingdong Wu, Runmei Zhang
Short text classification has provoked a vast amount of attention and research in recent decades. However, most existing methods only focus on the short texts that contain dozens of words like Twitter and Microblog, while pay far less attention to the extreme short texts like news headline and search snippets. Meanwhile, contemporary short text classification methods that extend the features via external
-
I2R: Intra and inter-modal representation learning for code search Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-10 Xu Zhang, Yanzheng Xiang, Zejie Liu, Xiaoyu Hu, Deyu Zhou
Code search, which locates code snippets in large code repositories based on natural language queries entered by developers, has become increasingly popular in the software development process. It has the potential to improve the efficiency of software developers. Recent studies have demonstrated the effectiveness of using deep learning techniques to represent queries and codes accurately for code
-
A systematic review on recommendation systems applied to chronic diseases Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-10 Ana Vieira, João Carneiro, Paulo Novais, Juan Corchado, Goreti Marreiros
A large percentage of the worldwide population is affected by chronic diseases, leading to a burden of the patient and the national healthcare systems. Recommendation systems are used for the personalization of healthcare due to their capacity of performing predictive analyses based on the patient’s clinical data. This systematic literature review presents four research questions to provide an overall
-
Temporal attention-aware evidential recurrent network for trustworthy prediction of Alzheimer’s disease progression Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-10 Chenran Zhang, Qingsen Bao, Feng Zhang, Ping Li, Lei Chen
Accurate and reliable prediction of Alzheimer’s disease (AD) progression is crucial for effective interventions and treatment to delay its onset. Recently, deep learning models for AD progression achieve excellent predictive accuracy. However, their predictions lack reliability due to the non-calibration defects, that affects their recognition and acceptance. To address this issue, this paper proposes
-
A fast and distributed C4.5 algorithm for urban big data Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-10 Wan-Shu Cheng, Peng-Yu Huang, Jheng-Yu Huang, Ju-Chin Chen, Kawuu W. Lin
The amount of information nowadays is rapidly growing. Aside from valuable information, information that is unrelated to a target or is meaningless is also growing. Big data and broader digital technologies are considered the primary components of smart city governance and planning. Big data analysis is considered to define a new era in urban planning, research, and policy. Effective data mining and
-
Detection of multi-size peach in orchard using RGB-D camera combined with an improved DEtection Transformer model Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-07 Yu Yang, Xin Wang, Zhenfang Liu, Min Huang, Shangpeng Sun, Qibing Zhu
The first major contribution of the paper is the proposal of using an improved DEtection Transformer network (named R2N-DETR) and Kinect-V2 camera for detecting multiple-size peaches under orchards with varied illumination and fruit occlusion. R2N-DETR model first employed Res2Net-50 to extract a fused low-high level feature map containing fine spatial features and precise semantic information of multi-size
-
A multi-layer multi-view stacking model for credit risk assessment Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-01 Wenfang Han, Xiao Gu, Ling Jian
Credit risk assessment plays a key role in determining the banking policies and commercial strategies of financial institutions. Ensemble learning approaches have been validated to be more competitive than individual classifiers and statistical techniques for default prediction. However, most researches focused on improving overall prediction accuracy rather than improving the identification of actual
-
Adversarial unsupervised domain adaptation based on generative adversarial network for stock trend forecasting Intell. Data Anal. (IF 1.7) Pub Date : 2023-08-02 Qiheng Wei, Qun Dai
Stock trend forecasting, which refers to the prediction of the rise and fall of the next day’s stock price, is a promising research field in financial time series forecasting, with a large quantity of well-performing algorithms and models being proposed. However, most of the studies focus on trendprediction for stocks with a large number of samples, while the trend prediction problem of newly listed
-
FairAW – Additive weighting without discrimination Intell. Data Anal. (IF 1.7) Pub Date : 2023-07-20 Sandro Radovanović, Andrija Petrović, Zorica Dodevska, Boris Delibašić
With growing awareness of the societal impact of decision-making, fairness has become an important issue. More specifically, in many real-world situations, decision-makers can unintentionally discriminate a certain group of individuals based on either inherited or appropriated attributes, such as gender, age, race, or religion. In this paper, we introduce a post-processing technique, called fair additive
-
A parallel and balanced SVM algorithm on spark for data-intensive computing Intell. Data Anal. (IF 1.7) Pub Date : 2023-07-20 Jianjiang Li, Jinliang Shi, Zhiguo Liu, Can Feng
Support Vector Machine (SVM) is a machine learning with excellent classification performance, which has been widely used in various fields such as data mining, text classification, face recognition and etc. However, when data volume scales to a certain level, the computational time becomes too longand the efficiency becomes low. To address this issue, we propose a parallel balanced SVM algorithm based
-
GeoNLPlify: A spatial data augmentation enhancing text classification for crisis monitoring Intell. Data Anal. (IF 1.7) Pub Date : 2023-07-06 Rémy Decoupes, Mathieu Roche, Maguelonne Teisseire
Abstract Crises such as natural disasters and public health emergencies generate vast amounts of text data, making it challenging to classify the information into relevant categories. Acquiring expert-labeled data for such scenarios can be difficult, leading to limited training datasets for text classification by fine-tuning BERT-like models. Unfortunately, traditional data augmentation techniques
-
Enhancing link prediction efficiency with shortest path and structural attributes Intell. Data Anal. (IF 1.7) Pub Date : 2023-06-29 Muhammad Wasim, Feras Al-Obeidat, Adnan Amin, Haji Gul, Fernando Moreira
Link prediction is one of the most essential and crucial tasks in complex network research since it seeks to forecast missing links in a network based on current ones. This problem has applications in a variety of scientific disciplines, including social network research, recommendation systems, and biological networks. In previous work, link prediction has been solved through different methods such
-
Asymmetric multilevel interactive attention network integrating reviews for item recommendation Intell. Data Anal. (IF 1.7) Pub Date : 2023-06-29 Peilin Yang, Wenguang Zheng, Yingyuan Xiao, Xu Jiao
Recently, most studies in the field have focused on integrating reviews behind ratings to improve recommendation performance. However, two main problems remain (1) Most works use a unified data form and the same processing method to address the user and the item reviews, regardless of their essential differences. (2) Most works only adopt simple concatenation operation when constructing user-item interaction
-
Comparative analysis of epidemic public opinion and policies in two regions of China based on big data Intell. Data Anal. (IF 1.7) Pub Date : 2023-06-28 Dong Qiu, Lin Huang
Since the outbreak of COVID-19 (Corona Virus Disease 2019), the Chinese government has taken strict measures to prevent and control the epidemic. Although the spread of the virus has been controlled, people’s daily life and work have been affected and restricted to varying degrees. Thus people havedifferent sentiments, these may affect people’s implementation and compliance with the policies, thus
-
Feature evolvable learning with image streams Intell. Data Anal. (IF 1.7) Pub Date : 2023-06-01 Tianxiang Zheng, Xianmin Wang, Yixiang Chen, Fujia Yu, Jing Li
Feature Evolvable Stream Learning (FESL) has received extensive attentions during the past few years where old features could vanish and new features could appear when learning with streaming data. Existing FESL algorithms are mainly designed for simple datasets with low-dimension features, nevertheless they are ineffective to deal with complex streams such as image sequences. Such crux lies in two
-
A new Chinese text clustering algorithm based on WRD and improved K-means Intell. Data Anal. (IF 1.7) Pub Date : 2023-06-01 Zicai Cui, Bocheng Zhong, Chen Bai
Text clustering has been widely used in data mining, document management, search engines, and other fields. The K-means algorithm is a representative algorithm of text clustering. However, traditional K-means algorithm often uses Euclidean distance or cosine distance to measure the similarity between texts, which is not effective in face of high-dimensional data and cannot retain enough semantic information
-
Data-driven predictive maintenance framework for railway systems Intell. Data Anal. (IF 1.7) Pub Date : 2023-06-01 Jorge Meira, Bruno Veloso, Verónica Bolón-Canedo, Goreti Marreiros, Amparo Alonso-Betanzos, João Gama
The emergence of the Industry 4.0 trend brings automation and data exchange to industrial manufacturing. Using computational systems and IoT devices allows businesses to collect and deal with vast volumes of sensorial and business process data. The growing and proliferation of big data and machinelearning technologies enable strategic decisions based on the analyzed data. This study suggests a data-driven
-
A discrete equilibrium optimization algorithm for breast cancer diagnosis Intell. Data Anal. (IF 1.7) Pub Date : 2023-06-01 Haouassi Hichem, Mahdaoui Rafik, Chouhal Ouahiba
Illness diagnosis is the essential step in designating a treatment. Nowadays, Technological advancements in medical equipment can produce many features to describe breast cancer disease with more comprehensive and discriminant data. Based on the patient’s medical data, several data-driven models are proposed for breast cancer diagnosis using learning techniques such as naive Bayes, neural networks
-
An improved hybrid structure learning strategy for Bayesian networks based on ensemble learning Intell. Data Anal. (IF 1.7) Pub Date : 2023-06-01 Wenlong Gao, Zhimei Zeng, Xiaojie Ma, Yongsong Ke, Minqian Zhi
In the application of Bayesian networks to solve practical problems, it is likely to encounter the situation that the data set is expensive and difficult to obtain in large quantities and the small data set is easy to cause the inaccuracy of Bayesian network (BN) scoring functions, which affects the BN optimization results. Therefore, how to better learn Bayesian network structures under a small data
-
Trajectory personalization privacy preservation method based on multi-sensitivity attribute generalization and local suppression Intell. Data Anal. (IF 1.7) Pub Date : 2023-05-25 Qingying Yu, Feng Yang, Zhenxing Xiao, Shan Gong, Liping Sun, Chuanming Chen
Fast-developing mobile location-aware services generate an enormous volume of trajectory data while adding value to people’s lives. However, trajectory data contains not only location information, but also sensitive personal information. If the original trajectory data is published directly, it could result in serious privacy leaks. Most of the existing privacy-preserving trajectory publishing methods
-
Active ordinal classification by querying relative information Intell. Data Anal. (IF 1.7) Pub Date : 2023-05-25 Deniu He
Collecting and learning with auxiliary information is a way to further reduce the labeling cost of active learning. This paper studies the problem of active learning for ordinal classification by querying low-cost relative information (instance-pair relation information) through pairwise queries. Two challenges in this study that arise are how to train an ordinal classifier with absolute information
-
Safe co-training for semi-supervised regression Intell. Data Anal. (IF 1.7) Pub Date : 2023-05-25 Liyan Liu, Peng Huang, Hong Yu, Fan Min
Co-training is a popular semi-supervised learning method. The learners exchange pseudo-labels obtained from different views to reduce the accumulation of errors. One of the key issues is how to ensure the quality of pseudo-labels. However, the pseudo-labels obtained during the co-training process may be inaccurate. In this paper, we propose a safe co-training (SaCo) algorithm for regression with two