-
GraphMatcher: A Graph Representation Learning Approach for Ontology Matching arXiv.cs.AI Pub Date : 2024-04-20 Sefika Efeoglu
Ontology matching is defined as finding a relationship or correspondence between two or more entities in two or more ontologies. To solve the interoperability problem of the domain ontologies, semantically similar entities in these ontologies must be found and aligned before merging them. GraphMatcher, developed in this study, is an ontology matching system using a graph attention approach to compute
-
CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method arXiv.cs.AI Pub Date : 2024-04-23 Mingbao Lin, Zhihang Lin, Wengyi Zhan, Liujuan Cao, Rongrong Ji
Transforming large pre-trained low-resolution diffusion models to cater to higher-resolution demands, i.e., diffusion extrapolation, significantly improves diffusion adaptability. We propose tuning-free CutDiffusion, aimed at simplifying and accelerating the diffusion extrapolation process, making it more affordable and improving performance. CutDiffusion abides by the existing patch-wise extrapolation
-
A review of deep learning-based information fusion techniques for multimodal medical image classification arXiv.cs.AI Pub Date : 2024-04-23 Yihao Li, Mostafa El Habib Daho, Pierre-Henri Conze, Rachid Zeghlache, Hugo Le Boité, Ramin Tadayoni, Béatrice Cochener, Mathieu Lamard, Gwenolé Quellec
Multimodal medical imaging plays a pivotal role in clinical diagnosis and research, as it combines information from various imaging modalities to provide a more comprehensive understanding of the underlying pathology. Recently, deep learning-based multimodal fusion techniques have emerged as powerful tools for improving medical image classification. This review offers a thorough analysis of the developments
-
SGFormer: Spherical Geometry Transformer for 360 Depth Estimation arXiv.cs.AI Pub Date : 2024-04-23 Junsong Zhang, Zisong Chen, Chunyu Lin, Lang Nie, Zhijie Shen, Junda Huang, Yao Zhao
Panoramic distortion poses a significant challenge in 360 depth estimation, particularly pronounced at the north and south poles. Existing methods either adopt a bi-projection fusion strategy to remove distortions or model long-range dependencies to capture global structures, which can result in either unclear structure or insufficient local perception. In this paper, we propose a spherical geometry
-
Leveraging Speech for Gesture Detection in Multimodal Communication arXiv.cs.AI Pub Date : 2024-04-23 Esam Ghaleb, Ilya Burenko, Marlou Rasenberg, Wim Pouw, Ivan Toni, Peter Uhrig, Anna Wilson, Judith Holler, Aslı Özyürek, Raquel Fernández
Gestures are inherent to human interaction and often complement speech in face-to-face communication, forming a multimodal communication system. An important task in gesture analysis is detecting a gesture's beginning and end. Research on automatic gesture detection has primarily focused on visual and kinematic information to detect a limited set of isolated or silent gestures with low variability
-
CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models arXiv.cs.AI Pub Date : 2024-04-23 Teodor Chiaburu, Frank Haußer, Felix Bießmann
Mounting evidence in explainability for artificial intelligence (XAI) research suggests that good explanations should be tailored to individual tasks and should relate to concepts relevant to the task. However, building task specific explanations is time consuming and requires domain expertise which can be difficult to integrate into generic XAI methods. A promising approach towards designing useful
-
CNN2GNN: How to Bridge CNN with GNN arXiv.cs.AI Pub Date : 2024-04-23 Ziheng Jiao, Hongyuan Zhang, Xuelong Li
Although the convolutional neural network (CNN) has achieved excellent performance in vision tasks by extracting the intra-sample representation, it will take a higher training expense because of stacking numerous convolutional layers. Recently, as the bilinear models, graph neural networks (GNN) have succeeded in exploring the underlying topological relationship among the graph data with a few graph
-
Grounded Knowledge-Enhanced Medical VLP for Chest X-Ray arXiv.cs.AI Pub Date : 2024-04-23 Qiao Deng, Zhongzhen Huang, Yunqi Wang, Zhichuan Wang, Zhao Wang, Xiaofan Zhang, Qi Dou, Yeung Yu Hui, Edward S. Hui
Medical vision-language pre-training has emerged as a promising approach for learning domain-general representations of medical image and text. Current algorithms that exploit the global and local alignment between medical image and text could however be marred by the redundant information in medical data. To address this issue, we propose a grounded knowledge-enhanced medical vision-language pre-training
-
Cross-Task Multi-Branch Vision Transformer for Facial Expression and Mask Wearing Classification arXiv.cs.AI Pub Date : 2024-04-22 Armando Zhu, Keqin Li, Tong Wu, Peng Zhao, Wenjing Zhou, Bo Hong
With wearing masks becoming a new cultural norm, facial expression recognition (FER) while taking masks into account has become a significant challenge. In this paper, we propose a unified multi-branch vision transformer for facial expression recognition and mask wearing classification tasks. Our approach extracts shared features for both tasks using a dual-branch architecture that obtains multi-scale
-
Explaining Arguments' Strength: Unveiling the Role of Attacks and Supports (Technical Report) arXiv.cs.AI Pub Date : 2024-04-22 Xiang Yin, Potyka Nico, Francesca Toni
Quantitatively explaining the strength of arguments under gradual semantics has recently received increasing attention. Specifically, several works in the literature provide quantitative explanations by computing the attribution scores of arguments. These works disregard the importance of attacks and supports, even though they play an essential role when explaining arguments' strength. In this paper
-
Mechanistic Interpretability for AI Safety -- A Review arXiv.cs.AI Pub Date : 2024-04-22 Leonard Bereska, Efstratios Gavves
Understanding AI systems' inner workings is critical for ensuring value alignment and safety. This review explores mechanistic interpretability: reverse-engineering the computational mechanisms and representations learned by neural networks into human-understandable algorithms and concepts to provide a granular, causal understanding. We establish foundational concepts such as features encoding knowledge
-
Multi-channel Emotion Analysis for Consensus Reaching in Group Movie Recommendation Systems arXiv.cs.AI Pub Date : 2024-04-21 Adilet Yerkin, Elnara Kadyrgali, Yerdauit Torekhan, Pakizar Shamoi
Watching movies is one of the social activities typically done in groups. Emotion is the most vital factor that affects movie viewers' preferences. So, the emotional aspect of the movie needs to be determined and analyzed for further recommendations. It can be challenging to choose a movie that appeals to the emotions of a diverse group. Reaching an agreement for a group can be difficult due to the
-
On the Value of Labeled Data and Symbolic Methods for Hidden Neuron Activation Analysis arXiv.cs.AI Pub Date : 2024-04-21 Abhilekha Dalal, Rushrukh Rayan, Adrita Barua, Eugene Y. Vasserman, Md Kamruzzaman Sarker, Pascal Hitzler
A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would help answer the question of what a deep learning system internally detects as relevant in the input, demystifying the otherwise black-box nature of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be interpretable in a way
-
A Survey on the Memory Mechanism of Large Language Model based Agents arXiv.cs.AI Pub Date : 2024-04-21 Zeyu Zhang, Xiaohe Bo, Chen Ma, Rui Li, Xu Chen, Quanyu Dai, Jieming Zhu, Zhenhua Dong, Ji-Rong Wen
Large language model (LLM) based agents have recently attracted much attention from the research and industry communities. Compared with original LLMs, LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems that need long-term and complex agent-environment interactions. The key component to support agent-environment interactions is the memory
-
MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering arXiv.cs.AI Pub Date : 2024-04-19 Avinash Anand, Janak Kapuriya, Chhavi Kirtani, Apoorv Singh, Jay Saraf, Naman Lal, Jatin Kumar, Adarsh Raj Shivam, Astha Verma, Rajiv Ratn Shah, Roger Zimmermann
Recent advancements in LLMs have shown their significant potential in tasks like text summarization and generation. Yet, they often encounter difficulty while solving complex physics problems that require arithmetic calculation and a good understanding of concepts. Moreover, many physics problems include images that contain important details required to understand the problem's context. We propose
-
A Clean-graph Backdoor Attack against Graph Convolutional Networks with Poisoned Label Only arXiv.cs.AI Pub Date : 2024-04-19 Jiazhu Dai, Haoyu Sun
Graph Convolutional Networks (GCNs) have shown excellent performance in dealing with various graph structures such as node classification, graph classification and other tasks. However,recent studies have shown that GCNs are vulnerable to a novel threat known as backdoor attacks. However, all existing backdoor attacks in the graph domain require modifying the training samples to accomplish the backdoor
-
How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples arXiv.cs.AI Pub Date : 2024-04-19 Dren Fazlija, Arkadij Orlov, Johanna Schrader, Monty-Maximilian Zühlke, Michael Rohs, Daniel Kudenko
With an ever-increasing reliance on machine learning (ML) models in the real world, adversarial examples threaten the safety of AI-based systems such as autonomous vehicles. In the image domain, they represent maliciously perturbed data points that look benign to humans (i.e., the image modification is not noticeable) but greatly mislead state-of-the-art ML models. Previously, researchers ensured the
-
Learning to Cut via Hierarchical Sequence/Set Model for Efficient Mixed-Integer Programming arXiv.cs.AI Pub Date : 2024-04-19 Jie Wang, Zhihai Wang, Xijun Li, Yufei Kuang, Zhihao Shi, Fangzhou Zhu, Mingxuan Yuan, Jia Zeng, Yongdong Zhang, Feng Wu
Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), which formulate many important real-world applications. Cut selection heavily depends on (P1) which cuts to prefer and (P2) how many cuts to select. Although modern MILP solvers tackle (P1)-(P2) by human-designed heuristics, machine learning carries the potential to learn more effective heuristics. However
-
GluMarker: A Novel Predictive Modeling of Glycemic Control Through Digital Biomarkers arXiv.cs.AI Pub Date : 2024-04-19 Ziyi Zhou, Ming Cheng, Xingjian Diao, Yanjun Cui, Xiangling Li
The escalating prevalence of diabetes globally underscores the need for diabetes management. Recent research highlights the growing focus on digital biomarkers in diabetes management, with innovations in computational frameworks and noninvasive monitoring techniques using personalized glucose metrics. However, they predominantly focus on insulin dosing and specific glucose values, or with limited attention
-
Reinforcement Learning Approach for Integrating Compressed Contexts into Knowledge Graphs arXiv.cs.AI Pub Date : 2024-04-19 Ngoc Quach, Qi Wang, Zijun Gao, Qifeng Sun, Bo Guan, Lillian Floyd
The widespread use of knowledge graphs in various fields has brought about a challenge in effectively integrating and updating information within them. When it comes to incorporating contexts, conventional methods often rely on rules or basic machine learning models, which may not fully grasp the complexity and fluidity of context information. This research suggests an approach based on reinforcement
-
Centralized vs. Decentralized Multi-Agent Reinforcement Learning for Enhanced Control of Electric Vehicle Charging Networks arXiv.cs.AI Pub Date : 2024-04-18 Amin Shojaeighadikolaei, Zsolt Talata, Morteza Hashemi
The widespread adoption of electric vehicles (EVs) poses several challenges to power distribution networks and smart grid infrastructure due to the possibility of significantly increasing electricity demands, especially during peak hours. Furthermore, when EVs participate in demand-side management programs, charging expenses can be reduced by using optimal charging control policies that fully utilize
-
The collective use and evaluation of generative AI tools in digital humanities research: Survey-based results arXiv.cs.AI Pub Date : 2024-04-18 Meredith Dedema, Rongqian Ma
The advent of generative artificial intelligence (GenAI) technologies has revolutionized research, with significant implications for Digital Humanities (DH), a field inherently intertwined with technological progress. This article investigates how digital humanities scholars adopt, practice, as well as critically evaluate, GenAI technologies such as ChatGPT in the research process. Drawing on 76 responses
-
DF-DM: A foundational process model for multimodal data fusion in the artificial intelligence era arXiv.cs.AI Pub Date : 2024-04-18 David Restrepo, Chenwei Wu, Constanza Vásquez-Venegas, Luis Filipe Nakayama, Leo Anthony Celi, Diego M López
In the big data era, integrating diverse data modalities poses significant challenges, particularly in complex fields like healthcare. This paper introduces a new process model for multimodal Data Fusion for Data Mining, integrating embeddings and the Cross-Industry Standard Process for Data Mining with the existing Data Fusion Information Group model. Our model aims to decrease computational costs
-
A Time-Inhomogeneous Markov Model for Resource Availability under Sparse Observations arXiv.cs.AI Pub Date : 2024-04-18 Lukas Rottkamp, Matthias Schubert
Accurate spatio-temporal information about the current situation is crucial for smart city applications such as modern routing algorithms. Often, this information describes the state of stationary resources, e.g. the availability of parking bays, charging stations or the amount of people waiting for a vehicle to pick them up near a given location. To exploit this kind of information, predicting future
-
An Adaptive Metaheuristic Framework for Changing Environments arXiv.cs.AI Pub Date : 2024-04-18 Bestoun S. Ahmed
The rapidly changing landscapes of modern optimization problems require algorithms that can be adapted in real-time. This paper introduces an Adaptive Metaheuristic Framework (AMF) designed for dynamic environments. It is capable of intelligently adapting to changes in the problem parameters. The AMF combines a dynamic representation of problems, a real-time sensing system, and adaptive techniques
-
AccidentBlip2: Accident Detection With Multi-View MotionBlip2 arXiv.cs.AI Pub Date : 2024-04-18 Yihua Shao, Hongyi Cai, Wenxin Long, Weiyi Lang, Zhe Wang, Haoran Wu, Yan Wang, Yang Yang, Zhen Lei
Multimodal Large Language Models (MLLMs) have shown outstanding capabilities in many areas of multimodal reasoning. Therefore, we use the reasoning ability of Multimodal Large Language Models for environment description and scene understanding in complex transportation environments. In this paper, we propose AccidentBlip2, a multimodal large language model that can predict in real time whether an accident
-
Character is Destiny: Can Large Language Models Simulate Persona-Driven Decisions in Role-Playing? arXiv.cs.AI Pub Date : 2024-04-18 Rui Xu, Xintao Wang, Jiangjie Chen, Siyu Yuan, Xinfeng Yuan, Jiaqing Liang, Zulong Chen, Xiaoqing Dong, Yanghua Xiao
Can Large Language Models substitute humans in making important decisions? Recent research has unveiled the potential of LLMs to role-play assigned personas, mimicking their knowledge and linguistic habits. However, imitative decision-making requires a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate
-
Personalized Forgetting Mechanism with Concept-Driven Knowledge Tracing arXiv.cs.AI Pub Date : 2024-04-18 Shanshan Wang, Ying Hu, Xun Yang, Zhongzhou Zhang, Keyang Wang, Xingyi Zhang
Knowledge Tracing (KT) aims to trace changes in students' knowledge states throughout their entire learning process by analyzing their historical learning data and predicting their future learning performance. Existing forgetting curve theory based knowledge tracing models only consider the general forgetting caused by time intervals, ignoring the individualization of students and the causal relationship
-
X-Light: Cross-City Traffic Signal Control Using Transformer on Transformer as Meta Multi-Agent Reinforcement Learner arXiv.cs.AI Pub Date : 2024-04-18 Haoyuan Jiang, Ziyue Li, Hua Wei, Xuantang Xiong, Jingqing Ruan, Jiaming Lu, Hangyu Mao, Rui Zhao
The effectiveness of traffic light control has been significantly improved by current reinforcement learning-based approaches via better cooperation among multiple traffic lights. However, a persisting issue remains: how to obtain a multi-agent traffic signal control algorithm with remarkable transferability across diverse cities? In this paper, we propose a Transformer on Transformer (TonT) model
-
DST-GTN: Dynamic Spatio-Temporal Graph Transformer Network for Traffic Forecasting arXiv.cs.AI Pub Date : 2024-04-18 Songtao Huang, Hongjin Song, Tianqi Jiang, Akbar Telikani, Jun Shen, Qingguo Zhou, Binbin Yong, Qiang Wu
Accurate traffic forecasting is essential for effective urban planning and congestion management. Deep learning (DL) approaches have gained colossal success in traffic forecasting but still face challenges in capturing the intricacies of traffic dynamics. In this paper, we identify and address this challenges by emphasizing that spatial features are inherently dynamic and change over time. A novel
-
Exploring the landscape of large language models: Foundations, techniques, and challenges arXiv.cs.AI Pub Date : 2024-04-18 Milad Moradi, Ke Yan, David Colwell, Matthias Samwald, Rhona Asgari
In this review paper, we delve into the realm of Large Language Models (LLMs), covering their foundational principles, diverse applications, and nuanced training processes. The article sheds light on the mechanics of in-context learning and a spectrum of fine-tuning approaches, with a special focus on methods that optimize efficiency in parameter usage. Additionally, it explores how LLMs can be more
-
From Language Models to Practical Self-Improving Computer Agents arXiv.cs.AI Pub Date : 2024-04-18 Alex Sheng
We develop a simple and straightforward methodology to create AI computer agents that can carry out diverse computer tasks and self-improve by developing tools and augmentations to enable themselves to solve increasingly complex tasks. As large language models (LLMs) have been shown to benefit from non-parametric augmentations, a significant body of recent work has focused on developing software that
-
Toward Short-Term Glucose Prediction Solely Based on CGM Time Series arXiv.cs.AI Pub Date : 2024-04-18 Ming Cheng, Xingjian Diao, Ziyi Zhou, Yanjun Cui, Wenjun Liu, Shitong Cheng
The global diabetes epidemic highlights the importance of maintaining good glycemic control. Glucose prediction is a fundamental aspect of diabetes management, facilitating real-time decision-making. Recent research has introduced models focusing on long-term glucose trend prediction, which are unsuitable for real-time decision-making and result in delayed responses. Conversely, models designed to
-
Sampling-based Pareto Optimization for Chance-constrained Monotone Submodular Problems arXiv.cs.AI Pub Date : 2024-04-18 Xiankun Yan, Aneta Neumann, Frank Neumann
Recently surrogate functions based on the tail inequalities were developed to evaluate the chance constraints in the context of evolutionary computation and several Pareto optimization algorithms using these surrogates were successfully applied in optimizing chance-constrained monotone submodular problems. However, the difference in performance between algorithms using the surrogates and those employing
-
Enhancing Financial Inclusion and Regulatory Challenges: A Critical Analysis of Digital Banks and Alternative Lenders Through Digital Platforms, Machine Learning, and Large Language Models Integration arXiv.cs.AI Pub Date : 2024-04-18 Luke Lee
This paper explores the dual impact of digital banks and alternative lenders on financial inclusion and the regulatory challenges posed by their business models. It discusses the integration of digital platforms, machine learning (ML), and Large Language Models (LLMs) in enhancing financial services accessibility for underserved populations. Through a detailed analysis of operational frameworks and
-
Concept Induction using LLMs: a user experiment for assessment arXiv.cs.AI Pub Date : 2024-04-18 Adrita Barua, Cara Widmer, Pascal Hitzler
Explainable Artificial Intelligence (XAI) poses a significant challenge in providing transparent and understandable insights into complex AI models. Traditional post-hoc algorithms, while useful, often struggle to deliver interpretable explanations. Concept-based models offer a promising avenue by incorporating explicit representations of concepts to enhance interpretability. However, existing research
-
SGRU: A High-Performance Structured Gated Recurrent Unit for Traffic Flow Prediction arXiv.cs.AI Pub Date : 2024-04-18 Wenfeng Zhang, Xin Li, Anqi Li, Xiaoting Huang, Ti Wang, Honglei Gao
Traffic flow prediction is an essential task in constructing smart cities and is a typical Multivariate Time Series (MTS) Problem. Recent research has abandoned Gated Recurrent Units (GRU) and utilized dilated convolutions or temporal slicing for feature extraction, and they have the following drawbacks: (1) Dilated convolutions fail to capture the features of adjacent time steps, resulting in the
-
CAUS: A Dataset for Question Generation based on Human Cognition Leveraging Large Language Models arXiv.cs.AI Pub Date : 2024-04-18 Minjung Shin, Donghyun Kim, Jeh-Kwang Ryu
We introduce the CAUS (Curious About Uncertain Scene) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties. Leveraging this dataset, we investigate the potential of LLMs to engage in questioning effectively. Our approach involves providing scene descriptions embedded with uncertainties to stimulate the generation of
-
Planning with Language Models Through The Lens of Efficiency arXiv.cs.AI Pub Date : 2024-04-18 Michael Katz, Harsha Kokel, Kavitha Srinivas, Shirin Sohrabi
We analyse the cost of using LLMs for planning and highlight that recent trends are profoundly uneconomical. We propose a significantly more efficient approach and argue for a responsible use of compute resources; urging research community to investigate LLM-based approaches that upholds efficiency.
-
Enhancing Q&A with Domain-Specific Fine-Tuning and Iterative Reasoning: A Comparative Study arXiv.cs.AI Pub Date : 2024-04-17 Zooey Nguyen, Anthony Annunziata, Vinh Luong, Sang Dinh, Quynh Le, Anh Hai Ha, Chanh Le, Hong An Phan, Shruti Raghavan, Christopher Nguyen
This paper investigates the impact of domain-specific model fine-tuning and of reasoning mechanisms on the performance of question-answering (Q&A) systems powered by large language models (LLMs) and Retrieval-Augmented Generation (RAG). Using the FinanceBench SEC financial filings dataset, we observe that, for RAG, combining a fine-tuned embedding model with a fine-tuned LLM achieves better accuracy
-
Meta-Decomposition: Dynamic Segmentation Approach Selection in IoT-based Activity Recognition arXiv.cs.AI Pub Date : 2024-04-17 Seyed M. R. Modaresi, Aomar Osmani, Mohammadreza Razzazi, Abdelghani Chibani
Internet of Things (IoT) devices generate heterogeneous data over time; and relying solely on individual data points is inadequate for accurate analysis. Segmentation is a common preprocessing step in many IoT applications, including IoT-based activity recognition, aiming to address the limitations of individual events and streamline the process. However, this step introduces at least two families
-
GEOBIND: Binding Text, Image, and Audio through Satellite Images arXiv.cs.AI Pub Date : 2024-04-17 Aayush Dhakal, Subash Khanal, Srikumar Sastry, Adeel Ahmad, Nathan Jacobs
In remote sensing, we are interested in modeling various modalities for some geographic location. Several works have focused on learning the relationship between a location and type of landscape, habitability, audio, textual descriptions, etc. Recently, a common way to approach these problems is to train a deep-learning model that uses satellite images to infer some unique characteristics of the location
-
A Survey on Semantic Modeling for Building Energy Management arXiv.cs.AI Pub Date : 2024-04-17 Miracle Aniakor, Vinicius V. Cogo, Pedro M. Ferreira
Buildings account for a substantial portion of global energy consumption. Reducing buildings' energy usage primarily involves obtaining data from building systems and environment, which are instrumental in assessing and optimizing the building's performance. However, as devices from various manufacturers represent their data in unique ways, this disparity introduces challenges for semantic interoperability
-
Implementation and Evaluation of a Gradient Descent-Trained Defensible Blackboard Architecture System arXiv.cs.AI Pub Date : 2024-04-17 Jordan Milbrath, Jonathan Rivard, Jeremy Straub
A variety of forms of artificial intelligence systems have been developed. Two well-known techniques are neural networks and rule-fact expert systems. The former can be trained from presented data while the latter is typically developed by human domain experts. A combined implementation that uses gradient descent to train a rule-fact expert system has been previously proposed. A related system type
-
Pretraining Billion-scale Geospatial Foundational Models on Frontier arXiv.cs.AI Pub Date : 2024-04-17 Aristeidis Tsaris, Philipe Ambrozio Dias, Abhishek Potnis, Junqi Yin, Feiyi Wang, Dalton Lunga
As AI workloads increase in scope, generalization capability becomes challenging for small task-specific models and their demand for large amounts of labeled training samples increases. On the contrary, Foundation Models (FMs) are trained with internet-scale unlabeled data via self-supervised learning and have been shown to adapt to various tasks with minimal fine-tuning. Although large FMs have demonstrated
-
Cross-Problem Learning for Solving Vehicle Routing Problems arXiv.cs.AI Pub Date : 2024-04-17 Zhuoyi Lin, Yaoxin Wu, Bangjian Zhou, Zhiguang Cao, Wen Song, Yingqian Zhang, Senthilnath Jayavelu
Existing neural heuristics often train a deep architecture from scratch for each specific vehicle routing problem (VRP), ignoring the transferable knowledge across different VRP variants. This paper proposes the cross-problem learning to assist heuristics training for different downstream VRP variants. Particularly, we modularize neural architectures for complex VRPs into 1) the backbone Transformer
-
Spatial Context-based Self-Supervised Learning for Handwritten Text Recognition arXiv.cs.AI Pub Date : 2024-04-17 Carlos Penarrubia, Carlos Garrido-Munoz, Jose J. Valero-Mas, Jorge Calvo-Zaragoza
Handwritten Text Recognition (HTR) is a relevant problem in computer vision, and implies unique challenges owing to its inherent variability and the rich contextualization required for its interpretation. Despite the success of Self-Supervised Learning (SSL) in computer vision, its application to HTR has been rather scattered, leaving key SSL methodologies unexplored. This work focuses on one of them
-
Learn to Tour: Operator Design For Solution Feasibility Mapping in Pickup-and-delivery Traveling Salesman Problem arXiv.cs.AI Pub Date : 2024-04-17 Bowen Fang, Xu Chen, Xuan Di
This paper aims to develop a learning method for a special class of traveling salesman problems (TSP), namely, the pickup-and-delivery TSP (PDTSP), which finds the shortest tour along a sequence of one-to-one pickup-and-delivery nodes. One-to-one here means that the transported people or goods are associated with designated pairs of pickup and delivery nodes, in contrast to that indistinguishable goods
-
Prediction of Unmanned Surface Vessel Motion Attitude Based on CEEMDAN-PSO-SVM arXiv.cs.AI Pub Date : 2024-04-17 Zhuoya Geng, Jianmei Chen, Wanqiang Zhu
Unmanned boats, while navigating at sea, utilize active compensation systems to mitigate wave disturbances experienced by onboard instruments and equipment. However, there exists a lag in the measurement of unmanned boat attitudes, thus introducing unmanned boat motion attitude prediction to compensate for the lag in the signal acquisition process. This paper, based on the basic principles of waves
-
Instantiations and Computational Aspects of Non-Flat Assumption-based Argumentation arXiv.cs.AI Pub Date : 2024-04-17 Tuomo Lehtonen, Anna Rapberger, Francesca Toni, Markus Ulbricht, Johannes P. Wallner
Most existing computational tools for assumption-based argumentation (ABA) focus on so-called flat frameworks, disregarding the more general case. In this paper, we study an instantiation-based approach for reasoning in possibly non-flat ABA. We make use of a semantics-preserving translation between ABA and bipolar argumentation frameworks (BAFs). By utilizing compilability theory, we establish that
-
DUPE: Detection Undermining via Prompt Engineering for Deepfake Text arXiv.cs.AI Pub Date : 2024-04-17 James Weichert, Chinecherem Dimobi
As large language models (LLMs) become increasingly commonplace, concern about distinguishing between human and AI text increases as well. The growing power of these models is of particular concern to teachers, who may worry that students will use LLMs to write school assignments. Facing a technology with which they are unfamiliar, teachers may turn to publicly-available AI text detectors. Yet the
-
How to Exhibit More Predictable Behaviors arXiv.cs.AI Pub Date : 2024-04-17 Salomé Lepers, Sophie Lemonnier, Vincent Thomas, Olivier Buffet
This paper looks at predictability problems, i.e., wherein an agent must choose its strategy in order to optimize the predictions that an external observer could make. We address these problems while taking into account uncertainties on the environment dynamics and on the observed agent's policy. To that end, we assume that the observer 1. seeks to predict the agent's future action or state at each
-
Inductive Cognitive Diagnosis for Fast Student Learning in Web-Based Online Intelligent Education Systems arXiv.cs.AI Pub Date : 2024-04-17 Shuo Liu, Junhao Shen, Hong Qian, Aimin Zhou
Cognitive diagnosis aims to gauge students' mastery levels based on their response logs. Serving as a pivotal module in web-based online intelligent education systems (WOIESs), it plays an upstream and fundamental role in downstream tasks like learning item recommendation and computerized adaptive testing. WOIESs are open learning environment where numerous new students constantly register and complete
-
CAGE: Causality-Aware Shapley Value for Global Explanations arXiv.cs.AI Pub Date : 2024-04-17 Nils Ole Breuer, Andreas Sauter, Majid Mohammadi, Erman Acar
As Artificial Intelligence (AI) is having more influence on our everyday lives, it becomes important that AI-based decisions are transparent and explainable. As a consequence, the field of eXplainable AI (or XAI) has become popular in recent years. One way to explain AI models is to elucidate the predictive importance of the input features for the AI model in general, also referred to as global explanations
-
Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation arXiv.cs.AI Pub Date : 2024-04-17 Jessica López Espejel, Mahaman Sanoussi Yahaya Alassan, Merieme Bouhandi, Walid Dahhane, El Hassane Ettifouri
Large Language Models (LLMs) have become the go-to solution for many Natural Language Processing (NLP) tasks due to their ability to tackle various problems and produce high-quality results. Specifically, they are increasingly used to automatically generate code, easing the burden on developers by handling repetitive tasks. However, this improvement in quality has led to high computational and memory
-
Small Language Models are Good Too: An Empirical Study of Zero-Shot Classification arXiv.cs.AI Pub Date : 2024-04-17 Pierre LepagnolLISN, Thomas GeraldLISN, Sahar GhannayLISN, Christophe ServanSTL, ILES, Sophie RossetLISN
This study is part of the debate on the efficiency of large versus small language models for text classification by prompting.We assess the performance of small language models in zero-shot text classification, challenging the prevailing dominance of large models.Across 15 datasets, our investigation benchmarks language models from 77M to 40B parameters using different architectures and scoring functions
-
Empowering Large Language Models on Robotic Manipulation with Affordance Prompting arXiv.cs.AI Pub Date : 2024-04-17 Guangran Cheng, Chuheng Zhang, Wenzhe Cai, Li Zhao, Changyin Sun, Jiang Bian
While large language models (LLMs) are successful in completing various language processing tasks, they easily fail to interact with the physical world by generating control sequences properly. We find that the main reason is that LLMs are not grounded in the physical world. Existing LLM-based approaches circumvent this problem by relying on additional pre-defined skills or pre-trained sub-policies
-
Causal Effect Estimation Using Random Hyperplane Tessellations arXiv.cs.AI Pub Date : 2024-04-16 Abhishek Dalvi, Neil Ashtekar, Vasant Honavar
Matching is one of the simplest approaches for estimating causal effects from observational data. Matching techniques compare the observed outcomes across pairs of individuals with similar covariate values but different treatment statuses in order to estimate causal effects. However, traditional matching techniques are unreliable given high-dimensional covariates due to the infamous curse of dimensionality
-
CrossGP: Cross-Day Glucose Prediction Excluding Physiological Information arXiv.cs.AI Pub Date : 2024-04-16 Ziyi Zhou, Ming Cheng, Yanjun Cui, Xingjian Diao, Zhaorui Ma
The increasing number of diabetic patients is a serious issue in society today, which has significant negative impacts on people's health and the country's financial expenditures. Because diabetes may develop into potential serious complications, early glucose prediction for diabetic patients is necessary for timely medical treatment. Existing glucose prediction methods typically utilize patients'
-
Cognitive-Motor Integration in Assessing Bimanual Motor Skills arXiv.cs.AI Pub Date : 2024-04-16 Erim Yanik, Xavier Intes, Suvranu De
Accurate assessment of bimanual motor skills is essential across various professions, yet, traditional methods often rely on subjective assessments or focus solely on motor actions, overlooking the integral role of cognitive processes. This study introduces a novel approach by leveraging deep neural networks (DNNs) to analyze and integrate both cognitive decision-making and motor execution. We tested