样式: 排序: IF: - GO 导出 标记为已读
-
EASE: An Effort-aware Extension of Unsupervised Key Class Identification Approaches ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-21 Weifeng Pan, Marouane Kessentini, Hua Ming, Zijiang Yang
Key class identification approaches aim at identifying the most important classes to help developers, especially newcomers, start the software comprehension process. So far, many supervised and unsupervised approaches have been proposed; however, they have not considered the effort to comprehend classes. In this article, we identify the challenge of “effort-aware key class identification”; to partially
-
RE Methods for Virtual Reality Software Product Development: A Mapping Study ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-20 Sai Anirudh Karre, Y. Raghu Reddy, Raghav Mittal
Software practitioners use various methods in Requirements Engineering (RE) to elicit, analyze, and specify the requirements of enterprise products. The methods impact the final product characteristics and influence product delivery. Ad-hoc usage of the methods by software practitioners can lead to inconsistency and ambiguity in the product. With the notable rise in enterprise products, games, and
-
Exploring Semantic Redundancy using Backdoor Triggers: A Complementary Insight into the Challenges Facing DNN-based Software Vulnerability Detection ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-20 Changjie Shao, Gaolei Li, Jun Wu, Xi Zheng
To detect software vulnerabilities with better performance, deep neural networks (DNNs) have received extensive attention recently. However, these vulnerability detection DNN models trained with code representations are vulnerable to specific perturbations on code representations. This motivates us to rethink the bane of software vulnerability detection and find function-agnostic features during code
-
Battling against Protocol Fuzzing: Protecting Networked Embedded Devices from Dynamic Fuzzers ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-20 Puzhuo Liu, Yaowen Zheng, Chengnian Sun, Hong Li, Zhi Li, Limin Sun
Networked Embedded Devices (NEDs) are increasingly targeted by cyberattacks, mainly due to their widespread use in our daily lives. Vulnerabilities in NEDs are the root causes of these cyberattacks. Although deployed NEDs go through thorough code audits, there can still be considerable exploitable vulnerabilities. Existing mitigation measures like code encryption and obfuscation adopted by vendors
-
Enablers and Barriers of Empathy in Software Developer and User Interactions: A Mixed Methods Case Study ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-20 Hashini Gunatilake, John Grundy, Rashina Hoda, Ingo Mueller
Software engineering (SE) requires developers to collaborate with stakeholders, and understanding their emotions and perspectives is often vital. Empathy is a concept characterising a person’s ability to understand and share the feelings of another. However, empathy continues to be an under-researched human aspect in SE. We studied how empathy is practised between developers and end users using a mixed
-
Understanding Real-Time Collaborative Programming: A Study of Visual Studio Live Share ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-20 Xin Tan, Xinyue Lv, Jing Jiang, Li Zhang
Real-time collaborative programming (RCP) entails developers working simultaneously, regardless of their geographic locations. RCP differs from traditional asynchronous online programming methods, such as Git or SVN, where developers work independently and update the codebase at separate times. Although various real-time code collaboration tools (e.g., Visual Studio Live Share, Code with Me, and Replit)
-
Test Optimization in DNN Testing: A Survey ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-20 Qiang Hu, Yuejun Guo, Xiaofei Xie, Maxime Cordy, Lei Ma, Mike Papadakis, Yves Le Traon
This article presents a comprehensive survey on test optimization in deep neural network (DNN) testing. Here, test optimization refers to testing with low data labeling effort. We analyzed 90 papers, including 43 from the software engineering (SE) community, 32 from the machine learning (ML) community, and 15 from other communities. Our study: (i) unifies the problems as well as terminologies associated
-
Identifying and Explaining Safety-critical Scenarios for Autonomous Vehicles via Key Features ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Neelofar Neelofar, Aldeida Aleti
Ensuring the safety of autonomous vehicles (AVs) is of utmost importance, and testing them in simulated environments is a safer option than conducting in-field operational tests. However, generating an exhaustive test suite to identify critical test scenarios is computationally expensive, as the representation of each test is complex and contains various dynamic and static features, such as the AV
-
Rigorous Assessment of Model Inference Accuracy using Language Cardinality ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Donato Clun, Donghwan Shin, Antonio Filieri, Domenico Bianculli
Models such as finite state automata are widely used to abstract the behavior of software systems by capturing the sequences of events observable during their execution. Nevertheless, models rarely exist in practice and, when they do, get easily outdated; moreover, manually building and maintaining models is costly and error-prone. As a result, a variety of model inference methods that automatically
-
ARCTURUS: Full Coverage Binary Similarity Analysis with Reachability-guided Emulation ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Anshunkang Zhou, Yikun Hu, Xiangzhe Xu, Charles Zhang
Binary code similarity analysis is extremely useful, since it provides rich information about an unknown binary, such as revealing its functionality and identifying reused libraries. Robust binary similarity analysis is challenging, as heavy compiler optimizations can make semantically similar binaries have gigantic syntactic differences. Unfortunately, existing semantic-based methods still suffer
-
Characterizing Deep Learning Package Supply Chains in PyPI: Domains, Clusters, and Disengagement ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Kai Gao, Runzhi He, Bing Xie, Minghui Zhou
Deep learning (DL) frameworks have become the cornerstone of the rapidly developing DL field. Through installation dependencies specified in the distribution metadata, numerous packages directly or transitively depend on DL frameworks, layer after layer, forming DL package supply chains (SCs), which are critical for DL frameworks to remain competitive. However, vital knowledge on how to nurture and
-
Method-level Bug Prediction: Problems and Promises ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Shaiful Chowdhury, Gias Uddin, Hadi Hemmati, Reid Holmes
Fixing software bugs can be colossally expensive, especially if they are discovered in the later phases of the software development life cycle. As such, bug prediction has been a classic problem for the research community. As of now, the Google Scholar site generates ∼113,000 hits if searched with the “bug prediction” phrase. Despite this staggering effort by the research community, bug prediction
-
Industry Practices for Challenging Autonomous Driving Systems with Critical Scenarios ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Qunying Song, Emelie Engström, Per Runeson
Testing autonomous driving systems for safety and reliability is essential, yet complex. A primary challenge is identifying relevant test scenarios, especially the critical ones that may expose hazards or harm to autonomous vehicles and other road users. Although numerous approaches and tools for critical scenario identification are proposed, the industry practices for selection, implementation, and
-
Compiler Autotuning through Multiple-phase Learning ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Mingxuan Zhu, Dan Hao, Junjie Chen
Widely used compilers like GCC and LLVM usually have hundreds of optimizations controlled by optimization flags, which are enabled or disabled during compilation to improve the runtime performance (e.g., small execution time) of the compiler program. Due to the large number of optimization flags and their combination, it is difficult for compiler users to manually tune compiler optimization flags.
-
Bug Analysis in Jupyter Notebook Projects: An Empirical Study ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Taijara Loiola De Santana, Paulo Anselmo Da Mota Silveira Neto, Eduardo Santana De Almeida, Iftekhar Ahmed
Computational notebooks, such as Jupyter, have been widely adopted by data scientists to write code for analyzing and visualizing data. Despite their growing adoption and popularity, few studies have been found to understand Jupyter development challenges from the practitioners’ point of view. This article presents a systematic study of bugs and challenges that Jupyter practitioners face through a
-
Mapping APIs in Dynamic-typed Programs by Leveraging Transfer Learning ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Zhenfei Huang, Junjie Chen, Jiajun Jiang, Yihua Liang, Hanmo You, Fengjie Li
Application Programming Interface (API) migration is a common task for adapting software across different programming languages and platforms, where manually constructing the mapping relations between APIs is indeed time-consuming and error-prone. To facilitate this process, many automated API mapping approaches have been proposed. However, existing approaches were mainly designed and evaluated for
-
Deceiving Humans and Machines Alike: Search-based Test Input Generation for DNNs Using Variational Autoencoders ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Sungmin Kang, Robert Feldt, Shin Yoo
Due to the rapid adoption of Deep Neural Networks (DNNs) into larger software systems, testing of DNN-based systems has received much attention recently. While many different test adequacy criteria have been suggested, we lack effective test input generation techniques. Inputs such as images of real-world objects and scenes are not only expensive to collect but also difficult to randomly sample. Consequently
-
Beyond Accuracy: An Empirical Study on Unit Testing in Open-source Deep Learning Projects ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Han Wang, Sijia Yu, Chunyang Chen, Burak Turhan, Xiaodong Zhu
Deep Learning (DL) models have rapidly advanced, focusing on achieving high performance through testing model accuracy and robustness. However, it is unclear whether DL projects, as software systems, are tested thoroughly or functionally correct when there is a need to treat and test them like other software systems. Therefore, we empirically study the unit tests in open-source DL projects, analyzing
-
Estimating Uncertainty in Labeled Changes by SZZ Tools on Just-In-Time Defect Prediction ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Shikai Guo, Dongmin Li, Lin Huang, Sijia Lv, Rong Chen, Hui Li, Xiaochen Li, He Jiang
The aim of Just-In-Time (JIT) defect prediction is to predict software changes that are prone to defects in a project in a timely manner, thereby improving the efficiency of software development and ensuring software quality. Identifying changes that introduce bugs is a critical task in just-in-time defect prediction, and researchers have introduced the SZZ approach and its variants to label these
-
Smart Contract Code Repair Recommendation based on Reinforcement Learning and Multi-metric Optimization ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Hanyang Guo, Yingye Chen, Xiangping Chen, Yuan Huang, Zibin Zheng
A smart contract is a kind of code deployed on the blockchain that executes automatically once an event triggers a clause in the contract. Since smart contracts involve businesses such as asset transfer, they are more vulnerable to attacks, so it is crucial to ensure the security of smart contracts. Because a smart contract cannot be tampered with once deployed on the blockchain, for smart contract
-
Mitigating Debugger-based Attacks to Java Applications with Self-debugging ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-18 Davide Pizzolotto, Stefano Berlato, Mariano Ceccato
Java bytecode is a quite high-level language and, as such, it is fairly easy to analyze and decompile with malicious intents, e.g., to tamper with code and skip license checks. Code obfuscation was a first attempt to mitigate malicious reverse-engineering based on static analysis. However, obfuscated code can still be dynamically analyzed with standard debuggers to perform step-wise execution and to
-
PACE: A Program Analysis Framework for Continuous Performance Prediction ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-17 Chidera Biringa, Gökhan Kul
Software development teams establish elaborate continuous integration pipelines containing automated test cases to accelerate the development process of software. Automated tests help to verify the correctness of code modifications decreasing the response time to changing requirements. However, when the software teams do not track the performance impact of pending modifications, they may need to spend
-
Assessing Effectiveness of Test Suites: What Do We Know and What Should We Do? ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-17 Peng Zhang, Yang Wang, Xutong Liu, Zeyu Lu, Yibiao Yang, Yanhui Li, Lin Chen, Ziyuan Wang, Chang-Ai Sun, Xiao Yu, Yuming Zhou
Background. Software testing is a critical activity for ensuring the quality and reliability of software systems. To evaluate the effectiveness of different test suites, researchers have developed a variety of metrics. Problem. However, comparing these metrics is challenging due to the lack of a standardized evaluation framework including comprehensive factors. As a result, researchers often focus
-
Using Voice and Biofeedback to Predict User Engagement during Product Feedback Interviews ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-17 Alessio Ferrari, Thaide Huichapa, Paola Spoletini, Nicole Novielli, Davide Fucci, Daniela Girardi
Capturing users’ engagement is crucial for gathering feedback about the features of a software product. In a market-driven context, current approaches to collecting and analyzing users’ feedback are based on techniques leveraging information extracted from product reviews and social media. These approaches are hardly applicable in contexts where online feedback is limited, as for the majority of apps
-
A Smart Status Based Monitoring Algorithm for the Dynamic Analysis of Memory Safety ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-17 Zhe Chen, Rui Yan, Yingzi Ma, Yulei Sui, Jingling Xue
C is a dominant programming language for implementing system and low-level embedded software. Unfortunately, the unsafe nature of its low-level control of memory often leads to memory errors. Dynamic analysis has been widely used to detect memory errors at runtime. However, existing monitoring algorithms for dynamic analysis are not yet satisfactory, as they cannot deterministically and completely
-
Measuring and Clustering Heterogeneous Chatbot Designs ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-17 Pablo C. Cañizares, Jose María López-Morales, Sara Pérez-Soler, Esther Guerra, Juan de Lara
Conversational agents, or chatbots, have become popular to access all kind of software services. They provide an intuitive natural language interface for interaction, available from a wide range of channels including social networks, web pages, intelligent speakers or cars. In response to this demand, many chatbot development platforms and tools have emerged. However, they typically lack support to
-
Building Domain-Specific Machine Learning Workflows: A Conceptual Framework for the State of the Practice ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-17 Bentley James Oakes, Michalis Famelis, Houari Sahraoui
Domain experts are increasingly employing machine learning to solve their domain-specific problems. This article presents to software engineering researchers the six key challenges that a domain expert faces in addressing their problem with a computational workflow, and the underlying executable implementation. These challenges arise out of our conceptual framework which presents the “route” of transformations
-
Test Generation Strategies for Building Failure Models and Explaining Spurious Failures ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-17 Baharin A. Jodat, Abhishek Chandar, Shiva Nejati, Mehrdad Sabetzadeh
Test inputs fail not only when the system under test is faulty but also when the inputs are invalid or unrealistic. Failures resulting from invalid or unrealistic test inputs are spurious. Avoiding spurious failures improves the effectiveness of testing in exercising the main functions of a system, particularly for compute-intensive (CI) systems where a single test execution takes significant time
-
Replication in Requirements Engineering: the NLP for RE Case ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-15 Sallam Abualhaija, Fatma Başak Aydemir, Fabiano Dalpiaz, Davide Dell’Anna, Alessio Ferrari, Xavier Franch, Davide Fucci
[Context] Natural language processing (NLP) techniques have been widely applied in the requirements engineering (RE) field to support tasks such as classification and ambiguity detection. Despite its empirical vocation, RE research has given limited attention to replication of NLP for RE studies. Replication is hampered by several factors, including the context specificity of the studies, the heterogeneity
-
BatFix: Repairing language model-based transpilation ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-12 Daniel Ramos, Inês Lynce, Vasco Manquinho, Ruben Martins, Claire Le Goues
To keep up with changes in requirements, frameworks, and coding practices, software organizations might need to migrate code from one language to another. Source-to-source migration, or transpilation, is often a complex, manual process. Transpilation requires expertise both in the source and target language, making it highly laborious and costly. Languages models for code generation and transpilation
-
MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-09 Congying Xu, Valerio Terragni, Hengcheng Zhu, Jiarong Wu, Shing-Chi Cheung
Metamorphic Testing (MT) alleviates the oracle problem by defining oracles based on metamorphic relations (MRs), that govern multiple related inputs and their outputs. However, designing MRs is challenging, as it requires domain-specific knowledge. This hinders the widespread adoption of MT. We observe that developer-written test cases can embed domain knowledge that encodes MRs. Such encoded MRs could
-
A Survey of Source Code Search: A 3-Dimensional Perspective ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-06 Weisong Sun, Chunrong Fang, Yifei Ge, Yuling Hu, Yuchen Chen, Quanjun Zhang, Xiuting Ge, Yang Liu, Zhenyu Chen
(Source) code search is widely concerned by software engineering researchers because it can improve the productivity and quality of software development. Given a functionality requirement usually described in a natural language sentence, a code search system can retrieve code snippets that satisfy the requirement from a large-scale code corpus, e.g., GitHub. To realize effective and efficient code
-
Help Them Understand: Testing and Improving Voice User Interfaces ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-05 Emanuela Guglielmi, Giovanni Rosa, Simone Scalabrino, Gabriele Bavota, Rocco Oliveto
Voice-based virtual assistants are becoming increasingly popular. Such systems provide frameworks to developers for building custom apps. End-users can interact with such apps through a Voice User Interface (VUI), which allows the user to use natural language commands to perform actions. Testing such apps is not trivial: The same command can be expressed in different semantically equivalent ways. In
-
Testing Multi-Subroutine Quantum Programs: From Unit Testing to Integration Testing ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-05 Peixun Long, Jianjun Zhao
Quantum computing has emerged as a promising field with the potential to revolutionize various domains by harnessing the principles of quantum mechanics. As quantum hardware and algorithms continue to advance, developing high-quality quantum software has become crucial. However, testing quantum programs poses unique challenges due to the distinctive characteristics of quantum systems and the complexity
-
On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-Based Software Testing ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-04-04 Anjana Perera, Burak Turhan, Aldeida Aleti, Marcel Böhme
Defect predictors, static bug detectors and humans inspecting the code can propose locations in the program that are more likely to be buggy before they are discovered through testing. Automated test generators such as search-based software testing (SBST) techniques can use this information to direct their search for test cases to likely-buggy code, thus speeding up the process of detecting existing
-
The IDEA of Us: An Identity-Aware Architecture for Autonomous Systems ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-28 Carlos Gavidia-Calderon, Anastasia Kordoni, Amel Bennaceur, Mark Levine, Bashar Nuseibeh
Autonomous systems, such as drones and rescue robots, are increasingly used during emergencies. They deliver services and provide situational awareness that facilitate emergency management and response. To do so, they need to interact and cooperate with humans in their environment. Human behaviour is uncertain and complex, so it can be difficult to reason about it formally. In this paper, we propose
-
Octopus: Scaling Value-Flow Analysis via Parallel Collection of Realizable Path Conditions ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-29 Wensheng Tang, Dejun Dong, Shijie Li, Chengpeng Wang, Peisen Yao, Jinguo Zhou, Charles Zhang
Value-flow analysis is a fundamental technique in program analysis, benefiting various clients, such as memory corruption detection and taint analysis. However, existing efforts suffer from the low potential speedup that leads to a deficiency in scalability. In this work, we present a parallel algorithm Octopus to collect path conditions for realizable paths efficiently. Octopus builds on the realizability
-
Navigating the Complexity of Generative AI Adoption in Software Engineering ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-28 Daniel Russo
This paper explores the adoption of Generative Artificial Intelligence (AI) tools within the domain of software engineering, focusing on the influencing factors at the individual, technological, and social levels. We applied a convergent mixed-methods approach to offer a comprehensive understanding of AI adoption dynamics. We initially conducted a questionnaire survey with 100 software engineers, drawing
-
MTL-TRANSFER: Leveraging Multi-task Learning and Transferred Knowledge for Improving Fault Localization and Program Repair ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-27 Xu Wang, Hongwei Yu, Xiangxin Meng, Hongliang Cao, Hongyu Zhang, Hailong Sun, Xudong Liu, Chunming Hu
Fault localization (FL) and automated program repair (APR) are two main tasks of automatic software debugging. Compared with traditional methods, deep learning-based approaches have been demonstrated to achieve better performance in FL and APR tasks. However, the existing deep learning-based FL methods ignore the deep semantic features or only consider simple code representations. And for APR tasks
-
Early and Realistic Exploitability Prediction of Just-Disclosed Software Vulnerabilities: How Reliable Can It Be? ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-27 Emanuele Iannone, Giulia Sellitto, Emanuele Iaccarino, Filomena Ferrucci, Andrea De Lucia, Fabio Palomba
With the rate of discovered and disclosed vulnerabilities escalating, researchers have been experimenting with machine learning to predict whether a vulnerability will be exploited. Existing solutions leverage information unavailable when a CVE is created, making them unsuitable just after the disclosure. This paper experiments with early exploitability prediction models driven exclusively by the initial
-
On Estimating the Feasible Solution Space of Multi-Objective Testing Resource Allocation ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-26 Guofu Zhang, Lei Li, Zhaopin Su, Feng Yue, Yang Chen, Miqing Li, Xin Yao
The multi-objective testing resource allocation problem (MOTRAP) is concerned on how to reasonably plan the testing time of software testers to save the cost and improve the reliability as much as possible. The feasible solution space of a MOTRAP is determined by its variables (i.e., the time invested in each component) and constraints (e.g., the pre-specified reliability, cost, or time). Although
-
DiPri: Distance-based Seed Prioritization for Greybox Fuzzing ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-26 Ruixiang Qian, Quanjun Zhang, Chunrong Fang, Ding Yang, Shun Li, Binyu Li, Zhenyu Chen
Greybox fuzzing is a powerful testing technique. Given a set of initial seeds, greybox fuzzing continuously generates new test inputs to execute the program under test and drives executions with code coverage as feedback. Seed prioritization is an important step of greybox fuzzing that helps greybox fuzzing choose promising seeds for input generation in priority. However, mainstream greybox fuzzers
-
On the Way to SBOMs: Investigating Design Issues and Solutions in Practice ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-26 Tingting Bi, Boming Xia, Zhenchang Xing, Qinghua Lu, Liming Zhu
The increase of software supply chain threats has underscored the necessity for robust security mechanisms, among which the Software Bill of Materials (SBOM) stands out as a promising solution. SBOMs, by providing a machine-readable inventory of software composition details, play a crucial role in enhancing transparency and traceability within software supply chains. This empirical study delves into
-
Deep Is Better? An Empirical Comparison of Information Retrieval and Deep Learning Approaches to Code Summarization ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Tingwei Zhu, Zhong Li, Minxue Pan, Chaoxuan Shi, Tian Zhang, Yu Pei, Xuandong Li
Code summarization aims to generate short functional descriptions for source code to facilitate code comprehension. While Information Retrieval (IR) approaches that leverage similar code snippets and corresponding summaries have led the early research, Deep Learning (DL) approaches that use neural models to capture statistical properties between code and summaries are now mainstream. Although some
-
Improving Automated Program Repair with Domain Adaptation ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Armin Zirak, Hadi Hemmati
Automated Program Repair (APR) is defined as the process of fixing a bug/defect in the source code, by an automated tool. APR tools have recently experienced promising results by leveraging state-of-the-art Neural Language Processing (NLP) techniques. APR tools such as TFix and CodeXGLUE that combine text-to-text transformers with software-specific techniques are outperforming alternatives, these days
-
Ethics in the Age of AI: An Analysis of AI Practitioners’ Awareness and Challenges ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Aastha Pant, Rashina Hoda, Simone V. Spiegler, Chakkrit Tantithamthavorn, Burak Turhan
Ethics in AI has become a debated topic of public and expert discourse in recent years. But what do people who build AI—AI practitioners—have to say about their understanding of AI ethics and the challenges associated with incorporating it into the AI-based systems they develop? Understanding AI practitioners’ views on AI ethics is important as they are the ones closest to the AI systems and can bring
-
Measurement of Embedding Choices on Cryptographic API Completion Tasks ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Ya Xiao, Wenjia Song, Salman Ahmed, Xinyang Ge, Bimal Viswanath, Na Meng, Danfeng (Daphne) Yao
In this article, we conduct a measurement study to comprehensively compare the accuracy impacts of multiple embedding options in cryptographic API completion tasks. Embedding is the process of automatically learning vector representations of program elements. Our measurement focuses on design choices of three important aspects, program analysis preprocessing, token-level embedding, and sequence-level
-
Understanding Developers Well-Being and Productivity: A 2-year Longitudinal Analysis during the COVID-19 Pandemic ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Daniel Russo, Paul H. P. Hanel, Niels Van Berkel
The COVID-19 pandemic has brought significant and enduring shifts in various aspects of life, including increased flexibility in work arrangements. In a longitudinal study, spanning 24 months with six measurement points from April 2020 to April 2022, we explore changes in well-being, productivity, social contacts, and needs of software engineers during this time. Our findings indicate systematic changes
-
Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project Health ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-14 Andre Lustosa, Tim Menzies
When data is scarce, software analytics can make many mistakes. For example, consider learning predictors for open source project health (e.g., the number of closed pull requests in 12 months time). The training data for this task may be very small (e.g., 5 years of data, collected every month means just 60 rows of training data). The models generated from such tiny datasets can make many prediction
-
The Lost World: Characterizing and Detecting Undiscovered Test Smells ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Yanming Yang, Xing Hu, Xin Xia, Xiaohu Yang
Test smell refers to poor programming and design practices in testing and widely spreads throughout software projects. Considering test smells have negative impacts on the comprehension and maintenance of test code and even make code-under-test more defect-prone, it thus has great importance in mining, detecting, and refactoring them. Since Deursen et al. introduced the definition of “test smell”,
-
How Important Are Good Method Names in Neural Code Generation? A Model Robustness Perspective ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-14 Guang Yang, Yu Zhou, Wenhua Yang, Tao Yue, Xiang Chen, Taolue Chen
Pre-trained code generation models (PCGMs) have been widely applied in neural code generation, which can generate executable code from functional descriptions in natural languages, possibly together with signatures. Despite substantial performance improvement of PCGMs, the role of method names in neural code generation has not been thoroughly investigated. In this article, we study and demonstrate
-
A Post-training Framework for Improving the Performance of Deep Learning Models via Model Transformation ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Jiajun Jiang, Junjie Yang, Yingyi Zhang, Zan Wang, Hanmo You, Junjie Chen
Deep learning (DL) techniques have attracted much attention in recent years and have been applied to many application scenarios. To improve the performance of DL models regarding different properties, many approaches have been proposed in the past decades, such as improving the robustness and fairness of DL models to meet the requirements for practical use. Among existing approaches, post-training
-
Safety of Perception Systems for Automated Driving: A Case Study on Apollo ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Sangeeth Kochanthara, Tajinder Singh, Alexandru Forrai, Loek Cleophas
The automotive industry is now known for its software-intensive and safety-critical nature. The industry is on a path to the holy grail of completely automating driving, starting from relatively simple operational areas like highways. One of the most challenging, evolving, and essential parts of automated driving is the software that enables understanding of surroundings and the vehicle’s own as well
-
Attack as Detection: Using Adversarial Attack Methods to Detect Abnormal Examples ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Zhe Zhao, Guangke Chen, Tong Liu, Taishan Li, Fu Song, Jingyi Wang, Jun Sun
As a new programming paradigm, deep learning (DL) has achieved impressive performance in areas such as image processing and speech recognition, and has expanded its application to solve many real-world problems. However, neural networks and DL are normally black-box systems; even worse, DL-based software are vulnerable to threats from abnormal examples, such as adversarial and backdoored examples constructed
-
Representation Learning for Stack Overflow Posts: How Far Are We? ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Junda He, Xin Zhou, Bowen Xu, Ting Zhang, Kisub Kim, Zhou Yang, Ferdian Thung, Ivana Clairine Irsan, David Lo
The tremendous success of Stack Overflow has accumulated an extensive corpus of software engineering knowledge, thus motivating researchers to propose various solutions for analyzing its content. The performance of such solutions hinges significantly on the selection of representation models for Stack Overflow posts. As the volume of literature on Stack Overflow continues to burgeon, it highlights
-
Reusing Convolutional Neural Network Models through Modularization and Composition ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Binhang Qi, Hailong Sun, Hongyu Zhang, Xiang Gao
With the widespread success of deep learning technologies, many trained deep neural network (DNN) models are now publicly available. However, directly reusing the public DNN models for new tasks often fails due to mismatching functionality or performance. Inspired by the notion of modularization and composition in software reuse, we investigate the possibility of improving the reusability of DNN models
-
SourcererJBF: A Java Build Framework For Large-Scale Compilation ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Md Rakib Hossain Misu, Rohan Achar, Cristina V. Lopes
Researchers and tool developers working on dynamic analysis, software testing, automated program repair, verification, and validation, need large compiled, compilable, and executable code corpora to test their ideas. The publicly available corpora are relatively small, and/or non-compilable, and/or non-executable. Developing a compiled code corpus is a laborious activity demanding significant manual
-
PTM-APIRec: Leveraging Pre-trained Models of Source Code in API Recommendation ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Zhihao Li, Chuanyi Li, Ze Tang, Wanhong Huang, Jidong Ge, Bin Luo, Vincent Ng, Ting Wang, Yucheng Hu, Xiaopeng Zhang
Recommending APIs is a practical and essential feature of IDEs. Improving the accuracy of API recommendations is an effective way to improve coding efficiency. With the success of deep learning in software engineering, the state-of-the-art (SOTA) performance of API recommendation is also achieved by deep-learning-based approaches. However, existing SOTAs either only consider the API sequences in the
-
Causality-driven Testing of Autonomous Driving Systems ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Luca Giamattei, Antonio Guerriero, Roberto Pietrantuono, Stefano Russo
Testing Autonomous Driving Systems (ADS) is essential for safe development of self-driving cars. For thorough and realistic testing, ADS are usually embedded in a simulator and tested in interaction with the simulated environment. However, their high complexity and the multiple safety requirements lead to costly and ineffective testing. Recent techniques exploit many-objective strategies and ML to
-
Learning-based Relaxation of Completeness Requirements for Data Entry Forms ACM Trans. Softw. Eng. Methodol. (IF 4.4) Pub Date : 2024-03-15 Hichem Belgacem, Xiaochen Li, Domenico Bianculli, Lionel Briand
Data entry forms use completeness requirements to specify the fields that are required or optional to fill for collecting necessary information from different types of users. However, because of the evolving nature of software, some required fields may not be applicable for certain types of users anymore. Nevertheless, they may still be incorrectly marked as required in the form; we call such fields