Test Data Generation for Mutation Testing Based on Markov Chain Usage Model and Estimation of Distribution Algorithm,IEEE Transactions on Software Engineering

当前位置： X-MOL 学术 › IEEE Trans. Softw. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Test Data Generation for Mutation Testing Based on Markov Chain Usage Model and Estimation of Distribution Algorithm
IEEE Transactions on Software Engineering ( IF 7.4 ) Pub Date : 2024-01-24 , DOI: 10.1109/tse.2024.3358297
Changqing Wei ₁ , Xiangjuan Yao ₁ , Dunwei Gong ₂ , Huai Liu ₃

Affiliation

Mutation testing, a mainstream fault-based software testing technique, can mimic a wide variety of software faults by seeding them into the target program and resulting in the so-called mutants. Test data generated in mutation testing should be able to kill as many mutants as possible, hence guaranteeing a high fault-detection effectiveness of testing. Nevertheless, the test data generation can be very expensive, because mutation testing normally involves an extremely large number of mutants and some mutants are hard to kill. It is thus a critical yet challenging job to find an efficient way to generate a small set of test data that are able to kill multiple mutants at the same time as well as reveal those hard-to-detect faults. In this paper, we propose a new approach for test data generation in mutation testing, through the novel applications of the Markov chain usage model and the estimation of distribution algorithm. We first utilize the Markov chain usage model to reduce the so-called mutant branches in weak mutation testing and generate a minimal set of extended paths. Then, we regard the problem of generating test data as the problem of covering extended paths and use an estimation of distribution algorithm based on probability model to solve the problem. Finally, we develop a framework, TAMMEA, to implement the new approach of generating test data for mutation testing. The empirical studies based on fifteen object programs show that TAMMEA can kill more mutants using fewer test data compared with baseline techniques. In addition, the computation overhead of TAMMEA is lower than that of the baseline technique based on the traditional genetic algorithm, and comparable to that of the random method. It is clear that the new approach improves both the effectiveness and efficiency of mutation testing, thus promoting its practicability.

中文翻译：

基于马尔可夫链使用模型和分布算法估计的突变测试的测试数据生成

突变测试是一种主流的基于故障的软件测试技术，可以通过将软件故障植入目标程序并产生所谓的突变来模拟各种软件故障。突变测试中生成的测试数据应该能够杀死尽可能多的突变体，从而保证测试的高故障检测有效性。然而，测试数据的生成可能非常昂贵，因为突变测试通常涉及大量突变体，并且某些突变体很难杀死。因此，找到一种有效的方法来生成一小组测试数据，这些数据能够同时杀死多个突变体并揭示那些难以检测的故障，是一项至关重要但具有挑战性的工作。在本文中，我们通过马尔可夫链使用模型和分布算法估计的新颖应用，提出了一种在突变测试中生成测试数据的新方法。我们首先利用马尔可夫链使用模型来减少弱突变测试中所谓的突变分支，并生成最小的扩展路径集。然后，我们将生成测试数据的问题视为覆盖扩展路径的问题，并使用基于概率模型的分布估计算法来解决该问题。最后，我们开发了一个框架 TAMMEA，来实现为突变测试生成测试数据的新方法。基于 15 个目标程序的实证研究表明，与基线技术相比，TAMMEA 可以使用更少的测试数据杀死更多的突变体。此外，TAMMEA的计算开销低于基于传统遗传算法的基线技术，与随机方法相当。显然，新方法提高了突变测试的有效性和效率，从而提高了其实用性。

更新日期：2024-01-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>