An empirical study on software understandability and its dependence on code characteristics,Empirical Software Engineering

当前位置： X-MOL 学术 › Empir. Software Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An empirical study on software understandability and its dependence on code characteristics
Empirical Software Engineering ( IF 4.1 ) Pub Date : 2023-11-15 , DOI: 10.1007/s10664-023-10396-7
Luigi Lavazza , Sandro Morasca , Marco Gatto

Context

Insufficient code understandability makes software difficult to inspect and maintain and is a primary cause of software development cost. Several source code measures may be used to identify difficult-to-understand code, including well-known ones such as Lines of Code and McCabe’s Cyclomatic Complexity, and novel ones, such as Cognitive Complexity.

Objective

We investigate whether and to what extent source code measures, individually or together, are correlated with code understandability.

Method

We carried out an empirical study with students who were asked to carry out realistic maintenance tasks on methods from real-life Open Source Software projects. We collected several data items, including the time needed to correctly complete the maintenance tasks, which we used to quantify method understandability. We investigated the presence of correlations between the collected code measures and code understandability by using several Machine Learning techniques.

Results

We obtained models of code understandability using one or two code measures. However, the obtained models are not very accurate, the average prediction error being around 30%.

Conclusions

Based on our empirical study, it does not appear possible to build an understandability model based on structural code measures alone. Specifically, even the newly introduced Cognitive Complexity measure does not seem able to fulfill the promise of providing substantial improvements over existing measures, at least as far as code understandability prediction is concerned. It seems that, to obtain models of code understandability of acceptable accuracy, process measures should be used, possibly together with new source code measures that are better related to code understandability.

中文翻译：

软件可理解性及其对代码特征依赖的实证研究

语境

代码可理解性不足导致软件难以检查和维护，是导致软件开发成本的主要原因。可以使用多种源代码度量来识别难以理解的代码，包括众所周知的代码行和麦凯布圈复杂度等，以及新颖的认知复杂度等。

客观的

我们调查源代码测量（单独或一起）是否以及在多大程度上与代码可理解性相关。

方法

我们对学生进行了一项实证研究，要求他们根据现实开源软件项目的方法执行实际的维护任务。我们收集了多个数据项，包括正确完成维护任务所需的时间，我们用这些数据来量化方法的可理解性。我们通过使用多种机器学习技术研究了收集的代码度量和代码可理解性之间是否存在相关性。

结果

我们使用一两个代码度量获得了代码可理解性模型。然而，所获得的模型并不十分准确，平均预测误差在30%左右。

结论

根据我们的实证研究，似乎不可能仅基于结构代码度量来构建可理解性模型。具体来说，即使是新引入的认知复杂性度量似乎也无法兑现对现有度量提供实质性改进的承诺，至少就代码可理解性预测而言是如此。看来，为了获得可接受准确性的代码可理解性模型，应该使用过程测量，可能还需要与与代码可理解性更好相关的新源代码测量一起使用。

更新日期：2023-11-16

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>