当前位置: X-MOL 学术Genet. Program. Evolvable Mach. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Semantic segmentation network stacking with genetic programming
Genetic Programming and Evolvable Machines ( IF 2.6 ) Pub Date : 2023-10-26 , DOI: 10.1007/s10710-023-09464-0
Illya Bakurov , Marco Buzzelli , Raimondo Schettini , Mauro Castelli , Leonardo Vanneschi

Semantic segmentation consists of classifying each pixel of an image and constitutes an essential step towards scene recognition and understanding. Deep convolutional encoder–decoder neural networks now constitute state-of-the-art methods in the field of semantic segmentation. The problem of street scenes’ segmentation for automotive applications constitutes an important application field of such networks and introduces a set of imperative exigencies. Since the models need to be executed on self-driving vehicles to make fast decisions in response to a constantly changing environment, they are not only expected to operate reliably but also to process the input images rapidly. In this paper, we explore genetic programming (GP) as a meta-model that combines four different efficiency-oriented networks for the analysis of urban scenes. Notably, we present and examine two approaches. In the first approach, we represent solutions as GP trees that combine networks’ outputs such that each output class’s prediction is obtained through the same meta-model. In the second approach, we propose representing solutions as lists of GP trees, each designed to provide a unique meta-model for a given target class. The main objective is to develop efficient and accurate combination models that could be easily interpreted, therefore allowing gathering some hints on how to improve the existing networks. The experiments performed on the Cityscapes dataset of urban scene images with semantic pixel-wise annotations confirm the effectiveness of the proposed approach. Specifically, our best-performing models improve systems’ generalization ability by approximately 5% compared to traditional ensembles, 30% for the less performing state-of-the-art CNN and show competitive results with respect to state-of-the-art ensembles. Additionally, they are small in size, allow interpretability, and use fewer features due to GP’s automatic feature selection.



中文翻译:

具有遗传编程的语义分割网络堆叠

语义分割包括对图像的每个像素进行分类,是实现场景识别和理解的重要步骤。深度卷积编码器-解码器神经网络现在构成了语义分割领域最先进的方法。汽车应用的街道场景分割问题构成了此类网络的一个重要应用领域,并引入了一系列迫切的需求。由于这些模型需要在自动驾驶车辆上执行,以便根据不断变化的环境做出快速决策,因此它们不仅需要可靠运行,而且能够快速处理输入图像。在本文中,我们探索遗传编程(GP)作为元模型,它结合了四种不同的以效率为导向的网络来分析城市场景。值得注意的是,我们提出并研究了两种方法。在第一种方法中,我们将解决方案表示为结合网络输出的 GP 树,以便通过相同的元模型获得每个输出类的预测。在第二种方法中,我们建议将解决方案表示为 GP 树列表,每个树都旨在为给定的目标类提供独特的元模型。主要目标是开发易于解释的高效且准确的组合模型,从而收集有关如何改进现有网络的一些提示。在具有语义像素注释的城市场景图像的 Cityscapes 数据集上进行的实验证实了所提出方法的有效性。具体来说,与传统集成相比,我们性能最佳的模型将系统的泛化能力提高了约 5%,对于性能较差的最先进的 CNN,系统的泛化能力提高了 30%,并且与最先进的集成相比,表现出了有竞争力的结果。此外,它们尺寸小,易于解释,并且由于 GP 的自动特征选择而使用更少的特征。

更新日期:2023-10-27
down
wechat
bug