当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards better small object detection in UAV scenes: Aggregating more object-oriented information
Pattern Recognition Letters ( IF 5.1 ) Pub Date : 2024-04-06 , DOI: 10.1016/j.patrec.2024.04.002
Chenyue Yang , Yichao Cao , Xiaobo Lu

Security, transportation, and rescue applications require fully analyzing the visual data interpretation via drone platforms. While various aspects of object detection research are expanding at a rapid pace, the detection of small objects in drone platforms continues to pose significant challenges. Specifically, targets in drone-captured scenarios are notoriously hard to detect due to factors such as low resolution, feature indistinguishability, occlusion, scale variation, among others. To address this, our work provides abundant object-oriented information to enhance the recognition ability of small objects in large scale and is structured as follows: Firstly, we devise a method for small object detection in UAV captured scenes, incorporating enhanced object-oriented information. This involves an improvement of the fundamental convolutional feature transformation to yield more discriminative contexts for small objects. Secondly, we represent the input tokens of the vision Multilayer Perceptron (MLP) as a wave function. In order to acquire more effective global representation, we compute the amplitude and phase of these tokens using a local-maximum approach, facilitating dynamic aggregation tailored to the object’s unique semantic information. Lastly, we propose a fusion method transitioning from local to global, devised for comprehensive learning of object features. Experimental evidence substantiates the efficacy of our model, achieving a mean Average Precision (mAP) of 30.6% on the VisDrone dataset. This precision, maintained with consistent input size, outperforms other state-of-the-art methods, underscoring our model’s reliability for drone-captured small object detection.

中文翻译:

实现无人机场景中更好的小物体检测:聚合更多面向对象的信息

安全、运输和救援应用需要通过无人机平台全面分析视觉数据解释。虽然物体检测研究的各个方面正在快速扩展,但无人机平台中小物体的检测仍然面临着重大挑战。具体来说,由于分辨率低、特征难以区分、遮挡、尺度变化等因素,无人机捕获场景中的目标非常难以检测。为了解决这个问题,我们的工作提供了丰富的面向对象信息来增强大范围小物体的识别能力,其结构如下:首先,我们设计了一种无人机捕获场景中小物体检测的方法,结合增强的面向对象信息。这涉及到基本卷积特征变换的改进,以便为小物体产生更具辨别力的上下文。其次,我们将视觉多层感知器(MLP)的输入标记表示为波函数。为了获得更有效的全局表示,我们使用局部最大值方法计算这些标记的幅度和相位,从而促进根据对象的独特语义信息进行动态聚合。最后,我们提出了一种从局部过渡到全局的融合方法,旨在全面学习对象特征。实验证据证实了我们模型的有效性,在 VisDrone 数据集上实现了 30.6% 的平均精度 (mAP)。在保持一致的输入大小的情况下,这种精度优于其他最先进的方法,强调了我们的模型在无人机捕获的小物体检测方面的可靠性。
更新日期:2024-04-06
down
wechat
bug