Taming the data deluge: A novel end-to-end deep learning system for classifying marine biological and environmental images,Limnology and Oceanography: Methods

当前位置： X-MOL 学术 › Limnol. Oceanogr. Methods › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Taming the data deluge: A novel end-to-end deep learning system for classifying marine biological and environmental images
Limnology and Oceanography: Methods ( IF 2.7 ) Pub Date : 2023-11-10 , DOI: 10.1002/lom3.10591
Hongsheng Bi ₁ , Yunhao Cheng ₂ , Xuemin Cheng ₃ , Mark C. Benfield ₄ , David G. Kimmel ₅ , Haiyong Zheng ₂ , Sabrina Groves ₁ , Kezhen Ying ₆

Affiliation

Underwater imaging enables nondestructive plankton sampling at frequencies, durations, and resolutions unattainable by traditional methods. These systems necessitate automated processes to identify organisms efficiently. Early underwater image processing used a standard approach: binarizing images to segment targets, then integrating deep learning models for classification. While intuitive, this infrastructure has limitations in handling high concentrations of biotic and abiotic particles, rapid changes in dominant taxa, and highly variable target sizes. To address these challenges, we introduce a new framework that starts with a scene classifier to capture large within-image variation, such as disparities in the layout of particles and dominant taxa. After scene classification, scene-specific Mask regional convolutional neural network (Mask R-CNN) models are trained to separate target objects into different groups. The procedure allows information to be extracted from different image types, while minimizing potential bias for commonly occurring features. Using in situ coastal plankton images, we compared the scene-specific models to the Mask R-CNN model encompassing all scene categories as a single full model. Results showed that the scene-specific approach outperformed the full model by achieving a 20% accuracy improvement in complex noisy images. The full model yielded counts that were up to 78% lower than those enumerated by the scene-specific model for some small-sized plankton groups. We further tested the framework on images from a benthic video camera and an imaging sonar system with good results. The integration of scene classification, which groups similar images together, can improve the accuracy of detection and classification for complex marine biological images.

中文翻译：

应对数据洪流：一种新颖的端到端深度学习系统，用于对海洋生物和环境图像进行分类

水下成像能够以传统方法无法达到的频率、持续时间和分辨率进行无损浮游生物采样。这些系统需要自动化流程来有效识别生物体。早期的水下图像处理使用标准方法：对图像进行二值化以分割目标，然后集成深度学习模型进行分类。虽然直观，但这种基础设施在处理高浓度的生物和非生物颗粒、优势类群的快速变化以及高度可变的目标尺寸方面存在局限性。为了解决这些挑战，我们引入了一个新的框架，该框架从场景分类器开始，以捕获图像内的大变化，例如粒子和主要分类群布局的差异。场景分类后，训练特定场景的 Mask 区域卷积神经网络 (Mask R-CNN) 模型，将目标对象分成不同的组。该过程允许从不同的图像类型中提取信息，同时最大限度地减少常见特征的潜在偏差。使用原位沿海浮游生物图像，我们将特定场景模型与包含所有场景类别作为单个完整模型的 Mask R-CNN 模型进行了比较。结果表明，特定场景方法的性能优于完整模型，在复杂的噪声图像中实现了 20% 的精度提升。对于一些小型浮游生物群，完整模型得出的计数比特定场景模型计算的计数低 78%。我们进一步在来自海底摄像机和成像声纳系统的图像上测试了该框架，取得了良好的结果。场景分类的集成将相似的图像分组在一起，可以提高复杂海洋生物图像的检测和分类的准确性。

更新日期：2023-11-10

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>