Convolutional neural network application on a new middle Eocene radiolarian dataset,Marine Micropaleontology

当前位置： X-MOL 学术 › Mar. Micropaleontol. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Convolutional neural network application on a new middle Eocene radiolarian dataset
Marine Micropaleontology ( IF 1.9 ) Pub Date : 2023-07-18 , DOI: 10.1016/j.marmicro.2023.102268
Veronica Carlsson , Taniel Danelian , Martin Tetard , Mathias Meunier , Pierre Boulet , Philippe Devienne , Sandra Ventalon

A new radiolarian image database was used to train a Convolutional Neural Network (CNN) for automatic image classification. The focus was on 39 commonly occurring nassellarian species, which are important for biostratigraphy.

The database consisted of tropical radiolarian assemblages from 129 middle Eocene samples retrieved from ODP Holes 1258A, 1259A, and 1260A (Demerara Rise). A total of 116 taxonomic classes were established, with 96 classes used for training a ResNet50 CNN. To represent the diverse radiolarian assemblage, some classes were formed by grouping forms based on external morphological criteria. This approach resulted in an 86.6% training accuracy.

A test set of 800 images from new samples obtained from Hole 1260A was used to validate the CNN, achieving a 75.69% accuracy. The focus then shifted to 39 well-known nassellarian species, using a total of 15,932 images from the new samples. The goal was to determine if the targeted species were correctly classified and explore potential real-world applications of the trained CNN.

Different prediction threshold values were experimented with. In most cases, a lower threshold value was preferred to ensure that all species were captured in the correct groups, even if it resulted in lower accuracies within the classes.

中文翻译：

卷积神经网络在新的中始新世放射虫数据集上的应用

使用新的放射虫图像数据库来训练卷积神经网络（CNN）以进行自动图像分类。重点是 39 种常见的鼻孔雀属物种，它们对生物地层学很重要。

该数据库由从 ODP 孔 1258A、1259A 和 1260A（德梅拉拉隆起）取回的129 个始新世中期样本中的热带放射虫组合组成。总共建立了 116 个分类类，其中 96 个类用于训练 ResNet50 CNN。为了代表不同的放射虫组合，一些类是通过基于外部形态标准的分组形式形成的。这种方法的训练准确率达到 86.6%。

使用从 1260A 孔获得的新样本的 800 张图像的测试集来验证 CNN，达到 75.69% 的准确率。随后，焦点转移到 39 个著名的鼻孔雀物种，使用了来自新样本的总共 15,932 张图像。目标是确定目标物种是否被正确分类，并探索经过训练的 CNN 在现实世界中的潜在应用。

试验了不同的预测阈值。在大多数情况下，首选较低的阈值，以确保所有物种都被捕获到正确的组中，即使这会导致类别内的准确度较低。

更新日期：2023-07-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南