Comparative analysis of various models for image classification on Cifar-100 dataset,Journal of Physics: Conference Series

当前位置： X-MOL 学术 › J. Phys. Conf. Ser. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Comparative analysis of various models for image classification on Cifar-100 dataset
Journal of Physics: Conference Series Pub Date : 2024-02-01 , DOI: 10.1088/1742-6596/2711/1/012015
YuYu Zheng , HaoXuan Huang , JunMing Chen

Nowadays, people developed various convolutional neural network (CNN) based models for computer vision. Some famous models, such as GoogLeNet, Residual Network (ResNet), Visual Geometry Group (VGG), and You Only Look Once (YOLO), have different architecture and performances. Determining which model to use may be a troublesome problem for those just starting to study image classification. To solve this problem, we introduce the GoogLeNet, ResNet-18, and VGG-16 models, comparing their architecture, features, and performance. Then we give our suggestions based on the test results to help beginners choose a suitable model. We conducted experiments to train and test GoogLeNet, ResNet-18, and VGG-16 on the Cifar-100 datasets with the same hyperparameters. Based on the test results (test accuracy, average test loss, training loss), we analyze the figures for trends, key points, increase rate, and other features. Then we combine the architecture of each model to make our conclusions. The experimental results show that ResNet-18 can be a good choice when training the model with the Cifar-100 datasets because it performs well after training and has a low time complexity. ResNet-18 also has the fastest convergence speed. GoogLeNet would be the second choice because it functions similarly to ResNet-18 and is even better. However, training GoogLeNet is a time-consuming task. VGG is not recommended in this experiment because it has the worst performance and similar training complexity compared with ResNet-18.

中文翻译：

Cifar-100数据集上各种图像分类模型的对比分析

如今，人们开发了各种基于卷积神经网络（CNN）的计算机视觉模型。一些著名的模型，如 GoogLeNet、Residual Network (ResNet)、Visual Geometry Group (VGG) 和 You Only Look Once (YOLO) 具有不同的架构和性能。对于刚刚开始研究图像分类的人来说，确定使用哪种模型可能是一个麻烦的问题。为了解决这个问题，我们介绍了 GoogLeNet、ResNet-18 和 VGG-16 模型，比较了它们的架构、特性和性能。然后我们根据测试结果给出我们的建议，以帮助初学者选择合适的模型。我们进行了实验，使用相同的超参数在 Cifar-100 数据集上训练和测试 GoogLeNet、ResNet-18 和 VGG-16。根据测试结果（测试准确率、平均测试损失、训练损失），我们分析数据的趋势、关键点、增长率和其他特征。然后我们结合每个模型的架构来得出我们的结论。实验结果表明，在使用 Cifar-100 数据集训练模型时，ResNet-18 是一个不错的选择，因为它训练后表现良好，并且时间复杂度较低。 ResNet-18还具有最快的收敛速度。 GoogLeNet 将是第二选择，因为它的功能与 ResNet-18 类似，甚至更好。然而，训练 GoogLeNet 是一项耗时的任务。本实验中不推荐使用 VGG，因为与 ResNet-18 相比，它的性能最差且训练复杂度相似。

更新日期：2024-02-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>