A dual progressive strategy for long-tailed visual recognition

Liang, Hong; Cao, Guoqing; Shao, Mingwen; Zhang, Qian

doi:10.1007/s00138-023-01480-5

A dual progressive strategy for long-tailed visual recognition

Original Paper
Published: 06 November 2023

Volume 35, article number 1, (2024)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Hong Liang¹,
Guoqing Cao ORCID: orcid.org/0000-0002-8028-4351¹,
Mingwen Shao¹ &
…
Qian Zhang¹

189 Accesses
1 Citation
Explore all metrics

Abstract

Unlike the roughly balanced dataset used in the experiments, the long-tail phenomenon in the dataset is more common when applied in practice. Most previous work has typically used re-sampling, re-weighting, and ensemble learning to mitigate the long-tail problem. The first two are the most commonly used (as are we) due to their better generality. Differently, assigning weights to classes directly using the inverse of the sample size to solve such problems may not be a good strategy, which often sacrifices the performance of the head classes. We propose a new approach to cost allocation, which consists of two parts: the first part is trained in an unweighted manner to ensure that the network is adequately fitted to the head data. The second part then dynamically assigns weights based on the relative difficulty of the class levels.In addition, we propose a novel, practical Grabcut-based data augmentation approach to increase the diversity and differentiation of the mid-tail class data. Extensive experiments on public and self-constructed long-tailed datasets demonstrate the effectiveness of our approach and achieve excellent performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Long-tailed image recognition through balancing discriminant quality

Article 07 July 2023

Long-Tailed Recognition Using Class-Balanced Experts

BLT: Balancing Long-Tailed Datasets with Adversarially-Perturbed Images

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Code Availability

The code that support the findings of this study are available from the corresponding author upon reasonable request.

References

Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer vision and pattern recognition, pp. 248–255 (2009)
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Houlsby, N.: An image is worth \(16 \times 16\) words: Transformers for image recognition at scale. arXiv:2010.11929 (2020)
Van Horn, G., Mac Aodha, O., Song, Y., Cui, Y., Sun, C., Shepard, A., Belongie, S.: The inaturalist species classification and detection dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8769–8778 (2018)
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. Adv. Neural Inf. Process. Syst., 27 (2014)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Sinha, S., Ohashi, H., Nakamura, K.: Class-wise difficulty-balanced loss for solving class-imbalance. In: Proceedings of the Asian Conference on Computer Vision (2020)
Cao, K., Wei, C., Gaidon, A., Arechiga, N., Ma, T.: Learning imbalanced datasets with label-distribution-aware margin loss. Adv. Neural Inf. Process. Syst. 32 (2019)
Liu, Z., Miao, Z., Zhan, X., Wang, J., Gong, B., Yu, S.X.: Large-scale long-tailed recognition in an open world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2537–2546 (2019)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl Data Eng. 21, 1263–1284 (2009)
Article Google Scholar
Huang, C., Li, Y., Loy, C.C., Tang, X.: Learning deep representation for imbalanced classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5375–5384 (2016)
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans Syst. Man. Cybern. Part B (Cybern.) 39, 539–550 (2008)
Google Scholar
More, A.: Survey of resampling techniques for improving classification performance in unbalanced datasets. arXiv:1608.06048 (2016)
Buda, M., Maki, A., Mazurowski, M.A.: A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw. 106, 249–259 (2018)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Han, H., Wang, W.Y., Mao, B.H.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: International Conference on Intelligent Computing, pp. 878–887 (2005)
Mahajan, D., Girshick, R., Ramanathan, V., He, K., Paluri, M., Li, Y., Van Der Maaten, L.: Exploring the limits of weakly supervised pretraining. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 181–196 (2018)
Shen, L., Lin, Z., Huang, Q.: Relay backpropagation for effective learning of deep convolutional neural networks. In: European Conference on Computer Vision, pp. 467–482 (2016)
Kang, B., Xie, S., Rohrbach, M., Yan, Z., Gordo, A., Feng, J., Kalantidis, Y.: Decoupling representation and classifier for long-tailed recognition. arXiv:1910.09217 (2019)
Sarafianos, N., Xu, X., Kakadiaris, I.A.: Deep imbalanced attribute classification using visual attention aggregation. In: Proceedings of the European Conference on Computer Vision, pp. 680–697 (2018)
Drummond, C., Holte, R.C.: C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In: Workshop on Learning from Imbalanced Datasets II, pp. 1–8 (2003)
Khan, S.H., Hayat, M., Bennamoun, M., Sohel, F.A., Togneri, R.: Cost-sensitive learning of deep feature representations from imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. 29, 3573–3587 (2017)
Google Scholar
Cui, Y., Jia, M., Lin, T.Y., Song, Y., Belongie, S.: Class-balanced loss based on effective number of samples. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9268–9277(2019)
Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6, 429–449 (2002)
Article Google Scholar
Ren, J., Yu, C., Ma, X., Zhao, H., Yi, S.: Balanced meta-softmax for long-tailed visual recognition. Adv. Neural Inf. Process. syst. 33, 4175–4186 (2020)
Google Scholar
Chou, H.P., Chang, S.C., Pan, J.Y., Wei, W., Juan, D.C.: Remix: rebalanced mixup. In: European Conference on Computer Vision, pp. 95–110 (2020)
Zhang, Y., Wei, X.S., Zhou, B., Wu, J.: Bag of tricks for long-tailed visual recognition with deep convolutional neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 3447–3455 (2021)
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. arXiv:1710.09412 (2017)
Verma, V., Lamb, A., Beckham, C., Najafi, A., Mitliagkas, I., Lopez-Paz, D., Bengio, Y.: Manifold mixup: Better representations by interpolating hidden states. In: International Conference on Machine Learning, pp. 6438–6447 (2019)
Cui, Y., Song, Y., Sun, C., Howard, A., Belongie, S.: Large scale fine-grained categorization and domain-specific transfer learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4109–4118 (2018)
Rother, C., Kolmogorov, V., Blake, A.: “GrabCut’’ interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. (TOG) 23, 309–314 (2004)
Article Google Scholar
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Zheng, X.: Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv:1603.04467 (2016)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Article Google Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lerer, A.: Automatic differentiation in pytorch (2017)
Loshchilov, I., Hutter, F.: Sgdr: stochastic gradient descent with warm restarts. arXiv:1608.03983 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Liang, Y., Zhu, L., Wang, X., Yang, Y.: A simple episodic linear probe improves visual recognition in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9559–9569 (2022)

Download references

Acknowledgements

The work described in this paper was supported by the National Natural Science Foundation of China (No. 61673396) and the Natural Science Foundation of Shandong Province (No. ZR2022MF260).

Funding

The work was supported by the National Natural Science Foundation of China (No.61673396) and the Natural Science Foundation of Shandong Province(No.ZR2022MF260).

Author information

Authors and Affiliations

College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, Shandong, China
Hong Liang, Guoqing Cao, Mingwen Shao & Qian Zhang

Authors

Hong Liang
View author publications
You can also search for this author in PubMed Google Scholar
Guoqing Cao
View author publications
You can also search for this author in PubMed Google Scholar
Mingwen Shao
View author publications
You can also search for this author in PubMed Google Scholar
Qian Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Q.Z. contributed to data curation; M.S. contributed to funding acquisition; G.C. contributed to writing—original draft; H.L contributed to writing—review & editing. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Guoqing Cao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval:

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liang, H., Cao, G., Shao, M. et al. A dual progressive strategy for long-tailed visual recognition. Machine Vision and Applications 35, 1 (2024). https://doi.org/10.1007/s00138-023-01480-5

Download citation

Received: 01 March 2023
Revised: 30 August 2023
Accepted: 04 October 2023
Published: 06 November 2023
DOI: https://doi.org/10.1007/s00138-023-01480-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A dual progressive strategy for long-tailed visual recognition

Abstract

Access this article

Similar content being viewed by others

Long-tailed image recognition through balancing discriminant quality

Long-Tailed Recognition Using Class-Balanced Experts

BLT: Balancing Long-Tailed Datasets with Adversarially-Perturbed Images

Data availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval:

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A dual progressive strategy for long-tailed visual recognition

Abstract

Access this article

Similar content being viewed by others

Long-tailed image recognition through balancing discriminant quality

Long-Tailed Recognition Using Class-Balanced Experts

BLT: Balancing Long-Tailed Datasets with Adversarially-Perturbed Images

Data availability

Code Availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval:

Consent to participate

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation