The limitations of differentiable architecture search

Guillaume, Lacharme; Hubert, Cardot; Christophe, Lente; Nicolas, Monmarche

doi:10.1007/s10044-024-01260-5

The limitations of differentiable architecture search

Original Article
Published: 12 April 2024

Volume 27, article number 40, (2024)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Lacharme Guillaume¹,
Cardot Hubert¹,
Lente Christophe¹ &
…
Monmarche Nicolas¹

45 Accesses
Explore all metrics

Abstract

In this paper, we will provide a detailed explanation of the limitations behind differentiable architecture search (DARTS). Algorithms based on the DARTS paradigm tend to converge towards degenerate solutions. A degenerate solution corresponds to an architecture with a shallow graph containing mainly skip connections. We have identified 6 sources of errors that could explain this phenomenon. Some of these errors can only be partially eliminated. Therefore, we will propose an innovative solution to remove degenerate solutions from the search space. We will demonstrate the validity of our approach through experiments conducted on the CIFAR10 and CIFAR100 databases. Our code is available at the following link: https://scm.univ-tours.fr/projetspublics/lifat/darts_ibpria_sparcity

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Article 18 August 2021

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Availability of data and materials

The CIFAR-10 and CIFAR-100 databases can be obtained using the following link: https://www.cs.toronto.edu/%C2%A0kriz/cifar.html

References

Ang A, Ma J, Liu N, Huang K, Wang Y (2021) Fast projection onto the capped simplex with applications to sparse regression in bioinformatics. Neural Inf Process Syst (NeurIPS)
Balles L, Hennig P (2018) Dissecting adam: the sign, magnitude and variance of stochastic gradients. In: International conference on machine learning (PMLR)
Choe H, Na B, Mok J, Yoon S (2021) Variance-stationary differentiable NAS. In: British machine vision conference (BMVC)
Franceschi L, Donini M, Frasconi P, Pontil M (2017) Forward and reverse gradient-based hyperparameter optimization. In: International conference on machine learning (ICML)
Fu J, Luo H, Feng J, Low KH, Chua TS (2016) DrMAD: distilling reverse-mode automatic differentiation for optimizing hyperparameters of deep neural networks. In: International joint conference on artificial intelligence (IJCAI)
Gu YC, Wang LJ, Liu Y, Yang Y, Wu YH, Lu SP, Cheng MM (2021) DOTS: decoupling operation and topology in differentiable architecture search. In: Conference on computer vision and pattern recognition (CVPR)
He C, Ye H, Shen L, Zhang T (2020) MiLeNAS: efficient neural architecture search via mixed-level reformulation. In: Conference on computer vision and pattern recognition (CVPR)
Hong W, Li G, Zhang W, Tang R, Wang Y, Li Z, Yu Y (2020) DropNAS: grouped operation dropout for differentiable architecture search. In: International joint conference on artificial intelligence (IJCAI)
Hou P, Jin Y, Chen Y (2021) Single-DARTS: towards stable architecture search. In: IEEE/CVF international conference on computer vision workshops (ICCVW)
Kendall MG (1938) A new measure of rank correlation. Biometrika
Kingma DP, Ba JL (2015) ADAM: a method for stochastic optimization. In: International conference on learning representations (ICLR)
Lacharme G, Cardot H, Lenté C, Monmarché N (2023) DARTS with degeneracy correction. In: Iberian conference on pattern recognition and image analysis (IbPRIA)
Lee HB, Lee H, Shin J, Yang E, Hospedales TM, Hwang SJ (2022) online hyperparameter meta-learning with hypergradient distillation. In: International conference on learning representations (ICLR)
Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2018) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res
Li G, Qian G, Delgadillo IC, Muller M, Thabet A, Ghanem B (2020) SGAS: sequential greedy architecture search. In: Conference on computer vision and pattern recognition (CVPR)
Liang H, Zhang S, Sun J, He X, Huang W, Zhuang K, Li Z (2021) DARTS+: improved differentiable architecture search with early stopping. http://arxiv.org/abs/1909.06035
Lin M, Wang P, Sun Z, Chen H, Sun X, Qian Q, Li H, Jin R (2021) Zen-NAS: a zero-shot NAS for high-performance image recognition. In: International conference on computer vision (ICCV)
Liu H, Simonyan K, Yang Y (2019) Darts: differentiable architecture search. In: International conference on learning representations (ICLR)
Lorraine J, Vicol P, Duvenaud D (2020) Optimizing millions of hyperparameters by implicit differentiation
Loshchilov I, Hutter F (2019) Decoupled weight decay regularization. In: International conference on learning representations (ICLR)
Luketina J, Berglund M, Greff K, Raiko T (2016) Scalable gradient-based tuning of continuous regularization hyperparameters. In: International conference on machine learning (ICML)
Metz L, Maheswaranathan N, Sun R, Daniel Freeman C, Poole B, Sohl-Dickstein J (2020) Using a thousand optimization tasks to learn hyperparameter search strategies Neural Inf Process Syst (NeurIPS)
Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Conference on artificial intelligence (AAAI)
Vicol P, Lorraine JP, Pedregosa F, Duvenaud D, Grosse RB (2022) On implicit bias in overparameterized bilevel optimization. In: International conference on machine learning (ICML)
Vicol P, Metz L, Sohl-Dickstein J (2021) Persistent unbiased gradient estimation in unrolled computation graphs with persistent evolution strategies
Wei T, Wang C, Rui Y, Chen CW (2016) Network morphism. In: Proceedings of machine learning research (PMLR)
Wu Y, Ren M, Liao R, Grosse R (2018) Understanding short-horizon bias in stochastic meta-optimizations. In: International conference on learning representations (ICLR)
Yibo Y, Hongyang L, Shan Y, Fei W, Chen Q, Zhouchen L (2020) ISTA-NAS: efficient and consistent neural architecture search by sparse coding. In: Proceedings of the 34th international conference on neural information processing systems (NeurIPS)
Zhang M, Su S, Pan S, Chang X, Abbasnejad E, Haffari R (2021) iDARTS: differentiable architecture search with stochastic implicit gradients. In: International conference on machine learning (ICML)
Zhou P, Xiong C, Socher R, Hoi SCH (2020) Theory-inspired path-regularized differential network architecture search. Neural Inf Process Syst (NeurIPS)
Zoph B, Vasudevan V, Shlens J, Le QV (2019) Learning transferable architectures for scalable image recognition. In: Conference on computer vision and pattern recognition (CVPR)

Download references

Funding

Region Centre Val de Loire.

Author information

Authors and Affiliations

Université de Tours, LIFAT, 64 Av. Jean Portalis, 37200, Tours, France
Lacharme Guillaume, Cardot Hubert, Lente Christophe & Monmarche Nicolas

Authors

Lacharme Guillaume
View author publications
You can also search for this author in PubMed Google Scholar
Cardot Hubert
View author publications
You can also search for this author in PubMed Google Scholar
Lente Christophe
View author publications
You can also search for this author in PubMed Google Scholar
Monmarche Nicolas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lacharme Guillaume.

Ethics declarations

Ethical approval

No ethical approval.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Settings experimentations

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guillaume, L., Hubert, C., Christophe, L. et al. The limitations of differentiable architecture search. Pattern Anal Applic 27, 40 (2024). https://doi.org/10.1007/s10044-024-01260-5

Download citation

Received: 31 October 2023
Accepted: 21 February 2024
Published: 12 April 2024
DOI: https://doi.org/10.1007/s10044-024-01260-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The limitations of differentiable architecture search

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A survey on Image Data Augmentation for Deep Learning

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Additional information

Publisher's Note

Appendix A: Settings experimentations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The limitations of differentiable architecture search

Abstract

Access this article

Similar content being viewed by others

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A survey on Image Data Augmentation for Deep Learning

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Additional information

Publisher's Note

Appendix A: Settings experimentations

Appendix A: Settings experimentations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation