A Hybrid Machine Learning Model for Code Optimization

Hakimi, Yacine; Baghdadi, Riyadh; Challal, Yacine

doi:10.1007/s10766-023-00758-5

A Hybrid Machine Learning Model for Code Optimization

Published: 22 September 2023

Volume 51, pages 309–331, (2023)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Yacine Hakimi¹,
Riyadh Baghdadi^2,3 &
Yacine Challal⁴

911 Accesses
Explore all metrics

Abstract

The complexity of programming modern heterogeneous systems raises huge challenges. Over the past two decades, researchers have aimed to alleviate these difficulties by employing classical Machine Learning and Deep Learning techniques within compilers to optimize code automatically. This work presents a novel approach to optimize code using at the same time Classical Machine Learning and Deep Learning techniques by maximizing their benefits while mitigating their drawbacks. Our proposed model extracts features from the code using Deep Learning and then applies Classical Machine Learning to map these features to specific outputs for various tasks. The effectiveness of our model is evaluated on three downstream tasks: device mapping, optimal thread coarsening, and algorithm classification. Our experimental results demonstrate that our model outperforms previous models in device mapping with an average accuracy of 91.60% on two datasets and in optimal thread coarsening task where we are the first to achieve a positive speedup on all four platforms while achieving a comparable result of 91.48% in the algorithm classification task. Notably, our approach yields better results even with a small dataset without requiring a pre-training phase or a complex code representation, offering the advantage of reducing training time and data volume requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Article Open access 31 March 2021

Automated machine learning: past, present and future

Article Open access 18 April 2024

A comprehensive review of Binary Neural Network

Article 30 March 2023

References

Ghazi, A.N., Petersen, K., Börstler, J.: Heterogeneous systems testing techniques: an exploratory survey. In: International Conference on Software Quality, pp. 67–85 (2015). Springer
Chen, C., Fang, J., Tang, T., Yang, C.: Lu factorization on heterogeneous systems: an energy-efficient approach towards high performance. Computing 99(8), 791–811 (2017)
Article MathSciNet MATH Google Scholar
Singh, S.: Computing without processors: heterogeneous systems allow us to target our programming to the appropriate environment. Queue 9(6), 50–63 (2011)
Article Google Scholar
Cummins, C., Petoumenos, P., Wang, Z., Leather, H.: End-to-end deep learning of optimization heuristics. In: 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 219–232 (2017). IEEE
Ben-Nun, T., Jakobovits, A.S., Hoefler, T.: Neural code comprehension: A learnable representation of code semantics. Advances in Neural Information Processing Systems 31 (2018)
Magni, A., Dubach, C., O’Boyle, M.: Automatic optimization of thread-coarsening for graphics processors. In: Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, pp. 455–466 (2014)
VenkataKeerthy, S., Aggarwal, R., Jain, S., Desarkar, M.S., Upadrasta, R., Srikant, Y.: Ir2vec: Llvm ir based scalable program embeddings. ACM Trans. Archit. Code Optim. (TACO) 17(4), 1–27 (2020)
Article Google Scholar
Grewe, D., Wang, Z., O’Boyle, M.F.: Portable mapping of data parallel programs to opencl for heterogeneous systems. In: Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 1–10 (2013). IEEE
Ashouri, A.H., Killian, W., Cavazos, J., Palermo, G., Silvano, C.: A survey on compiler autotuning using machine learning. ACM Comput. Surv. (CSUR) 51(5), 1–42 (2018)
Article Google Scholar
Memeti, S., Pllana, S., Binotto, A., Kołodziej, J., Brandic, I.: Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review. Computing 101(8), 893–936 (2019)
Article MathSciNet Google Scholar
Wang, Z., O’Boyle, M.: Machine learning in compiler optimization. Proc. IEEE 106(11), 1879–1901 (2018)
Article Google Scholar
Allamanis, M., Barr, E.T., Devanbu, P., Sutton, C.: A survey of machine learning for big code and naturalness. ACM Comput. Surv. (CSUR) 51(4), 1–37 (2018)
Article Google Scholar
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., et al.: Codebert: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)
Fang, J., Huang, C., Tang, T., Wang, Z.: Parallel programming models for heterogeneous many-cores: a comprehensive survey. CCF Trans. High Perf. Comput. 2(4), 382–400 (2020)
Article Google Scholar
Czarnul, P., Proficz, J., Drypczewski, K.: Survey of methodologies, approaches, and challenges in parallel programming using high-performance computing systems. Sci. Program. 2020, 19 (2020)
Google Scholar
Mou, L., Li, G., Zhang, L., Wang, T., Jin, Z.: Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 1287–1293 (2016)
Deniz, E., Sen, A.: Using machine learning techniques to detect parallel patterns of multi-threaded applications. Int. J. Parall. Program. 44(4), 867–900 (2016)
Article Google Scholar
Baghdadi, R., Merouani, M., Leghettas, M.-H., Abdous, K., Arbaoui, T., Benatchba, K., Amarasinghe, S.P.: A deep learning based cost model for automatic code optimization. ArXiv arXiv:abs/2104.04955 (2021)
Wang, Z., Tournavitis, G., Franke, B.: Integrating profile-driven parallelism detection and machine-learning-based mapping. ACM Trans. Archit. Code Optim. (TACO) 11(1), 1–26 (2014)
Google Scholar
Stephenson, M., Amarasinghe, S.: Predicting unroll factors using supervised classification. In: International Symposium on Code Generation and Optimization, pp. 123–134 (2005). IEEE
Wen, Y., Wang, Z., O’boyle, M.F.: Smart multi-task scheduling for opencl programs on cpu/gpu heterogeneous platforms. In: 2014 21st International Conference on High Performance Computing (HiPC), pp. 1–10 (2014). IEEE
Luk, C., Hong, S., Qilin, K.: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42), pp. 45–55
Liu, B., Zhao, Y., Zhong, X., Liang, Z., Feng, B.: A novel thread partitioning approach based on machine learning for speculative multithreading. In: 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, pp. 826–836 (2013). IEEE
Wang, Z., Oboyle, M.F.: Using machine learning to partition streaming programs. ACM Trans. Archit. Code Optim. (TACO) 10(3), 1–25 (2013)
Article Google Scholar
Haj-Ali, A., Ahmed, N.K., Willke, T., Shao, Y.S., Asanovic, K., Stoica, I.: Neurovectorizer: End-to-end vectorization with deep reinforcement learning. In: Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization, pp. 242–255 (2020)
Alon, U., Zilberstein, M., Levy, O., Yahav, E.: code2vec: learning distributed representations of code. Proc. ACM Program. Lang. (2018). https://doi.org/10.1145/3290353
Article Google Scholar
Lattner, C., Adve, V.: Llvm: A compilation framework for lifelong program analysis & transformation. In: International Symposium on Code Generation and Optimization, 2004. CGO 2004., pp. 75–86 (2004). IEEE
Barchi, F., Urgese, G., Macii, E., Acquaviva, A.: Code mapping in heterogeneous platforms using deep learning and llvm-ir. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), pp. 1–6 (2019). IEEE
Barchi, F., Parisi, E., Urgese, G., Ficarra, E., Acquaviva, A.: Exploration of convolutional neural network models for source code classification. Eng. Appl. Artif. Intell. 97, 104075 (2021)
Article Google Scholar
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 4–24 (2019)
Article MathSciNet Google Scholar
Brauckmann, A., Goens, A., Ertel, S., Castrillon, J.: Compiler-based graph representations for deep learning models of code. In: Proceedings of the 29th International Conference on Compiler Construction, pp. 201–211 (2020)
Cummins, C., Fisches, Z.V., Ben-Nun, T., Hoefler, T., Leather, H.: Programl: Graph-based deep learning for program optimization and analysis. arXiv preprint arXiv:2003.10536 (2020)
Dong, X., Yu, Z., Cao, W., Shi, Y., Ma, Q.: A survey on ensemble learning. Front. Comput. Sci. 14(2), 241–258 (2020)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire des Méthodes de Conception des Systèmes (LMCS), Ecole nationale Supérieure d’Informatique, BP 68M, 16309, Oued Smar, Algeria
Yacine Hakimi
Massachusetts Institute of Technology (MIT), Massachusetts, USA
Riyadh Baghdadi
New York University Abu Dhabi (NYUAD), Abu Dhabi, UAE
Riyadh Baghdadi
College of Computing and Information Technology, University of Doha for Science and Technology (UDST), Doha, Qatar
Yacine Challal

Authors

Yacine Hakimi
View author publications
You can also search for this author in PubMed Google Scholar
Riyadh Baghdadi
View author publications
You can also search for this author in PubMed Google Scholar
Yacine Challal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yacine Hakimi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hakimi, Y., Baghdadi, R. & Challal, Y. A Hybrid Machine Learning Model for Code Optimization. Int J Parallel Prog 51, 309–331 (2023). https://doi.org/10.1007/s10766-023-00758-5

Download citation

Received: 12 February 2023
Accepted: 05 September 2023
Published: 22 September 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s10766-023-00758-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Hybrid Machine Learning Model for Code Optimization

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Automated machine learning: past, present and future

A comprehensive review of Binary Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Hybrid Machine Learning Model for Code Optimization

Abstract

Access this article

Similar content being viewed by others

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Automated machine learning: past, present and future

A comprehensive review of Binary Neural Network

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation