Abstract
The complexity of programming modern heterogeneous systems raises huge challenges. Over the past two decades, researchers have aimed to alleviate these difficulties by employing classical Machine Learning and Deep Learning techniques within compilers to optimize code automatically. This work presents a novel approach to optimize code using at the same time Classical Machine Learning and Deep Learning techniques by maximizing their benefits while mitigating their drawbacks. Our proposed model extracts features from the code using Deep Learning and then applies Classical Machine Learning to map these features to specific outputs for various tasks. The effectiveness of our model is evaluated on three downstream tasks: device mapping, optimal thread coarsening, and algorithm classification. Our experimental results demonstrate that our model outperforms previous models in device mapping with an average accuracy of 91.60% on two datasets and in optimal thread coarsening task where we are the first to achieve a positive speedup on all four platforms while achieving a comparable result of 91.48% in the algorithm classification task. Notably, our approach yields better results even with a small dataset without requiring a pre-training phase or a complex code representation, offering the advantage of reducing training time and data volume requirements.
Similar content being viewed by others
References
Ghazi, A.N., Petersen, K., Börstler, J.: Heterogeneous systems testing techniques: an exploratory survey. In: International Conference on Software Quality, pp. 67–85 (2015). Springer
Chen, C., Fang, J., Tang, T., Yang, C.: Lu factorization on heterogeneous systems: an energy-efficient approach towards high performance. Computing 99(8), 791–811 (2017)
Singh, S.: Computing without processors: heterogeneous systems allow us to target our programming to the appropriate environment. Queue 9(6), 50–63 (2011)
Cummins, C., Petoumenos, P., Wang, Z., Leather, H.: End-to-end deep learning of optimization heuristics. In: 2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 219–232 (2017). IEEE
Ben-Nun, T., Jakobovits, A.S., Hoefler, T.: Neural code comprehension: A learnable representation of code semantics. Advances in Neural Information Processing Systems 31 (2018)
Magni, A., Dubach, C., O’Boyle, M.: Automatic optimization of thread-coarsening for graphics processors. In: Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, pp. 455–466 (2014)
VenkataKeerthy, S., Aggarwal, R., Jain, S., Desarkar, M.S., Upadrasta, R., Srikant, Y.: Ir2vec: Llvm ir based scalable program embeddings. ACM Trans. Archit. Code Optim. (TACO) 17(4), 1–27 (2020)
Grewe, D., Wang, Z., O’Boyle, M.F.: Portable mapping of data parallel programs to opencl for heterogeneous systems. In: Proceedings of the 2013 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp. 1–10 (2013). IEEE
Ashouri, A.H., Killian, W., Cavazos, J., Palermo, G., Silvano, C.: A survey on compiler autotuning using machine learning. ACM Comput. Surv. (CSUR) 51(5), 1–42 (2018)
Memeti, S., Pllana, S., Binotto, A., Kołodziej, J., Brandic, I.: Using meta-heuristics and machine learning for software optimization of parallel computing systems: a systematic literature review. Computing 101(8), 893–936 (2019)
Wang, Z., O’Boyle, M.: Machine learning in compiler optimization. Proc. IEEE 106(11), 1879–1901 (2018)
Allamanis, M., Barr, E.T., Devanbu, P., Sutton, C.: A survey of machine learning for big code and naturalness. ACM Comput. Surv. (CSUR) 51(4), 1–37 (2018)
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M., Shou, L., Qin, B., Liu, T., Jiang, D., et al.: Codebert: a pre-trained model for programming and natural languages. arXiv preprint arXiv:2002.08155 (2020)
Fang, J., Huang, C., Tang, T., Wang, Z.: Parallel programming models for heterogeneous many-cores: a comprehensive survey. CCF Trans. High Perf. Comput. 2(4), 382–400 (2020)
Czarnul, P., Proficz, J., Drypczewski, K.: Survey of methodologies, approaches, and challenges in parallel programming using high-performance computing systems. Sci. Program. 2020, 19 (2020)
Mou, L., Li, G., Zhang, L., Wang, T., Jin, Z.: Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pp. 1287–1293 (2016)
Deniz, E., Sen, A.: Using machine learning techniques to detect parallel patterns of multi-threaded applications. Int. J. Parall. Program. 44(4), 867–900 (2016)
Baghdadi, R., Merouani, M., Leghettas, M.-H., Abdous, K., Arbaoui, T., Benatchba, K., Amarasinghe, S.P.: A deep learning based cost model for automatic code optimization. ArXiv arXiv:abs/2104.04955 (2021)
Wang, Z., Tournavitis, G., Franke, B.: Integrating profile-driven parallelism detection and machine-learning-based mapping. ACM Trans. Archit. Code Optim. (TACO) 11(1), 1–26 (2014)
Stephenson, M., Amarasinghe, S.: Predicting unroll factors using supervised classification. In: International Symposium on Code Generation and Optimization, pp. 123–134 (2005). IEEE
Wen, Y., Wang, Z., O’boyle, M.F.: Smart multi-task scheduling for opencl programs on cpu/gpu heterogeneous platforms. In: 2014 21st International Conference on High Performance Computing (HiPC), pp. 1–10 (2014). IEEE
Luk, C., Hong, S., Qilin, K.: Exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42), pp. 45–55
Liu, B., Zhao, Y., Zhong, X., Liang, Z., Feng, B.: A novel thread partitioning approach based on machine learning for speculative multithreading. In: 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, pp. 826–836 (2013). IEEE
Wang, Z., Oboyle, M.F.: Using machine learning to partition streaming programs. ACM Trans. Archit. Code Optim. (TACO) 10(3), 1–25 (2013)
Haj-Ali, A., Ahmed, N.K., Willke, T., Shao, Y.S., Asanovic, K., Stoica, I.: Neurovectorizer: End-to-end vectorization with deep reinforcement learning. In: Proceedings of the 18th ACM/IEEE International Symposium on Code Generation and Optimization, pp. 242–255 (2020)
Alon, U., Zilberstein, M., Levy, O., Yahav, E.: code2vec: learning distributed representations of code. Proc. ACM Program. Lang. (2018). https://doi.org/10.1145/3290353
Lattner, C., Adve, V.: Llvm: A compilation framework for lifelong program analysis & transformation. In: International Symposium on Code Generation and Optimization, 2004. CGO 2004., pp. 75–86 (2004). IEEE
Barchi, F., Urgese, G., Macii, E., Acquaviva, A.: Code mapping in heterogeneous platforms using deep learning and llvm-ir. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), pp. 1–6 (2019). IEEE
Barchi, F., Parisi, E., Urgese, G., Ficarra, E., Acquaviva, A.: Exploration of convolutional neural network models for source code classification. Eng. Appl. Artif. Intell. 97, 104075 (2021)
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Yu, P.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32(1), 4–24 (2019)
Brauckmann, A., Goens, A., Ertel, S., Castrillon, J.: Compiler-based graph representations for deep learning models of code. In: Proceedings of the 29th International Conference on Compiler Construction, pp. 201–211 (2020)
Cummins, C., Fisches, Z.V., Ben-Nun, T., Hoefler, T., Leather, H.: Programl: Graph-based deep learning for program optimization and analysis. arXiv preprint arXiv:2003.10536 (2020)
Dong, X., Yu, Z., Cao, W., Shi, Y., Ma, Q.: A survey on ensemble learning. Front. Comput. Sci. 14(2), 241–258 (2020)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hakimi, Y., Baghdadi, R. & Challal, Y. A Hybrid Machine Learning Model for Code Optimization. Int J Parallel Prog 51, 309–331 (2023). https://doi.org/10.1007/s10766-023-00758-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-023-00758-5