Abstract
Widely used compilers like GCC and LLVM usually have hundreds of optimizations controlled by optimization flags, which are enabled or disabled during compilation to improve the runtime performance (e.g., small execution time) of the compiler program. Due to the large number of optimization flags and their combination, it is difficult for compiler users to manually tune compiler optimization flags. In the literature, a number of autotuning techniques have been proposed, which tune optimization flags for a compiled program by comparing its actual runtime performance with different optimization flag combinations. Due to the huge search space and heavy actual runtime cost, these techniques suffer from the widely recognized efficiency problem. To reduce the heavy runtime cost, in this article we propose a lightweight learning approach that uses a small number of actual runtime performance data to predict the runtime performance of a compiled program with various optimization flag combinations. Furthermore, to reduce the search space, we design a novel particle swarm algorithm that tunes compiler optimization flags with the prediction model. To evaluate the performance of the proposed approach, CompTuner, we conduct an extensive experimental study on two popular C compilers, GCC and LLVM, with two widely used benchmarks, cBench and PolyBench. The experimental results show that CompTuner significantly outperforms the six compared techniques, including the state-of-the-art technique BOCA.
- [1] . 2006. Using machine learning to focus iterative optimization. In 4th IEEE/ACM International Symposium on Code Generation and Optimization (CGO’06). IEEE Computer Society, 295–305.Google ScholarDigital Library
- [2] . 2004. Finding effective compilation sequences. ACM SIGPLAN Notices 39, 7 (2004), 231–239.Google ScholarDigital Library
- [3] . 2014. OpenTuner: An extensible framework for program autotuning. In International Conference on Parallel Architectures and Compilation (PACT’14), and (Eds.). ACM, 303–316.Google ScholarDigital Library
- [4] . 2016. Compiler Autotuning Using Machine Learning Techniques. Ph.D. thesis, 3–4.Google Scholar
- [5] . 2017. Micomp: Mitigating the compiler phase-ordering problem using optimization sub-sequences and machine learning. ACM Transactions on Architecture and Code Optimization (TACO) 14, 3 (2017), 1–28.Google ScholarDigital Library
- [6] . 2018. A survey on compiler autotuning using machine learning. ACM Computing Surveys (CSUR) 51, 5 (2018), 1–42.Google ScholarDigital Library
- [7] . 2016. COBAYN: Compiler autotuning framework using bayesian networks. ACM Transactions on Architecture and Code Optimization 13, 2, Article
21 (June 2016), 25 pages.Google ScholarDigital Library - [8] . 2018. Automatic Tuning of Compilers Using Machine Learning. Springer.Google ScholarCross Ref
- [9] . 2011. Algorithms for hyper-parameter optimization. Advances in Neural Information Processing Systems 24 (2011), 2546–2554.Google Scholar
- [10] . 2007. Rapidly selecting good compiler optimizations using performance counters. In International Symposium on Code Generation and Optimization (CGO’07). IEEE, 185–197.Google ScholarDigital Library
- [11] . 2022. https://ctuning.org/wiki/index.php/CTools:CBenchGoogle Scholar
- [12] . 2021. Efficient compiler autotuning via Bayesian optimization. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE’21). IEEE, 1198–1209.Google ScholarDigital Library
- [13] . 2012. Deconstructing iterative optimization. ACM Transactions on Architecture and Code Optimization (TACO) 9, 3 (2012), 1–30.Google ScholarDigital Library
- [14] . 2017. End-to-end deep learning of optimization heuristics. In 26th International Conference on Parallel Architectures and Compilation Techniques (PACT’17). IEEE Computer Society, 219–232.Google ScholarCross Ref
- [15] . 2022. Reinforcement learning approach to autonomous PID tuning. Computers & Chemical Engineering 161 (2022), 107760.Google ScholarCross Ref
- [16] . 2020. Research on the realization and optimization of FFTs in ARMv8 platform. In IOP Conference Series: Materials Science and Engineering, Vol. 768. IOP Publishing, 072114.Google ScholarCross Ref
- [17] . 2020. Roofline model-guided compilation optimization parameter selection method. In Proceedings of Data Science: 6th International Conference of Pioneering Computer Scientists, Engineers and Educators (ICPCSEE’20), Part I 6. Springer, 268–282.Google ScholarCross Ref
- [18] . 2020. Robust practical binary optimization at run-time using LLVM. In 2020 IEEE/ACM 6th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC’20) and Workshop on Hierarchical Parallelism for Exascale Computing (HiPar’20). IEEE, 56–64.Google ScholarCross Ref
- [19] . 2019. Embedded hyper-parameter tuning by Simulated Annealing. CoRR abs/1906.01504.
arXiv:1906.01504 http://arxiv.org/abs/1906.01504Google Scholar - [20] . 2018. A tutorial on Bayesian optimization. arXiv preprint arXiv:1807.02811.Google Scholar
- [21] . 2016. Evolutionary optimization of compiler flag selection by learning and exploiting flags interactions. In Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion. 1159–1166.Google ScholarDigital Library
- [22] . 2022. https://gcc.gnu.orgGoogle Scholar
- [23] . 2020. NeuroVectorizer: End-to-end vectorization with deep reinforcement learning. In 18th ACM/IEEE International Symposium on Code Generation and Optimization (CGO’20). ACM, 242–255.Google ScholarDigital Library
- [24] . 2022. Multi-intention-aware configuration selection for performance tuning. In 44th IEEE/ACM 44th International Conference on Software Engineering (ICSE’22). ACM, 1431–1442.Google ScholarDigital Library
- [25] . 2015. Combining multi-objective search and constraint solving for configuring large software product lines. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. 517–528.Google ScholarCross Ref
- [26] . 2008. Cole: Compiler optimization level exploration. In Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization. 165–174.Google ScholarDigital Library
- [27] . 2019. Autophase: Compiler phase-ordering for hls with deep reinforcement learning. In 2019 IEEE 27th Annual International Symposium on Field-programmable Custom Computing Machines (FCCM’19). IEEE, 308–308.Google ScholarCross Ref
- [28] . 2013. Performance potential of optimization phase selection during dynamic JIT compilation. In ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (co-located with ASPLOS’13) (VEE’13), , , and (Eds.). ACM, 131–142.Google ScholarDigital Library
- [29] . 1995. Particle swarm optimization. In Proceedings of International Conference on Neural Networks (ICNN’95), Vol. 4. IEEE, 1942–1948.Google ScholarCross Ref
- [30] . 2012. Roulette-wheel selection via stochastic acceptance. Physica A: Statistical Mechanics and Its Applications 391, 6 (2012), 2193–2196.Google ScholarCross Ref
- [31] . 2022. https://llvm.orgGoogle Scholar
- [32] . 2002. A machine learning approach to automatic production of compiler heuristics. In International Conference on Artificial Intelligence: Methodology, Systems, and Applications. Springer, 41–50.Google ScholarCross Ref
- [33] . 2020. Finding faster configurations using FLASH. IEEE Transactions on Software Engineering 46, 7 (2020), 794–811.Google ScholarCross Ref
- [34] . 2022. Tuning hyperparameters of machine learning algorithms and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases. Computational Biology and Chemistry 97 (2022), 107619.Google ScholarDigital Library
- [35] . 2019. Evolutionary algorithm for optimization of energy consumption at GCC compile time based on frequent pattern mining. Journal of Software 30, 5 (2019), 1269–1287.Google Scholar
- [36] . 2022. https://numpy.orgGoogle Scholar
- [37] . 1986. Advanced compiler optimizations for supercomputers. Communications of the ACM 29, 12 (1986), 1184–1201.Google ScholarDigital Library
- [38] . 2020. Sampling effect on performance prediction of configurable systems: A case study. In 11th ACM/SPEC International Conference on Performance Engineering (ICPE’20). ACM, 1–13.Google ScholarDigital Library
- [39] . 2017. Automatic configuration of GCC using Irace. In International Conference on Artificial Evolution (Evolution Artificielle). Springer, 202–216.Google Scholar
- [40] . 2007. Particle swarm optimization. Swarm Intelligence 1, 1 (2007), 33–57.Google ScholarCross Ref
- [41] . 2022. https://web.cse.ohio-state.edu/∼pouchet.2/software/polybench/Google Scholar
- [42] . 2013. Finding good optimization sequences covering program space. ACM Transactions on Architecture and Code Optimization (TACO) 9, 4 (2013), 1–23.Google ScholarDigital Library
- [43] . 2012. An optimized tuning of genetic algorithm parameters in compiler flag selection based on compilation and execution duration. In Proceedings of the International Conference on Soft Computing for Problem Solving (SocProS’11). Springer, 599–610.Google ScholarCross Ref
- [44] . 2021. Fine-tuning deep learning model parameters for improved super-resolution of dynamic MRI with prior-knowledge. CoRR abs/2102.02711.
arXiv:2102.02711 https://arxiv.org/abs/2102.02711Google Scholar - [45] . 1973. A survey of compiler optimization techniques. In Proceedings of the ACM Annual Conference. 106–113.Google ScholarDigital Library
- [46] . 2022. https://scikit-learn.org/stable/Google Scholar
- [47] . 2015. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE 104, 1 (2015), 148–175.Google ScholarCross Ref
- [48] . 1999. Empirical study of particle swarm optimization. In Proceedings of the 1999 Congress on Evolutionary Computation (CEC’99) (Cat. No. 99TH8406), Vol. 3. IEEE, 1945–1950.Google ScholarCross Ref
- [49] . 2015. Performance-influence models for highly configurable systems. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering. 284–294.Google ScholarDigital Library
- [50] . 2012. Practical Bayesian optimization of machine learning algorithms. Advances in Neural Information Processing Systems 25 (2012), 2951-2959.Google Scholar
- [51] . 2009. A scalable auto-tuning framework for compiler optimization. In 2009 IEEE International Symposium on Parallel & Distributed Processing. IEEE, 1–12.Google ScholarDigital Library
- [52] . 2019. Energy prediction for cache tuning in embedded systems. In 37th IEEE International Conference on Computer Design (ICCD’19). IEEE, 630–637.Google ScholarCross Ref
- [53] . 2021. Automatic tuning of hyperparameters using Bayesian optimization. Evolving Systems 12 (
2021), 12–30.Google ScholarCross Ref - [54] . 2021. A learning-based automatic parameters tuning framework for autonomous vehicle control in large scale system deployment. In 2021 American Control Conference (ACC’21). IEEE, 2919–2926.Google ScholarCross Ref
- [55] . 2011. Cosine similarity measures for intuitionistic fuzzy sets and their applications. Mathematical and Computer Modelling 53, 1–2 (2011), 91–97.Google ScholarDigital Library
Index Terms
- Compiler Autotuning through Multiple-phase Learning
Recommendations
Efficient Compiler Autotuning via Bayesian Optimization
ICSE '21: Proceedings of the 43rd International Conference on Software EngineeringA typical compiler such as GCC supports hundreds of optimizations controlled by compilation flags for improving the runtime performance of the compiled program. Due to the large number of compilation flags and the exponential number of flag combinations,...
Compiler algorithm language (CAL): an interpreter and compiler
ACST'07: Proceedings of the third conference on IASTED International Conference: Advances in Computer Science and TechnologyWe have designed a Compiler Algorithm Language (CAL) to provide compiler writers with a language which is close to actual algorithmic notation. In this work, we have developed an interpreter and debugger for CAL which can be used by researchers for ...
A Compiler Translate Directive-Based Language to Optimized CUDA
HPCC '14: Proceedings of the 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS)Graphics processing units(GPUs) provide a low cost platform for accelerating high performance computations. New programming languages, such as CUDA and OpenCL, make GPU programming attractive to programmers. However, programming GPUs is still a ...
Comments