Skip to main content
Log in

Hardware Acceleration for SLAM in Mobile Systems

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

The emerging mobile robot industry has spurred a flurry of interest in solving the simultaneous localization and mapping (SLAM) problem. However, existing SLAM platforms have difficulty in meeting the real-time and low-power requirements imposed by mobile systems. Though specialized hardware is promising with regard to achieving high performance and lowering the power, designing an efficient accelerator for SLAM is severely hindered by a wide variety of SLAM algorithms. Based on our detailed analysis of representative SLAM algorithms, we observe that SLAM algorithms advance two challenges for designing efficient hardware accelerators: the large number of computational primitives and irregular control flows. To address these two challenges, we propose a hardware accelerator that features composable computation units classified as the matrix, vector, scalar, and control units. In addition, we design a hierarchical instruction set for coping with a broad range of SLAM algorithms with irregular control flows. Experimental results show that, compared against an Intel x86 processor, on average, our accelerator with the area of 7.41 mm2 achieves 10.52x and 112.62x better performance and energy savings, respectively, across different datasets. Compared against a more energy-efficient ARM Cortex processor, our accelerator still achieves 33.03x and 62.64x better performance and energy savings, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Durrant-Whyte H, Bailey T. Simultaneous localization and mapping: Part I. IEEE Robotics & Automation Magazine, 2006, 13(2): 99–110. https://doi.org/10.1109/MRA.2006.1638022.

    Article  Google Scholar 

  2. Doucet A, De Freitas N, Gordon N. An introduction to sequential Monte Carlo methods. In Sequential Monte Carlo Methods in Practice, Doucet A, De Freitas N, Gordon N (eds.), Springer, 2001, pp.3–14. https://doi.org/10.1007/978-1-4757-3437-9_1.

  3. Montemerlo M, Thrun S, Roller D, Wegbreit B. Fast-SLAM 2.0: An improved particle filtering algorithm for simultaneous localization and mapping that provably converges. In Proc. the 18th International Joint Conference on Artificial Intelligence (IJCAI), Aug. 2003, pp.1151–1156.

  4. Guivant J E, Nebot E M. Optimization of the simultaneous localization and map-building algorithm for real-time implementation. IEEE Trans. Robotics and Automation, 2001, 17(3): 242–257. https://doi.org/10.1109/70.938382.

    Article  Google Scholar 

  5. Olson E B. Real-time correlative scan matching. In Proc. the 2009 IEEE International Conference on Robotics and Automation, May 2009, pp.4387–4393. https://doi.org/10.1109/ROBOT.2009.5152375.

  6. Yan B, Xin J, Shan M, Wang Y Q. CUDA implementation of a parallel particle filter for mobile robot pose estimation. In Proc. the 14th IEEE Conference on Industrial Electronics and Applications (ICIEA), Jun. 2019, pp.578–582. https://doi.org/10.1109/ICIEA.2019.8833856.

  7. Mittal R, Pathak V, Mithal A. A novel approach to optimize SLAM using GP-GPU. In Proc. International Conference on Data Science and Applications, Ray K, Roy K C, Toshniwal S K, Sharma H, Bandyopadhyay A (eds.), Springer, 2021, pp.273–280. https://doi.org/10.1007/978-981-15-7561-7_22.

  8. Nardi L, Bodin B, Zia M Z, Mawer J, Nisbet A, Kelly P H J, Davison A J, Lujan M, O'Boyle M F P, Riley G, Topham N, Furber S. Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM. In Proc. the 2015 IEEE International Conference on Robotics and Automation (ICRA), May 2015, pp.5783–5790. https://doi.org/10.1109/ICRA.2015.7140009.

  9. Peng T, Zhang D N, Liu R X, Asari V K, Loomis J S. Evaluating the power efficiency of visual SLAM on embedded GPU systems. In Proc. the 2019 IEEE National Aerospace and Electronics Conference (NAECON), July 2019, pp.117–121. https://doi.org/10.1109/NAECON46414.2019.9058059.

  10. Wu Y K, Luo L, Yin S J, Yu M Q, Qiao F, Huang H Z, Shi X S, Wei Q, Liu X J. An FPGA based energy efficient DS-SLAM accelerator for mobile robots in dynamic environment. Applied Sciences, 2021, 11(4): 1–15. https://doi.org/10.3390/app11041828.

    Article  Google Scholar 

  11. Bouhoun S, Sadoun R, Adnane M. OpenCL implementation of a SLAM system on an SoC-FPGA. Journal of Systems Architecture, 2020, 111: 101825. https://doi.org/10.1016/j.sysarc.2020.101825.

  12. Nguyen D D, El Ouardi A, Rodríguez S, Bouaziz S. FP-GA implementation of HOOFR bucketing extractor-based real-time embedded SLAM applications. Journal of Real-Time Image Processing, 2021, 18(3): 525–538. https://doi.org/10.1007/s11554-020-00986-9.

    Article  Google Scholar 

  13. Czarnowski J, Laidlow T, Clark R, Davison A J. Deep-Factors: Real-time probabilistic dense monocular SLAM. IEEE Robotics and Automation Letters, 2020, 5(2): 721–728. https://doi.org/10.1109/LRA.2020.2965415.

    Article  Google Scholar 

  14. Li Y Y, Brasch N, Wang Y D, Navab N, Tombari F. Structure-SLAM: Low-drift monocular SLAM in indoor environments. IEEE Robotics and Automation Letters, 2020, 5(4): 6583–6590. https://doi.org/10.1109/LRA.2020.3015456.

    Article  Google Scholar 

  15. Gomez-Ojeda R, Moreno F A, Zuniga-Noël D, Scaramuz-za D, Gonzalez-Jimenez J. PL-SLAM: A stereo SLAM system through the combination of points and line segments. IEEE Trans. Robotics, 2019, 35(3): 734–746. https://doi.org/10.1109/TRO.2019.2899783.

    Article  Google Scholar 

  16. Li X, Li Y Y, Örnek E P, Lin J L, Tombari F. Co-Planar parametrization for Stereo-SLAM and visual-inertial odometry. IEEE Robotics and Automation Letters, 2020, 5(4): 6972–6979. https://doi.org/10.1109/LRA.2020.3027230.

    Article  Google Scholar 

  17. Kolhatkar C, Wagle K. Review of SLAM algorithms for indoor mobile robot with LIDAR and RGB-D camera technology. In Innovations in Electrical and Electronic Engineering: Proceedings of ICEEE 2020, Favorskaya M N, Mekhilef S, Pandey R K, Singh N (eds.), Springer, 2021, pp.397–409. https://doi.org/10.1007/978-981-15-4692-1_30.

  18. Endres F, Hess J, Sturm J, Cremers D, Burgard W. 3-D mapping with an RGB-D camera. IEEE Trans. Robotics, 2014, 30(1): 177–187. https://doi.org/10.1109/TRO.2013.2279412.

    Article  Google Scholar 

  19. Kala S, Jose B R, Mathew J, Nalesh S. High-performance CNN accelerator on FPGA using unified winograd-GEMM architecture. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2019, 27(12): 2816–2828. https://doi.org/10.1109/TVLSI.2019.2941250.

    Article  Google Scholar 

  20. Tavakoli M R, Sayedi S M, Khaleghi M J. A high throughput hardware CNN accelerator using a novel multi-layer convolution processor. In Proc. the 28th Iranian Conference on Electrical Engineering (ICEE), Aug. 2020. https://doi.org/10.1109/ICEE50131.2020.9260785.

  21. Lowe D G. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2004, 60(2): 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94.

    Article  Google Scholar 

  22. Knyazev A V. A preconditioned conjugate gradient method for eigenvalue problems and its implementation in a subspace. In Numerical Treatment of Eigenvalue Problems Vol. 5/Numerische Behandlung von Eigenwertaufgaben Band 5, Albrecht J, Collatz L, Hagedorn P, Velte W (eds.), Birkhäuser, 1991, pp.143–154. https://doi.org/10.1007/978-3-0348-6332-2_11.

  23. Strasdat H, Montiel J M M, Davison A J. Visual SLAM: Why filter? Image and Vision Computing, 2012, 30(2): 65–77. https://doi.org/10.1016/j.imavis.2012.02.009.

    Article  Google Scholar 

  24. Tan F, Lohmiller W, Slotine J J. Analytical SLAM without linearization. arXiv: 1512.08829, 2016. https://arxiv.org/abs/1512.08829, Oct. 2023.

  25. Arulampalam M S, Maskell S, Gordon N, Clapp T. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans. Signal Processing, 2002, 50(2): 174–188. https://doi.org/10.1109/78.978374.

    Article  Google Scholar 

  26. Grisetti G, Stachniss C, Burgard W. Improved techniques for grid mapping with rao-blackwellized particle filters. IEEE Trans. Robotics, 2007, 23(1): 34–46. https://doi.org/10.1109/TRO.2006.889486.

    Article  Google Scholar 

  27. Bailey T, Durrant-Whyte H. Simultaneous localization and mapping (SLAM): Part II. IEEE Robotics & Automation Magazine, 2006, 13(3): 108–117. https://doi.org/10.1109/MRA.2006.1678144.

    Article  Google Scholar 

  28. Lu F, Milios E. Globally consistent range scan alignment for environment mapping. Autonomous Robots, 1997, 4(4): 333–349. https://doi.org/10.1023/A:1008854305733.

    Article  Google Scholar 

  29. Grisetti G, Kummerle R, Stachniss C, Burgard W. A tutorial on graph-based SLAM. IEEE Intelligent Transportation Systems Magazine, 2010, 2(4): 31–43. https://doi.org/10.1109/MITS.2010.939925.

    Article  Google Scholar 

  30. Rosten E, Drummond T. Machine learning for high-speed corner detection. In Proc. the 9th European Conference on Computer Vision (ECCV), May 2006, pp.430–443. https://doi.org/10.1007/11744023_34.

  31. Calonder M, Lepetit V, Strecha C, Fua P. BRIEF: Binary robust independent elementary features. In Proc. the 11th European Conference on Computer Vision (ECCV), Sept. 2010, pp.778–792. https://doi.org/10.1007/978-3-642-15561-1_56.

  32. Mur-Artal R, Montiel J M M, Tardós J D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robotics, 2015, 31(5): 1147–1163. https://doi.org/10.1109/TRO.2015.2463671.

    Article  Google Scholar 

  33. Bay H, Tuytelaars T, Van Gool L. SURF: Speeded up robust features. In Proc. the 9th European Conference on Computer Vision, May 2006, pp.404–417. https://doi.org/10.1007/11744023_32.

  34. Fischler M A, Bolles R C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 1981, 24(6): 381–395. https://doi.org/10.1145/358669.358692.

    Article  MathSciNet  Google Scholar 

  35. Besl P J, McKay N D. A method for registration of 3-D shapes. In Proc. the SPIE 1611, Sensor Fusion IV: Control Paradigms and Data Structures, Apr. 1992, pp.586– 606. https://doi.org/10.1117/12.57955.

  36. Censi A. An ICP variant using a point-to-line metric. In Proc. the 2008 IEEE International Conference on Robotics and Automation, May 2008, pp.19–25. https://doi.org/10.1109/ROBOT.2008.4543181.

  37. Rusinkiewicz S, Levoy M. Efficient variants of the ICP algorithm. In Proc. the 3rd International Conference on 3-D Digital Imaging and Modeling, May 28–Jun. 1, 2001, pp.145–152. https://doi.org/10.1109/IM.2001.924423.

  38. Kümmerle R, Grisetti G, Strasdat H, Konolige K, Burgard W. g2o: A general framework for graph optimization. In Proc. the 2011 IEEE International Conference on Robotics and Automation (ICRA), May 2011, pp.3607– 3613. https://doi.org/10.1109/ICRA.2011.5979949.

  39. Linsen L. Point cloud representation. Technical Report, Faculty of Computer Science, University of Karlsruhe: Univ., Fak. für Informatik, Bibliothek, 2001. https://geom.ivd.kit.edu/downloads/pubs/publinsen_2001.pdf, July 2020.

  40. Campos C, Elvira R, Rodríguez J J G, Montiel J M M, Tardós J D. ORB-SLAM3: An accurate open-source library for visual, visual-inertial, and multimap SLAM. IEEE Trans. Robotics, 2021, 37(6): 1874–1890. https://doi.org/10.1109/TRO.2021.3075644.

    Article  Google Scholar 

  41. Mucci P J, Browne S, Deane C, Ho G. PAPI: A portable interface to hardware performance counters. https://icl.utk.edu/projectsfiles/papi/pubs/dodugc99-papi.pdf, Nov. 2023.

  42. Luk C K, Cohn R, Muth R, Patil H, Klauser A, Lowney G, Wallace S, Reddi V J, Hazelwood K. Pin: Building customized program analysis tools with dynamic instrumentation. In Proc. the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), Jun. 2005, pp.190–200. https://doi.org/10.1145/1065010.1065034.

  43. Eyerman S, Eeckhout L, Karkhanis T, Smith J E. A performance counter architecture for computing accurate CPI components. In Proc. the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 2006, pp.175–184. https://doi.org/10.1145/1168857.1168880.

  44. Bird S, Phansalkar A, John L K, Mericas A, Indukuru R. Performance characterization of SPEC CPU benchmarks on Intel's Core microarchitecture based processor. In Proc. SPEC Benchmark Workshop, Jan. 2007.

  45. Jeong Y, Nister D, Steedly D, Szeliski R, Kweon I S. Pushing the envelope of modern methods for bundle adjustment. In Proc. the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2010, pp.1474–1481. https://doi.org/10.1109/CVPR.2010.5539795.

  46. Guennebaud G, Jacob B. Eigen v3. Technical Report, CGLibs, 2010. https://eigen.tuxfamily.org, October 2023.

  47. Bailey T, Nieto J, Nebot E. Consistency of the Fast-SLAM algorithm. In Proc. the 2006 IEEE International Conference on Robotics and Automation (ICRA), May 2006, pp.424–429. https://doi.org/10.1109/ROBOT.2006.1641748.

  48. Sturm J, Engelhard N, Endres F, Burgard W, Cremers D. A benchmark for the evaluation of RGB-D SLAM systems. In Proc. the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct. 2012, pp.573–580. https://doi.org/10.1109/IROS.2012.6385773.

  49. Wasenmüller O, Meyer M, Stricker D. CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2. In Proc. the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), Mar. 2016. https://doi.org/10.1109/WACV.2016.7477636.

  50. Joseph J. Huawei's Kirin 930 balances power & performance using Cortex A53e cores! 2015. https://www.giz-mochina.com/2015/03/27/huawei-reveals-kirin-930-uses-enhanced-cortex-a53e-cores/, October 2023.

  51. Shimpi A L, Smith R. The Intel Ivy Bridge (Core i7 3770k) review. Technical Report, Intel Research, 2012. https://www.anandtech.com/show/5771/theintel-ivy-bridge-core-i7-3770k-review/3, October 2023.

  52. Stillmaker A, Baas B. Scaling equations for the accurate prediction of CMOS device performance from 180 nm to 7 nm. Integration, 2017, 58: 74–81. https://doi.org/10.1016/j.vlsi.2017.02.002.

    Article  Google Scholar 

  53. Sarangi S, Baas B. DeepScaleTool: A tool for the accurate estimation of technology scaling in the deep-submicron era. In Proc. the 2021 IEEE International Symposium on Circuits and Systems (ISCAS), May 2021. https://doi.org/10.1109/ISCAS51556.2021.9401196.

  54. Hong S, Kim J. Three-dimensional visual mapping of underwater ship hull surface using piecewise-planar SLAM. International Journal of Control, Automation and Systems, 2020, 18(3): 564–574. https://doi.org/10.1007/s12555-019-0646-8.

    Article  Google Scholar 

  55. Wu L Y, Wan W G, Yu X Q, Ye C K, Muzahid A A M. A novel augmented reality framework based on monocular semi-dense simultaneous localization and mapping. Computer Animation and Virtual Worlds, 2020, 31(3): e1922. https://doi.org/10.1002/cav.1922.

  56. Wen S H, Zhao Y F, Yuan X, Wang Z T, Zhang D, Manfredi L. Path planning for active SLAM based on deep reinforcement learning under unknown environments. Intelligent Service Robotics, 2020, 13(2): 263–272. https://doi.org/10.1007/s11370-019-00310-w.

    Article  Google Scholar 

  57. Yang J J, Wang C, Zhang Q, Chang B S, Wang F, Wang X L, Wu M. Modeling of laneway environment and locating method of roadheader based on self-coupling and hector SLAM. In Proc. the 5th International Conference on Electromechanical Control Technology and Transportation (ICECTT), May 2020, pp.263–268. https://doi.org/10.1109/ICECTT50890.2020.00067.

  58. Hashimoto K, Saito F, Yamamoto T, Ikeda K. A field study of the human support robot in the home environment. In Proc. the 2013 IEEE Workshop on Advanced Robotics and Its Social Impacts, Nov. 2013, pp.143–150. https://doi.org/10.1109/ARSO.2013.6705520.

  59. Quigley M, Conley K, Gerkey B, Faust J, Foote T, Leibs J, Wheeler R, Ng A. ROS: An open-source robot operating system. In Proc. the 2009 ICRA Workshop on Open Source Software, May 2009.

  60. Zhang Z, Liu S S, Tsai G, Hu H B, Chu C C, Zheng F. PIRVS: An advanced visual-inertial SLAM system with flexible sensor fusion and hardware co-design. In Proc. the 2018 IEEE International Conference on Robotics and Automation (ICRA), May 2018, pp.3826–3832. https://doi.org/10.1109/ICRA.2018.8460672.

  61. Liu R Z, Yang J L, Chen Y R, Zhao W S. eSLAM: An energy-efficient accelerator for real-time ORB-SLAM on FPGA platform. In Proc. the 56th Annual Design Automation Conference, Jun. 2019, Article No. 193. https://doi.org/10.1145/3316781.3317820.

  62. Boikos K, Bouganis C S. A scalable FPGA-based architecture for depth estimation in SLAM. In Proc. the 15th International Symposium on Applied Reconfigurable Computing (ARC), Apr. 2019, pp.181–196. https://doi.org/10.1007/978-3-030-17227-5_14.

  63. Gu M Y, Guo K Y, Wang W Q, Wang Y, Yang H Z. An FPGA-based real-time simultaneous localization and mapping system. In Proc. the 2015 International Conference on Field Programmable Technology (FPT), Dec. 2015, pp.200–203. https://doi.org/10.1109/FPT.2015.7393150.

  64. Lee K Y, Byun K J. A hardware design of optimized ORB algorithm with reduced hardware cost. Advanced Science and Technology Letters, 2013, 43(3): 58–62. https://doi.org/10.14257/ASTL.2013.43.11.

    Article  Google Scholar 

  65. Na E S, Jeong Y J. FPGA implementation of SURF-based feature extraction and descriptor generation. Journal of Korea Multimedia Society, 2013, 16(4): 483–492. https://doi.org/10.9717/KMMS.2013.16.4.483.

    Article  Google Scholar 

  66. Jiang J, Li X Y, Zhang G J. SIFT hardware implementation for real-time image feature extraction. IEEE Trans. Circuits and Systems for Video Technology, 2014, 24(7): 1209–1220. https://doi.org/10.1109/TCSVT.2014.2302535.

    Article  MathSciNet  Google Scholar 

  67. Zhong S, Wang J H, Yan L X, Kang L, Cao Z G. A real-time embedded architecture for SIFT. Journal of Systems Architecture, 2013, 59(1): 16–29. https://doi.org/10.1016/j.sysarc.2012.09.002.

    Article  Google Scholar 

  68. Huang F C, Huang S Y, Ker J W, Chen Y C. High-performance SIFT hardware accelerator for real-time image feature extraction. IEEE Trans. Circuits and Systems for Video Technology, 2012, 22(3): 340–351. https://doi.org/10.1109/TCSVT.2011.2162760.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zi-Dong Du.

Supplementary Information

ESM 1

(PDF 349 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, Z., Hao, YF., Zhi, T. et al. Hardware Acceleration for SLAM in Mobile Systems. J. Comput. Sci. Technol. 38, 1300–1322 (2023). https://doi.org/10.1007/s11390-021-1523-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-021-1523-5

Keywords

Navigation