Skip to main content
Log in

Cooperative Grasp Detection using Convolutional Neural Network

  • Regular paper
  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

In this work, we develop a complete robotic system to enable the robot to fulfill object-independent cooperative grasp tasks. The human subject initiates a task by either holding the object in hand or placing the object on the table. The human intention is inferred through body motions, after which the robot selects the corresponding grasp strategy. A novel real-time grasp detection model is proposed to choose the best picking locations according to the object’s shape. This module enables the robot to grasp any object placed on the table. Moreover, if handover grasp task is triggered, the hand pixels are detected and filtered out from candidate grasp poses for safety purpose. The proposed grasp detection model is evaluated on two public grasping datasets and a set of casual objects. The best model variant can achieve accuracy of 97.8% and 96.6% on image-wise splitting and object-wise splitting tests on Cornell Grasp Dataset respectively. The Jacquard Dataset accuracy is 93.9%. The overall system is also evaluated on real cooperative grasp tasks. The experimental results show effectiveness of the proposed robot grasp detection and implementation system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

The Cornell Grasp Dataset used during the current study is available in the kaggle repository, [https://www.kaggle.com/datasets/oneoneliu/cornell-grasp]. While the Jacquard Dataset is available [https://jacquard.liris.cnrs.fr/].

References

  1. Miller, A., Allen, P.: Graspit! a versatile simulator for robotic grasping. IEEE Robot. Autom. Mag. 11(4), 110–122 (2004)

    Article  Google Scholar 

  2. Saxena, A., Driemeyer, J., Ng, A.Y.: Robotic grasping of novel objects using vision. Int. J. Robot. Res. 27(2), 157–173 (2008)

    Article  Google Scholar 

  3. Pouyanfar, S., Sadiq, S., Yan, Y., Tian, H., Tao, Y., Reyes, M.P., Shyu, M.-L., Chen, S.-C., Iyengar, S.S.: A survey on deep learning: Algorithms, techniques, and applications. ACM Comput. Surv. (CSUR) 51(5), 1–36 (2018)

    Article  Google Scholar 

  4. Redmon, J., Angelova, A.: Real-time grasp detection using convolutional neural networks. In: 2015 IEEE International Conference on Robotics and Automation (ICRA). IEEE, pp. 1316–1322 (2015)

  5. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  6. Zhang, S., Guo, Z., Huang, J., Ren, W., Xia, L.: Robotic grasping position of irregular object based yolo algorithm. In: 2020 5th International Conference on Automation, Control and Robotics Engineering (CACRE), pp. 642–646 (2020)

  7. Yang, J.-Y., Chen, U.-K., Chang, K.-C., Chen, Y.-J.: A novel robotic grasp detection technique by integrating yolo and grasp detection deep neural networks. In: 2020 International Conference on Advanced Robotics and Intelligent Systems (ARIS), pp. 1–4 (2020)

  8. Tian, L., Thalmann, N.M., Thalmann, D., Fang, Z., Zheng, J.: Object grasping of humanoid robot based on yolo. In: Computer Graphics International Conference. Springer, pp. 476–482 (2019)

  9. Zhang, H., Zhou, X., Lan, X., Li, J., Tian, Z., Zheng, N.: A real-time robotic grasping approach with oriented anchor box. IEEE Trans. Syst. Man. Cybern: Syst. 51(5), 3014–3025 (2021)

    Article  Google Scholar 

  10. Zhou, X., Lan, X., Zhang, H., Tian, Z., Zhang, Y., Zheng, N.: Fully convolutional grasp detection network with oriented anchor box. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7223–7230 (2018)

  11. Kumra, S., Kanan, C.: Robotic grasp detection using deep convolutional neural networks. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, pp. 769–776 (2017)

  12. Kumra, S., Joshi, S., Sahin, F.: Antipodal robotic grasping using generative residual convolutional neural network. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9626–9633 (2020)

  13. Cheng, H., Wang, Y., Meng, M.Q.-H.: A robot grasping system with single-stage anchor-free deep grasp detector. IEEE Trans. Instrum. Meas. 71, 1–12 (2022)

    Google Scholar 

  14. Asif, U., Tang, J., Harrer, S.: Graspnet: an efficient convolutional neural network for real-time grasp detection for low-powered devices. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. AAAI Press, pp. 4875–4882 (2018)

  15. Morrison, D., Corke, P., Leitner, J.: Learning robust, real-time, reactive robotic grasping. Int. J Robot. Res. 39(2/3), 183–201 (2020)

    Article  Google Scholar 

  16. Lenz, I., Lee, H., Saxena, A.: Deep learning for detecting robotic grasps. Int. J. Robot. Res. 34(4–5), 705–724 (2015)

    Article  Google Scholar 

  17. Chu, F.-J., Xu, R., Vela, P.A.: Real-world multiobject, multigrasp detection. IEEE Robot. Autom. Lett. 3(4), 3355–3362 (2018)

    Article  Google Scholar 

  18. Gou, M., Fang, H.-S., Zhu, Z., Xu, S., Wang, C., Lu, C.: Rgb matters: Learning 7-dof grasp poses on monocular rgbd images. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13 459–13 466 (2021)

  19. Rosenberger, P., Cosgun, A., Newbury, R., Kwan, J., Ortenzi, V., Corke, P., Grafinger, M.: Object-independent human-to-robot handovers using real time robotic vision. IEEE Robot. Autom. Lett. 6(1), 17–23 (2021)

    Article  Google Scholar 

  20. Yang,W., Paxton, C., Mousavian, A., Chao, Y.-W., Cakmak, M., Fox, D.: Reactive human-to-robot handovers of arbitrary objects. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 3118–3124 (2021)

  21. Song, S., Zeng, A., Lee, J., Funkhouser, T.: Grasping in the wild: learning 6dof closed-loop grasping from low-cost demonstrations. IEEE Robot. Autom. Lett. 5(3), 4978–4985 (2020)

    Article  Google Scholar 

  22. Tadic, V., Toth, A., Vizvari, Z., Klincsik, M., Sari, Z., Sarcevic, P., Sarosi, J., Biro, I.: Perspectives of realsense and zed depth sensors for robotic vision applications. Machines 10(3) (2022)

  23. Tan, M., Le, Q.: Efficientnetv2: Smaller models and faster training. In: International Conference on Machine Learning. PMLR, pp. 10 096–10 106 (2021)

  24. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

  25. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

  26. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)

  27. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. CoRR, (2017). arXiv:1709.01507

  28. Elfwing, S., Uchibe, E., Doya, K.: Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks 107, 3–11 (2018)

    Article  Google Scholar 

  29. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)

    Article  Google Scholar 

  30. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. CoRR (2017)

  31. Quigley, M., Conley, K., Gerkey, B., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y., et al.: Ros: an open-source robot operating system. In: ICRA Workshop on Open Source Software 3(3.2.) Kobe, Japan, p. 5 (2009)

  32. Jiang, Y., Moseson, S., Saxena, A.: Efficient grasping from rgbd images: Learning using a new rectangle representation. In: 2011 IEEE International Conference on Robotics and Automation. IEEE, pp. 3304–3311 (2011)

  33. Esmaeili, A., Marvasti, F.: A novel approach to quantized matrix completion using huber loss measure. IEEE Signal Process. Lett. 26(2), 337–341 (2019)

    Article  Google Scholar 

  34. Kingma, D., Ba, J.: Adam: A method for stochastic optimization. Comput, Sci (2014)

    Google Scholar 

  35. Bambach, S., Lee, S., Crandall, D.J., Yu, C.: Lending a hand: detecting hands and recognizing activities in complex egocentric interactions. In: The IEEE International Conference on Computer Vision (ICCV) (2015)

  36. Guo, D., Sun, F., Liu, H., Kong, T., Fang, B., Xi, N.: A hybrid deep architecture for robotic grasp detection. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 1609–1614 (2017)

  37. Tsai, R., Lenz, R.: A new technique for fully autonomous and efficient 3d robotics hand/eye calibration. IEEE Trans. Robot. Autom. 5(3), 345–358 (1989)

    Article  Google Scholar 

Download references

Acknowledgements

Thanks for all the related equipments sponsored by Shenzhen Technology University.

Funding

This project was supported by Guangdong-Hong Kong-Macao Joint Laboratory of Human–Machine Intelligence-Synergy Systems, Project No. 2019B121205007 (Ye Gu). Shenzhen Science and Technology Program. No. JCYJ20220818102215034 (Jianmin Cao). Scientific research capacity improvement project of key construction disciplines in Guangdong Province, No. 2021ZDJS109 (Jianmin Cao). Research Promotion Project of Key Construction Discipline in Guangdong Province, No. 2022ZDJS112 (Ye Gu).

Author information

Authors and Affiliations

Authors

Contributions

Ye Gu and Jianmin Cao propose the general idea of this paper. Ye Gu designs the overall system and the structures of each module. Dujia Wei implements and verifies the grasp detection model. Ye Gu implements the action recognition and hand segmentation models and evaluates the overall system. Yawei Du implements the hand-eye calibration. Ye Gu is the major contributor in writing the manuscript.

Corresponding author

Correspondence to Ye Gu.

Ethics declarations

Ethics approval

Not applicable

Consent to participate

Not applicable

Consent for publication

Not applicable

Conflicts of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gu, Y., Wei, D., Du, Y. et al. Cooperative Grasp Detection using Convolutional Neural Network. J Intell Robot Syst 110, 5 (2024). https://doi.org/10.1007/s10846-023-02028-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-023-02028-5

Keywords

Navigation