skip to main content
research-article
Free Access
Just Accepted

Double Reference Guided Interactive 2D and 3D Caricature Generation

Online AM:01 April 2024Publication History
Skip Abstract Section

Abstract

In this paper, we propose the first geometry and texture (double) referenced interactive 2D and 3D caricature generating and editing method. The main challenge of caricature generation lies in the fact that it not only exaggerates the facial geometry but also refreshes the facial texture. We address this challenge by utilizing the semantic segmentation maps as an intermediary domain, removing the influence of photo texture while preserving the person-specific geometry features. Specifically, our proposed method consists of two main components: 3D-CariNet and CariMaskGAN. 3D-CariNet uses sketches or caricatures to exaggerate the input photo into several types of 3D caricatures. To generate a CariMask, we geometrically exaggerate the photos using the projection of exaggerated 3D landmarks, after which CariMask is converted into a caricature by CariMaskGAN. In this step, users can edit and adjust the geometry of caricatures freely. Moreover, we propose a semantic detail preprocessing approach that considerably increases the details of generated caricatures and allows modification of hair strands, wrinkles, and beards. By rendering high-quality 2D caricatures as textures, we produce 3D caricatures with a variety of texture styles. Extensive experimental results have demonstrated that our method can produce higher-quality caricatures as well as support interactive modification with ease.

References

  1. Ergun Akleman. 1997. Making caricatures with morphing. In SIGGRAPH Visual Proceedings. ACM, 145.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ergun Akleman, James Palmer, and Ryan Logan. 2000. Making extreme caricatures with a new interactive 2D deformation technique with simplicial complexes. In Proceedings of Visual, Vol.  1. Citeseer, 2000.Google ScholarGoogle Scholar
  3. Volker Blanz and Thomas Vetter. 1999. A Morphable Model for the Synthesis of 3D Faces. In SIGGRAPH. ACM, 187–194.Google ScholarGoogle Scholar
  4. Susan E Brennan. 1985. Caricature generator: The dynamic exaggeration of faces by computer. Leonardo 18, 3 (1985), 170–178.Google ScholarGoogle ScholarCross RefCross Ref
  5. Hongrui Cai, Yudong Guo, Zhuang Peng, and Juyong Zhang. 2021. Landmark detection and 3D face reconstruction for caricature using a nonlinear parametric model. Graphical Models 115(2021), 101103.Google ScholarGoogle ScholarCross RefCross Ref
  6. Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2013. Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics (TVCG) 20, 3(2013), 413–425.Google ScholarGoogle Scholar
  7. Kaidi Cao, Jing Liao, and Lu Yuan. 2018. CariGANs: unpaired photo-to-caricature translation. ACM Trans. Graph. (TOG) 37, 6 (2018), 244:1–244:14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Dongdong Chen, Jing Liao, Lu Yuan, Nenghai Yu, and Gang Hua. 2017. Coherent online video style transfer. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1105–1114.Google ScholarGoogle ScholarCross RefCross Ref
  9. Shu-Yu Chen, Feng-Lin Liu, Yu-Kun Lai, Paul L Rosin, Chunpeng Li, Hongbo Fu, and Lin Gao. 2021. DeepFaceEditing: deep face generation and editing with disentangled geometry and appearance control. ACM Trans. Graph. (TOG) 40, 4 (2021), 90:1–90:15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shu-Yu Chen, Wanchao Su, Lin Gao, Shihong Xia, and Hongbo Fu. 2020. DeepFaceDrawing: Deep generation of face images from sketches. ACM Transactions on Graphics (TOG) 39, 4 (2020), 72–1.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Wenjuan Chen, Hongchuan Yu, Minyong Shi, and Qingjie Sun. 2009. Regularity-Based Caricature Synthesis. In 2009 International Conference on Management and Service Science. IEEE, 1–5.Google ScholarGoogle Scholar
  12. Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. 2020. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8188–8197.Google ScholarGoogle ScholarCross RefCross Ref
  13. Wenqing Chu, Wei-Chih Hung, Yi-Hsuan Tsai, Yu-Ting Chang, Yijun Li, Deng Cai, and Ming-Hsuan Yang. 2021. Learning to caricature via semantic shape transform. International Journal of Computer Vision (IJCV) (2021), 1–17.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, and William T Freeman. 2017. Synthesizing normalized faces from facial identity features. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 3703–3712.Google ScholarGoogle ScholarCross RefCross Ref
  15. Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, and Xin Tong. 2019. Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 0–0.Google ScholarGoogle ScholarCross RefCross Ref
  16. Hans G Feichtinger and Thomas Strohmer. 2012. Advances in Gabor analysis. Springer Science & Business Media.Google ScholarGoogle Scholar
  17. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2015. A Neural Algorithm of Artistic Style. CoRR abs/1508.06576(2015).Google ScholarGoogle Scholar
  18. Julia Gong, Yannick Hold-Geoffroy, and Jingwan Lu. 2020. AutoToon: Automatic Geometric Warping for Face Cartoon Generation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). 360–369.Google ScholarGoogle ScholarCross RefCross Ref
  19. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems (NIPS) 27 (2014).Google ScholarGoogle Scholar
  20. Zheng Gu, Chuanqi Dong, Jing Huo, Wenbin Li, and Yang Gao. 2021. CariMe: Unpaired Caricature Generation with Multiple Exaggerations. IEEE Transactions on Multimedia (TMM)(2021).Google ScholarGoogle Scholar
  21. Xiaoguang Han, Chang Gao, and Yizhou Yu. 2017. Deepsketch2face: a deep learning based sketching system for 3d face and caricature modeling. ACM Transactions on graphics (TOG) 36, 4 (2017), 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Xiaoguang Han, Kangcheng Hou, Dong Du, Yuda Qiu, Shuguang Cui, Kun Zhou, and Yizhou Yu. 2018. Caricatureshop: Personalized and photorealistic caricature sketching. IEEE transactions on visualization and computer graphics (TVCG) 26, 7(2018), 2349–2361.Google ScholarGoogle Scholar
  23. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 770–778.Google ScholarGoogle ScholarCross RefCross Ref
  24. Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  25. Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). 1501–1510.Google ScholarGoogle ScholarCross RefCross Ref
  26. Xin Huang, Dong Liang, Hongrui Cai, Juyong Zhang, and Jinyuan Jia. 2022. CariPainter: Sketch Guided Interactive Caricature Generation. In Proceedings of the 30th ACM International Conference on Multimedia. 1232–1240.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal unsupervised image-to-image translation. In Proceedings of the European conference on computer vision (ECCV). 172–189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Jing Huo, Wenbin Li, Yinghuan Shi, Yang Gao, and Hujun Yin. 2018. WebCaricature: a benchmark for caricature recognition. In BMVC. BMVA Press, 223.Google ScholarGoogle Scholar
  29. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 1125–1134.Google ScholarGoogle ScholarCross RefCross Ref
  30. Wonjong Jang, Gwangjin Ju, Yucheol Jung, Jiaolong Yang, Xin Tong, and Seungyong Lee. 2021. StyleCariGAN: caricature generation via StyleGAN feature map modulation. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1–16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 4401–4410.Google ScholarGoogle ScholarCross RefCross Ref
  32. Junho Kim, Minjae Kim, Hyeonwoo Kang, and Kwanghee Lee. 2020. U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation. In ICLR. OpenReview.net.Google ScholarGoogle Scholar
  33. Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster).Google ScholarGoogle Scholar
  34. KH Lai, PWH Chung, and EA Edirisinghe. 2006. Novel approach to neural network based caricature generation. (2006).Google ScholarGoogle Scholar
  35. Wenbin Li, Wei Xiong, Haofu Liao, Jing Huo, Yang Gao, and Jiebo Luo. 2020. CariGAN: Caricature generation through weakly paired adversarial learning. Neural Networks 132(2020), 66–74.Google ScholarGoogle ScholarCross RefCross Ref
  36. Xueting Li, Sifei Liu, Jan Kautz, and Ming-Hsuan Yang. 2019. Learning linear transformations for fast image and video style transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3809–3817.Google ScholarGoogle ScholarCross RefCross Ref
  37. Lin Liang, Hong Chen, Ying-Qing Xu, and Heung-Yeung Shum. 2002. Example-Based Caricature Generation with Exaggeration. In PG. IEEE Computer Society, 386–393.Google ScholarGoogle Scholar
  38. Jing Liao, Yuan Yao, Lu Yuan, Gang Hua, and Sing Bing Kang. 2017. Visual attribute transfer through deep image analogy. ACM Trans. Graph.(TOG) 36, 4 (2017), 120:1–120:15.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Junfa Liu, Yiqiang Chen, and Wen Gao. 2006. Mapping learning in eigenspace for harmonious caricature generation. In ACM Multimedia. ACM, 683–686.Google ScholarGoogle Scholar
  40. Junfa Liu, Yiqiang Chen, Jinjing Xie, Xingyu Gao, and Wen Gao. 2009. Semi-supervised learning of caricature pattern from manifold regularization. In International Conference on Multimedia Modeling. Springer, 413–424.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. 2017. Sphereface: Deep hypersphere embedding for face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 212–220.Google ScholarGoogle ScholarCross RefCross Ref
  42. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision (ICCV). 3730–3738.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. V. London. 2017. How to Draw a Portrait: The Step-By-step Guide on How to Draw Portraits in the Three-quarters View. Independently Published. https://books.google.com/books?id=EN6mswEACAAJGoogle ScholarGoogle Scholar
  44. Zhenyao Mo, John P Lewis, and Ulrich Neumann. 2004. Improved automatic caricature by feature normalization and exaggeration. In ACM SIGGRAPH 2004 Sketches. ACM, 57.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Pascal Paysan, Reinhard Knothe, Brian Amberg, Sami Romdhani, and Thomas Vetter. 2009. A 3D Face Model for Pose and Illumination Invariant Face Recognition. In AVSS. IEEE Computer Society, 296–301.Google ScholarGoogle Scholar
  46. Yichun Shi, Debayan Deb, and Anil K Jain. 2019. Warpgan: Automatic caricature generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10762–10771.Google ScholarGoogle ScholarCross RefCross Ref
  47. Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR.Google ScholarGoogle Scholar
  48. Yaniv Taigman, Adam Polyak, and Lior Wolf. 2017. Unsupervised Cross-Domain Image Generation. In ICLR (Poster). OpenReview.net.Google ScholarGoogle Scholar
  49. Qianyi Wu, Juyong Zhang, Yu-Kun Lai, Jianmin Zheng, and Jianfei Cai. 2018. Alive caricature from 2d to 3d. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7336–7345.Google ScholarGoogle ScholarCross RefCross Ref
  50. Zipeng Ye, Ran Yi, Minjing Yu, Juyong Zhang, Yu-Kun Lai, and Yong-jin Liu. 2020. 3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Face Photos. CoRR abs/2003.06841(2020).Google ScholarGoogle Scholar
  51. Changqian Yu, Jingbo Wang, Chao Peng, Changxin Gao, Gang Yu, and Nong Sang. 2018. Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European conference on computer vision (ECCV). 325–341.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Ziqiang Zheng, Chao Wang, Zhibin Yu, Nan Wang, Haiyong Zheng, and Bing Zheng. 2019. Unpaired photo-to-caricature translation on faces in the wild. Neurocomputing 355(2019), 71–81.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision (ICCV). 2223–2232.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Double Reference Guided Interactive 2D and 3D Caricature Generation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications Just Accepted
      ISSN:1551-6857
      EISSN:1551-6865
      Table of Contents

      Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Online AM: 1 April 2024
      • Accepted: 18 March 2024
      • Revised: 30 January 2024
      • Received: 31 March 2023
      Published in tomm Just Accepted

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)53
      • Downloads (Last 6 weeks)53

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader