skip to main content
research-article
Free Access
Just Accepted

Towards Few-Label Vertical Federated Learning

Online AM:09 April 2024Publication History
Skip Abstract Section

Abstract

Federated Learning (FL) provided a novel paradigm for privacy-preserving machine learning, enabling multiple clients to collaborate on model training without sharing private data. To handle multi-source heterogeneous data, vertical federated learning (VFL) has been extensively investigated. However, in the context of VFL, the label information tends to be kept in one authoritative client and is very limited. This poses two challenges for model training in the VFL scenario: On the one hand, a small number of labels cannot guarantee to train a well VFL model with informative network parameters, resulting in unclear boundaries for classification decisions; On the other hand, the large amount of unlabeled data is dominant and should not be discounted, and it’s worthwhile to focus on how to leverage them to improve representation modeling capabilities. In order to address the above two challenges, Firstly, we introduce supervised contrastive loss to enhance the intra-class aggregation and inter-class estrangement, which is to deeply explore label information and improve the effectiveness of downstream classification tasks. Secondly, for unlabeled data, we introduce a pseudo-label-guided consistency mechanism to induce the classification results coherent across clients, which allows the representations learned by local networks to absorb the knowledge from other clients, and alleviates the disagreement between different clients for classification tasks. We conduct sufficient experiments on four commonly used datasets, and the experimental results demonstrate that our method is superior to the state-of-the-art methods, especially in the low-label rate scenario, and the improvement becomes more significant.

References

  1. Manoj Ghuhan Arivazhagan, Vinay Aggarwal, Aaditya Kumar Singh, and Sunav Choudhary. 2019. Federated learning with personalization layers. arXiv preprint arXiv:1912.00818(2019).Google ScholarGoogle Scholar
  2. Arthur Asuncion and David Newman. 2007. UCI machine learning repository.Google ScholarGoogle Scholar
  3. Avrim Blum and Tom Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on Computational learning theory. 92–100.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečnỳ, Stefano Mazzocchi, Brendan McMahan, et al. 2019. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems 1 (2019), 374–388.Google ScholarGoogle Scholar
  5. Akin Caliskan, Armin Mustafa, Evren Imre, and Adrian Hilton. 2020. Multi-view consistency loss for improved single-image 3d reconstruction of clothed people. In Proceedings of the Asian Conference on Computer Vision.Google ScholarGoogle Scholar
  6. Tianyi Chen, Xiao Jin, Yuejiao Sun, and Wotao Yin. 2020. Vafl: a method of vertical asynchronous federated learning. arXiv preprint arXiv:2007.06081(2020).Google ScholarGoogle Scholar
  7. Liam Collins, Hamed Hassani, Aryan Mokhtari, and Sanjay Shakkottai. 2021. Exploiting shared representations for personalized federated learning. In International Conference on Machine Learning. PMLR, 2089–2099.Google ScholarGoogle Scholar
  8. Cynthia Dwork. 2006. Differential privacy. In Automata, Languages and Programming: 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10-14, 2006, Proceedings, Part II 33. Springer, 1–12.Google ScholarGoogle Scholar
  9. Farzan Farnia, Amirhossein Reisizadeh, Ramtin Pedarsani, and Ali Jadbabaie. 2022. An Optimal Transport Approach to Personalized Federated Learning. IEEE Journal on Selected Areas in Information Theory 3, 2 (2022), 162–171.Google ScholarGoogle ScholarCross RefCross Ref
  10. Siwei Feng and Han Yu. 2020. Multi-participant multi-class vertical federated learning. arXiv preprint arXiv:2001.11154(2020).Google ScholarGoogle Scholar
  11. Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan Ö Arık, Larry S Davis, and Tomas Pfister. 2020. Consistency-based semi-supervised active learning: Towards minimizing labeling cost. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16. Springer, 510–526.Google ScholarGoogle Scholar
  12. Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ramchandran. 2020. An efficient framework for clustered federated learning. Advances in Neural Information Processing Systems 33 (2020), 19586–19597.Google ScholarGoogle Scholar
  13. Chen Gong, Zhenzhe Zheng, Fan Wu, Yunfeng Shao, Bingshuai Li, and Guihai Chen. 2023. To store or not? online data selection for federated learning with limited storage. In Proceedings of the ACM Web Conference 2023. 3044–3055.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 297–304.Google ScholarGoogle Scholar
  15. Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677(2017).Google ScholarGoogle Scholar
  16. Nakamasa Inoue and Keita Goto. 2020. Semi-Supervised Contrastive Learning with Generalized Contrastive Loss and Its Application to Speaker Recognition. In 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). 1641–1646.Google ScholarGoogle Scholar
  17. Antoine Jamin and Anne Humeau-Heurtier. 2019. (Multiscale) cross-entropy methods: A review. Entropy 22, 1 (2019), 45.Google ScholarGoogle ScholarCross RefCross Ref
  18. Xiao Jin, Pin-Yu Chen, Chia-Yi Hsu, Chia-Mu Yu, and Tianyi Chen. 2021. CAFE: Catastrophic data leakage in vertical federated learning. Advances in Neural Information Processing Systems 34 (2021), 994–1006.Google ScholarGoogle Scholar
  19. Yan Kang, Yang Liu, and Tianjian Chen. 2020. Fedmvt: Semi-supervised vertical federated learning with multiview training. arXiv preprint arXiv:2008.10838(2020).Google ScholarGoogle Scholar
  20. Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, and Ananda Theertha Suresh. 2020. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. In Proceedings of the 37th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol.  119), Hal Daumé III and Aarti Singh (Eds.). PMLR, 5132–5143.Google ScholarGoogle Scholar
  21. Fei-Fei Li, Marco Andreeto, Marc’Aurelio Ranzato, and Pietro Perona. 2022. Caltech 101. https://doi.org/10.22002/D1.20086Google ScholarGoogle ScholarCross RefCross Ref
  22. Junnan Li, Caiming Xiong, and Steven CH Hoi. 2021. Comatch: Semi-supervised learning with contrastive graph regularization. In Proceedings of the IEEE/CVF international conference on computer vision. 9475–9484.Google ScholarGoogle ScholarCross RefCross Ref
  23. Qinbin Li, Bingsheng He, and Dawn Song. 2021. Model-Contrastive Federated Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10713–10722.Google ScholarGoogle ScholarCross RefCross Ref
  24. Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems 2 (2020), 429–450.Google ScholarGoogle Scholar
  25. Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. 2019. On the convergence of fedavg on non-iid data. arXiv preprint arXiv:1907.02189(2019).Google ScholarGoogle Scholar
  26. Youwei Liang, Dong Huang, and Chang-Dong Wang. 2019. Consistency meets inconsistency: A unified graph learning framework for multi-view clustering. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 1204–1209.Google ScholarGoogle ScholarCross RefCross Ref
  27. Tao Lin, Lingjing Kong, Sebastian U Stich, and Martin Jaggi. 2020. Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems 33 (2020), 2351–2363.Google ScholarGoogle Scholar
  28. Yang Liu, Yan Kang, Tianyuan Zou, Yanhong Pu, Yuanqin He, Xiaozhou Ye, Ye Ouyang, Ya-Qin Zhang, and Qiang Yang. 2022. Vertical Federated Learning. arXiv preprint arXiv:2211.12814(2022).Google ScholarGoogle Scholar
  29. Xinjian Luo, Yuncheng Wu, Xiaokui Xiao, and Beng Chin Ooi. 2021. Feature inference attack on model predictions in vertical federated learning. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 181–192.Google ScholarGoogle ScholarCross RefCross Ref
  30. Xiaosong Ma, Jie Zhang, Song Guo, and Wenchao Xu. 2022. Layer-wised model aggregation for personalized federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10092–10101.Google ScholarGoogle ScholarCross RefCross Ref
  31. Shie Mannor, Dori Peleg, and Reuven Rubinstein. 2005. The cross entropy method for classification. In Proceedings of the 22nd international conference on Machine learning. 561–568.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. David Mickisch, Felix Assion, Florens Greßner, Wiebke Günther, and Mariele Motta. 2020. Understanding the decision boundary of deep neural networks: An empirical study. arXiv preprint arXiv:2002.01810(2020).Google ScholarGoogle Scholar
  33. Ion Muslea, Steven Minton, and Craig A Knoblock. 2006. Active learning with multiple views. Journal of Artificial Intelligence Research 27 (2006), 203–233.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Jinlong Pang, Jieling Yu, Ruiting Zhou, and John CS Lui. 2022. An incentive auction for heterogeneous client selection in federated learning. IEEE Transactions on Mobile Computing(2022).Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Krishna Pillutla, Kshitiz Malik, Abdel-Rahman Mohamed, Mike Rabbat, Maziar Sanjabi, and Lin Xiao. 2022. Federated learning with partial model personalization. In International Conference on Machine Learning. PMLR, 17716–17758.Google ScholarGoogle Scholar
  36. Protection Regulation. 2016. Regulation (EU) 2016/679 of the European Parliament and of the Council. Regulation (eu) 679(2016), 2016.Google ScholarGoogle Scholar
  37. Daniele Romanini, Adam James Hall, Pavlos Papadopoulos, Tom Titcombe, Abbas Ismail, Tudor Cebere, Robert Sandmann, Robin Roehm, and Michael A Hoeh. 2021. Pyvertical: A vertical federated learning framework for multi-headed splitnn. arXiv preprint arXiv:2104.00489(2021).Google ScholarGoogle Scholar
  38. Theo Ryffel, Andrew Trask, Morten Dahl, Bobby Wagner, Jason Mancuso, Daniel Rueckert, and Jonathan Passerat-Palmbach. 2018. A generic framework for privacy preserving deep learning. arXiv preprint arXiv:1811.04017(2018).Google ScholarGoogle Scholar
  39. Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What makes for good views for contrastive learning?Advances in Neural Information Processing Systems 33 (2020), 6827–6839.Google ScholarGoogle Scholar
  40. Shubham Tulsiani, Alexei A Efros, and Jitendra Malik. 2018. Multi-view consistency as supervisory signal for learning shape and pose prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2897–2905.Google ScholarGoogle ScholarCross RefCross Ref
  41. Jesper E Van Engelen and Holger H Hoos. 2020. A survey on semi-supervised learning. Machine learning 109, 2 (2020), 373–440.Google ScholarGoogle Scholar
  42. Cédric Villani et al. 2009. Optimal transport: old and new. Vol.  338. Springer.Google ScholarGoogle Scholar
  43. Hao Wang, Zakhary Kaplan, Di Niu, and Baochun Li. 2020. Optimizing federated learning on non-iid data with reinforcement learning. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 1698–1707.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H Vincent Poor. 2020. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems 33 (2020), 7611–7623.Google ScholarGoogle Scholar
  45. Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, Farhad Farokhi, Shi Jin, Tony QS Quek, and H Vincent Poor. 2020. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security 15 (2020), 3454–3469.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Kang Wei, Jun Li, Chuan Ma, Ming Ding, Sha Wei, Fan Wu, Guihai Chen, and Thilina Ranbaduge. 2022. Vertical federated learning: Challenges, methodologies and experiments. arXiv preprint arXiv:2202.04309(2022).Google ScholarGoogle Scholar
  47. Haiqin Weng, Juntao Zhang, Feng Xue, Tao Wei, Shouling Ji, and Zhiyuan Zong. 2020. Privacy leakage of real-world vertical federated learning. arXiv preprint arXiv:2011.09290(2020).Google ScholarGoogle Scholar
  48. Yuncheng Wu, Shaofeng Cai, Xiaokui Xiao, Gang Chen, and Beng Chin Ooi. 2020. Privacy preserving vertical federated learning for tree-based models. arXiv preprint arXiv:2008.06170(2020).Google ScholarGoogle Scholar
  49. Zhaomin Wu, Qinbin Li, and Bingsheng He. 2022. Practical vertical federated learning with unsupervised representation learning. IEEE Transactions on Big Data(2022).Google ScholarGoogle ScholarCross RefCross Ref
  50. Fan Yang, Kai Wu, Shuyi Zhang, Guannan Jiang, Yong Liu, Feng Zheng, Wei Zhang, Chengjie Wang, and Long Zeng. 2022. Class-aware contrastive semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14421–14430.Google ScholarGoogle ScholarCross RefCross Ref
  51. J. H. Yang, C. Chen, H. N. Dai, M. Ding, L. L. Fu, and Z. B. Zheng. 2022. Hierarchical representation for multi-view clustering: from intra-sample to intra-view to inter-view. In Proceedings of the Conference on Information and Knowledge Management.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. J. H. Yang, C. Chen, H. N. Dai, M. Ding, Z. B. Wu, and Z. B. Zheng. 2022. Robust Corrupted Data Recovery and Clustering via Generalized Transformed Tensor Low-Rank Representation. IEEE Trans. Neural Networks Learn. Syst.(2022), DOI: 10.1109/TNNLS.2022.3215983.Google ScholarGoogle ScholarCross RefCross Ref
  53. J. H. Yang, C. Chen, H. N. Dai, L. L. Fu, and Z. B. Zheng. 2022. A structure noise-aware tensor dictionary learning method for high-dimensional data clustering. Inf. Sci. 612(2022), 87–106.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. J. H. Yang, L. L. Fu, C. Chen, H. N. Dai, and Z. B. Zheng. 2023. Cross-View Graph Matching for Incomplete Multi-view Clustering. Neurocomputing 515(2023), 79–88.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Lei Yang, Jiaming Huang, Wanyu Lin, and Jiannong Cao. 2022. Personalized Federated Learning on Non-IID Data via Group-Based Meta-Learning. ACM Transactions on Knowledge Discovery from Data (TKDD) (2022).Google ScholarGoogle Scholar
  56. Xihong Yang, Xiaochang Hu, Sihang Zhou, Xinwang Liu, and En Zhu. 2022. Interpolation-based contrastive learning for few-label semi-supervised learning. IEEE Transactions on Neural Networks and Learning Systems (2022).Google ScholarGoogle ScholarCross RefCross Ref
  57. Chenhao Ying, Haiming Jin, Xudong Wang, and Yuan Luo. 2020. Double insurance: Incentivized federated learning with differential privacy in mobile crowdsensing. In 2020 International Symposium on Reliable Distributed Systems (SRDS). IEEE, 81–90.Google ScholarGoogle ScholarCross RefCross Ref
  58. Chunjie Zhang, Jian Cheng, and Qi Tian. 2019. Multi-view image classification with visual, semantic and view consistency. IEEE Transactions on Image Processing 29 (2019), 617–627.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Chen Zhang, Yu Xie, Hang Bai, Bin Yu, Weihong Li, and Yuan Gao. 2021. A survey on federated learning. Knowledge-Based Systems 216 (2021), 106775.Google ScholarGoogle ScholarCross RefCross Ref
  60. Qingsong Zhang, Bin Gu, Cheng Deng, Songxiang Gu, Liefeng Bo, Jian Pei, and Heng Huang. 2021. AsySQN: Faster Vertical Federated Learning Algorithms with Better Computation Resource Utilization. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3917–3927.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Yuhang Zhang, Xiaopeng Zhang, Jie Li, Robert Qiu, Haohang Xu, and Qi Tian. 2022. Semi-supervised Contrastive Learning with Similarity Co-calibration. IEEE Transactions on Multimedia(2022), 1–1. https://doi.org/10.1109/TMM.2022.3158069Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards Few-Label Vertical Federated Learning

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM Transactions on Knowledge Discovery from Data
          ACM Transactions on Knowledge Discovery from Data Just Accepted
          ISSN:1556-4681
          EISSN:1556-472X
          Table of Contents

          Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Online AM: 9 April 2024
          • Accepted: 31 March 2024
          • Revised: 24 January 2024
          • Received: 1 April 2023
          Published in tkdd Just Accepted

          Check for updates

          Qualifiers

          • research-article
        • Article Metrics

          • Downloads (Last 12 months)124
          • Downloads (Last 6 weeks)124

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader