research-article

Free Access

Just Accepted

Towards Few-Label Vertical Federated Learning

Authors:
Lei Zhang

Sun Yat-sen University, Guangzhou, China

Sun Yat-sen University, Guangzhou, China
Search about this author

,
Lele Fu

School of Systems Science and Engineering Sun Yat-sen University, Guangzhou, China

School of Systems Science and Engineering Sun Yat-sen University, Guangzhou, China
Search about this author

,
Chen Liu

SHENZHEN TRANSSION HOLDINGS CO..LIMITED, Shenzhen, China

SHENZHEN TRANSSION HOLDINGS CO..LIMITED, Shenzhen, China
Search about this author

,
Zhao Yang

SHENZHEN TRANSSION HOLDINGS CO..LIMITED, Shenzhen, China

SHENZHEN TRANSSION HOLDINGS CO..LIMITED, Shenzhen, China
Search about this author

,
Jinghua Yang

Southwest Jiaotong University, Chengdu, China

Southwest Jiaotong University, Chengdu, China
Search about this author

,
Zibin Zheng

Sun Yat-sen University, Zhuhai, China

Sun Yat-sen University, Zhuhai, China
Search about this author

,
Chuan Chen

School of Data and Computer Science Sun Yat-sen University, Guangzhou, China

School of Data and Computer Science Sun Yat-sen University, Guangzhou, China
Search about this author

Authors Info & Claims

ACM Transactions on Knowledge Discovery from DataAccepted on March 2024https://doi.org/10.1145/3656344

Online AM:09 April 2024Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

Federated Learning (FL) provided a novel paradigm for privacy-preserving machine learning, enabling multiple clients to collaborate on model training without sharing private data. To handle multi-source heterogeneous data, vertical federated learning (VFL) has been extensively investigated. However, in the context of VFL, the label information tends to be kept in one authoritative client and is very limited. This poses two challenges for model training in the VFL scenario: On the one hand, a small number of labels cannot guarantee to train a well VFL model with informative network parameters, resulting in unclear boundaries for classification decisions; On the other hand, the large amount of unlabeled data is dominant and should not be discounted, and it’s worthwhile to focus on how to leverage them to improve representation modeling capabilities. In order to address the above two challenges, Firstly, we introduce supervised contrastive loss to enhance the intra-class aggregation and inter-class estrangement, which is to deeply explore label information and improve the effectiveness of downstream classification tasks. Secondly, for unlabeled data, we introduce a pseudo-label-guided consistency mechanism to induce the classification results coherent across clients, which allows the representations learned by local networks to absorb the knowledge from other clients, and alleviates the disagreement between different clients for classification tasks. We conduct sufficient experiments on four commonly used datasets, and the experimental results demonstrate that our method is superior to the state-of-the-art methods, especially in the low-label rate scenario, and the improvement becomes more significant.

References

Manoj Ghuhan Arivazhagan, Vinay Aggarwal, Aaditya Kumar Singh, and Sunav Choudhary. 2019. Federated learning with personalization layers. arXiv preprint arXiv:1912.00818(2019).Google Scholar
Arthur Asuncion and David Newman. 2007. UCI machine learning repository.Google Scholar
Avrim Blum and Tom Mitchell. 1998. Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on Computational learning theory. 92–100.Google ScholarDigital Library
Keith Bonawitz, Hubert Eichner, Wolfgang Grieskamp, Dzmitry Huba, Alex Ingerman, Vladimir Ivanov, Chloe Kiddon, Jakub Konečnỳ, Stefano Mazzocchi, Brendan McMahan, et al. 2019. Towards federated learning at scale: System design. Proceedings of Machine Learning and Systems 1 (2019), 374–388.Google Scholar
Akin Caliskan, Armin Mustafa, Evren Imre, and Adrian Hilton. 2020. Multi-view consistency loss for improved single-image 3d reconstruction of clothed people. In Proceedings of the Asian Conference on Computer Vision.Google Scholar
Tianyi Chen, Xiao Jin, Yuejiao Sun, and Wotao Yin. 2020. Vafl: a method of vertical asynchronous federated learning. arXiv preprint arXiv:2007.06081(2020).Google Scholar
Liam Collins, Hamed Hassani, Aryan Mokhtari, and Sanjay Shakkottai. 2021. Exploiting shared representations for personalized federated learning. In International Conference on Machine Learning. PMLR, 2089–2099.Google Scholar
Cynthia Dwork. 2006. Differential privacy. In Automata, Languages and Programming: 33rd International Colloquium, ICALP 2006, Venice, Italy, July 10-14, 2006, Proceedings, Part II 33. Springer, 1–12.Google Scholar
Farzan Farnia, Amirhossein Reisizadeh, Ramtin Pedarsani, and Ali Jadbabaie. 2022. An Optimal Transport Approach to Personalized Federated Learning. IEEE Journal on Selected Areas in Information Theory 3, 2 (2022), 162–171.Google ScholarCross Ref
Siwei Feng and Han Yu. 2020. Multi-participant multi-class vertical federated learning. arXiv preprint arXiv:2001.11154(2020).Google Scholar
Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan Ö Arık, Larry S Davis, and Tomas Pfister. 2020. Consistency-based semi-supervised active learning: Towards minimizing labeling cost. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16. Springer, 510–526.Google Scholar
Avishek Ghosh, Jichan Chung, Dong Yin, and Kannan Ramchandran. 2020. An efficient framework for clustered federated learning. Advances in Neural Information Processing Systems 33 (2020), 19586–19597.Google Scholar
Chen Gong, Zhenzhe Zheng, Fan Wu, Yunfeng Shao, Bingshuai Li, and Guihai Chen. 2023. To store or not? online data selection for federated learning with limited storage. In Proceedings of the ACM Web Conference 2023. 3044–3055.Google ScholarDigital Library
Michael Gutmann and Aapo Hyvärinen. 2010. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 297–304.Google Scholar
Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677(2017).Google Scholar
Nakamasa Inoue and Keita Goto. 2020. Semi-Supervised Contrastive Learning with Generalized Contrastive Loss and Its Application to Speaker Recognition. In 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). 1641–1646.Google Scholar
Antoine Jamin and Anne Humeau-Heurtier. 2019. (Multiscale) cross-entropy methods: A review. Entropy 22, 1 (2019), 45.Google ScholarCross Ref
Xiao Jin, Pin-Yu Chen, Chia-Yi Hsu, Chia-Mu Yu, and Tianyi Chen. 2021. CAFE: Catastrophic data leakage in vertical federated learning. Advances in Neural Information Processing Systems 34 (2021), 994–1006.Google Scholar
Yan Kang, Yang Liu, and Tianjian Chen. 2020. Fedmvt: Semi-supervised vertical federated learning with multiview training. arXiv preprint arXiv:2008.10838(2020).Google Scholar
Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian Stich, and Ananda Theertha Suresh. 2020. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning. In Proceedings of the 37th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 119), Hal Daumé III and Aarti Singh (Eds.). PMLR, 5132–5143.Google Scholar
Fei-Fei Li, Marco Andreeto, Marc’Aurelio Ranzato, and Pietro Perona. 2022. Caltech 101. https://doi.org/10.22002/D1.20086Google ScholarCross Ref
Junnan Li, Caiming Xiong, and Steven CH Hoi. 2021. Comatch: Semi-supervised learning with contrastive graph regularization. In Proceedings of the IEEE/CVF international conference on computer vision. 9475–9484.Google ScholarCross Ref
Qinbin Li, Bingsheng He, and Dawn Song. 2021. Model-Contrastive Federated Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 10713–10722.Google ScholarCross Ref
Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. 2020. Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems 2 (2020), 429–450.Google Scholar
Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, and Zhihua Zhang. 2019. On the convergence of fedavg on non-iid data. arXiv preprint arXiv:1907.02189(2019).Google Scholar
Youwei Liang, Dong Huang, and Chang-Dong Wang. 2019. Consistency meets inconsistency: A unified graph learning framework for multi-view clustering. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 1204–1209.Google ScholarCross Ref
Tao Lin, Lingjing Kong, Sebastian U Stich, and Martin Jaggi. 2020. Ensemble distillation for robust model fusion in federated learning. Advances in Neural Information Processing Systems 33 (2020), 2351–2363.Google Scholar
Yang Liu, Yan Kang, Tianyuan Zou, Yanhong Pu, Yuanqin He, Xiaozhou Ye, Ye Ouyang, Ya-Qin Zhang, and Qiang Yang. 2022. Vertical Federated Learning. arXiv preprint arXiv:2211.12814(2022).Google Scholar
Xinjian Luo, Yuncheng Wu, Xiaokui Xiao, and Beng Chin Ooi. 2021. Feature inference attack on model predictions in vertical federated learning. In 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 181–192.Google ScholarCross Ref
Xiaosong Ma, Jie Zhang, Song Guo, and Wenchao Xu. 2022. Layer-wised model aggregation for personalized federated learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10092–10101.Google ScholarCross Ref
Shie Mannor, Dori Peleg, and Reuven Rubinstein. 2005. The cross entropy method for classification. In Proceedings of the 22nd international conference on Machine learning. 561–568.Google ScholarDigital Library
David Mickisch, Felix Assion, Florens Greßner, Wiebke Günther, and Mariele Motta. 2020. Understanding the decision boundary of deep neural networks: An empirical study. arXiv preprint arXiv:2002.01810(2020).Google Scholar
Ion Muslea, Steven Minton, and Craig A Knoblock. 2006. Active learning with multiple views. Journal of Artificial Intelligence Research 27 (2006), 203–233.Google ScholarDigital Library
Jinlong Pang, Jieling Yu, Ruiting Zhou, and John CS Lui. 2022. An incentive auction for heterogeneous client selection in federated learning. IEEE Transactions on Mobile Computing(2022).Google ScholarDigital Library
Krishna Pillutla, Kshitiz Malik, Abdel-Rahman Mohamed, Mike Rabbat, Maziar Sanjabi, and Lin Xiao. 2022. Federated learning with partial model personalization. In International Conference on Machine Learning. PMLR, 17716–17758.Google Scholar
Protection Regulation. 2016. Regulation (EU) 2016/679 of the European Parliament and of the Council. Regulation (eu) 679(2016), 2016.Google Scholar
Daniele Romanini, Adam James Hall, Pavlos Papadopoulos, Tom Titcombe, Abbas Ismail, Tudor Cebere, Robert Sandmann, Robin Roehm, and Michael A Hoeh. 2021. Pyvertical: A vertical federated learning framework for multi-headed splitnn. arXiv preprint arXiv:2104.00489(2021).Google Scholar
Theo Ryffel, Andrew Trask, Morten Dahl, Bobby Wagner, Jason Mancuso, Daniel Rueckert, and Jonathan Passerat-Palmbach. 2018. A generic framework for privacy preserving deep learning. arXiv preprint arXiv:1811.04017(2018).Google Scholar
Yonglong Tian, Chen Sun, Ben Poole, Dilip Krishnan, Cordelia Schmid, and Phillip Isola. 2020. What makes for good views for contrastive learning?Advances in Neural Information Processing Systems 33 (2020), 6827–6839.Google Scholar
Shubham Tulsiani, Alexei A Efros, and Jitendra Malik. 2018. Multi-view consistency as supervisory signal for learning shape and pose prediction. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2897–2905.Google ScholarCross Ref
Jesper E Van Engelen and Holger H Hoos. 2020. A survey on semi-supervised learning. Machine learning 109, 2 (2020), 373–440.Google Scholar
Cédric Villani et al. 2009. Optimal transport: old and new. Vol. 338. Springer.Google Scholar
Hao Wang, Zakhary Kaplan, Di Niu, and Baochun Li. 2020. Optimizing federated learning on non-iid data with reinforcement learning. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. IEEE, 1698–1707.Google ScholarDigital Library
Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, and H Vincent Poor. 2020. Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in neural information processing systems 33 (2020), 7611–7623.Google Scholar
Kang Wei, Jun Li, Ming Ding, Chuan Ma, Howard H Yang, Farhad Farokhi, Shi Jin, Tony QS Quek, and H Vincent Poor. 2020. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Transactions on Information Forensics and Security 15 (2020), 3454–3469.Google ScholarDigital Library
Kang Wei, Jun Li, Chuan Ma, Ming Ding, Sha Wei, Fan Wu, Guihai Chen, and Thilina Ranbaduge. 2022. Vertical federated learning: Challenges, methodologies and experiments. arXiv preprint arXiv:2202.04309(2022).Google Scholar
Haiqin Weng, Juntao Zhang, Feng Xue, Tao Wei, Shouling Ji, and Zhiyuan Zong. 2020. Privacy leakage of real-world vertical federated learning. arXiv preprint arXiv:2011.09290(2020).Google Scholar
Yuncheng Wu, Shaofeng Cai, Xiaokui Xiao, Gang Chen, and Beng Chin Ooi. 2020. Privacy preserving vertical federated learning for tree-based models. arXiv preprint arXiv:2008.06170(2020).Google Scholar
Zhaomin Wu, Qinbin Li, and Bingsheng He. 2022. Practical vertical federated learning with unsupervised representation learning. IEEE Transactions on Big Data(2022).Google ScholarCross Ref
Fan Yang, Kai Wu, Shuyi Zhang, Guannan Jiang, Yong Liu, Feng Zheng, Wei Zhang, Chengjie Wang, and Long Zeng. 2022. Class-aware contrastive semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14421–14430.Google ScholarCross Ref
J. H. Yang, C. Chen, H. N. Dai, M. Ding, L. L. Fu, and Z. B. Zheng. 2022. Hierarchical representation for multi-view clustering: from intra-sample to intra-view to inter-view. In Proceedings of the Conference on Information and Knowledge Management.Google ScholarDigital Library
J. H. Yang, C. Chen, H. N. Dai, M. Ding, Z. B. Wu, and Z. B. Zheng. 2022. Robust Corrupted Data Recovery and Clustering via Generalized Transformed Tensor Low-Rank Representation. IEEE Trans. Neural Networks Learn. Syst.(2022), DOI: 10.1109/TNNLS.2022.3215983.Google ScholarCross Ref
J. H. Yang, C. Chen, H. N. Dai, L. L. Fu, and Z. B. Zheng. 2022. A structure noise-aware tensor dictionary learning method for high-dimensional data clustering. Inf. Sci. 612(2022), 87–106.Google ScholarDigital Library
J. H. Yang, L. L. Fu, C. Chen, H. N. Dai, and Z. B. Zheng. 2023. Cross-View Graph Matching for Incomplete Multi-view Clustering. Neurocomputing 515(2023), 79–88.Google ScholarDigital Library
Lei Yang, Jiaming Huang, Wanyu Lin, and Jiannong Cao. 2022. Personalized Federated Learning on Non-IID Data via Group-Based Meta-Learning. ACM Transactions on Knowledge Discovery from Data (TKDD) (2022).Google Scholar
Xihong Yang, Xiaochang Hu, Sihang Zhou, Xinwang Liu, and En Zhu. 2022. Interpolation-based contrastive learning for few-label semi-supervised learning. IEEE Transactions on Neural Networks and Learning Systems (2022).Google ScholarCross Ref
Chenhao Ying, Haiming Jin, Xudong Wang, and Yuan Luo. 2020. Double insurance: Incentivized federated learning with differential privacy in mobile crowdsensing. In 2020 International Symposium on Reliable Distributed Systems (SRDS). IEEE, 81–90.Google ScholarCross Ref
Chunjie Zhang, Jian Cheng, and Qi Tian. 2019. Multi-view image classification with visual, semantic and view consistency. IEEE Transactions on Image Processing 29 (2019), 617–627.Google ScholarDigital Library
Chen Zhang, Yu Xie, Hang Bai, Bin Yu, Weihong Li, and Yuan Gao. 2021. A survey on federated learning. Knowledge-Based Systems 216 (2021), 106775.Google ScholarCross Ref
Qingsong Zhang, Bin Gu, Cheng Deng, Songxiang Gu, Liefeng Bo, Jian Pei, and Heng Huang. 2021. AsySQN: Faster Vertical Federated Learning Algorithms with Better Computation Resource Utilization. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3917–3927.Google ScholarDigital Library
Yuhang Zhang, Xiaopeng Zhang, Jie Li, Robert Qiu, Haohang Xu, and Qi Tian. 2022. Semi-supervised Contrastive Learning with Similarity Co-calibration. IEEE Transactions on Multimedia(2022), 1–1. https://doi.org/10.1109/TMM.2022.3158069Google ScholarDigital Library

Index Terms

Towards Few-Label Vertical Federated Learning
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees
      2. Neural networks
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Transductive Multilabel Learning via Label Set Propagation

The problem of multilabel classification has attracted great interest in the last decade, where each instance can be assigned with a set of multiple class labels simultaneously. It has a wide variety of real-world applications, e.g., automatic image ...
Read More
Deep semi-supervised learning with contrastive learning and partial label propagation for image data
Abstract
Deep semi-supervised learning is becoming an active research topic because it jointly utilizes labeled and unlabeled samples in training deep neural networks. Recent advances are mainly focused on inductive semi-supervised learning ...
Read More
Semi-supervised partial label learning algorithm via reliable label propagation
Abstract
Partial label learning (PLL) is a weakly supervised learning method that is able to predict one label as the correct answer from a given candidate label set. In PLL, when all possible candidate labels are as signed to real-world training examples, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Knowledge Discovery from Data Just Accepted
ISSN:1556-4681
EISSN:1556-472X
Table of Contents

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Online AM: 9 April 2024
- Accepted: 31 March 2024
- Revised: 24 January 2024
- Received: 1 April 2023
Published in tkdd Just Accepted

Check for updates
Author Tags
Vertical Federated Learning
Semi-Supervised Learning
Contrastive Learning
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 124
  Total Downloads
- Downloads (Last 12 months)124
- Downloads (Last 6 weeks)124
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards Few-Label Vertical Federated Learning

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Transductive Multilabel Learning via Label Set Propagation

Deep semi-supervised learning with contrastive learning and partial label propagation for image data

Semi-supervised partial label learning algorithm via reliable label propagation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Towards Few-Label Vertical Federated Learning

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Transductive Multilabel Learning via Label Set Propagation

Deep semi-supervised learning with contrastive learning and partial label propagation for image data

Semi-supervised partial label learning algorithm via reliable label propagation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media