Skip to main content
Log in

Mitigate noisy data for smart IoT via GAN based machine unlearning

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

With the development of IoT applications, machine learning dramatically improves the utility of variable IoT systems such as autonomous driving. Although the pretrain-finetune framework can cope well with data heterogeneity in complex IoT scenarios, the data collected by sensors often contain unexpected noisy data, e.g., out-of-distribution (OOD) data, which leads to the reduced performance of fine-tuned models. To resolve the problem, this paper proposes MuGAN, a method that can mitigate the side-effect of OOD data via the generative adversarial network (GAN)-based machine unlearning. MuGAN follows a straightforward but effective idea to mitigate the performance loss caused by OOD data, i.e., “flashbacking” the model to the condition where OOD data are excluded from model training. To achieve the goal, we design an adversarial game, where a discriminator is trained to identify whether a sample belongs to the training set by observing the confidence score. Meanwhile, a generator (i.e., the target model) is updated to fool the discriminator into believing that the OOD data are not included in the training set but others do. The experimental results show that benefiting from the high unlearning rate (more than 90%) and retention rate (99%), MuGAN succeeds in lowering the model performance degradation caused by OOD data from 5.88% to 0.8%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Peng B, Chi M M, Liu C. Non-IID federated learning via random exchange of local feature maps for textile IIoT secure computing. Sci China Inf Sci, 2022, 65: 170302

    Article  Google Scholar 

  2. Jung J, Kim B, Cho J, et al. A secure platform model based on ARM platform security architecture for IoT devices. IEEE Internet Things J, 2022, 9: 5548–5560

    Article  Google Scholar 

  3. Imteaj A, Thakker U, Wang S, et al. A survey on federated learning for resource-constrained IoT devices. IEEE Internet Things J, 2021, 9: 1–24

    Article  Google Scholar 

  4. Khan L U, Saad W, Han Z, et al. Federated learning for Internet of Things: recent advances, taxonomy, and open challenges. IEEE Commun Surv Tutorials, 2021, 23: 1759–1799

    Article  Google Scholar 

  5. Zhang T, Gao L, He C, et al. Federated learning for the Internet of Things: applications, challenges, and opportunities. IEEE Internet Things M, 2022, 5: 24–29

    Article  Google Scholar 

  6. He T X, Liu J, Cho K, et al. Analyzing the forgetting problem in pretrain-finetuning of open-domain dialogue response models. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021. 1121–1133

  7. Krishnamurthi R, Kumar A, Gopinathan D, et al. An overview of IoT sensor data processing, fusion, and analysis techniques. Sensors, 2020, 20: 6076

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  8. Wu Z-F, Wei T, Jiang J W, et al. NGC: a unified framework for learning with open-world noisy data. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021. 62–71

  9. Wenzel F, Dittadi A, Gehler P V, et al. Assaying out-of-distribution generalization in transfer learning. 2022. ArXiv:2207.09239

  10. Bourtoule L, Chandrasekaran V, Choquette-Choo C A, et al. Machine unlearning. In: Proceedings of IEEE Symposium on Security and Privacy (SP), 2021. 141–159

  11. Cao Y Z, Yang J F. Towards making systems forget with machine unlearning. In: Proceedings of IEEE Symposium on Security and Privacy, 2015. 463–480

  12. Ma Z, Liu Y, Liu X, et al. Learn to forget: machine unlearning via neuron masking. IEEE Trans Dependable Secure Comput, 2022. doi: https://doi.org/10.1109/TDSC.2022.3194884

  13. Hsu T H, Wang Z H, See A R. A cloud-edge-smart IoT architecture for speeding up the deployment of neural network models with transfer learning techniques. Electronics, 2022, 11: 2255–2269

    Article  Google Scholar 

  14. Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks. Commun ACM, 2020, 63: 139–144

    Article  Google Scholar 

  15. Schelter S. “Amnesia”—machine learning models that can forget user data very fast. In: Proceedings of the 10th Conference on Innovative Data Systems Research, Amsterdam, 2020

  16. Chen C, Sun F, Zhang M, et al. Recommendation unlearning. In: Proceedings of the ACM Web Conference, 2022. 2768–2777

  17. Baumhauer T, Schöttle P, Zeppelzauer M. Machine unlearning: linear filtration for logit-based classifiers. Mach Learn, 2022, 111: 3203–3226

    Article  MathSciNet  PubMed  PubMed Central  Google Scholar 

  18. Izzo Z, Smart M A, Chaudhuri K, et al. Approximate data deletion from machine learning models. In: Proceedings of International Conference on Artificial Intelligence and Statistics, 2021. 2008–2016

  19. Brophy J, Lowd D. Machine unlearning for Random forests. In: Proceedings of International Conference on Machine Learning, 2021. 1092–1104

  20. Fu S P, He F X, Tao D C. Knowledge removal in sampling-based Bayesian inference. 2022. ArXiv:2203.12964

  21. Rawat A, Requeima J, Bruinsma W, et al. Challenges and pitfalls of Bayesian unlearning. 2022. ArXiv:2207.03227

  22. Chien E, Pan C, Milenkovic O. Certified graph unlearning. 2022. ArXiv:2206.09140

  23. He Y Z, Meng G Z, Chen K, et al. Deepobliviate: a powerful charm for erasing data residual memory in deep neural networks. 2021. ArXiv:2105.06209

  24. Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. 2015. ArXiv:1511.06434

  25. Zhu J, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, 2017. 2223–2232

  26. Oliver A, Odena A, Raffel C, et al. Realistic evaluation of deep semi-supervised learning algorithms. In: Proceedings of Advances in Neural Information Processing Systems 31, 2018

  27. Morningstar W, Ham C, Gallagher A, et al. Density of states estimation for out of distribution detection. In: Proceedings of International Conference on Artificial Intelligence and Statistics, 2021. 3232–3240

  28. Orekondy T, Schiele B, Fritz M. Knockoff nets: stealing functionality of black-box models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. 4954–4963

  29. Tramèr F, Zhang F, Juels A, et al. Stealing machine learning models via prediction APIs. In: Proceedings of the 25th USENIX Security Symposium (USENIX Security 16), 2016. 601–618

  30. Caesar H, Bankiti V, Lang A, et al. nuScenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020. 11621–11631

  31. Pan Z Y, Emaru T, Ravankar A, et al. Applying semantic segmentation to autonomous cars in the snowy environment. 2020. ArXiv:2007.12869

  32. Nakanoya M, Im J, Qiu H, et al. Personalized federated learning of driver prediction models for autonomous driving. 2021. ArXiv:2112.00956

  33. Li Z, Pan M X, Zhang T, et al. Testing DNN-based autonomous driving systems under critical environmental conditions. In: Proceedings of International Conference on Machine Learning, 2021. 6471–6482

  34. Li J N, Xiong C M, Hoi S C H. Learning from noisy data with robust representation learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021. 9485–9494

  35. Zhang L, Goldstein M, Ranganath R. Understanding failures in out-of-distribution detection with deep generative models. In: Proceedings of International Conference on Machine Learning, 2021. 12427–12436

  36. Ulmer D, Cinà G. Know your limits: uncertainty estimation with ReLU classifiers fails at reliable OOD detection. In: Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence, 2021. 1766–1776

  37. Arivazhagan M G, Aggarwal V S, Aaditya K, et al. Federated learning with personalization layers. 2019. ArXiv:1912.00818

  38. McMahan B, Moore E, Ramage D, et al. The German traffic sign recognition benchmark: a multi-class classification competition. In: Proceedings of the 2011 International Joint Conference on Neural Networks, 2011. 1453–1460

  39. Xu P, Ehinger K A, Zhang Y. TurkerGaze: crowdsourcing saliency with webcam based eye tracking. 2015. ArXiv:1504.06755

  40. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556

  41. Dinh T C, Tran N, Nguyen J. Personalized federated learning with Moreau envelopes. In: Proceedings of Advances in Neural Information Processing Systems, 2020, 33: 21394–21405

  42. Luo B, Xiao W L, Wang S Q, et al. Tackling system and statistical heterogeneity for federated learning with adaptive client sampling. In: Proceedings of IEEE INFOCOM 2022-IEEE Conference on Computer Communications, 2022. 1739–1748

  43. Schelter S, Grafberger S, Dunning T. HedgeCut: maintaining randomised trees for low-latency machine unlearning. In: Proceedings of the 2021 International Conference on Management of Data, 2021. 1545–1557

  44. Gupta V, Jung C, Neel S, et al. Adaptive machine unlearning. In: Proceedings of Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 2021

  45. McMahan B, Moore E, Ramage D, et al. Communication-efficient learning of deep networks from decentralized data. In: Proceedings of Artificial Intelligence and Statistics, 2017. 1273–1282

  46. Chen M, Zhang Z K, Wang T H, et al. When machine unlearning jeopardizes privacy. In: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021

  47. van der Maaten L, Hinton G. Visualizing data using t-SNE. J Mach Learning Res, 2008, 9: 2579–2605

    Google Scholar 

Download references

Acknowledgements

This work was supported by National Key Research and Development Program of China (Grant No. 2022YFB3103500), National Natural Science Foundation of China (Grant Nos. U21A20464, 61872283), Natural Science Basic Research Program of Shaanxi (Grant No. 2021JC-22), Key Research and Development Program of Shaanxi (Grant No. 2022GY029), and China 111 Project (Grant No. B16037).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yilong Yang or Yang Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, Z., Yang, Y., Liu, Y. et al. Mitigate noisy data for smart IoT via GAN based machine unlearning. Sci. China Inf. Sci. 67, 132104 (2024). https://doi.org/10.1007/s11432-022-3671-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-022-3671-9

Keywords

Navigation