Regional filtering distillation for object detection

Wu, Pingfan; Zhang, Jiayu; Sun, Han; Liu, Ningzhong

doi:10.1007/s00138-023-01503-1

Regional filtering distillation for object detection

Original Paper
Published: 31 January 2024

Volume 35, article number 24, (2024)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

Pingfan Wu^1,2,3,
Jiayu Zhang¹,
Han Sun¹ &
…
Ningzhong Liu ORCID: orcid.org/0000-0002-6735-7130^1,2,3

201 Accesses
Explore all metrics

Abstract

Knowledge distillation is a common and effective method in model compression, which trains a compact student model to mimic the capability of a large teacher model to get superior generalization. Previous works on knowledge distillation are underperforming for challenging tasks such as object detection, compared to the general application of unsophisticated classification tasks. In this paper, we propose that the failure of knowledge distillation on object detection is mainly caused by the imbalance between features of informative and invalid background. Not all background noise is redundant, and the valuable background noise after screening contains relations between foreground and background. Therefore, we propose a novel regional filtering distillation (RFD) algorithm to solve this problem through two modules: region selection and attention-guided distillation. Region selection first filters massive invalid backgrounds and retains knowledge-dense regions on near object anchor locations. Attention-guided distillation further improves distillation performance on object detection tasks by extracting the relations between foreground and background to migrate key features. Extensive experiments on both one-stage and two-stage detectors have been conducted to prove the effectiveness of RFD. For example, RFD improves 2.8% and 2.6% mAP for ResNet50-RetinaNet and ResNet50-FPN student networks on the MS COCO dataset, respectively. We also evaluate our method with the Faster R-CNN model on Pascal VOC and KITTI benchmark, which obtain 1.52% and 4.36% mAP promotions for the ResNet18-FPN student network, respectively. Furthermore, our method increases 5.70% of mAP for MobileNetv2-SSD compared to the original model. The proposed RFD technique performs highly on detection tasks through regional filtering distillation. In the future, we plan to extend it to more challenging task scenarios, such as segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

GLAMD: Global and Local Attention Mask Distillation for Object Detectors

One-stage object detection knowledge distillation via adversarial learning

Article 24 July 2021

IDa-Det: An Information Discrepancy-Aware Distillation for 1-Bit Detectors

References

Shen, D., Wu, G., Suk, H.-I.: Deep learning in medical image analysis. Ann. Rev. Biomed. Eng. 19, 221–248 (2017)
Article Google Scholar
Chen, J., Li, K., Deng, Q., Li, K., Yu, P.S.: Distributed deep learning model for intelligent video surveillance systems with edge computing. IEEE Trans. Ind. Inf. (2019)
Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O.: nuscenes: a multimodal dataset for autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11621–11631 (2020)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: Proceedings of the European Conference on Computer Vision, pp. 21–37 (2016)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Proceedings of the European Conference on Computer Vision, pp. 213–229 (2020)
Li, Y., Mao, H., Girshick, R.B., He, K.: Exploring plain vision transformer backbones for object detection. In: Proceedings of the European Conference on Computer Vision, pp. 280–296 (2022)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv:1704.04861
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018)
He, Y., Liu, P., Wang, Z., Hu, Z., Yang, Y.: Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4340–4349 (2019)
Lin, M., Ji, R., Wang, Y., Zhang, Y., Zhang, B., Tian, Y., Shao, L.: Hrank: filter pruning using high-rank feature map. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1529–1538 (2020)
Chen, W., Wilson, J.T., Tyree, S., Weinberger, K.Q., Chen, Y.: Compressing convolutional neural networks (2015). arXiv preprint arXiv:1506.04449
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Advances in Neural Information Processing Systems, pp. 1135–1143 (2015)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network (2015). arXiv preprint arXiv:1503.02531
Mirzadeh, S., Farajtabar, M., Li, A., Levine, N., Matsukawa, A., Ghasemzadeh, H.: Improved knowledge distillation via teacher assistant. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 5191–5198 (2020)
Zagoruyko, S., Komodakis, N.: Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer (2016). arXiv preprint arXiv:1612.03928
Wang, T., Yuan, L., Zhang, X., Feng, J.: Distilling object detectors with fine-grained feature imitation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4933–4942 (2019)
Guo, J., Han, K., Wang, Y., Wu, H., Chen, X., Xu, C., Xu, C.: Distilling object detectors via decoupled features. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2154–2164 (2021)
Zhang, L., Ma, K.: Improve object detection with feature-based knowledge distillation: towards accurate and efficient detectors. In: Proceedings of the 9th International Conference on Learning Representations (2020)
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M.: Selective search for object recognition. Proc. IEEE Int. J. Comput. Vis. 104(2), 154–171 (2013)
Article Google Scholar
Lin, T., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 936–944 (2017)
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Bayraktar, E., Wang, Y., Bue, A.D.: Fast re-obj: real-time object re-identification in rigid scenes. Mach. Vis. Appl. 33(6), 97 (2022)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Chen, G., Choi, W., Yu, X., Han, T.X., Chandraker, M.: Learning efficient object detection models with knowledge distillation. In: Advances in Neural Information Processing Systems, pp. 742–751 (2017)
Li, Q., Jin, S., Yan, J.: Mimicking very efficient network for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6356–6364 (2017)
Bucila, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 535–541 (2006)
Yim, J., Joo, D., Bae, J., Kim, J.: A gift from knowledge distillation: fast optimization, network minimization and transfer learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7130–7138 (2017)
Heo, B., Lee, M., Yun, S., Choi, J.Y.: Knowledge distillation with adversarial samples supporting decision boundary. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 3771–3778 (2019)
Park, W., Kim, D., Lu, Y., Cho, M.: Relational knowledge distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3967–3976 (2019)
Zhang, L., Song, J., Gao, A., Chen, J., Bao, C., Ma, K.: Be your own teacher: improve the performance of convolutional neural networks via self distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3713–3722 (2019)
Sun, R., Tang, F., Zhang, X., Xiong, H., Tian, Q.: Distilling object detectors with task adaptive regularization (2020). arXiv preprint arXiv:2006.13108
Dai, X., Jiang, Z., Wu, Z., Bao, Y., Wang, Z., Liu, S., Zhou, E.: General instance distillation for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7842–7851 (2021)
Dong, N., Zhang, Y., Ding, M., Xu, S., Bai, Y.: One-stage object detection knowledge distillation via adversarial learning. Appl. Intell. 52(4), 4582–4598 (2022)
Article Google Scholar
DeVries, T., Misra, I., Wang, C., van der Maaten, L.: Does object recognition work for everyone. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2019)
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2921–2929 (2016)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Lin, T., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: Proceedings of the European Conference on Computer Vision, pp. 740–755 (2014)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The kitti vision benchmark suite. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3354–3361 (2012)
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., Zhang, Z., Cheng, D., Zhu, C., Cheng, T., Zhao, Q., Li, B., Lu, X., Zhu, R., Wu, Y., Dai, J., Wang, J., Shi, J., Ouyang, W., Loy, C.C., Lin, D.: Mmdetection: Open mmlab detection toolbox and benchmark (2019). arXiv preprint arXiv:1906.07155
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)
Zheng, Z., Ye, R., Wang, P., Ren, D., Zuo, W., Hou, Q., Cheng, M.: Localization distillation for dense object detection. In: Proceedings of the Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2022)
Bayraktar, E., Yigit, C.B., Boyraz, P.: A hybrid image dataset toward bridging the gap between real and simulation environments for robotics. Mach. Vis. Appl. 30(1), 23–40 (2019)

Download references

Acknowledgements

This work was supported by the Natural Science Foundation of Jiangsu Province of China (BK20222012), Guangxi Science and Technology Project (AB22080026/2021AB22167), and National Natural Science Foundation of China (No. 61375021)

Author information

Authors and Affiliations

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Jiangsu, 211106, Nanjing, China
Pingfan Wu, Jiayu Zhang, Han Sun & Ningzhong Liu
MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Jiangsu, 211106, Nanjing, China
Pingfan Wu & Ningzhong Liu
Collaborative Innovation Center of Novel Software Technology and Industrialization, Jiangsu, 211106, Nanjing, China
Pingfan Wu & Ningzhong Liu

Authors

Pingfan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jiayu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Han Sun
View author publications
You can also search for this author in PubMed Google Scholar
Ningzhong Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ningzhong Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wu, P., Zhang, J., Sun, H. et al. Regional filtering distillation for object detection. Machine Vision and Applications 35, 24 (2024). https://doi.org/10.1007/s00138-023-01503-1

Download citation

Received: 30 March 2023
Revised: 15 September 2023
Accepted: 18 December 2023
Published: 31 January 2024
DOI: https://doi.org/10.1007/s00138-023-01503-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regional filtering distillation for object detection

Abstract

Access this article

Similar content being viewed by others

GLAMD: Global and Local Attention Mask Distillation for Object Detectors

One-stage object detection knowledge distillation via adversarial learning

IDa-Det: An Information Discrepancy-Aware Distillation for 1-Bit Detectors

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Regional filtering distillation for object detection

Abstract

Access this article

Similar content being viewed by others

GLAMD: Global and Local Attention Mask Distillation for Object Detectors

One-stage object detection knowledge distillation via adversarial learning

IDa-Det: An Information Discrepancy-Aware Distillation for 1-Bit Detectors

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation