Abstract
Various neuron approximations can be used to reduce the computational complexity of neural networks. One such approximation based on summation and maximum operations is a bipolar morphological neuron. This paper presents an improved structure of the bipolar morphological neuron that enhances its computational efficiency and a new approach to training based on continuous approximations of the maximum and knowledge distillation. Experiments were carried out on the MNIST dataset using a LeNet-like neural network architecture and on the CIFAR10 dataset using a ResNet-22 model architecture. The proposed training method achieves 99.45% classification accuracy on the LeNet-like model (the same accuracy as that provided by the classical network) and 86.69% accuracy on the ResNet-22 model compared with 86.43% accuracy of the classical model. The results show that the proposed method with log-sum-exp (LSE) approximation of the maximum and layer-by-layer knowledge distillation makes it possible to obtain a simplified bipolar morphological network that is not inferior to the classical networks.
REFERENCES
Chernyshova Y.S., Sheshkus A.V., and Arlazarov, V.V., Two-step CNN framework for text line recognition in camera-captured images, IEEE Access, 2020, vol. 8, pp. 32587–32600.
Kanaeva, I.A., Ivanova, Y.A., and Spitsyn, V.G., Deep convolutional generative adversarial network-based synthesis of datasets for road pavement distress segmentation, Comput. Optics, 2021, vol. 45, no. 6, pp. 907–916.
Das, P.A.K. and Tomar, D.S., Convolutional neural networks based weapon detection: A comparative study, Fourteenth International Conference on Machine Vision (ICMV 2021), SPIE, 2022, vol. 12084, pp. 351–359.
Bulatov, K. et al., Smart IDReader: Document recognition in video stream, 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), IEEE, 2017, vol. 6, pp. 39–44.
Zhao, Y., Wang, D., and Wang, L., Convolution accelerator designs using fast algorithms, Algorithms, 2019, vol. 12, no. 5, p. 112.
Yao, Z. et al., Hawq-v3: Dyadic neural network quantization, International Conference on Machine Learning, PMLR, 2021, pp. 11875–11886.
Tai, C. et al., Convolutional neural networks with low-rank regularization, arXiv:1511.06067, 2015.
Sun, X. et al., Pruning filters with L1-norm and standard deviation for CNN compression, Eleventh International Conference on Machine Vision (ICMV 2018), SPIE, 2019, vo. 11041, pp. 691–699.
You, H. et al., Shiftaddnet: A hardware-inspired deep network, Adv. Neural Inf. Process. Syst., 2020, vol. 33, pp. 2771–2783.
Chen, H. et al., AdderNet: Do we really need multiplications in deep learning?, Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 1468–1477.
Limonova, E.E. et al., Bipolar morphological neural networks: Gate-efficient architecture for computer vision, IEEE Access, 2021, vol. 9, pp. 97569–97581.
Limonova, E.E., Fast and gate-efficient approximated activations for bipolar morphological neural networks, Inf. Technol. Vychisl. Sist., 2022, no. 2, pp. 3–10.
Hinton, G., Vinyals, O., and Dean, J., Distilling the knowledge in a neural network, arXiv:1503.02531, 2015.
Xu, Y. et al., Kernel based progressive distillation for adder neural networks, Adv. Neural Inf. Process. Syst., 2020, vol. 33, pp. 12322–12333.
Kirszenberg, A. et al., Going beyond p-convolutions to learn grayscale morphological operators, Proc. of the First International Joint Conference on Discrete Geometry and Mathematical Morphology, DGMM 2021, Uppsala, Sweden, 2021, Cham: Springer, 2021, pp. 470–482.
Calafiore, G.C., Gaubert, S., and Possieri, C., A universal approximation result for difference of log-sum-exp neural networks, IEEE Trans. Neural Networks Learn. Syst., 2020, vol. 31, no. 12, pp. 5603–5612.
He, K. et al., Deep residual learning for image recognition, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
Funding
This work was supported by ongoing institutional funding. No additional grants to carry out or direct this particular research were obtained.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The author declares that he have no conflicts of interest.
Additional information
Translated by A. Klimontovich
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zingerenko, M.V., Limonova, E.E. Layer-by-Layer Knowledge Distillation for Training Simplified Bipolar Morphological Neural Networks. Program Comput Soft 49 (Suppl 2), S108–S114 (2023). https://doi.org/10.1134/S0361768823100080
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S0361768823100080