Loss Function for Training Models of Segmentation of Document Images

Perminov, A. I.; Turdakov, D. Yu.; Belyaeva, O. V.

doi:10.1134/S0361768823070058

Loss Function for Training Models of Segmentation of Document Images

Published: 07 December 2023

Volume 49, pages 574–589, (2023)
Cite this article

Programming and Computer Software Aims and scope Submit manuscript

104 Accesses
Explore all metrics

Abstract

This work is devoted to improving the quality of segmentation of images of various scientific papers and legal acts by neural network models by training them using modified loss functions that take into account special features of images of the appropriate subject domain. The analysis of existing loss functions is carried out, and new functions are proposed that work both with the coordinates of bounding boxes and use information about the pixels of the input image. To assess the quality, a neural network segmentation model with modified loss functions is trained, and a theoretical assessment is carried out using a simulation experiment showing the convergence rate and segmentation error. As a result of the study, rapidly converging loss functions are created that improve the quality of document image segmentation using additional information about the input data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Historical Handwritten Document Segmentation by Using a Weighted Loss

Document Image Segmentation Using Deep Features

TextUnet: Text Segmentation Using U-net

REFERENCES

Zheng, Z., Wang, P., et al.. Distance-IoU loss: Faster and better learning for bounding box regression, Proc. of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, no. 07, pp. 12993–13000.
Rezatofighi, H., Tsoi, N., et al.. Generalized intersection over union: A metric and a loss for bounding box regression, Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 658–666.
Zheng, T., Zhao, S., et al.. SCALoss: Side and corner aligned loss for bounding box regression. arXiv preprint arXiv:2104.00462, 2021.
He, J., Erfani, S., et al.. α-IoU: A family of power intersection over union losses for bounding box regression, Adv. Neural Inf. Process. Syst., 2021, vol. 34.
Wu, S., Yang, J., et al., IoU-balanced loss functions for single-stage object detection, Pattern Recognit. Lett., 2022, vol. 156, pp. 96–103.
Article Google Scholar
Du, S., Zhang, B., and Zhang, P., Scale-sensitive IOU loss: An improved regression loss function in remote sensing object detection, IEEE Access, 2021, vol. 9, pp. 141258–141272.
Article Google Scholar
Redmon, J. and Farhadi, A., YOLOv3: An incremental improvement. arXiv:1804.02767, 2018.
Zhong, X., Tang, J., and Yepes, A.J., Publaynet: Largest dataset ever for document layout analysis, Proc. of the 2019 International Conference on Document Analysis and Recognition (ICDAR), 2019, pp. 1015–1022.
Belyaeva, O.V., Perminov, A.I., and Kozlov, I.S., Synthetic data usage for fine-tuning document segmentation models, Trudy ISP RAN, 2020, vol. 32, no. 4, pp. 189–202.

Download references

Author information

Authors and Affiliations

Ivannikov Institute for System Programming, Russian Academy of Sciences, 109004, Moscow, Russia
A. I. Perminov, D. Yu. Turdakov & O. V. Belyaeva
Moscow State University, 119991, Moscow, Russia
A. I. Perminov & D. Yu. Turdakov

Authors

A. I. Perminov
View author publications
You can also search for this author in PubMed Google Scholar
D. Yu. Turdakov
View author publications
You can also search for this author in PubMed Google Scholar
O. V. Belyaeva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to A. I. Perminov, D. Yu. Turdakov or O. V. Belyaeva.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Translated by A. Klimontovich

Rights and permissions

Reprints and permissions

About this article

Cite this article

Perminov, A.I., Turdakov, D.Y. & Belyaeva, O.V. Loss Function for Training Models of Segmentation of Document Images. Program Comput Soft 49, 574–589 (2023). https://doi.org/10.1134/S0361768823070058

Download citation

Received: 10 January 2023
Revised: 18 January 2023
Accepted: 13 February 2023
Published: 07 December 2023
Issue Date: December 2023
DOI: https://doi.org/10.1134/S0361768823070058

Keywords:

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions