Skip to main content
Log in

Loss Function for Training Models of Segmentation of Document Images

  • Published:
Programming and Computer Software Aims and scope Submit manuscript

Abstract

This work is devoted to improving the quality of segmentation of images of various scientific papers and legal acts by neural network models by training them using modified loss functions that take into account special features of images of the appropriate subject domain. The analysis of existing loss functions is carried out, and new functions are proposed that work both with the coordinates of bounding boxes and use information about the pixels of the input image. To assess the quality, a neural network segmentation model with modified loss functions is trained, and a theoretical assessment is carried out using a simulation experiment showing the convergence rate and segmentation error. As a result of the study, rapidly converging loss functions are created that improve the quality of document image segmentation using additional information about the input data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.
Fig. 11.
Fig. 12.

Similar content being viewed by others

REFERENCES

  1. Zheng, Z., Wang, P., et al.. Distance-IoU loss: Faster and better learning for bounding box regression, Proc. of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, no. 07, pp. 12993–13000.

  2. Rezatofighi, H., Tsoi, N., et al.. Generalized intersection over union: A metric and a loss for bounding box regression, Proc. of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 658–666.

  3. Zheng, T., Zhao, S., et al.. SCALoss: Side and corner aligned loss for bounding box regression. arXiv preprint arXiv:2104.00462, 2021.

  4. He, J., Erfani, S., et al.. α-IoU: A family of power intersection over union losses for bounding box regression, Adv. Neural Inf. Process. Syst., 2021, vol. 34.

  5. Wu, S., Yang, J., et al., IoU-balanced loss functions for single-stage object detection, Pattern Recognit. Lett., 2022, vol. 156, pp. 96–103.

    Article  Google Scholar 

  6. Du, S., Zhang, B., and Zhang, P., Scale-sensitive IOU loss: An improved regression loss function in remote sensing object detection, IEEE Access, 2021, vol. 9, pp. 141258–141272.

    Article  Google Scholar 

  7. Redmon, J. and Farhadi, A., YOLOv3: An incremental improvement. arXiv:1804.02767, 2018.

  8. Zhong, X., Tang, J., and Yepes, A.J., Publaynet: Largest dataset ever for document layout analysis, Proc. of the 2019 International Conference on Document Analysis and Recognition (ICDAR), 2019, pp. 1015–1022.

  9. Belyaeva, O.V., Perminov, A.I., and Kozlov, I.S., Synthetic data usage for fine-tuning document segmentation models, Trudy ISP RAN, 2020, vol. 32, no. 4, pp. 189–202.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to A. I. Perminov, D. Yu. Turdakov or O. V. Belyaeva.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Translated by A. Klimontovich

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Perminov, A.I., Turdakov, D.Y. & Belyaeva, O.V. Loss Function for Training Models of Segmentation of Document Images. Program Comput Soft 49, 574–589 (2023). https://doi.org/10.1134/S0361768823070058

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0361768823070058

Keywords:

Navigation