TableStrRec: framework for table structure recognition in data sheet images

Fernandes, Johan; Xiao, Bin; Simsek, Murat; Kantarci, Burak; Khan, Shahzad; Alkheir, Ala Abu

doi:10.1007/s10032-023-00453-8

Johan Fernandes¹,
Bin Xiao¹,
Murat Simsek¹,
Burak Kantarci ORCID: orcid.org/0000-0003-0220-7956¹,
Shahzad Khan² &
…
Ala Abu Alkheir³

313 Accesses
1 Citation
Explore all metrics

Abstract

Billions of documents in data sheet format are shared between various organizations across the globe on a daily basis. The essential information in these documents is presented in tabular format. Extracting and assimilating this information can help organizations make data-driven decisions. Solutions for detecting tables in document images have been well explored. Thus, in this work, we propose TableStrRec, a deep learning-based approach to recognize the structure of such detected tables by detecting rows and columns. TableStrRec comprises two Cascade R-CNN architectures, each with a deformable backbone and Complete IOU loss to improve their detection performance. One architecture detects and classifies rows as regular rows (rows without a merged cell) and irregular rows (groups of regular rows that share a merged cell). The second architecture detects and classifies columns as regular columns (columns without a merged cell) and irregular columns (groups of regular columns that share a merged cell). Both architectures work in parallel to provide the results in a single inference. We show that utilizing TableStrRec to detect four classes of objects improves the table structure recognition performance on three public test sets. We achieve \(90.5\%\) and \(89.6\%\) weighted average F1 scores on the ICDAR2013 test set for rows and columns, respectively. On the TabStructDB test set, we achieve \(72.7\%\) and \(78.5\%\) weighted average F1 score for rows and columns, respectively. We also evaluate the proposed method under the FinTabNet dataset using the structure-only TEDS score, achieving 98.34%, which can outperform most state-of-the-art benchmark models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-Type-TD-TSR – Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations

The Benefits of Close-Domain Fine-Tuning for Table Detection in Document Images

Table Structure Recognition Using Top-Down and Bottom-Up Cues

References

Göbel, M., Hassan, T., Oro, E., Orsi, G.: Icdar 2013 table competition. In: 12th International Conference on Document Analysis and Recognition, pp. 1449–1453 (2013)
Brynjolfsson, E., McElheran, K.: Data in action: data-driven decision making and predictive analytics in U.S. manufacturing. Entrepreneurship & Economics eJournal (2019)
Siddiqui, S.A., Malik, M.I., Agne, S., Dengel, A., Ahmed, S.: Decnt: deep deformable cnn for table detection. IEEE Access 6, 74151–74161 (2018)
Article Google Scholar
Prasad, D., Gadpal, A., Kapadni, K., Visave, M., Sultanpure, K.: Cascadetabnet: An approach for end to end table detection and structure recognition from image-based documents. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2020)
Kara, E., Traquair, M., Simsek, M., Kantarci, B., Khan, S.: Holistic design for deep learning-based discovery of tabular structures in datasheet images. Eng. Appl. Artif. Intell. 90, 103–551 (2020)
Article Google Scholar
Fernandes, J., Simsek, M., Kantarci, B., Khan, S.: Tabledet: an end-to-end deep learning approach for table detection and table image classification in data sheet images. Neurocomputing 468, 317–334 (2022)
Article Google Scholar
Gao, L., Yi, X., Jiang, Z., Hao, L., Tang, Z.: Icdar2017 competition on page object detection. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) 01, pp. 1417–1422 (2017)
Gao, L., Huang, Y., Déjean, H., Meunier, J.L., Yan, Q., Fang, Y., Kleber, F., Lang, E.: Icdar 2019 competition on table detection and recognition (ctdar). In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1510–1515 (2019). https://doi.org/10.1109/ICDAR.2019.00243
Siddiqui, S.A., Fateh, I.A., Rizvi, S.T.R., Dengel, A., Ahmed, S.: Deeptabstr: deep learning based table structure recognition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1403–1409 (2019)
Hashmi, K.A., Stricker, D., Liwicki, M., Afzal, M.N., Afzal, M.Z.: Guided table structure recognition through anchor optimization. IEEE Access 9, 113,521-113,534 (2021)
Article Google Scholar
Jiang, J., Simsek, M., Kantarci, B., Khan, S.: Tabcellnet: deep learning-based tabular cell structure detection. Neurocomputing 440, 12–23 (2021)
Article Google Scholar
Chi, Z., Huang, H., Xu, H., Yu, H., Yin, W., Mao, X.: Complicated table structure recognition. CoRR arXiv:1908.04729 (2019)
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. In: IEEE International Conference on Computer Vision (ICCV) pp. 764–773 (2017)
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: Faster and better learning for bounding box regression. In: AAAI (2020)
Zheng, X., Burdick, D., Popa, L., Zhong, P., Wang, N.X.R.: Global table extractor (gte): a framework for joint table identification and cell structure recognition using visual context. In: Winter Conference for Applications in Computer Vision (WACV) (2021)
Zanibbi, R., Blostein, D., Cordy, J.: A survey of table recognition. IJDAR 7, 1–16 (2004). https://doi.org/10.1007/s10032-004-0120-9
Article Google Scholar
Liu, Y., Bai, K., Mitra, P., Giles, C.L.: Tableseer: automatic table metadata extraction and searching in digital libraries. In: In Technical Report, pp. 91–100 (2007)
Liu, H., Li, X., Liu, B., Jiang, D., Liu, Y., Ren, B., Ji, R.: Show, read and reason: table structure recognition with flexible context aggregator. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 1084–1092 (2021)
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask r-cnn. IEEE International Conference on Computer Vision pp. 2980–2988 (2017)
Raja, S., Mondal, A., Jawahar, C.: Table structure recognition using top-down and bottom-up cues. In: European Conference on Computer Vision, Springer, pp. 70–86 (2020)
Liu, H., Li, X., Liu, B., Jiang, D., Liu, Y., Ren, B.: Neural collaborative graph machines for table structure recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4533–4542 (2022)
Chi, Z., Huang, H., Xu, H.D., Yu, H., Yin, W., Mao, X.L.: Complicated table structure recognition. arXiv preprint arXiv:1908.04729 (2019)
Xue, W., Yu, B., Wang, W., Tao, D., Li, Q.: Tgrnet: A table graph reconstruction network for table structure recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1295–1304 (2021)
Xiao, B., Simsek, M., Kantarci, B., Alkheir, A.A.: Table structure recognition with conditional attention. arXiv preprint arXiv:2203.03819 (2022)
Raja, S., Mondal, A., Jawahar, C.: Visual understanding of complex table structures from document images. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2299–2308 (2022)
Ichikawa, K.: Image-based relation classification approach for table structure recognition. In: International Conference on Document Analysis and Recognition, Springer, pp. 632–647 (2021)
Long, R., Wang, W., Xue, N., Gao, F., Yang, Z., Wang, Y., Xia, G.S.: Parsing table structures in the wild. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 944–952 (2021)
Smock, B., Pesala, R., Abraham, R.: Pubtables-1m: Towards comprehensive table extraction from unstructured documents. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4634–4642 (2022)
Dai, J., Qi, H., Xiong, Y., Li, Y., Zhang, G., Hu, H., Wei, Y.: Deformable convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp. 764–773 (2017)
Qiao, L., Li, Z., Cheng, Z., Zhang, P., Pu, S., Niu, Y., Ren, W., Tan, W., Wu, F.: Lgpma: complicated table structure recognition with local and global pyramid mask alignment. In: International Conference on Document Analysis and Recognition, Springer, pp. 99–114 (2021)
Zhang, Z., Zhang, J., Du, J., Wang, F.: Split, embed and merge: an accurate table structure recognizer. Pattern Recognit. 126, 108–565 (2022)
Article Google Scholar
Tensmeyer, C., Morariu, V.I., Price, B., Cohen, S., Martinez, T.: Deep splitting and merging for table structure decomposition. In: 2019 International Conference on Document Analysis and Recognition (ICDAR) (IEEE), pp. 114–121 (2019)
Zhang, J., Elhoseiny, M., Cohen, S., Chang, W., Elgammal, A.: Relationship proposal networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5678–5686 (2017)
Lin, W., Sun, Z., Ma, C., Li, M., Wang, J., Sun, L., Huo, Q.: Tsrformer: table structure recognition with transformers. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 6473–6482 (2022)
Ma, C., Lin, W., Sun, L., Huo, Q.: Robust table detection and structure recognition from heterogeneous document images. Pattern Recognit. 133, 109,006 (2023)
Article Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems 30, (2017)
He, Y., Qi, X., Ye, J., Gao, P., Chen, Y., Li, B., Tang, X., Xiao, R.: Pingan-vcgroup’s solution for icdar 2021 competition on scientific table image recognition to latex. arXiv preprint arXiv:2105.01846 (2021)
Nassar, A., Livathinos, N., Lysak, M., Staar, P.: Tableformer: table structure understanding with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4614–4623 (2022)
Zhong, X., ShafieiBavani, E., Jimeno Yepes, A.: Image-based table recognition: data, model, and evaluation. In: European Conference on Computer Vision, Springer, pp. 564–580 (2020)
Schreiber, S., Agne, S., Wolf, I., Dengel, A., Ahmed, S.: Deepdesrt: deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) 01, pp. 1162–1167 (2017)
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 6154–6162 (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR arXiv:1512.03385 (2015)
Lin, T., Maire, M., Belongie, S.J., Bourdev, L.D., Girshick, R.B., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. CoRR arXiv:1405.0312 (2014)
Yu, J., Jiang, Y., Wang, Z., Cao, Z., Huang, T.S.: Unitbox: an advanced object detection network. CoRR arXiv:1608.01471 (2016)
Paliwal, S., Vishwanath, D., Rahul, R., Sharma, M., Vig, L.: Tablenet: Deep learning model for end-to-end table detection and tabular data extraction from scanned document images. In: 2019 International Conference on Document Analysis and Recognition (ICDAR) pp. 128–133 (2019)
Smock, B., Pesala, R.: Table Transformer. https://github.com/microsoft/table-transformer (2021)
Wu, Y., Kirillov, A., Massa, F., Lo, W.Y., Girshick, R.: Detectron2. https://github.com/facebookresearch/detectron2 (2019)
Wu, Y., He, K.: Group normalization. In: Proceedings of the European conference on computer vision (ECCV), pp. 3–19 (2018)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning (PMLR), pp. 448–456 (2015)
Ye, J., Qi, X., He, Y., Chen, Y., Gu, D., Gao, P., Xiao, R.: Pingan-vcgroup’s solution for icdar 2021 competition on scientific literature parsing task b: table recognition to html. arXiv preprint arXiv:2105.01848 (2021)
He, Y., Qi, X., Ye, J., Gao, P., Chen, Y., Li, B., Tang, X., Xiao, R.: TableMASTER-mmocr https://github.com/JiaquanYe/TableMASTER-mmocr (2021)
Hurst, M.: A constraint-based approach to table structure derivation. In: Seventh International Conference on Document Analysis and Recognition, 2003. vol. 3, IEEE Computer Society, pp. 911–911 (2003)
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, ON, K1N 6N5, Canada
Johan Fernandes, Bin Xiao, Murat Simsek & Burak Kantarci
Gnowit, 383 Celtic Ridge Cres, Ottawa, ON, K2W 0B5, Canada
Shahzad Khan
Lytica, 555 Legget Dr, Ottawa, ON, K2K 2X3, Canada
Ala Abu Alkheir

Authors

Johan Fernandes
View author publications
You can also search for this author in PubMed Google Scholar
Bin Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Murat Simsek
View author publications
You can also search for this author in PubMed Google Scholar
Burak Kantarci
View author publications
You can also search for this author in PubMed Google Scholar
Shahzad Khan
View author publications
You can also search for this author in PubMed Google Scholar
Ala Abu Alkheir
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Burak Kantarci.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fernandes, J., Xiao, B., Simsek, M. et al. TableStrRec: framework for table structure recognition in data sheet images. IJDAR (2023). https://doi.org/10.1007/s10032-023-00453-8

Download citation

Received: 24 November 2021
Revised: 11 August 2023
Accepted: 16 August 2023
Published: 08 September 2023
DOI: https://doi.org/10.1007/s10032-023-00453-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TableStrRec: framework for table structure recognition in data sheet images

Abstract

Access this article

Similar content being viewed by others

Multi-Type-TD-TSR – Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations

The Benefits of Close-Domain Fine-Tuning for Table Detection in Document Images

Table Structure Recognition Using Top-Down and Bottom-Up Cues

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

TableStrRec: framework for table structure recognition in data sheet images

Abstract

Access this article

Similar content being viewed by others

Multi-Type-TD-TSR – Extracting Tables from Document Images Using a Multi-stage Pipeline for Table Detection and Table Structure Recognition: From OCR to Structured Table Representations

The Benefits of Close-Domain Fine-Tuning for Table Detection in Document Images

Table Structure Recognition Using Top-Down and Bottom-Up Cues

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation