DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents

Bisen, Tejasvee; Javed, Mohammed; Kirtania, Shashank; Nagabhushan, P.

doi:10.1007/s10044-023-01190-8

DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents

Theoretical Advances
Published: 02 August 2023

Volume 26, pages 1641–1655, (2023)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Tejasvee Bisen ORCID: orcid.org/0000-0002-3156-5536¹,
Mohammed Javed¹,
Shashank Kirtania² &
…
P. Nagabhushan¹

177 Accesses
1 Citation
Explore all metrics

Abstract

For any digital application with document images such as retrieval, the classification of document images becomes an essential stage. Conventionally for the purpose, the full versions of the documents, that is the uncompressed document images make the input dataset, which poses a threat due to the big volume required to accommodate the full versions of the documents. Therefore, it would be novel, if the same classification task could be accomplished directly (with some partial decompression) with the compressed representation of documents in order to make the whole process computationally more efficient. In this research work, a novel deep learning model—DWT-CompCNN—is proposed for classification of documents that are compressed using High Throughput JPEG 2000 (HTJ2K) algorithm. The proposed DWT-CompCNN comprises of five convolutional layers with filter sizes of 16, 32, 64, 128, and 256 consecutively for each increasing layer to improve learning from the wavelet coefficients extracted from the compressed images. Experiments are performed on two benchmark datasets, Tobacco-3482 and RVL-CDIP, which demonstrate that the proposed model is time and space efficient, and also achieves a better classification accuracy in compressed domain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Less Is More: Accelerating Faster Neural Networks Straight from JPEG

2C-Net: integrate image compression and classification via deep neural network

Article 01 December 2022

Convolutional Neural Network (CNN) to Reduce Construction Loss in JPEG Compression Caused by Discrete Fourier Transform (DFT)

Availability of data and materials

Data set and materials will be made available on request.

References

Kumar J, Ye P, Doermann D (2014) Structural similarity for document image classification and retrieval. Pattern Recogn Lett 43:119–126
Article Google Scholar
Csurka G, Larlus D, Gordo A, Almazan J (2016) What is the right way to represent document images? arXiv preprint arXiv:1603.01076
Harley AW, Ufkes A, Derpanis KG (2015) Evaluation of deep convolutional nets for document image classification and retrieval. In: 2015 13th international conference on document analysis and recognition (ICDAR), pp 991–995. IEEE
Sarkhel R, Nandi A (2019) Deterministic routing between layout abstractions for multi-scale classification of visually rich documents. In: 28th international joint conference on artificial intelligence (IJCAI), 2019
Barni M (2018) Document and image compression
Zhang Y, Yutao Z, Guangxu L (2018) Document image compression with application to digital preservation in digital libraries. In: 2018 IEEE international conference on signal processing, communications and computing (ICSPCC), pp 1–4. IEEE
Byju AP, Sumbul G, Demir B, Bruzzone L (2020) Remote-sensing image scene classification with deep neural networks in JPEG 2000 compressed domain. IEEE Trans Geosci Remote Sens 59(4):3458–3472
Article Google Scholar
Mukhopadhyay J (2011) Image and video processing in the compressed domain
Javed M, Nagabhushan P, Chaudhuri BB (2018) A review on document image analysis techniques directly in the compressed domain. Artif Intell Rev 50(4):539–568
Article Google Scholar
Afzal MZ, Kölsch A, Ahmed S, Liwicki M (2017) Cutting the error by half: investigation of very deep CNN and advanced training strategies for document image classification. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 883–888. IEEE
Ferrando J, Domínguez JL, Torres J, García R, García D, Garrido D, Cortada J, Valero M (2020) Improving accuracy and speeding up document image classification through parallel systems. In: International conference on computational science. Springer, pp 387–400
Hu B, Ergu D, Yang H, Liu K, Cai Y (2019) Document images classification based on deep learning. Proc Comput Sci 162:514–522
Article Google Scholar
Csurka G (2017) Document image classification, with a specific view on applications of patent images, 325–350
Kölsch A, Afzal MZ, Ebbecke M, Liwicki M (2017) Real-time document image classification using deep CNN and extreme learning machines. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 1318–1323. IEEE
Das A, Roy S, Bhattacharya U, Parui SK (2018) Document image classification with intra-domain transfer learning and stacked generalization of deep convolutional neural networks. In: 2018 24th international conference on pattern recognition (ICPR), pp 3180–3185. IEEE
Mandivarapu JK, Bunch E, You Q, Fung G (2021) Efficient document image classification using region-based graph neural network. arXiv preprint arXiv:2106.13802
Bakkali S, Ming Z, Coustaty M, Rusinol M (2020) Visual and textual deep feature fusion for document image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 562–563
Asim MN, Khan MUG, Malik MI, Razzaque K, Dengel A, Ahmed S (2019) Two stream deep network for document image classification. In: 2019 international conference on document analysis and recognition (ICDAR), pp 1410–1416. IEEE
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
MATH Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Bakkali S, Ming Z, Coustaty M, Rusi nol M (2020) Cross-modal deep networks for document image classification. In: 2020 IEEE international conference on image processing (ICIP), pp 2556–2560. IEEE
Kang L, Kumar J, Ye P, Li Y, Doermann D (2014) Convolutional neural networks for document image classification. In: 2014 22nd international conference on pattern recognition, pp 3168–3172. IEEE
Tensmeyer C, Martinez T (2017) Analysis of convolutional neural networks for document image classification. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 388–393. IEEE
Salomon D (2004) Data compression: the complete reference
Nagabhushan P, Javed M, Chaudhuri B (2014) Entropy computations of document images in run-length compressed domain. In: 2014 fifth international conference on signal and image processing, pp 287–291. IEEE
De Queiroz RL (1998) Processing jpeg-compressed images and documents. IEEE Trans Image Process 7(12):1661–1672
Article Google Scholar
Rabbani M (2002) Jpeg 2000: Image compression fundamentals, standards and practice. J Electr Imaging 11(2):286
Article Google Scholar
Nagabhushan P, et al (2019) Text line segmentation in compressed representation of handwritten document using tunneling algorithm. arXiv preprint arXiv:1901.11477
Rajesh B, Jain P, Javed M, Doermann D (2021) Hh-compwordnet: Holistic handwritten word recognition in the compressed domain. In: 2021 data compression conference (DCC), pp 362–362. IEEE
Byju AP, Demir B, Bruzzone L (2020) A progressive content-based image retrieval in jpeg 2000 compressed remote sensing archives. IEEE Trans Geosci Remote Sens 58(8):5739–5751
Article Google Scholar
Schaefer G (2017) Fast compressed domain jpeg image retrieval. In: 2017 International conference on vision, image and signal processing (ICVISP), pp. 22–26. IEEE
Rajesh B, Javed M, Srivastava S (2019) Dct-compcnn: a novel image classification network using jpeg compressed DCT coefficients. In: 2019 IEEE conference on information and communication technology, pp 1–6. IEEE
Arslan HS, Archambault S, Bhatt P, Watanabe K, Cuevaz J, Le P, Miller D, Zhumatiy V (2022) Usage of compressed domain in fast frameworks. Signal Image Video Process, 1–9
Hiremath PS, Shivashankar S (2008) Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image. Pattern Recogn Lett 29(9):1182–1189
Article Google Scholar
Williams T, Li R (2016) Advanced image classification using wavelets and convolutional neural networks. In: 2016 15th IEEE international conference on machine learning and applications (ICMLA), pp 233–239. IEEE
Khatami A, Nazari A, Beheshti A, Nguyen TT, Nahavandi S, Zieba J (2020) Convolutional neural network for medical image classification using wavelet features. In: 2020 international joint conference on neural networks (IJCNN), pp 1–8. IEEE
Ali RB, Ejbali R, Zaied M (2018) A deep convolutional neural wavelet network for classification of medical images. J Comput Sci 69(11):1488–1498
Article Google Scholar
Rossetto AM, Zhou W (2019) Improving classification with CNNS using wavelet pooling with nesterov-accelerated adam. In: Proceedings of 11th international conference on bioinformation and computer biology, vol 60, pp 84–93
Li J, Gray RM (2000) Context-based multiscale classification of document images using wavelet coefficient distributions. IEEE Trans Image Proces 9(9):1604–1616
Article Google Scholar
Li Q, Shen L, Guo S, Lai Z (2020) Wavelet integrated CNNS for noise-robust image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7245–7254
Shankar BU, Meher SK, Ghosh A (2007) Neuro-wavelet classifier for remote sensing image classification. In: 2007 international conference on computing: theory and applications (ICCTA’07), pp 711–715. IEEE
Chamain LD, Ding Z (2020) Improving deep learning classification of jpeg2000 images over bandlimited networks. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4062–4066. IEEE
Abdmouleh MK, Masmoudi A, Bouhlel MS (2012) A new method which combines arithmetic coding with RLE for lossless image compression
Watson AB (1994) Image compression using the discrete cosine transform. Math J 4(1):81
MathSciNet Google Scholar
Chowdhury MMH, Khatun A (2012) Image compression using discrete wavelet transform. Int J Comput Sci Issues (IJCSI) 9(4):327
Google Scholar
Schelkens P, Skodras A, Ebrahimi T (2009) The jpeg 2000 suite
Taubman D, Naman A, Mathew R, Smith M, Watanabe O (2019) High throughput jpeg 2000 (htj2k): algorithm, performance and potential
Zhang X, Zou J, He K, Sun J (2015) Accelerating very deep convolutional networks for classification and detection. IEEE Trans Pattern Anal Mach Intell 38(10):1943–1955
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Lewis D, Agam G, Argamon S, Frieder O, Grossman D, Heard J (2006) Building a test collection for complex document information processing. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, pp 665–666
Harley AW, Ufkes A, Derpanis KG Evaluation of deep convolutional nets for document image classification and retrieval. In: International conference on document analysis and recognition (ICDAR)
Fang X, Watanabe O (2021) Development of open-source codec compliant with htj2k standard. In: 2021 IEEE 10th global conference on consumer electronics (GCCE), pp 11–14. IEEE
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980
Watanabe O, Taubman D (2019) A matlab implementation of the emerging htj2k standard. In: 2019 IEEE 8th global conference on consumer electronics (GCCE), pp 491–495. IEEE
Afzal MZ, Capobianco S, Malik MI, Marinai S, Breuel TM, Dengel A, Liwicki M (2015) Deepdocclassifier: document classification with deep convolutional neural network. In: 2015 13th international conference on document analysis and recognition (ICDAR), pp 1111–1115. IEEE
Kanchi S, Pagani A, Mokayed H, Liwicki M, Stricker D, Afzal MZ (2022)Emmdocclassifier: efficient multimodal document image classifier for scarce data

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Computer Vision & Biometrics Lab, Department of Information Technology, IIIT Allahabad, Prayagraj, U.P., India
Tejasvee Bisen, Mohammed Javed & P. Nagabhushan
Department of Computer Engineering, Thapar Institute of Engineering and Technology, Patiala, Punjab, India
Shashank Kirtania

Authors

Tejasvee Bisen
View author publications
You can also search for this author in PubMed Google Scholar
Mohammed Javed
View author publications
You can also search for this author in PubMed Google Scholar
Shashank Kirtania
View author publications
You can also search for this author in PubMed Google Scholar
P. Nagabhushan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tejasvee Bisen.

Ethics declarations

Conflict of interest

The authors have no relevant financial or nonfinancial interests to disclose.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

All authors have agreed with the content and give explicit consent to submit the work. All authors have obtained consent from the responsible authorities at the institute/organization where the work has been carried out.

Ethical conduct

The submitted work is original and has not been published or submitted elsewhere in any form or language.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bisen, T., Javed, M., Kirtania, S. et al. DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents. Pattern Anal Applic 26, 1641–1655 (2023). https://doi.org/10.1007/s10044-023-01190-8

Download citation

Received: 03 July 2022
Accepted: 14 July 2023
Published: 02 August 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s10044-023-01190-8

Keyword

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents

Abstract

Access this article

Similar content being viewed by others

Less Is More: Accelerating Faster Neural Networks Straight from JPEG

2C-Net: integrate image compression and classification via deep neural network

Convolutional Neural Network (CNN) to Reduce Construction Loss in JPEG Compression Caused by Discrete Fourier Transform (DFT)

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Ethical conduct

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keyword

Navigation

DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents

Abstract

Access this article

Similar content being viewed by others

Less Is More: Accelerating Faster Neural Networks Straight from JPEG

2C-Net: integrate image compression and classification via deep neural network

Convolutional Neural Network (CNN) to Reduce Construction Loss in JPEG Compression Caused by Discrete Fourier Transform (DFT)

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent for publication

Ethical conduct

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keyword

Search

Navigation