Skip to main content
Log in

DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents

  • Theoretical Advances
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

For any digital application with document images such as retrieval, the classification of document images becomes an essential stage. Conventionally for the purpose, the full versions of the documents, that is the uncompressed document images make the input dataset, which poses a threat due to the big volume required to accommodate the full versions of the documents. Therefore, it would be novel, if the same classification task could be accomplished directly (with some partial decompression) with the compressed representation of documents in order to make the whole process computationally more efficient. In this research work, a novel deep learning model—DWT-CompCNN—is proposed for classification of documents that are compressed using High Throughput JPEG 2000 (HTJ2K) algorithm. The proposed DWT-CompCNN comprises of five convolutional layers with filter sizes of 16, 32, 64, 128, and 256 consecutively for each increasing layer to improve learning from the wavelet coefficients extracted from the compressed images. Experiments are performed on two benchmark datasets, Tobacco-3482 and RVL-CDIP, which demonstrate that the proposed model is time and space efficient, and also achieves a better classification accuracy in compressed domain.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Availability of data and materials

Data set and materials will be made available on request.

References

  1. Kumar J, Ye P, Doermann D (2014) Structural similarity for document image classification and retrieval. Pattern Recogn Lett 43:119–126

    Article  Google Scholar 

  2. Csurka G, Larlus D, Gordo A, Almazan J (2016) What is the right way to represent document images? arXiv preprint arXiv:1603.01076

  3. Harley AW, Ufkes A, Derpanis KG (2015) Evaluation of deep convolutional nets for document image classification and retrieval. In: 2015 13th international conference on document analysis and recognition (ICDAR), pp 991–995. IEEE

  4. Sarkhel R, Nandi A (2019) Deterministic routing between layout abstractions for multi-scale classification of visually rich documents. In: 28th international joint conference on artificial intelligence (IJCAI), 2019

  5. Barni M (2018) Document and image compression

  6. Zhang Y, Yutao Z, Guangxu L (2018) Document image compression with application to digital preservation in digital libraries. In: 2018 IEEE international conference on signal processing, communications and computing (ICSPCC), pp 1–4. IEEE

  7. Byju AP, Sumbul G, Demir B, Bruzzone L (2020) Remote-sensing image scene classification with deep neural networks in JPEG 2000 compressed domain. IEEE Trans Geosci Remote Sens 59(4):3458–3472

    Article  Google Scholar 

  8. Mukhopadhyay J (2011) Image and video processing in the compressed domain

  9. Javed M, Nagabhushan P, Chaudhuri BB (2018) A review on document image analysis techniques directly in the compressed domain. Artif Intell Rev 50(4):539–568

    Article  Google Scholar 

  10. Afzal MZ, Kölsch A, Ahmed S, Liwicki M (2017) Cutting the error by half: investigation of very deep CNN and advanced training strategies for document image classification. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 883–888. IEEE

  11. Ferrando J, Domínguez JL, Torres J, García R, García D, Garrido D, Cortada J, Valero M (2020) Improving accuracy and speeding up document image classification through parallel systems. In: International conference on computational science. Springer, pp 387–400

  12. Hu B, Ergu D, Yang H, Liu K, Cai Y (2019) Document images classification based on deep learning. Proc Comput Sci 162:514–522

    Article  Google Scholar 

  13. Csurka G (2017) Document image classification, with a specific view on applications of patent images, 325–350

  14. Kölsch A, Afzal MZ, Ebbecke M, Liwicki M (2017) Real-time document image classification using deep CNN and extreme learning machines. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 1318–1323. IEEE

  15. Das A, Roy S, Bhattacharya U, Parui SK (2018) Document image classification with intra-domain transfer learning and stacked generalization of deep convolutional neural networks. In: 2018 24th international conference on pattern recognition (ICPR), pp 3180–3185. IEEE

  16. Mandivarapu JK, Bunch E, You Q, Fung G (2021) Efficient document image classification using region-based graph neural network. arXiv preprint arXiv:2106.13802

  17. Bakkali S, Ming Z, Coustaty M, Rusinol M (2020) Visual and textual deep feature fusion for document image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 562–563

  18. Asim MN, Khan MUG, Malik MI, Razzaque K, Dengel A, Ahmed S (2019) Two stream deep network for document image classification. In: 2019 international conference on document analysis and recognition (ICDAR), pp 1410–1416. IEEE

  19. Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305

    MATH  Google Scholar 

  20. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  21. Bakkali S, Ming Z, Coustaty M, Rusi nol M (2020) Cross-modal deep networks for document image classification. In: 2020 IEEE international conference on image processing (ICIP), pp 2556–2560. IEEE

  22. Kang L, Kumar J, Ye P, Li Y, Doermann D (2014) Convolutional neural networks for document image classification. In: 2014 22nd international conference on pattern recognition, pp 3168–3172. IEEE

  23. Tensmeyer C, Martinez T (2017) Analysis of convolutional neural networks for document image classification. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol 1, pp 388–393. IEEE

  24. Salomon D (2004) Data compression: the complete reference

  25. Nagabhushan P, Javed M, Chaudhuri B (2014) Entropy computations of document images in run-length compressed domain. In: 2014 fifth international conference on signal and image processing, pp 287–291. IEEE

  26. De Queiroz RL (1998) Processing jpeg-compressed images and documents. IEEE Trans Image Process 7(12):1661–1672

    Article  Google Scholar 

  27. Rabbani M (2002) Jpeg 2000: Image compression fundamentals, standards and practice. J Electr Imaging 11(2):286

    Article  Google Scholar 

  28. Nagabhushan P, et al (2019) Text line segmentation in compressed representation of handwritten document using tunneling algorithm. arXiv preprint arXiv:1901.11477

  29. Rajesh B, Jain P, Javed M, Doermann D (2021) Hh-compwordnet: Holistic handwritten word recognition in the compressed domain. In: 2021 data compression conference (DCC), pp 362–362. IEEE

  30. Byju AP, Demir B, Bruzzone L (2020) A progressive content-based image retrieval in jpeg 2000 compressed remote sensing archives. IEEE Trans Geosci Remote Sens 58(8):5739–5751

    Article  Google Scholar 

  31. Schaefer G (2017) Fast compressed domain jpeg image retrieval. In: 2017 International conference on vision, image and signal processing (ICVISP), pp. 22–26. IEEE

  32. Rajesh B, Javed M, Srivastava S (2019) Dct-compcnn: a novel image classification network using jpeg compressed DCT coefficients. In: 2019 IEEE conference on information and communication technology, pp 1–6. IEEE

  33. Arslan HS, Archambault S, Bhatt P, Watanabe K, Cuevaz J, Le P, Miller D, Zhumatiy V (2022) Usage of compressed domain in fast frameworks. Signal Image Video Process, 1–9

  34. Hiremath PS, Shivashankar S (2008) Wavelet based co-occurrence histogram features for texture classification with an application to script identification in a document image. Pattern Recogn Lett 29(9):1182–1189

    Article  Google Scholar 

  35. Williams T, Li R (2016) Advanced image classification using wavelets and convolutional neural networks. In: 2016 15th IEEE international conference on machine learning and applications (ICMLA), pp 233–239. IEEE

  36. Khatami A, Nazari A, Beheshti A, Nguyen TT, Nahavandi S, Zieba J (2020) Convolutional neural network for medical image classification using wavelet features. In: 2020 international joint conference on neural networks (IJCNN), pp 1–8. IEEE

  37. Ali RB, Ejbali R, Zaied M (2018) A deep convolutional neural wavelet network for classification of medical images. J Comput Sci 69(11):1488–1498

    Article  Google Scholar 

  38. Rossetto AM, Zhou W (2019) Improving classification with CNNS using wavelet pooling with nesterov-accelerated adam. In: Proceedings of 11th international conference on bioinformation and computer biology, vol 60, pp 84–93

  39. Li J, Gray RM (2000) Context-based multiscale classification of document images using wavelet coefficient distributions. IEEE Trans Image Proces 9(9):1604–1616

    Article  Google Scholar 

  40. Li Q, Shen L, Guo S, Lai Z (2020) Wavelet integrated CNNS for noise-robust image classification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7245–7254

  41. Shankar BU, Meher SK, Ghosh A (2007) Neuro-wavelet classifier for remote sensing image classification. In: 2007 international conference on computing: theory and applications (ICCTA’07), pp 711–715. IEEE

  42. Chamain LD, Ding Z (2020) Improving deep learning classification of jpeg2000 images over bandlimited networks. In: ICASSP 2020-2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 4062–4066. IEEE

  43. Abdmouleh MK, Masmoudi A, Bouhlel MS (2012) A new method which combines arithmetic coding with RLE for lossless image compression

  44. Watson AB (1994) Image compression using the discrete cosine transform. Math J 4(1):81

    MathSciNet  Google Scholar 

  45. Chowdhury MMH, Khatun A (2012) Image compression using discrete wavelet transform. Int J Comput Sci Issues (IJCSI) 9(4):327

    Google Scholar 

  46. Schelkens P, Skodras A, Ebrahimi T (2009) The jpeg 2000 suite

  47. Taubman D, Naman A, Mathew R, Smith M, Watanabe O (2019) High throughput jpeg 2000 (htj2k): algorithm, performance and potential

  48. Zhang X, Zou J, He K, Sun J (2015) Accelerating very deep convolutional networks for classification and detection. IEEE Trans Pattern Anal Mach Intell 38(10):1943–1955

    Article  Google Scholar 

  49. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  50. Lewis D, Agam G, Argamon S, Frieder O, Grossman D, Heard J (2006) Building a test collection for complex document information processing. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, pp 665–666

  51. Harley AW, Ufkes A, Derpanis KG Evaluation of deep convolutional nets for document image classification and retrieval. In: International conference on document analysis and recognition (ICDAR)

  52. Fang X, Watanabe O (2021) Development of open-source codec compliant with htj2k standard. In: 2021 IEEE 10th global conference on consumer electronics (GCCE), pp 11–14. IEEE

  53. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  54. Watanabe O, Taubman D (2019) A matlab implementation of the emerging htj2k standard. In: 2019 IEEE 8th global conference on consumer electronics (GCCE), pp 491–495. IEEE

  55. Afzal MZ, Capobianco S, Malik MI, Marinai S, Breuel TM, Dengel A, Liwicki M (2015) Deepdocclassifier: document classification with deep convolutional neural network. In: 2015 13th international conference on document analysis and recognition (ICDAR), pp 1111–1115. IEEE

  56. Kanchi S, Pagani A, Mokayed H, Liwicki M, Stricker D, Afzal MZ (2022)Emmdocclassifier: efficient multimodal document image classifier for scarce data

Download references

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tejasvee Bisen.

Ethics declarations

Conflict of interest

The authors have no relevant financial or nonfinancial interests to disclose.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

All authors have agreed with the content and give explicit consent to submit the work. All authors have obtained consent from the responsible authorities at the institute/organization where the work has been carried out.

Ethical conduct

The submitted work is original and has not been published or submitted elsewhere in any form or language.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bisen, T., Javed, M., Kirtania, S. et al. DWT-CompCNN: deep image classification network for high throughput JPEG 2000 compressed documents. Pattern Anal Applic 26, 1641–1655 (2023). https://doi.org/10.1007/s10044-023-01190-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-023-01190-8

Keyword

Navigation