skip to main content
research-article
Free Access
Just Accepted

Complexity Analysis of Chinese Text Based on the Construction Grammar Theory and Deep Learning

Online AM:10 April 2024Publication History
Skip Abstract Section

Abstract

Due to the complexity of Chinese and the differences between Chinese and English, the application of Chinese text in the digital field has a certain complexity. Taking Chinese text in Open Relation Extraction (ORE) as the research object, the complexity of Chinese text is analyzed. An extraction system of word vectors based on construction grammar theory and Deep Learning (DL) is constructed to achieve smooth extraction of Chinese text. The work of this paper mainly includes the following aspects. To study the application of DL in the complexity analysis of Chinese text based on construction grammar, firstly, the connotation of construction grammar and its role in Chinese text analysis are explored. Secondly, from the perspective of the ORE of word vectors in language analysis, an ORE model based on word vectors is implemented. Moreover, an extraction method based on the distance of word vectors is proposed. The test results show that the F1 value of the proposed algorithm is 67% on the public WEB-500 and NYT-500 datasets, which is superior to other similar text extraction algorithms. When the recall rate is more than 30%, the accuracy of the proposed method is higher than several other latest language analysis systems. This indicates that the proposed Chinese text extraction system based on the DL algorithm and construction grammar theory has advantages in complexity analysis and can provide a new research idea for Chinese text analysis.

References

  1. D. H. Maulud, S. R. Zeebaree, K. Jacksi. et al, State of art for semantic analysis of natural language processing. Qubahan Academic Journal, vol. 1, no. 2, pp. 21-28, 2021Google ScholarGoogle ScholarCross RefCross Ref
  2. S. Arts, J. Hou & J. C. Gomez. Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures. Research Policy, vol. 50, no. 2, pp. 104144, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  3. W. E. Zhang, Q. Z. Sheng, A. Alhazmi et al. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM Transactions on Intelligent Systems and Technology (TIST), 11(3), 1-41. vol. 11, no. 3, pp. 1-41, 2020.Google ScholarGoogle Scholar
  4. Y. Kang, Z. Cai, C. W. Tan et al. Natural language processing (NLP) in management research: A literature review. Journal of Management Analytics, vol. 7, no. 2, pp. 139-172, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  5. J. Guo, H. He, T. He et al. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing. J. Mach. Learn. Res., vol. 21, no. 23, pp. 1–7, 2020.Google ScholarGoogle Scholar
  6. Y. Gu, R. Tinn, H. Cheng et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH), vol. 3, no. 1, pp. 1–23, 2021.Google ScholarGoogle Scholar
  7. D. W. Otter, J. R. Medina & J. K. Kalita. A survey of the usages of deep learning for natural language processing. IEEE Transactions on neural networks and learning systems, vol. 32, no. 2, pp. 604-624, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  8. X. Qiu, T. Sun, Y. Xu et al. pre-trained models for natural language processing: A survey. Science China Technological Sciences, vol. 63, no. 10, pp. 1872-1897, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  9. D. B. Claro, M. Souza, C. Castellã Xavier et al. Multilingual open information extraction: Challenges and opportunities. Information, vol. 10, no. 7, pp. 228, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  10. C. F. L. Sena, & D. B. Claro. Pragmatic OIE: a pragmatic open information extraction for the Portuguese language. Knowledge and Information Systems, vol. 62, no. 9, pp. 3811-3836, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. Anthonissen, Cognition in construction grammar: Connecting individual and community grammars. Cognitive linguistics, vol. 31, no. 2, pp. 309-337, 2020.Google ScholarGoogle Scholar
  12. B. Shao, Y. Cai, & G. Trousdale. A multivariate analysis of diachronic variation in a bunch of nouns: A construction grammar account. Journal of English Linguistics, vol. 47, no. 2, pp. 150-174, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  13. Y. H. Kuo. Reinforcement by realignment in diachronic construction grammar: The case of classifier xiē in Mandarin Chinese. Constructions and Frames, vol. 12, no. 2, pp. 206-238, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  14. N. Groom. Construction grammar and the corpus-based analysis of discourses: The case of how construction. International Journal of Corpus Linguistics, vol. 24, no. 3, pp. 291-323, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  15. Y. Jin & M. Yang. A Study of Three Variants of Gerund Construction from the Contrastive Perspective of Social and Natural Academic Abstracts on Construction Grammar Theory. Chinese Journal of Applied Linguistics, vol. 43, no. 2, pp. 219-230, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  16. S. F. Urunbaevna, Construction Grammar: Constructions and Argument Structures. Spanish Journal of Innovation and Integrity, vol. 5, no.1, pp. 236-238, 2022.Google ScholarGoogle Scholar
  17. R. Jackendoff, Alternative theories of morphology in the Parallel Architecture: A reply to Benavides 2022. Isogloss. Open Journal of Romance Linguistics, vol. 8, no.1, pp.1-10, 2022.Google ScholarGoogle ScholarCross RefCross Ref
  18. S. Li, J. Hu, Y. Cui et al. DeepPatent: patent classification with convolutional neural networks and word embedding. Scientometrics, vol. 117, no. 2, pp. 721-744, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Pardede. Sentiment Analysis of Stocktwits Data with Word Vector and Gated Recurrent Unit. Jurnal Linguistik Komputasional, vol. 4, no. 2, pp. 47-51, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  20. K. Wongpatikaseree, Y. Kaewpitakkun, S. Yuenyong et al. EmoCNN: Encoding Emotional Expression from Text to Word Vector and Classifying Emotions—A Case Study in Thai Social Network Conversation. Engineering Journal, vol. 25, no. 7, pp. 73-82, 2021.Google ScholarGoogle Scholar
  21. Q. T. Hai & S. O. Hwang. Detection of malicious URLs based on word vector representation and Ingram. Journal of Intelligent & Fuzzy Systems, vol. 35, no. 6, pp. 5889-5900, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  22. M. S. Hoque, N. Jamil, N. Amin et al. An Improved Vulnerability Exploitation Prediction Model with Novel Cost Function and Custom Trained Word Vector Embedding. Sensors, vol. 21, no. 12, pp. 4220, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  23. Q. Wang, M. Du, X. Chen et al. Privacy-preserving collaborative model learning: The case of word vector training. IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 12, pp. 2381-2393, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. Zheng, X. Liu & L. Yin. Sentence representation method based on multi-layer semantic network. Applied Sciences, vol. 11, no. 3, pp. 1316, 2021.Google ScholarGoogle ScholarCross RefCross Ref
  25. I. Chalkidis & D. Kampas. Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artificial Intelligence and Law, vol. 27, no. 2, pp. 171-198, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Y. Zhang, J. Zheng, Y. Jiang, et al. A text sentiment classification modeling method based on coordinated CNN-LSTM-attention model. Chinese Journal of Electronics, vol. 28, no. 1, pp. 120-126, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  27. H. You, S. Tian, L. Yu et al. Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 2, pp. 1281-1293, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  28. W. S. Gershon. Reverberations and reverb: Sound possibilities for narrative, creativity, and critique. Qualitative Inquiry, vol. 26, no. 10, pp. 1163-1173, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  29. J. D. Groof, F. van der Sommen, J. van der Putten et al. The Argos project: the development of a computer-aided detection system to improve the detection of Barrett's neoplasia on white light endoscopy. United European gastroenterology journal, vol. 7, no. 4, pp. 538-547, 2019.Google ScholarGoogle Scholar
  30. P. Liu & X. Wang. A Distance Approach for Open Information Extraction Based on Word Vector. KSII Transactions on Internet and Information Systems (TIIS), vol. 12, no. 6, pp. 2470-2491, 2018.Google ScholarGoogle Scholar
  31. Kagita, Mohan Krishna & Xiujuan, Li, Lakshmana Kumar, R. (2021). Machine Learning Techniques for Multi-media Communications in Business Marketing. Journal of Multiple-valued Logic and Soft Computing. 36. 151-167.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Published in

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing Just Accepted
    ISSN:2375-4699
    EISSN:2375-4702
    Table of Contents

    Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Online AM: 10 April 2024
    • Accepted: 16 September 2023
    • Revised: 18 August 2023
    • Received: 29 May 2023
    Published in tallip Just Accepted

    Check for updates

    Qualifiers

    • research-article
  • Article Metrics

    • Downloads (Last 12 months)35
    • Downloads (Last 6 weeks)35

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader