research-article

Free Access

Just Accepted

Complexity Analysis of Chinese Text Based on the Construction Grammar Theory and Deep Learning

Authors:
Changlin Wu

Northeast Normal University, Changchun 130024, Jilin, China

Northeast Normal University, Changchun 130024, Jilin, China

0009-0000-7745-7269
View Profile

,
Changan Wu

Northeast Normal University, Changchun 130024, Jilin, China

Northeast Normal University, Changchun 130024, Jilin, China

0009-0002-1195-7732
View Profile

ACM Transactions on Asian and Low-Resource Language Information ProcessingAccepted on September 2023https://doi.org/10.1145/3625390

Online AM:10 April 2024Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

Due to the complexity of Chinese and the differences between Chinese and English, the application of Chinese text in the digital field has a certain complexity. Taking Chinese text in Open Relation Extraction (ORE) as the research object, the complexity of Chinese text is analyzed. An extraction system of word vectors based on construction grammar theory and Deep Learning (DL) is constructed to achieve smooth extraction of Chinese text. The work of this paper mainly includes the following aspects. To study the application of DL in the complexity analysis of Chinese text based on construction grammar, firstly, the connotation of construction grammar and its role in Chinese text analysis are explored. Secondly, from the perspective of the ORE of word vectors in language analysis, an ORE model based on word vectors is implemented. Moreover, an extraction method based on the distance of word vectors is proposed. The test results show that the F1 value of the proposed algorithm is 67% on the public WEB-500 and NYT-500 datasets, which is superior to other similar text extraction algorithms. When the recall rate is more than 30%, the accuracy of the proposed method is higher than several other latest language analysis systems. This indicates that the proposed Chinese text extraction system based on the DL algorithm and construction grammar theory has advantages in complexity analysis and can provide a new research idea for Chinese text analysis.

References

D. H. Maulud, S. R. Zeebaree, K. Jacksi. et al, State of art for semantic analysis of natural language processing. Qubahan Academic Journal, vol. 1, no. 2, pp. 21-28, 2021Google ScholarCross Ref
S. Arts, J. Hou & J. C. Gomez. Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures. Research Policy, vol. 50, no. 2, pp. 104144, 2021.Google ScholarCross Ref
W. E. Zhang, Q. Z. Sheng, A. Alhazmi et al. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM Transactions on Intelligent Systems and Technology (TIST), 11(3), 1-41. vol. 11, no. 3, pp. 1-41, 2020.Google Scholar
Y. Kang, Z. Cai, C. W. Tan et al. Natural language processing (NLP) in management research: A literature review. Journal of Management Analytics, vol. 7, no. 2, pp. 139-172, 2020.Google ScholarCross Ref
J. Guo, H. He, T. He et al. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing. J. Mach. Learn. Res., vol. 21, no. 23, pp. 1–7, 2020.Google Scholar
Y. Gu, R. Tinn, H. Cheng et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH), vol. 3, no. 1, pp. 1–23, 2021.Google Scholar
D. W. Otter, J. R. Medina & J. K. Kalita. A survey of the usages of deep learning for natural language processing. IEEE Transactions on neural networks and learning systems, vol. 32, no. 2, pp. 604-624, 2020.Google ScholarCross Ref
X. Qiu, T. Sun, Y. Xu et al. pre-trained models for natural language processing: A survey. Science China Technological Sciences, vol. 63, no. 10, pp. 1872-1897, 2020.Google ScholarCross Ref
D. B. Claro, M. Souza, C. Castellã Xavier et al. Multilingual open information extraction: Challenges and opportunities. Information, vol. 10, no. 7, pp. 228, 2019.Google ScholarCross Ref
C. F. L. Sena, & D. B. Claro. Pragmatic OIE: a pragmatic open information extraction for the Portuguese language. Knowledge and Information Systems, vol. 62, no. 9, pp. 3811-3836, 2020.Google ScholarDigital Library
L. Anthonissen, Cognition in construction grammar: Connecting individual and community grammars. Cognitive linguistics, vol. 31, no. 2, pp. 309-337, 2020.Google Scholar
B. Shao, Y. Cai, & G. Trousdale. A multivariate analysis of diachronic variation in a bunch of nouns: A construction grammar account. Journal of English Linguistics, vol. 47, no. 2, pp. 150-174, 2019.Google ScholarCross Ref
Y. H. Kuo. Reinforcement by realignment in diachronic construction grammar: The case of classifier xiē in Mandarin Chinese. Constructions and Frames, vol. 12, no. 2, pp. 206-238, 2020.Google ScholarCross Ref
N. Groom. Construction grammar and the corpus-based analysis of discourses: The case of how construction. International Journal of Corpus Linguistics, vol. 24, no. 3, pp. 291-323, 2019.Google ScholarCross Ref
Y. Jin & M. Yang. A Study of Three Variants of Gerund Construction from the Contrastive Perspective of Social and Natural Academic Abstracts on Construction Grammar Theory. Chinese Journal of Applied Linguistics, vol. 43, no. 2, pp. 219-230, 2020.Google ScholarCross Ref
S. F. Urunbaevna, Construction Grammar: Constructions and Argument Structures. Spanish Journal of Innovation and Integrity, vol. 5, no.1, pp. 236-238, 2022.Google Scholar
R. Jackendoff, Alternative theories of morphology in the Parallel Architecture: A reply to Benavides 2022. Isogloss. Open Journal of Romance Linguistics, vol. 8, no.1, pp.1-10, 2022.Google ScholarCross Ref
S. Li, J. Hu, Y. Cui et al. DeepPatent: patent classification with convolutional neural networks and word embedding. Scientometrics, vol. 117, no. 2, pp. 721-744, 2018.Google ScholarDigital Library
H. Pardede. Sentiment Analysis of Stocktwits Data with Word Vector and Gated Recurrent Unit. Jurnal Linguistik Komputasional, vol. 4, no. 2, pp. 47-51, 2021.Google ScholarCross Ref
K. Wongpatikaseree, Y. Kaewpitakkun, S. Yuenyong et al. EmoCNN: Encoding Emotional Expression from Text to Word Vector and Classifying Emotions—A Case Study in Thai Social Network Conversation. Engineering Journal, vol. 25, no. 7, pp. 73-82, 2021.Google Scholar
Q. T. Hai & S. O. Hwang. Detection of malicious URLs based on word vector representation and Ingram. Journal of Intelligent & Fuzzy Systems, vol. 35, no. 6, pp. 5889-5900, 2018.Google ScholarCross Ref
M. S. Hoque, N. Jamil, N. Amin et al. An Improved Vulnerability Exploitation Prediction Model with Novel Cost Function and Custom Trained Word Vector Embedding. Sensors, vol. 21, no. 12, pp. 4220, 2021.Google ScholarCross Ref
Q. Wang, M. Du, X. Chen et al. Privacy-preserving collaborative model learning: The case of word vector training. IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 12, pp. 2381-2393, 2018.Google ScholarDigital Library
W. Zheng, X. Liu & L. Yin. Sentence representation method based on multi-layer semantic network. Applied Sciences, vol. 11, no. 3, pp. 1316, 2021.Google ScholarCross Ref
I. Chalkidis & D. Kampas. Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artificial Intelligence and Law, vol. 27, no. 2, pp. 171-198, 2019.Google ScholarDigital Library
Y. Zhang, J. Zheng, Y. Jiang, et al. A text sentiment classification modeling method based on coordinated CNN-LSTM-attention model. Chinese Journal of Electronics, vol. 28, no. 1, pp. 120-126, 2019.Google ScholarCross Ref
H. You, S. Tian, L. Yu et al. Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 2, pp. 1281-1293, 2019.Google ScholarCross Ref
W. S. Gershon. Reverberations and reverb: Sound possibilities for narrative, creativity, and critique. Qualitative Inquiry, vol. 26, no. 10, pp. 1163-1173, 2020.Google ScholarCross Ref
J. D. Groof, F. van der Sommen, J. van der Putten et al. The Argos project: the development of a computer-aided detection system to improve the detection of Barrett's neoplasia on white light endoscopy. United European gastroenterology journal, vol. 7, no. 4, pp. 538-547, 2019.Google Scholar
P. Liu & X. Wang. A Distance Approach for Open Information Extraction Based on Word Vector. KSII Transactions on Internet and Information Systems (TIIS), vol. 12, no. 6, pp. 2470-2491, 2018.Google Scholar
Kagita, Mohan Krishna & Xiujuan, Li, Lakshmana Kumar, R. (2021). Machine Learning Techniques for Multi-media Communications in Business Marketing. Journal of Multiple-valued Logic and Soft Computing. 36. 151-167.Google Scholar

Recommendations

Chinese Sentence Similarity Calculation Based on Modifiers
Artificial Intelligence and Security
Abstract
To compute the similarity of Chinese sentences accurately, a revised Chinese sentence similarity approach is proposed though enhancing the importance of the modifiers of stem of sentence. After extracting the modified part of the sentence by ...
Read More
DeepNetDevanagari: a deep learning model for Devanagari ancient character recognition
Abstract
Devanagari script is the most widely used script in India and other Asian countries. There is a rich collection of ancient Devanagari manuscripts, which is a wealth of knowledge. To make these manuscripts available to people, efforts are being ...
Read More
Effective method for making Chinese word vector dynamic

Word vector is an important tool for natural language processing (NLP) tasks such as text classification. However, existing static language models such as Word2vec cannot solve the polysemy problem, leading to a decline in text classification performance. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Asian and Low-Resource Language Information Processing Just Accepted
ISSN:2375-4699
EISSN:2375-4702
Table of Contents

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Online AM: 10 April 2024
- Accepted: 16 September 2023
- Revised: 18 August 2023
- Received: 29 May 2023
Published in tallip Just Accepted

Check for updates
Author Tags
deep learning
Word2Vec
complexity analysis of Chinese text
construction grammar
word vector
open domain relation extraction
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 35
  Total Downloads
- Downloads (Last 12 months)35
- Downloads (Last 6 weeks)35
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Complexity Analysis of Chinese Text Based on the Construction Grammar Theory and Deep Learning

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Recommendations

Chinese Sentence Similarity Calculation Based on Modifiers

DeepNetDevanagari: a deep learning model for Devanagari ancient character recognition

Effective method for making Chinese word vector dynamic

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Complexity Analysis of Chinese Text Based on the Construction Grammar Theory and Deep Learning

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Recommendations

Chinese Sentence Similarity Calculation Based on Modifiers

DeepNetDevanagari: a deep learning model for Devanagari ancient character recognition

Effective method for making Chinese word vector dynamic

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media