Abstract
Due to the complexity of Chinese and the differences between Chinese and English, the application of Chinese text in the digital field has a certain complexity. Taking Chinese text in Open Relation Extraction (ORE) as the research object, the complexity of Chinese text is analyzed. An extraction system of word vectors based on construction grammar theory and Deep Learning (DL) is constructed to achieve smooth extraction of Chinese text. The work of this paper mainly includes the following aspects. To study the application of DL in the complexity analysis of Chinese text based on construction grammar, firstly, the connotation of construction grammar and its role in Chinese text analysis are explored. Secondly, from the perspective of the ORE of word vectors in language analysis, an ORE model based on word vectors is implemented. Moreover, an extraction method based on the distance of word vectors is proposed. The test results show that the F1 value of the proposed algorithm is 67% on the public WEB-500 and NYT-500 datasets, which is superior to other similar text extraction algorithms. When the recall rate is more than 30%, the accuracy of the proposed method is higher than several other latest language analysis systems. This indicates that the proposed Chinese text extraction system based on the DL algorithm and construction grammar theory has advantages in complexity analysis and can provide a new research idea for Chinese text analysis.
- D. H. Maulud, S. R. Zeebaree, K. Jacksi. et al, State of art for semantic analysis of natural language processing. Qubahan Academic Journal, vol. 1, no. 2, pp. 21-28, 2021Google ScholarCross Ref
- S. Arts, J. Hou & J. C. Gomez. Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures. Research Policy, vol. 50, no. 2, pp. 104144, 2021.Google ScholarCross Ref
- W. E. Zhang, Q. Z. Sheng, A. Alhazmi et al. Adversarial attacks on deep-learning models in natural language processing: A survey. ACM Transactions on Intelligent Systems and Technology (TIST), 11(3), 1-41. vol. 11, no. 3, pp. 1-41, 2020.Google Scholar
- Y. Kang, Z. Cai, C. W. Tan et al. Natural language processing (NLP) in management research: A literature review. Journal of Management Analytics, vol. 7, no. 2, pp. 139-172, 2020.Google ScholarCross Ref
- J. Guo, H. He, T. He et al. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing. J. Mach. Learn. Res., vol. 21, no. 23, pp. 1–7, 2020.Google Scholar
- Y. Gu, R. Tinn, H. Cheng et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH), vol. 3, no. 1, pp. 1–23, 2021.Google Scholar
- D. W. Otter, J. R. Medina & J. K. Kalita. A survey of the usages of deep learning for natural language processing. IEEE Transactions on neural networks and learning systems, vol. 32, no. 2, pp. 604-624, 2020.Google ScholarCross Ref
- X. Qiu, T. Sun, Y. Xu et al. pre-trained models for natural language processing: A survey. Science China Technological Sciences, vol. 63, no. 10, pp. 1872-1897, 2020.Google ScholarCross Ref
- D. B. Claro, M. Souza, C. Castellã Xavier et al. Multilingual open information extraction: Challenges and opportunities. Information, vol. 10, no. 7, pp. 228, 2019.Google ScholarCross Ref
- C. F. L. Sena, & D. B. Claro. Pragmatic OIE: a pragmatic open information extraction for the Portuguese language. Knowledge and Information Systems, vol. 62, no. 9, pp. 3811-3836, 2020.Google ScholarDigital Library
- L. Anthonissen, Cognition in construction grammar: Connecting individual and community grammars. Cognitive linguistics, vol. 31, no. 2, pp. 309-337, 2020.Google Scholar
- B. Shao, Y. Cai, & G. Trousdale. A multivariate analysis of diachronic variation in a bunch of nouns: A construction grammar account. Journal of English Linguistics, vol. 47, no. 2, pp. 150-174, 2019.Google ScholarCross Ref
- Y. H. Kuo. Reinforcement by realignment in diachronic construction grammar: The case of classifier xiē in Mandarin Chinese. Constructions and Frames, vol. 12, no. 2, pp. 206-238, 2020.Google ScholarCross Ref
- N. Groom. Construction grammar and the corpus-based analysis of discourses: The case of how construction. International Journal of Corpus Linguistics, vol. 24, no. 3, pp. 291-323, 2019.Google ScholarCross Ref
- Y. Jin & M. Yang. A Study of Three Variants of Gerund Construction from the Contrastive Perspective of Social and Natural Academic Abstracts on Construction Grammar Theory. Chinese Journal of Applied Linguistics, vol. 43, no. 2, pp. 219-230, 2020.Google ScholarCross Ref
- S. F. Urunbaevna, Construction Grammar: Constructions and Argument Structures. Spanish Journal of Innovation and Integrity, vol. 5, no.1, pp. 236-238, 2022.Google Scholar
- R. Jackendoff, Alternative theories of morphology in the Parallel Architecture: A reply to Benavides 2022. Isogloss. Open Journal of Romance Linguistics, vol. 8, no.1, pp.1-10, 2022.Google ScholarCross Ref
- S. Li, J. Hu, Y. Cui et al. DeepPatent: patent classification with convolutional neural networks and word embedding. Scientometrics, vol. 117, no. 2, pp. 721-744, 2018.Google ScholarDigital Library
- H. Pardede. Sentiment Analysis of Stocktwits Data with Word Vector and Gated Recurrent Unit. Jurnal Linguistik Komputasional, vol. 4, no. 2, pp. 47-51, 2021.Google ScholarCross Ref
- K. Wongpatikaseree, Y. Kaewpitakkun, S. Yuenyong et al. EmoCNN: Encoding Emotional Expression from Text to Word Vector and Classifying Emotions—A Case Study in Thai Social Network Conversation. Engineering Journal, vol. 25, no. 7, pp. 73-82, 2021.Google Scholar
- Q. T. Hai & S. O. Hwang. Detection of malicious URLs based on word vector representation and Ingram. Journal of Intelligent & Fuzzy Systems, vol. 35, no. 6, pp. 5889-5900, 2018.Google ScholarCross Ref
- M. S. Hoque, N. Jamil, N. Amin et al. An Improved Vulnerability Exploitation Prediction Model with Novel Cost Function and Custom Trained Word Vector Embedding. Sensors, vol. 21, no. 12, pp. 4220, 2021.Google ScholarCross Ref
- Q. Wang, M. Du, X. Chen et al. Privacy-preserving collaborative model learning: The case of word vector training. IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 12, pp. 2381-2393, 2018.Google ScholarDigital Library
- W. Zheng, X. Liu & L. Yin. Sentence representation method based on multi-layer semantic network. Applied Sciences, vol. 11, no. 3, pp. 1316, 2021.Google ScholarCross Ref
- I. Chalkidis & D. Kampas. Deep learning in law: early adaptation and legal word embeddings trained on large corpora. Artificial Intelligence and Law, vol. 27, no. 2, pp. 171-198, 2019.Google ScholarDigital Library
- Y. Zhang, J. Zheng, Y. Jiang, et al. A text sentiment classification modeling method based on coordinated CNN-LSTM-attention model. Chinese Journal of Electronics, vol. 28, no. 1, pp. 120-126, 2019.Google ScholarCross Ref
- H. You, S. Tian, L. Yu et al. Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Transactions on Geoscience and Remote Sensing, vol. 58, no. 2, pp. 1281-1293, 2019.Google ScholarCross Ref
- W. S. Gershon. Reverberations and reverb: Sound possibilities for narrative, creativity, and critique. Qualitative Inquiry, vol. 26, no. 10, pp. 1163-1173, 2020.Google ScholarCross Ref
- J. D. Groof, F. van der Sommen, J. van der Putten et al. The Argos project: the development of a computer-aided detection system to improve the detection of Barrett's neoplasia on white light endoscopy. United European gastroenterology journal, vol. 7, no. 4, pp. 538-547, 2019.Google Scholar
- P. Liu & X. Wang. A Distance Approach for Open Information Extraction Based on Word Vector. KSII Transactions on Internet and Information Systems (TIIS), vol. 12, no. 6, pp. 2470-2491, 2018.Google Scholar
- Kagita, Mohan Krishna & Xiujuan, Li, Lakshmana Kumar, R. (2021). Machine Learning Techniques for Multi-media Communications in Business Marketing. Journal of Multiple-valued Logic and Soft Computing. 36. 151-167.Google Scholar
Recommendations
Chinese Sentence Similarity Calculation Based on Modifiers
Artificial Intelligence and SecurityAbstractTo compute the similarity of Chinese sentences accurately, a revised Chinese sentence similarity approach is proposed though enhancing the importance of the modifiers of stem of sentence. After extracting the modified part of the sentence by ...
DeepNetDevanagari: a deep learning model for Devanagari ancient character recognition
AbstractDevanagari script is the most widely used script in India and other Asian countries. There is a rich collection of ancient Devanagari manuscripts, which is a wealth of knowledge. To make these manuscripts available to people, efforts are being ...
Effective method for making Chinese word vector dynamic
Word vector is an important tool for natural language processing (NLP) tasks such as text classification. However, existing static language models such as Word2vec cannot solve the polysemy problem, leading to a decline in text classification performance. ...
Comments