当前位置: X-MOL 学术Journal of Writing Research › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A NLP-based stylometric approach for tracking the evolution of L1 written language competence
Journal of Writing Research Pub Date : 2021-05-01 , DOI: 10.17239/jowr-2021.13.01.03
Alessio Miaschi , Dominique Brunato , Felice Dell'Orletta

In this study we present a Natural Language Processing (NLP)-based stylometric approach for tracking the evolution of written language competence in Italian L1 learners. The approach relies on a wide set of linguistically motivated features capturing stylistic aspects of a text, which were extracted from students’ essays contained in CItA (Corpus Italiano di Apprendenti L1), the first longitudinal corpus of texts written by Italian L1 learners enrolled in the first and second year of lower secondary school. We address the problem of modeling written language development as a supervised classification task consisting in predicting the chronological order of essays written by the same student at different temporal spans. The promising results obtained in several classification scenarios allow us to conclude that it is possible to automatically model the highly relevant changes affecting written language evolution across time, as well as identifying which features are more predictive of this process. In the last part of the article, we focus the attention on the possible influence of background variables on language learning and we present preliminary results of a pilot study aiming at understanding how the observed developmental patterns are affected by information related to the school environment of the student.

中文翻译:

一种基于 NLP 的文体测量方法,用于跟踪 L1 书面语言能力的演变

在这项研究中,我们提出了一种基于自然语言处理 (NLP) 的文体测量方法,用于跟踪意大利语 L1 学习者书面语言能力的演变。该方法依赖于捕获文本文体方面的大量语言动机特征,这些特征是从 CItA (Corpus Italiano di Apprendenti L1) 中包含的学生论文中提取的,这是第一个由意大利 L1 学习者编写的纵向文本语料库。初中一年级和二年级。我们解决了将书面语言发展建模为监督分类任务的问题,该任务包括预测同一学生在不同时间跨度上所写论文的时间顺序。在几个分类场景中获得的有希望的结果使我们得出结论,可以自动对影响书面语言演变的高度相关变化进行建模,并确定哪些特征更能预测这一过程。在文章的最后一部分,我们将注意力集中在背景变量对语言学习的可能影响上,并展示了一项试点研究的初步结果,旨在了解观察到的发展模式如何受到与学校环境相关的信息的影响。学生。
更新日期:2021-05-01
down
wechat
bug