当前位置: X-MOL 学术Aut. Control Comp. Sci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Text Model for the Automatic Scoring of Business Letter Writing
Automatic Control and Computer Sciences Pub Date : 2024-02-27 , DOI: 10.3103/s0146411623070167
D. D. Zafievsky , N. S. Lagutina , O. A. Melnikova , A. Y. Poletaev

Abstract

This article describes a text model designed to automatically score a cohesive text in the form of a letter on a theme. The scoring parameters are formulated and formalized in the form of 14 criteria with the help of expert English language teachers. The criteria include parameters related to the analysis of vocabulary, including the features of the data domain, text subject, writing style and format, and logical connection in sentences. The authors have developed algorithms for determining the corresponding numerical characteristics using methods and tools for automatic text analysis. The algorithms are based on the analysis of the composition and structure of sentences, using data from specialized dictionaries. The characteristics are focused on checking business e-mails, but can be adapted to the analysis of other written texts, for example, by replacing dictionaries. Based on the developed algorithms, a system for automatic text scoring is created. An experiment is carried out to analyze the results of this system’s operation on a corpus of 20 texts, previously marked up by English teachers. Automatic scoring and the scoring of experts are compared using heat maps and the the UMAP two-dimensional representation of vectors applied to the characteristic text vectors. In most cases, there are no significant differences between the scores; moreover, automatic scoring turns out to be more objective. Thus, the developed model successfully copes with this task and can be used to evaluate texts written by humans. The results will be used for automatic student language profiling. The advantages of the model lie in the good interpretability of the results, credibility, and development prospects.



中文翻译:

商业信函写作自动评分的文本模型

摘要

本文描述了一种文本模型,旨在自动对主题字母形式的连贯文本进行评分。在专业英语教师的帮助下,评分参数以 14 项标准的形式制定和形式化。标准包括与词汇分析相关的参数,包括数据域特征、文本主题、写作风格和格式以及句子中的逻辑连接。作者开发了使用自动文本分析方法和工具确定相应数字特征的算法。这些算法基于使用专门词典中的数据对句子的组成和结构进行分析。这些特性主要用于检查商业电子邮件,但可以适应其他书面文本的分析,例如通过替换字典。基于开发的算法,创建了自动文本评分系统。我们通过实验来分析该系统在由英语教师预先标记的 20 篇文本的语料库上的运行结果。使用热图和应用于特征文本向量的 UMAP 二维表示来比较自动评分和专家评分。大多数情况下,分数之间没有显着差异;而且,自动评分变得更加客观。因此,开发的模型成功地应对了这项任务,并可用于评估人类编写的文本。结果将用于自动学生语言分析。该模型的优点在于结果可解释性好、可信度高、发展前景好。

更新日期:2024-02-28
down
wechat
bug