当前位置: X-MOL 学术Comput. Linguist. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Role of Typological Feature Prediction in NLP and Linguistics
Computational Linguistics ( IF 9.3 ) Pub Date : 2023-11-20 , DOI: 10.1162/coli_a_00498
Johannes Bjerva 1
Affiliation  

Computational typology has gained traction in the field of Natural Language Processing (NLP) in recent years, as evidenced by the increasing number of papers on the topic and the establishment of a Special Interest Group on the topic (SIGTYP), including the organization of successful workshops and shared tasks. A considerable amount of work in this sub-field is concerned with prediction of typological features, e.g., for databases such as the World Atlas of Language Structures (WALS) or Grambank. Prediction is argued to be useful either because (1) it allows for obtaining feature values for relatively undocumented languages, alleviating the sparseness in WALS, in turn argued to be useful for both NLP and linguistics; and (2) it allows us to probe models to see whether or not these typological features are encapsulated in, e.g., language representations. In this article, we present a critical stance concerning prediction of typological features, investigating to what extent this line of research is aligned with purported needs—both from the perspective of NLP practitioners, and perhaps more importantly, from the perspective of linguists specialized in typology and language documentation. We provide evidence that this line of research in its current state suffers from a lack of interdisciplinary alignment. Based on an extensive survey of the linguistic typology community, we present concrete recommendations for future research in order to improve this alignment between linguists and NLP researchers, beyond the scope of typological feature prediction.

中文翻译:

类型特征预测在 NLP 和语言学中的作用

近年来,计算类型学在自然语言处理(NLP)领域受到关注,有关该主题的论文数量不断增加以及该主题特别兴趣小组(SIGTYP)的建立就证明了这一点,包括组织成功的研讨会和共享任务。该子领域的大量工作涉及类型特征的预测,例如世界语言结构地图集(WALS)或 Grambank 等数据库。预测被认为是有用的,因为(1)它允许获取相对未记录的语言的特征值,减轻 WALS 的稀疏性,进而被认为对 NLP 和语言学都有用;(2)它允许我们探索模型以查看这些类型学特征是否封装在例如语言表示中。在本文中,我们提出了关于类型学特征预测的批判立场,调查了这一研究方向在多大程度上与所谓的需求相一致——既从 NLP 从业者的角度出发,也许更重要的是,从专门从事类型学的语言学家的角度出发和语言文档。我们提供的证据表明,目前这一领域的研究缺乏跨学科协调。基于对语言类型学界的广泛调查,我们为未来的研究提出了具体建议,以改善语言学家和 NLP 研究人员之间的一致性,超出类型特征预测的范围。
更新日期:2023-11-20
down
wechat
bug