当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modeling the Sacred: Considerations when Using Considerations when Using Religious Texts in Natural Language Processing
arXiv - CS - Computation and Language Pub Date : 2024-04-23 , DOI: arxiv-2404.14740
Ben Hutchinson

This position paper concerns the use of religious texts in Natural Language Processing (NLP), which is of special interest to the Ethics of NLP. Religious texts are expressions of culturally important values, and machine learned models have a propensity to reproduce cultural values encoded in their training data. Furthermore, translations of religious texts are frequently used by NLP researchers when language data is scarce. This repurposes the translations from their original uses and motivations, which often involve attracting new followers. This paper argues that NLP's use of such texts raises considerations that go beyond model biases, including data provenance, cultural contexts, and their use in proselytism. We argue for more consideration of researcher positionality, and of the perspectives of marginalized linguistic and religious communities.

中文翻译:

神圣建模:在自然语言处理中使用宗教文本时的注意事项

本立场文件涉及自然语言处理 (NLP) 中宗教文本的使用,这对 NLP 伦理学特别感兴趣。宗教文本是重要文化价值观的表达,机器学习模型倾向于重现训练数据中编码的文化价值观。此外,当语言数据稀缺时,NLP 研究人员经常使用宗教文本的翻译。这改变了翻译的原始用途和动机,这通常涉及吸引新的追随者。本文认为,NLP 对此类文本的使用引发了超出模型偏差的考虑,包括数据来源、文化背景及其在传教中的使用。我们主张更多地考虑研究人员的立场以及边缘化语言和宗教社区的观点。
更新日期:2024-04-24
down
wechat
bug