当前位置: X-MOL 学术Sociological Methodology › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Language Models in Sociological Research: An Application to Classifying Large Administrative Data and Measuring Religiosity
Sociological Methodology ( IF 6.118 ) Pub Date : 2021-10-25 , DOI: 10.1177/00811750211053370
Jeffrey L. Jensen 1 , Daniel Karell 2 , Cole Tanigawa-Lau 3 , Nizar Habash 4 , Mai Oudah 4 , Dhia Fairus Shofia Fani 5
Affiliation  

Computational methods have become widespread in the social sciences, but probabilistic language models remain relatively underused. We introduce language models to a general social science readership. First, we offer an accessible explanation of language models, detailing how they estimate the probability of a piece of language, such as a word or sentence, on the basis of the linguistic context. Second, we apply language models in an illustrative analysis to demonstrate the mechanics of using these models in social science research. The example application uses language models to classify names in a large administrative database; the classifications are then used to measure a sociologically important phenomenon: the spatial variation of religiosity. This application highlights several advantages of language models, including their effectiveness in classifying text that contains variation around the base structures, as is often the case with localized naming conventions and dialects. We conclude by discussing language models’ potential to contribute to sociological research beyond classification through their ability to generate language.



中文翻译:

社会学研究中的语言模型:对大型行政数据进行分类和测量宗教信仰的应用

计算方法在社会科学中已经变得普遍,但概率语言模型的使用仍然相对不足。我们向一般社会科学读者介绍语言模型。首先,我们提供了语言模型的易于理解的解释,详细说明了它们如何根据语言上下文估计一段语言(例如单词或句子)的概率。其次,我们在说明性分析中应用语言模型,以展示在社会科学研究中使用这些模型的机制。示例应用程序使用语言模型对大型管理数据库中的名称进行分类;然后使用这些分类来衡量一个社会学上重要的现象:宗教信仰的空间变化。这个应用程序突出了语言模型的几个优点,包括它们在对包含围绕基本结构的变化的文本进行分类方面的有效性,这通常是本地化命名约定和方言的情况。我们最后讨论了语言模型通过其生成语言的能力为超越分类的社会学研究做出贡献的潜力。

更新日期:2021-10-25
down
wechat
bug