Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Causality and signalling of garden-path sentences
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences ( IF 5 ) Pub Date : 2024-01-29 , DOI: 10.1098/rsta.2023.0013
Daphne Wang 1 , Mehrnoosh Sadrzadeh 1

Sheaves are mathematical objects that describe the globally compatible data associated with open sets of a topological space. Original examples of sheaves were continuous functions; later they also became powerful tools in algebraic geometry, as well as logic and set theory. More recently, sheaves have been applied to the theory of contextuality in quantum mechanics. Whenever the local data are not necessarily compatible, sheaves are replaced by the simpler setting of presheaves. In previous work, we used presheaves to model lexically ambiguous phrases in natural language and identified the order of their disambiguation. In the work presented here, we model syntactic ambiguities and study a phenomenon in human parsing called garden-pathing. It has been shown that the information-theoretic quantity known as ‘surprisal’ correlates with human reading times in natural language but fails to do so in garden-path sentences. We compute the degree of signalling in our presheaves using probabilities from the large language model BERT and evaluate predictions on two psycholinguistic datasets. Our degree of signalling outperforms surprisal in two ways: (i) it distinguishes between hard and easy garden-path sentences (with a p -value < 10 5 ), whereas existing work could not, (ii) its garden-path effect is larger in one of the datasets (32 ms versus 8.75 ms per word), leading to better prediction accuracies. This article is part of the theme issue ‘Quantum contextuality, causality and freedom of choice’.



滑轮是描述与拓扑空间的开集相关的全局兼容数据的数学对象。滑轮最初的例子是连续函数;后来它们也成为代数几何、逻辑和集合论的强大工具。最近,滑轮已应用于量子力学的上下文理论。当本地数据不一定兼容时,滑轮就会被更简单的预滑轮设置所取代。在之前的工作中,我们使用 presheaves 对自然语言中的词汇歧义短语进行建模,并确定其消歧的顺序。在这里介绍的工作中,我们对句法歧义进行了建模,并研究了人类解析中称为花园路径的现象。研究表明,被称为“惊奇”的信息论量与人类在自然语言中的阅读时间相关,但在花园小径句子中则不然。我们使用大型语言模型的概率来计算预滑轮中的信号程度伯特并评估对两个心理语言学数据集的预测。我们的信号传递程度在两个方面胜过令人惊讶的表现:(i)它区分困难和简单的花园小路句子(带有 p -价值 < 10 - 5 ),而现有工作无法做到这一点,(ii)其花园小径效应在其中一个数据集中更大(每个单词 32 毫秒 vs 8.75 毫秒),从而带来更好的预测精度。本文是“量子背景、因果关系和选择自由”主题的一部分。