当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Language, Common Sense, and the Winograd Schema Challenge
Artificial Intelligence ( IF 14.4 ) Pub Date : 2023-10-06 , DOI: 10.1016/j.artint.2023.104031
Jacob Browning , Yann LeCun

Since the 1950s, philosophers and AI researchers have held that disambiguating natural language sentences depended on common sense. In 2011, the Winograd Schema Challenge was established to evaluate the common-sense reasoning abilities of a machine by testing its ability to disambiguate sentences. The designers argued only a system capable of “thinking in the full-bodied sense” would be able to pass the test. However, by 2021, the original authors concede the test has been soundly defeated by large language models which still seem to lack common sense of full-bodied thinking. In this paper, we argue that disambiguating sentences only seemed like a good test of common-sense based on a certain picture of the relationship between linguistic comprehension and semantic knowledge—one typically associated with the early computational theory of mind and Symbolic AI. If this picture is rejected, as it is by most LLM researchers, then disambiguation ceases to look like a comprehensive test of common-sense and instead appear only to test linguistic competence. The upshot is that any linguistic test, not just disambiguation, is unlikely to tell us much about common sense or genuine intelligence.



中文翻译:

语言、常识和 Winograd 模式挑战

自 20 世纪 50 年代以来,哲学家和人工智能研究人员一直认为消除自然语言句子的歧义取决于常识。2011 年,Winograd Schema Challenge 成立,旨在通过测试机器消除句子歧义的能力来评估机器的常识推理能力。设计者认为,只有能够“全面思考”的系统才能通过测试。然而,到 2021 年,原作者承认该测试已被大型语言模型彻底击败,这些模型似乎仍然缺乏全面思维的常识。在本文中,我们认为,基于语言理解和语义知识之间关系的某种图景,消除句子歧义似乎只是对常识的一种很好的测试——这种关系通常与早期的心智计算理论和符号人工智能相关。如果这张图片被拒绝,就像大多数法学硕士研究人员所做的那样,那么消歧就不再像是对常识的全面测试,而似乎只是测试语言能力。结果是,任何语言测试,而不仅仅是消歧测试,都不太可能告诉我们很多关于常识或真正智力的信息。

更新日期:2023-10-06
down
wechat
bug