当前位置: X-MOL 学术Lang. Learn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
CLASSIC Utterance Boundary: A Chunking-Based Model of Early Naturalistic Word Segmentation
Language Learning ( IF 5.240 ) Pub Date : 2023-02-02 , DOI: 10.1111/lang.12559
Francesco Cabiddu 1 , Lewis Bott 1 , Gary Jones 2 , Chiara Gambi 1, 3
Affiliation  

Word segmentation is a crucial step in children's vocabulary learning. While computational models of word segmentation can capture infants’ performance in small-scale artificial tasks, the examination of early word segmentation in naturalistic settings has been limited by the lack of measures that can relate models’ performance to developmental data. Here, we extended CLASSIC (Chunking Lexical and Sublexical Sequences in Children; Jones et al., 2021), a corpus-trained chunking model that can simulate several memory and phonological and vocabulary learning phenomena to allow it to perform word segmentation using utterance boundary information, and we have named this extended version CLASSIC utterance boundary (CLASSIC-UB). Further, we compared our model to the performance of children on a wide range of new measures, capitalizing on the link between word segmentation and vocabulary learning abilities. We showed that the combination of chunking and utterance-boundary information used by CLASSIC utterance boundary allowed a better prediction of English-learning children's output vocabulary than did other models.

中文翻译:

经典话语边界:基于分块的早期自然分词模型

分词是儿童词汇学习中至关重要的一步。虽然分词的计算模型可以捕捉婴儿在小规模人工任务中的表现,但由于缺乏将模型的表现与发育数据联系起来的措施,对自然环境中早期分词的检查受到了限制。在这里,我们扩展了 CLASSIC(Chunking Lexical and Sublexical Sequences in Children;Jones et al., 2021),这是一种语料库训练的分块模型,可以模拟多种记忆、语音和词汇学习现象,使其能够使用话语边界信息进行分词,我们将这个扩展版本命名为经典话语边界(CLASSIC-UB)。此外,我们将我们的模型与儿童在各种新措施中的表现进行了比较,利用分词和词汇学习能力之间的联系。我们表明,与其他模型相比,CLASSIC 话语边界使用的分块和话语边界信息的组合可以更好地预测英语学习儿童的输出词汇。
更新日期:2023-02-02
down
wechat
bug