当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generalization Measures for Zero-Shot Cross-Lingual Transfer
arXiv - CS - Computation and Language Pub Date : 2024-04-24 , DOI: arxiv-2404.15928
Saksham Bassi, Duygu Ataman, Kyunghyun Cho

A model's capacity to generalize its knowledge to interpret unseen inputs with different characteristics is crucial to build robust and reliable machine learning systems. Language model evaluation tasks lack information metrics about model generalization and their applicability in a new setting is measured using task and language-specific downstream performance, which is often lacking in many languages and tasks. In this paper, we explore a set of efficient and reliable measures that could aid in computing more information related to the generalization capability of language models in cross-lingual zero-shot settings. In addition to traditional measures such as variance in parameters after training and distance from initialization, we also measure the effectiveness of sharpness in loss landscape in capturing the success in cross-lingual transfer and propose a novel and stable algorithm to reliably compute the sharpness of a model optimum that correlates to generalization.

中文翻译:

零样本跨语言迁移的泛化措施

模型概括其知识以解释具有不同特征的未见输入的能力对于构建强大且可靠的机器学习系统至关重要。语言模型评估任务缺乏有关模型泛化的信息度量,并且它们在新环境中的适用性是使用任务和特定于语言的下游性能来衡量的,而这在许多语言和任务中通常是缺乏的。在本文中,我们探索了一组有效且可靠的措施,可以帮助计算更多与跨语言零样本设置中语言模型的泛化能力相关的信息。除了训练后参数的方差和初始化距离等传统测量方法之外,我们还测量了损失景观中锐度在捕获跨语言迁移成功方面的有效性,并提出了一种新颖且稳定的算法来可靠地计算损失景观的锐度与泛化相关的模型优化。
更新日期:2024-04-25
down
wechat
bug