当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning legal text representations via disentangling elements
Expert Systems with Applications ( IF 8.5 ) Pub Date : 2024-03-21 , DOI: 10.1016/j.eswa.2024.123749
Yingzhi Miao , Fang Zhou , Martin Pavlovski , Weining Qian

Recently, a rising number of works has been focusing on tasks in the legal field for providing references to professionals in order to improve their work efficiency. Learning legal text representations, being the most common initial step, can strongly influence the performance of downstream tasks. Existing works have shown that utilizing domain knowledge, such as legal elements, in text representation learning can improve the prediction performance of downstream models. However, existing methods are typically focused on specific downstream tasks, hindering their effective generalization to other legal tasks. Moreover, these models tend to entangle various legal elements into a unified representation, overlooking the nuances among distinct legal elements. To solve the aforementioned limitation, we (1) introduce a generic model, called (legal text to lement-related tor), based on a triplet loss to learn discriminative representations of legal texts concerning a specific element, and (2) present a framework for learning disentangled representations w.r.t. multiple elements. The learned representations are independent of each other in terms of elements, and can be directly applied to or fine-tuned for various downstream tasks. We conducted comprehensive experiments on two real-world legal applications, the results of which indicate that the proposed model outperforms a range of baselines by a margin of up to 34.2% on a similar case matching task and 14% on a legal element identification task. When a small quantity of labeled data is accessible, the proposed model’s superior performance becomes even more evident.

中文翻译:

通过解开元素学习法律文本表示

近年来,越来越多的作品开始关注法律领域的工作,为专业人士提供参考,以提高他们的工作效率。学习法律文本表示是最常见的初始步骤,可以强烈影响下游任务的性能。现有的工作表明,在文本表示学习中利用法律元素等领域知识可以提高下游模型的预测性能。然而,现有方法通常侧重于特定的下游任务,阻碍了它们对其他法律任务的有效推广。此外,这些模型往往将各种法律要素纠缠成一个统一的表示,而忽略了不同法律要素之间的细微差别。为了解决上述限制,我们(1)引入了一个通用模型,称为(法律文本到元素相关的 tor),基于三元组损失来学习有关特定元素的法律文本的判别性表示,并且(2)提出了一个框架用于学习多个元素的解缠结表示。学习到的表示在元素方面彼此独立,并且可以直接应用于或针对各种下游任务进行微调。我们对两个现实世界的法律应用进行了全面的实验,结果表明,所提出的模型在类似案例匹配任务上优于一系列基线,最高可达 34.2%,在法律要素识别任务上优于一系列基线,最高可达 14%。当可以访问少量标记数据时,所提出的模型的优越性能变得更加明显。
更新日期:2024-03-21
down
wechat
bug