CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation,Science China Information Sciences

当前位置： X-MOL 学术 › Sci. China Inf. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation
Science China Information Sciences ( IF 8.8 ) Pub Date : 2024-03-27 , DOI: 10.1007/s11432-021-3536-5
Yunfan Shao , Zhichao Geng , Yitao Liu , Junqi Dai , Hang Yan , Fei Yang , Zhe Li , Hujun Bao , Xipeng Qiu

In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chinese pre-trained unbalanced transformer (CPT). Different from previous Chinese PTMs, CPT is designed to utilize the shared knowledge between natural language understanding (NLU) and natural language generation (NLG) to boost the performance. CPT consists of three parts: a shared encoder, an understanding decoder, and a generation decoder. Two specific decoders with a shared encoder are pre-trained with masked language modeling (MLM) and denoising auto-encoding (DAE) tasks, respectively. With the partially shared architecture and multi-task pre-training, CPT can (1) learn specific knowledge of both NLU or NLG tasks with two decoders and (2) be fine-tuned flexibly that fully exploits the potential of the model. Moreover, the unbalanced transformer saves the computational and storage cost, which makes CPT competitive and greatly accelerates the inference of text generation. Experimental results on a wide range of Chinese NLU and NLG tasks show the effectiveness of CPT.

更新日期：2024-03-27

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>