research-article

Improved BIO-Based Chinese Automatic Abstract-Generation Model

Authors:
Qing Li

Shanghai University of Engineering Science, Songjiang Qu, China

Shanghai University of Engineering Science, Songjiang Qu, China

0000-0002-2682-6818
View Profile

,
Weibin Wan

Shanghai University of Engineering Science, Songjiang Qu, China

Shanghai University of Engineering Science, Songjiang Qu, China

0000-0002-7092-9849
View Profile

,
Yuming Zhao

Shanghai Jiao Tong University, Minhang Qu, China

Shanghai Jiao Tong University, Minhang Qu, China

0009-0007-1443-3463
View Profile

,
Xiaoyan Jiang

Shanghai University of Engineering Science, Songjiang Qu, China

Shanghai University of Engineering Science, Songjiang Qu, China

0000-0002-1946-576X
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 23 Issue 3Article No.: 39pp 1–16https://doi.org/10.1145/3643695

Published:09 March 2024Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

With its unique information-filtering function, text summarization technology has become a significant aspect of search engines and question-and-answer systems. However, existing models that include the copy mechanism often lack the ability to extract important fragments, resulting in generated content that suffers from thematic deviation and insufficient generalization. Specifically, Chinese automatic summarization using traditional generation methods often loses semantics because of its reliance on word lists. To address these issues, we proposed the novel BioCopy mechanism for the summarization task. By training the tags of predictive words and reducing the probability distribution range on the glossary, we enhanced the ability to generate continuous segments, which effectively solves the above problems. Additionally, we applied reinforced canonicality to the inputs to obtain better model results, making the model share the sub-network weight parameters and sparsing the model output to reduce the search space for model prediction. To further improve the model’s performance, we calculated the bilingual evaluation understudy (BLEU) score on the English dataset CNN/DailyMail to filter the thresholds and reduce the difficulty of word separation and the dependence of the output on the word list. We fully fine-tuned the model using the LCSTS dataset for the Chinese summarization task and conducted small-sample experiments using the CSL dataset. We also conducted ablation experiments on the Chinese dataset. The experimental results demonstrate that the optimized model can learn the semantic representation of the original text better than other models and performs well with small sample sizes.

REFERENCES

[1] Tas Oguzhan and Kiyani Farzad. 2007. A survey automatic text summarization. PressAcademia Procedia 5, 1 (2007), 205–213.Google ScholarCross Ref
[2] Wei Bingzhen, Ren Xuancheng, Zhang Yi, Cai Xiaoyan, Su Qi, and Sun Xu. 2019. Regularizing output distribution of abstractive chinese social media text summarization for improved semantic consistency. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP) 18, 3 (2019), 1–15.Google ScholarDigital Library
[3] Allahyari Mehdi, Pouriyeh Seyedamin, Assefi Mehdi, Safaei Saeid, Trippe Elizabeth D., Gutierrez Juan B., and Kochut Krys. 2017. Text summarization techniques: A brief survey. arXiv preprint arXiv:1707.02268 (2017).Google Scholar
[4] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Łukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008.Google Scholar
[5] Qiu Xipeng, Sun Tianxiang, Xu Yige, Shao Yunfan, Dai Ning, and Huang Xuanjing. 2020. Pre-trained models for natural language processing: A survey. Science China Technological Sciences (2020), 1–26.Google Scholar
[6] Xue Linting, Constant Noah, Roberts Adam, Kale Mihir, Al-Rfou Rami, Siddhant Aditya, Barua Aditya, and Raffel Colin. 2020. mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934 (2020).Google Scholar
[7] Wu Lijun, Li Juntao, Wang Yue, Meng Qi, Qin Tao, Chen Wei, Zhang Min, and Liu Tie-Yan. 2021. R-drop: Regularized dropout for neural networks. Advances in Neural Information Processing Systems 34 (2021).Google Scholar
[8] Martins Andre and Astudillo Ramon. 2016. From softmax to sparsemax: A sparse model of attention and multi-label classification. In International Conference on Machine Learning. PMLR, 1614–1623.Google Scholar
[9] Liu Yi, Zhang Guoan, Yu Puning, Su Jianlin, and Pan Shengfeng. 2021. BioCopy: A plug-and-play span copy mechanism in Seq2Seq models. arXiv preprint arXiv:2109.12533 (2021).Google Scholar
[10] Sutskever Ilya, Vinyals Oriol, and Le Quoc V.. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems. 3104–3112.Google ScholarDigital Library
[11] Rush Alexander M., Chopra Sumit, and Weston Jason. 2015. A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685 (2015).Google Scholar
[12] Chopra Sumit, Auli Michael, and Rush Alexander M.. 2016. Abstractive sentence summarization with attentive recurrent neural networks. In 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 93–98.Google ScholarCross Ref
[13] Nallapati Ramesh, Zhai Feifei, and Zhou Bowen. 2017. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In 31st AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
[14] See Abigail, Liu Peter J., and Manning Christopher D.. 2017. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368 (2017).Google Scholar
[15] Gu Jiatao, Lu Zhengdong, Li Hang, and Li Victor O. K.. 2016. Incorporating copying mechanism in sequence-to-sequence learning. arXiv preprint arXiv:1603.06393 (2016).Google Scholar
[16] Mikolov Tomas, Chen Kai, Corrado Greg, and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).Google Scholar
[17] Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar. Association for Computational Linguistics, 532–1543.Google Scholar
[18] Peters Matthew E., Neumann Mark, Iyyer Mohit, Gardner Matt, Clark Christopher, Lee Kenton, and Zettlemoyer Luke. 2018. Deep contextualized word representations. CoRR abs/1802.05365 (2018). arXiv:1802.05365 http://arxiv.org/abs/1802.05365Google Scholar
[19] Sun Chi, Qiu Xipeng, Xu Yige, and Huang Xuanjing. 2019. How to fine-tune BERT for text classification?. In China National Conference on Chinese Computational Linguistics. Springer, 194–206.Google ScholarDigital Library
[20] Radford Alec, Narasimhan Karthik, Salimans Tim, and Sutskever Ilya. 2018. Improving language understanding by generative pre-training. (2018).Google Scholar
[21] Shaw Peter, Uszkoreit Jakob, and Vaswani Ashish. 2018. Self-attention with relative position representations. arXiv preprint arXiv:1803.02155 (2018).Google Scholar
[22] Dauphin Yann N., Fan Angela, Auli Michael, and Grangier David. 2017. Language modeling with gated convolutional networks. In International Conference on Machine Learning. PMLR, 933–941.Google ScholarDigital Library
[23] Shazeer Noam. 2020. GLU variants improve transformer. CoRR abs/2002.05202 (2020). arXiv:2002.05202 https://arxiv.org/abs/2002.05202Google Scholar
[24] Peters Ben, Niculae Vlad, and Martins André F. T.. 2019. Sparse sequence-to-sequence models. arXiv preprint arXiv:1905.05702 (2019).Google Scholar
[25] Pires Telmo, Schlinger Eva, and Garrette Dan. 2019. How multilingual is multilingual BERT? arXiv preprint arXiv:1906.01502 (2019).Google Scholar
[26] Lin Chin-Yew. 2004. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out. 74–81.Google Scholar
[27] Papineni Kishore, Roukos Salim, Ward Todd, and Zhu Wei-Jing. 2002. BLEU: A method for automatic evaluation of machine translation. In 40th Annual Meeting of the Association for Computational Linguistics. 311–318.Google Scholar
[28] Hermann Karl Moritz, Kocisky Tomas, Grefenstette Edward, Espeholt Lasse, Kay Will, Suleyman Mustafa, and Blunsom Phil. 2015. Teaching machines to read and comprehend. Advances in Neural Information Processing Systems 28 (2015), 1693–1701.Google Scholar
[29] Nallapati Ramesh, Zhou Bowen, Gulcehre Caglar, and Xiang Bing. 2016. Abstractive text summarization using sequence-to-sequence rnns and beyond. arXiv preprint arXiv:1602.06023 (2016).Google Scholar
[30] Hu Baotian, Chen Qingcai, and Zhu Fangze. 2015. LCSTS: A large-scale Chinese short text summarization dataset. arXiv preprint arXiv:1506.05865 (2015).Google Scholar
[31] Pan Tse-Yu, Lo Li-Yun, Yeh Chung-Wei, Li Jhe-Wei, Liu Hou-Tim, and Hu Min-Chun. 2016. Real-time sign language recognition in complex background scene based on a hierarchical clustering classification method. In 2016 IEEE 2nd International Conference on Multimedia Big Data (BigMM ’16). IEEE, 64–67.Google ScholarCross Ref
[32] Abdaoui Amine, Pradel Camille, and Sigel Grégoire. 2020. Load what you need: Smaller versions of multilingual BERT. arXiv preprint arXiv:2010.05609 (2020).Google Scholar
[33] Shazeer Noam and Stern Mitchell. 2018. Adafactor: Adaptive learning rates with sublinear memory cost. In International Conference on Machine Learning. PMLR, 4596–4604.Google Scholar
[34] Dong Li, Yang Nan, Wang Wenhui, Wei Furu, Liu Xiaodong, Wang Yu, Gao Jianfeng, Zhou Ming, and Hon Hsiao-Wuen. 2019. Unified language model pre-training for natural language understanding and generation. Advances in Neural Information Processing Systems 32 (2019).Google Scholar

Index Terms

Improved BIO-Based Chinese Automatic Abstract-Generation Model
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing

Recommendations

A Topic Inference Chinese News Headline Generation Method Integrating Copy Mechanism
Abstract
To maximize the accuracy of the news headline generation model, increase the attention ratio of the model to significant information, and avoid duplication of generated headlines and problems unrelated to feature semantics, we proposed a topic ...
Read More
Improved N-grams approach for web page language identification
Transactions on computational collective intelligence V

Language identification has been widely used for machine translations and information retrieval. In this paper, an improved Ngrams (ING) approach is proposed for web page language identification. The improved N-grams approach is based on a combination ...
Read More
Khmer-Chinese bilingual LDA topic model based on dictionary

Multilingual probabilistic topic models have been widely used in topic of mining area in multilingual documents, this paper proposes the Khmer-Chinese bilingual latent Dirichlet allocation (KCB-LDA) model based on the bilingual dictionary. With the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 23, Issue 3
March 2024
277 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3613569
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 March 2024
- Online AM: 5 February 2024
- Accepted: 23 January 2024
- Revised: 18 August 2023
- Received: 31 January 2022
Published in tallip Volume 23, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Abstract summarization
copy mechanism
pre-train model
multilingual
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 69
  Total Downloads
- Downloads (Last 12 months)69
- Downloads (Last 6 weeks)36
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

Improved BIO-Based Chinese Automatic Abstract-Generation Model

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A Topic Inference Chinese News Headline Generation Method Integrating Copy Mechanism

Improved N-grams approach for web page language identification

Khmer-Chinese bilingual LDA topic model based on dictionary

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Full Text

Caption

Improved BIO-Based Chinese Automatic Abstract-Generation Model

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

REFERENCES

Cited By

Index Terms

Recommendations

A Topic Inference Chinese News Headline Generation Method Integrating Copy Mechanism

Improved N-grams approach for web page language identification

Khmer-Chinese bilingual LDA topic model based on dictionary

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Full Text

Share this Publication link

Share on Social Media