Skip to main content
Log in

Chinese Named Entity Recognition Augmented with Lexicon Memory

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Inspired by the concept of content-addressable retrieval from cognitive science, we propose a novel fragmentbased Chinese named entity recognition (NER) model augmented with a lexicon-based memory in which both characterlevel and word-level features are combined to generate better feature representations for possible entity names. Observing that the boundary information of entity names is particularly useful to locate and classify them into pre-defined categories, position-dependent features, such as prefix and suffix, are introduced and taken into account for NER tasks in the form of distributed representations. The lexicon-based memory is built to help generate such position-dependent features and deal with the problem of out-of-vocabulary words. Experimental results show that the proposed model, called LEMON, achieved state-of-the-art performance with an increase in the F1-score up to 3.2% over the state-of-the-art models on four different widely-used NER datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Kapetanios E, Tatar D, Sacarea C. Natural Language Processing: Semantic Aspects. CRC Press, 2013. DOI: https://doi.org/10.1201/b15472.

  2. Lin D K, Wu X Y. Phrase clustering for discriminative learning. In Proc. the 2009 Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Aug. 2009, pp.1030–1038.

  3. Nothman J, Ringland N, Radford W, Murphy T, Curran J R. Learning multilingual named entity recognition from Wikipedia. Artificial Intelligence, 2013, 194: 151–175. DOI: https://doi.org/10.1016/j.artint.2012.03.006.

    Article  MathSciNet  MATH  Google Scholar 

  4. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P. Natural language processing (almost) from scratch. The Journal of Machine Learning Research, 2011, 12: 2493–2537.

    MATH  Google Scholar 

  5. Huang Z H, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging. arXiv: 1508.01991, 2015. https://arxiv.org/abs/1508.01991, Oct. 2023.

  6. Lample G, Ballesteros M, Subramanian S, Kawakami K, Dyer C. Neural architectures for named entity recognition. In Proc. the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2016, pp.260–270. DOI: 10.18653/v1/N16-1030.

  7. Huang S, Sun X, Wang H F. Addressing domain adaptation for Chinese word segmentation with global recurrent structure. In Proc. the 8th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Nov. 2017, pp.184–193. DOI: 10.1007/978-3-030-01716-3_3.

  8. Chen L Z, Moschitti A. Learning to progressively recognize new named entities with sequence to sequence models. In Proc. the 27th International Conference on Computational Linguistics, Aug. 2018, pp.2181–2191.

  9. Zhang C Y, Bengio S, Hardt M, Recht B, Vinyals O. Understanding deep learning requires rethinking generalization. arXiv: 1611.03530, 2017. https://arxiv.org/abs/1611.03530, Oct. 2023.

  10. He J Z, Wang H F. Chinese named entity recognition and word segmentation based on character. In Proc. the 6th SIGHAN Workshop on Chinese Language Processing, Jan. 2008, pp.128–132.

  11. Liu Z X, Zhu C H, Zhao T J. Chinese named entity recognition with a sequence labeling approach: Based on characters, or based on words? In Proc. the 2010 Advanced Intelligent Computing Theories and Applications, and the 6th International Conference on Intelligent Computing, Aug. 2010, pp.634–640.

  12. Zhang Y, Yang J. Chinese NER using lattice LSTM. In Proc. the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), July 2018, pp.1554–1564. DOI: https://doi.org/10.18653/v1/P18-1144.

  13. Li J, Sun A X, Han J L, Li C L. A survey on deep learning for named entity recognition. IEEE Trans. Knowledge and Data Engineering, 2022, 34(1): 50–70. DOI: https://doi.org/10.1109/TKDE.2020.2981314.

    Article  Google Scholar 

  14. Xu M B, Jiang H, Watcharawittayakul S. A local detection approach for named entity recognition and mention detection. In Proc. the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), July 2017, pp.1237–1247. DOI: https://doi.org/10.18653/v1/P17-1114.

  15. Chiu J P C, Nichols E. Named entity recognition with bidirectional LSTM-CNNs. Trans. Association for Computational Linguistics, 2016, 4: 357–370. DOI: https://doi.org/10.1162/tacl_a_00104.

    Article  Google Scholar 

  16. Hassabis D, Kumaran D, Summerfield C, Botvinick M. Neuroscience-inspired artificial intelligence. Neuron, 2017, 95(2): 245–258. DOI: https://doi.org/10.1016/j.neuron.2017.06.011.

    Article  Google Scholar 

  17. Hopfield J J. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences of the United States of America, 1982, 79(8): 2554–2558. DOI: https://doi.org/10.1073/pnas.79.8.2554.

    Article  MathSciNet  MATH  Google Scholar 

  18. Anderson J R, Bothell D, Byrne M D, Douglass S, Lebiere C, Qin Y L. An integrated theory of the mind. Psychological Review, 2004, 111(4): 1036–1060. DOI: https://doi.org/10.1037/0033-295X.111.4.1036.

    Article  Google Scholar 

  19. Shallice T. From Neuropsychology to Mental Structure. Cambridge University Press, 1988. DOI: https://doi.org/10.1017/CBO9780511526817.

  20. Yadav V, Bethard S. A survey on recent advances in named entity recognition from deep learning models. In Proc. the 27th International Conference on Computational Linguistics, Aug. 2018, pp.2145–2158.

  21. Zhang S L, Jiang H, Xu M B, Hou J F, Dai L R. The fixed-size ordinally-forgetting encoding method for neural network language models. In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), July 2015, pp.495–500. DOI: 10.3115/v1/P15-2081.

  22. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv: 1409.0473, 2016. https://arxiv.org/abs/1409.0473, Oct. 2023.

  23. Luong T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation. In Proc. the 2015 Conference on Empirical Methods in Natural Language Processing, Sept. 2015, pp.1412–1421. DOI: 10.18653/v1/D15-1166.

  24. Rei M, Crichton G, Pyysalo S. Attending to characters in neural sequence labeling models. In Proc. the 26th International Conference on Computational Linguistics: Technical Papers, Dec. 2016, pp.309–318.

  25. Xu G H, Wang C Y, He X F. Improving clinical named entity recognition with global neural attention. In Proc. the 2nd International Joint Conference on Web and Big Data, July 2018, pp.264–279. DOI: 10.1007/978-3-319-96893-3_20.

  26. Zhang Q, Fu J L, Liu X Y, Huang X J. Adaptive co-attention network for named entity recognition in tweets. In Proc. the 32nd AAAI Conference on Artificial Intelligence and the 30th Innovative Applications of Artificial Intelligence Conference and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, Feb. 2018, pp.5674–5681.

  27. Weston J, Chopra S, Bordes A. Memory networks. arXiv: 1410.3916, 2015. https://arxiv.org/abs/1410.3916, Oct. 2023.

  28. Hammerton J. Named entity recognition with long shortterm memory. In Proc. the 7th Conference on Natural Language Learning at HLT-NAACL 2003, May 2003, pp.172–175.

  29. Peng N Y, Dredze M. Improving named entity recognition for Chinese social media with word segmentation representation learning. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Aug. 2016, pp.149–155. DOI: 10.18653/v1/P16-2025.

  30. Yang J, Teng Z Y, Zhang M S, Zhang Y. Combining discrete and neural features for sequence labeling. In Proc. the 17th International Conference on Computational Linguistics and Intelligent Text Processing, Mar. 2018, pp.140–154. DOI: 10.1007/978-3-319-75477-2_9.

  31. Xue N W, Shen L B. Chinese word segmentation as LMR tagging. In Proc. the 2nd SIGHAN Workshop on Chinese Language Processing, July 2003, pp.176–179. DOI: 10.3115/1119250.1119278.

  32. Elman J L. Finding structure in time. Cognitive Science, 1990, 14(2): 179–211. DOI: https://doi.org/10.1207/s15516709cog1402_1.

    Article  Google Scholar 

  33. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735–1780. DOI: https://doi.org/10.1162/neco.1997.9.8.1735.

    Article  Google Scholar 

  34. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L, Polosukhin I. Attention is all you need. In Proc. the 31st International Conference on Neural Information Processing Systems, Dec. 2017, pp.6000–6010. DOI: 10.5555/3295222.3295349.

  35. Joulin A, Grave E, Bojanowski P, Mikolov T. Bag of tricks for efficient text classification. In Proc. the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Apr. 2017, pp.427–431.

  36. Conneau A, Kruszewski G, Lample G, Barrault L, Baroni M. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties. In Proc. the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), July 2018, pp.2126–2136. DOI: https://doi.org/10.18653/v1/P18-1198.

  37. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In Proc. the 26th International Conference on Neural Information Processing Systems, Dec. 2013, pp.3111–3119.

  38. Sukhbaatar S, Szlam A, Weston J, Fergus R. End-to-end memory networks. In Proc. the 28th International Conference on Neural Information Processing Systems, Dec. 2015, pp.2440–2448.

  39. Lin T Y, Goyal P, Girshick R, He K M, Dollár P. Focal loss for dense object detection. In Proc. the 2017 IEEE International Conference on Computer Vision, Oct. 2017, pp.2999–3007. DOI: 10.1109/ICCV.2017.324.

  40. Weischedel R, Palmer M, Marcus M, Hovy E, Pradhan S, Ramshaw L, Xue N W, Taylor A, Kaufman J, Franchini M, El-Bachouti M, Belvin R, Houston A. OntoNotes release 4.0. Linguistic Data Consortium, Philadelphia, 2011. DOI: https://doi.org/10.35111/gfjf-7r50.

  41. Levow G A. The third international Chinese language processing Bakeoff: Word segmentation and named entity recognition. In Proc. the 5th SIGHAN Workshop on Chinese Language Processing, July 2006, pp.108–117.

  42. Peng N Y, Dredze M. Named entity recognition for Chinese social media with jointly trained embeddings. In Proc. the 2015 Conference on Empirical Methods in Natural Language Processing, Sept. 2015, pp.548–554. DOI: 10.18653/v1/D15-1064.

  43. Sun M S, Chen X X, Zhang K X, Guo Z P, Liu Z Y. THULAC: An efficient lexical analyzer for Chinese. Technical Report, Tsinghua University, 2016. https://nlp.csai.tsinghua.edu.cn/project/thulac/, Oct. 2023. (in Chinese)

  44. Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z M, Desmaison A, Antiga L, Lerer A. Automatic differentiation in PyTorch. In Proc. the 31st Conference on Neural Information Processing Systems, Dec. 2017.

  45. Kingma D P, Ba J. Adam: A method for stochastic optimization. In Proc. the 3rd International Conference on Learning Representations, May 2015.

  46. Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pretraining of deep bidirectional transformers for language understanding. In Proc. the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jun. 2019, pp.4171–4186. DOI: 10.18653/v1/N19-1423.

  47. Yang J, Zhang Y. NCRF++: An open-source neural sequence labeling toolkit. In Proc. the ACL 2018, System Demonstrations, July 2018, pp.74–79. DOI: 10.18653/v1/P18-4013.

  48. Chen A T, Peng F C, Shan R, Sun G. Chinese named entity recognition with conditional probabilistic models. In Proc. the 5th SIGHAN Workshop on Chinese Language Processing, July 2006, pp.173–176.

  49. Zhang S X, Qin Y, Wen J, Wang X J. Word segmentation and named entity recognition for SIGHAN Bakeoff3. In Proc. the 5th SIGHAN Workshop on Chinese Language Processing, July 2006, pp.158–161.

  50. Lu Y, Zhang Y, Ji D H. Multi-prototype Chinese character embedding. In Proc. the 10th International Conference on Language Resources and Evaluation, May 2016, pp.855–859.

  51. Dong C H, Zhang J J, Zong C Q, Hattori M, Di H. Character- based LSTM-CRF with radical-level features for Chinese named entity recognition. In Natural Language Understanding and Intelligent Applications, Lin C Y, Xue N W, Zhao D Y, Huang X J, Feng Y S (eds.), Springer, 2016, pp.239–250. DOI: https://doi.org/10.1007/978-3-319-50496-4_20.

  52. Wang M Q, Che W X, Manning C D. Effective bilingual constraints for semi-supervised learning of named entity recognizers. In Proc. the 27th AAAI Conference on Artificial Intelligence, July 2013, pp.919–925.

  53. Che W X, Wang M Q, Manning C D, Liu T. Named entity recognition with bilingual constraints. In Proc. the 2013 Conf. the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun. 2013, pp.52–62.

  54. He H F, Sun X. A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media. In Proc. the 31st AAAI Conference on Artificial Intelligence, Feb. 2017, pp.3216–3222.

  55. Zhang Z Y, Han X, Liu Z Y, Jiang X, Sun M S, Liu Q. ERNIE: Enhanced language representation with informative entities. In Proc. the 57th Annual Meeting of the Association for Computational Linguistics, July 2019, pp.1441–1451. DOI: 10.18653/v1/P19-1139.

  56. Yang Z L, Dai Z H, Yang Y M, Carbonell J, Salakhutdinov R, Le Q V. XLNet: Generalized autoregressive pretraining for language understanding. In Proc. the 33rd International Conference on Neural Information Processing Systems, Dec. 2019, pp.5753–5763.

  57. Peters M E, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L. Deep contextualized word representations. In Proc. the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), Jun. 2018, pp.2227–2237. DOI: 10.18653/v1/N18-1202.

  58. Liu W J, Zhou P, Zhao Z, Wang Z R, Ju Q, Deng H T, Wang P. K-BERT: Enabling language representation with knowledge graph. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(3): 2901–2908. DOI: https://doi.org/10.1609/aaai.v34i03.5681.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao-Qing Zheng.

Supplementary Information

ESM 1

(PDF 351 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., Zheng, XQ. & Huang, XJ. Chinese Named Entity Recognition Augmented with Lexicon Memory. J. Comput. Sci. Technol. 38, 1021–1035 (2023). https://doi.org/10.1007/s11390-021-1153-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-021-1153-y

Keywords

Navigation