Journal of Geographical Systems ( IF 2.417 ) Pub Date : 2022-02-18 , DOI: 10.1007/s10109-022-00375-9 Kai Ma 1 , YongJian Tan 1 , Zhong Xie 2, 3 , Qinjun Qiu 2, 3, 4 , Siqiong Chen 3
Many natural language tasks related to geographic information retrieval (GIR) require toponym recognition, and identifying Chinese toponyms from social media messages to share real-time information is a critical problem for many practical applications, such as natural disaster response and geolocating. In this article, we focused on toponym recognition from social media messages in Chinese. While existing off-the-shelf Chinese named entity recognition (NER) tools could be applied to identify toponyms, these approaches cannot address a variety of language irregularities taken from social media messages, including location name abbreviations, informal sentence structures and combination toponyms. We present a deep neural network named BERT-BiLSTM-CRF, which extends a basic bidirectional recurrent neural network model (BiLSTM) with the pretraining bidirectional encoder representation from transformers (BERT) representation to handle the toponym recognition task in Chinese text. Using three datasets taken from lists of alternative location names, the experimental results showed that the proposed model can significantly outperform previous Chinese NER models/algorithms and a set of state-of-the-art deep learning models.