Abstract
A natural language interface (NLI) to databases is an interface that translates a natural language question to a structured query that is executable by database management systems (DBMS). However, an NLI that is trained in the general domain is hard to apply in the spatial domain due to the idiosyncrasy and expressiveness of the spatial semantics. Moreover, there are a wide range of database servers available, and a unilingual NLI model limits its practical usage. In this article, we propose to not only address the spatial domain generalization challenge, but also support multilingual back-end, i.e., supporting different query languages, such as SQL and Prolog. For the challenge of spatial semantics, we propose a spatial comprehension model that is able to recognize the meaning of spatial entities based on the semantics of context and effectively resolve the ambiguity given the spatial semantics. The spatial semantics learned from the spatial comprehension model is then injected to the natural language question to ease the burden of capturing the spatial-specific semantics. We also propose to add a prefix symbol to support the multilingual back-end (e.g., query languages). With our spatial comprehension model and symbol injections, our NLI for the spatial domain, named SpatialNLI, is able to capture the semantic structure of the question and translate it to the corresponding syntax of an executable query accurately. We also experimentally ascertain that SpatialNLI outperforms state-of-the-art methods.
Similar content being viewed by others
Data Availibility
The datasets generated during and/or analysed during the current study are available in the https://github.com/jzl0166/SpatialNLI.
Notes
Our code and data is publicly available at https://github.com/jzl0166/SpatialNLI
References
Androutsopoulos, I., Ritchie, G.D., Thanisch, P.: Natural language interfaces to databases-an introduction. Natural language engineering 1(1), 29–81 (1995)
Bateman, J.A., Hois, J., Ross, R., Tenbrink, T.: A linguistic ontology of space for natural language processing. Artificial Intelligence 174(14), 1027–1071 (2010)
Brad, F., Iacob, R.C.A., Hosu, I., Rebedea, T.: Dataset for a neural natural language interface for databases (NNLIDB). In: Proceedings of the Eighth International Joint Conference on Natural Language Processing, IJCNLP, pp. 906–914 (2017). https://aclanthology.info/papers/I17-1091/i17-1091
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Dong, L., Lapata, M.: Language to logical form with neural attention. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL (2016). http://aclweb.org/anthology/P/P16/P16-1004.pdf
Fan, X., Monti, E., Mathias, L., Dreyer, M.: Transfer learning for neural semantic parsing. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 48–56 (2017)
Finegan-Dollak, C., Kummerfeld, J.K., Zhang, L., Ramanathan, K., Sadasivam, S., Zhang, R., Radev, D.R.: Improving text-to-sql evaluation methodology. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 351–360 (2018). https://aclanthology.info/papers/P18-1033/p18-1033
Ge, R., Mooney, R.J.: A statistical semantic parser that integrates syntax and semantics. In: Proceedings of the ninth conference on computational natural language learning. Association for Computational Linguistics (2005)
He, P., Mao, Y., Chakrabarti, K., Chen, W.: X-sql: reinforce schema representation with context. arXiv preprint arXiv:1908.08113 (2019)
Herzig, J., Berant, J.: Neural semantic parsing over multiple knowledge-bases. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 623–628 (2017). https://doi.org/10.18653/v1/P17-2098
Iyer, S., Konstas, I., Cheung, A., Krishnamurthy, J., Zettlemoyer, L.: Learning a neural semantic parser from user feedback. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 963–973 (2017). https://doi.org/10.18653/v1/P17-1089
Jia, R., Liang, P.: Data recombination for neural semantic parsing. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL (2016). http://aclweb.org/anthology/P/P16/P16-1002.pdf
Johnson, M., Schuster, M., Le, Q.V., Krikun, M., Wu, Y., Chen, Z., Thorat, N., Viégas, F., Wattenberg, M., Corrado, G., et al.: Google’s multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics 5, 339–351 (2017)
Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1700–1709 (2013)
Khan, A., Vasardani, M., Winter, S.: Extracting spatial information from place descriptions. In: COMP@ SIGSPATIAL, p. 62 (2013)
Kordjamshidi, P., Frasconi, P., Van Otterlo, M., Moens, M.F., De Raedt, L.: Relational learning for spatial relation extraction from natural language. In: International Conference on Inductive Logic Programming, pp. 204–220. Springer (2011)
Kordjamshidi, P., Moens, M.F.: Global machine learning for spatial ontology population. Journal of Web Semantics 30, 3–21 (2015)
Kordjamshidi, P., Moens, M.F., van Otterlo, M.: Spatial role labeling: Task definition and annotation scheme. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10), pp. 413–420. European Language Resources Association (ELRA) (2010)
Kordjamshidi, P., Otterlo, M.v., Moens, M.F.: Spatial role labeling annotation scheme. In: Handbook of linguistic annotation, pp. 1025–1052. Springer (2017)
Kordjamshidi, P., Van Otterlo, M., Moens, M.F.: Spatial role labeling: Towards extraction of spatial relations from natural language. ACM Transactions on Speech and Language Processing (TSLP) 8(3), 1–36 (2011)
Kwiatkowski, T., Choi, E., Artzi, Y., Zettlemoyer, L.: Scaling semantic parsers with on-the-fly ontology matching. In: Proceedings of the 2013 conference on empirical methods in natural language processing (2013)
Kwiatkowski, T., Zettlemoyer, L.S., Goldwater, S., Steedman, M.: Inducing probabilistic CCG grammars from logical form with higher-order unification. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1223–1233 (2010). http://www.aclweb.org/anthology/D10-1119
Kwiatkowski, T., Zettlemoyer, L.S., Goldwater, S., Steedman, M.: Lexical generalization in CCG grammar induction for semantic parsing. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1512–1523 (2011). http://www.aclweb.org/anthology/D11-1140
Li, F., Jagadish, H.V.: Nalir: an interactive natural language interface for querying relational databases. In: International Conference on Management of Data, SIGMOD, pp. 709–712 (2014). https://doi.org/10.1145/2588555.2594519. http://doi.acm.org/10.1145/2588555.2594519
Li, Y., Yang, H., Jagadish, H.V.: Nalix: an interactive natural language interface for querying XML. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 900–902 (2005). https://doi.org/10.1145/1066157.1066281. http://doi.acm.org/10.1145/1066157.1066281
Liang, P., Jordan, M.I., Klein, D.: Learning dependency-based compositional semantics. Computational Linguistics 39(2), 389–446 (2013). DOI: 10.1162/COLI\_a\_00127
Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)
Popescu, A., Etzioni, O., Kautz, H.A.: Towards a theory of natural language interfaces to databases. In: Proceedings of the 8th International Conference on Intelligent User Interfaces (2003)
Qi, J., Tang, J., He, Z., Wan, X., Cheng, Y., Zhou, C., Wang, X., Zhang, Q., Lin, Z.: Rasat: Integrating relational structures into pretrained seq2seq model for text-to-sql. ArXiv abs/2205.06983 (2022)
Rabinovich, M., Stern, M., Klein, D.: Abstract syntax networks for code generation and semantic parsing. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 1139–1149 (2017). https://doi.org/10.18653/v1/P17-1105
Ramalho, T., Kociskỳ, T., Besse, F., Eslami, S., Melis, G., Viola, F., Blunsom, P., Hermann, K.M.: Encoding spatial relations from natural language. arXiv preprint arXiv:1807.01670 (2018)
Saha, D., Floratou, A., Sankaranarayanan, K., Minhas, U.F., Mittal, A.R., Özcan, F.: ATHENA: an ontology-driven system for natural language querying over relational data stores. PVLDB 9(12), 1209–1220 (2016). http://www.vldb.org/pvldb/vol9/p1209-saha.pdf
Seo, M.J., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. In: 5th International Conference on Learning Representations, ICLR (2017). https://openreview.net/forum?id=HJ0UKP9ge
Shen, Q., Zhang, X., Jiang, W.: Annotation of spatial relations in natural language. In: 2009 International Conference on Environmental Science and Information Application Technology, vol. 3, pp. 418–421. IEEE (2009)
Susanto, R.H., Lu, W.: Neural architectures for multilingual semantic parsing. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 38–44 (2017)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp. 3104–3112 (2014)
Tang, L.R., Mooney, R.J.: Automated construction of database interfaces: Integrating statistical and relational learning for semantic parsing. In: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora, pp. 133–141. Association for Computational Linguistics (2000)
Tang, L.R., Mooney, R.J.: Using multiple clause constructors in inductive logic programming for semantic parsing. In: European Conference on Machine Learning, pp. 466–477. Springer (2001)
Utama, P., Weir, N., Basik, F., Binnig, C., Çetintemel, U., Hättasch, B., Ilkhechi, A., Ramaswamy, S., Usta, A.: An end-to-end neural natural language interface for databases. CoRR abs/1804.00401 (2018). http://arxiv.org/abs/1804.00401
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)
Wang, A., Kwiatkowski, T., Zettlemoyer, L.: Morpho-syntactic lexical generalization for ccg semantic parsing. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)
Wang, B., Shin, R., Liu, X., Polozov, O., Richardson, M.: Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. arXiv preprint arXiv:1911.04942 (2019)
Wang, S., Jiang, J.: Machine comprehension using match-lstm and answer pointer. In: 5th International Conference on Learning Representations, ICLR (2017). https://openreview.net/forum?id=B1-q5Pqxl
Wang, W.: A cross-domain natural language interface to databases using adversarial text method. In: Proceedings of the VLDB 2019 PhD Workshop, co-located with the 45th International Conference on Very Large Databases (VLDB 2019) (2019)
Wang, W., Ku, W.S.: Dynamic indoor navigation with bayesian filters. SIGSPATIAL Special 8(3), 9–10 (2017)
Wang, W., Ku, W.S.: Recommendation-based smart indoor navigation. In: Proceedings of the Second International Conference on Internet-of-Things Design and Implementation, pp. 311–312. ACM (2017)
Wang, W., Tian, Y., Wang, H., Ku, W.S.: A natural language interface for database: Achieving transfer-learnability using adversarial method for question understanding. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 97–108. IEEE (2020)
Wang, W., Tian, Y., Xiong, H., Wang, H., Ku, W.S.: A transfer-learnable natural language interface for databases. arXiv preprint arXiv:1809.02649 (2018)
Wang, W., Zhang, J., Sun, M.T., Ku, W.S.: Efficient parallel spatial skyline evaluation using mapreduce. In: Proceedings of the 20th international conference on extending database technology (2017)
Wang, W., Zhang, J., Sun, M.T., Ku, W.S.: A scalable spatial skyline evaluation system utilizing parallel independent region groups. The VLDB Journal The International Journal on Very Large Data Bases 28(1), 73–98 (2019)
Wang, Y., Berant, J., Liang, P.: Building a semantic parser overnight. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 1, pp. 1332–1342 (2015)
Yin, P., Neubig, G.: TRANX: A transition-based neural abstract syntax parser for semantic parsing and code generation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 7–12 (2018). https://aclanthology.info/papers/D18-2002/d18-2002
Zelle, J.M., Mooney, R.J.: Learning to parse database queries using inductive logic programming. In: Proceedings of the national conference on artificial intelligence, pp. 1050–1055 (1996)
Zettlemoyer, L., Collins, M.: Online learning of relaxed ccg grammars for parsing to logical form. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)
Zettlemoyer, L.S., Collins, M.: Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In: UAI ’05, Proceedings of the 21st Conference in Uncertainty in Artificial Intelligence (2005). https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1 &smnu=2 &article_id=1209 &proceeding_id=21
Zhao, K., Huang, L.: Type-driven incremental semantic parsing with polymorphism. In: NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1416–1421 (2015). http://aclweb.org/anthology/N/N15/N15-1162.pdf
Zlatev, J.: Spatial semantics. The Oxford handbook of cognitive linguistics pp. 318–350 (2007)
Funding
This research has been funded in part by the U.S. National Science Foundation grants IIS-1618669 (III) and ACI-1642133 (CICI).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, W., Li, J., Ku, WS. et al. Multilingual spatial domain natural language interface to databases. Geoinformatica 28, 29–52 (2024). https://doi.org/10.1007/s10707-023-00496-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-023-00496-3