Skip to main content
Log in

Multilingual spatial domain natural language interface to databases

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

A natural language interface (NLI) to databases is an interface that translates a natural language question to a structured query that is executable by database management systems (DBMS). However, an NLI that is trained in the general domain is hard to apply in the spatial domain due to the idiosyncrasy and expressiveness of the spatial semantics. Moreover, there are a wide range of database servers available, and a unilingual NLI model limits its practical usage. In this article, we propose to not only address the spatial domain generalization challenge, but also support multilingual back-end, i.e., supporting different query languages, such as SQL and Prolog. For the challenge of spatial semantics, we propose a spatial comprehension model that is able to recognize the meaning of spatial entities based on the semantics of context and effectively resolve the ambiguity given the spatial semantics. The spatial semantics learned from the spatial comprehension model is then injected to the natural language question to ease the burden of capturing the spatial-specific semantics. We also propose to add a prefix symbol to support the multilingual back-end (e.g., query languages). With our spatial comprehension model and symbol injections, our NLI for the spatial domain, named SpatialNLI, is able to capture the semantic structure of the question and translate it to the corresponding syntax of an executable query accurately. We also experimentally ascertain that SpatialNLI outperforms state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data Availibility

The datasets generated during and/or analysed during the current study are available in the https://github.com/jzl0166/SpatialNLI.

Notes

  1. https://www.swi-prolog.org/pldoc/man?section=db

  2. Our code and data is publicly available at https://github.com/jzl0166/SpatialNLI

  3. https://github.com/jkkummerfeld/text2sql-data

  4. https://github.com/Kyubyong/transformer

  5. https://yale-lily.github.io/spide

  6. https://github.com/microsoft/rat-sql

  7. https://huggingface.co/transformers/

References

  1. Androutsopoulos, I., Ritchie, G.D., Thanisch, P.: Natural language interfaces to databases-an introduction. Natural language engineering 1(1), 29–81 (1995)

    Article  Google Scholar 

  2. Bateman, J.A., Hois, J., Ross, R., Tenbrink, T.: A linguistic ontology of space for natural language processing. Artificial Intelligence 174(14), 1027–1071 (2010)

    Article  Google Scholar 

  3. Brad, F., Iacob, R.C.A., Hosu, I., Rebedea, T.: Dataset for a neural natural language interface for databases (NNLIDB). In: Proceedings of the Eighth International Joint Conference on Natural Language Processing, IJCNLP, pp. 906–914 (2017). https://aclanthology.info/papers/I17-1091/i17-1091

  4. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  5. Dong, L., Lapata, M.: Language to logical form with neural attention. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL (2016). http://aclweb.org/anthology/P/P16/P16-1004.pdf

  6. Fan, X., Monti, E., Mathias, L., Dreyer, M.: Transfer learning for neural semantic parsing. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 48–56 (2017)

  7. Finegan-Dollak, C., Kummerfeld, J.K., Zhang, L., Ramanathan, K., Sadasivam, S., Zhang, R., Radev, D.R.: Improving text-to-sql evaluation methodology. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 351–360 (2018). https://aclanthology.info/papers/P18-1033/p18-1033

  8. Ge, R., Mooney, R.J.: A statistical semantic parser that integrates syntax and semantics. In: Proceedings of the ninth conference on computational natural language learning. Association for Computational Linguistics (2005)

  9. He, P., Mao, Y., Chakrabarti, K., Chen, W.: X-sql: reinforce schema representation with context. arXiv preprint arXiv:1908.08113 (2019)

  10. Herzig, J., Berant, J.: Neural semantic parsing over multiple knowledge-bases. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 623–628 (2017). https://doi.org/10.18653/v1/P17-2098

  11. Iyer, S., Konstas, I., Cheung, A., Krishnamurthy, J., Zettlemoyer, L.: Learning a neural semantic parser from user feedback. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 963–973 (2017). https://doi.org/10.18653/v1/P17-1089

  12. Jia, R., Liang, P.: Data recombination for neural semantic parsing. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL (2016). http://aclweb.org/anthology/P/P16/P16-1002.pdf

  13. Johnson, M., Schuster, M., Le, Q.V., Krikun, M., Wu, Y., Chen, Z., Thorat, N., Viégas, F., Wattenberg, M., Corrado, G., et al.: Google’s multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics 5, 339–351 (2017)

    Article  Google Scholar 

  14. Kalchbrenner, N., Blunsom, P.: Recurrent continuous translation models. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1700–1709 (2013)

  15. Khan, A., Vasardani, M., Winter, S.: Extracting spatial information from place descriptions. In: COMP@ SIGSPATIAL, p. 62 (2013)

  16. Kordjamshidi, P., Frasconi, P., Van Otterlo, M., Moens, M.F., De Raedt, L.: Relational learning for spatial relation extraction from natural language. In: International Conference on Inductive Logic Programming, pp. 204–220. Springer (2011)

  17. Kordjamshidi, P., Moens, M.F.: Global machine learning for spatial ontology population. Journal of Web Semantics 30, 3–21 (2015)

    Article  Google Scholar 

  18. Kordjamshidi, P., Moens, M.F., van Otterlo, M.: Spatial role labeling: Task definition and annotation scheme. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10), pp. 413–420. European Language Resources Association (ELRA) (2010)

  19. Kordjamshidi, P., Otterlo, M.v., Moens, M.F.: Spatial role labeling annotation scheme. In: Handbook of linguistic annotation, pp. 1025–1052. Springer (2017)

  20. Kordjamshidi, P., Van Otterlo, M., Moens, M.F.: Spatial role labeling: Towards extraction of spatial relations from natural language. ACM Transactions on Speech and Language Processing (TSLP) 8(3), 1–36 (2011)

    Article  Google Scholar 

  21. Kwiatkowski, T., Choi, E., Artzi, Y., Zettlemoyer, L.: Scaling semantic parsers with on-the-fly ontology matching. In: Proceedings of the 2013 conference on empirical methods in natural language processing (2013)

  22. Kwiatkowski, T., Zettlemoyer, L.S., Goldwater, S., Steedman, M.: Inducing probabilistic CCG grammars from logical form with higher-order unification. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1223–1233 (2010). http://www.aclweb.org/anthology/D10-1119

  23. Kwiatkowski, T., Zettlemoyer, L.S., Goldwater, S., Steedman, M.: Lexical generalization in CCG grammar induction for semantic parsing. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 1512–1523 (2011). http://www.aclweb.org/anthology/D11-1140

  24. Li, F., Jagadish, H.V.: Nalir: an interactive natural language interface for querying relational databases. In: International Conference on Management of Data, SIGMOD, pp. 709–712 (2014). https://doi.org/10.1145/2588555.2594519. http://doi.acm.org/10.1145/2588555.2594519

  25. Li, Y., Yang, H., Jagadish, H.V.: Nalix: an interactive natural language interface for querying XML. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 900–902 (2005). https://doi.org/10.1145/1066157.1066281. http://doi.acm.org/10.1145/1066157.1066281

  26. Liang, P., Jordan, M.I., Klein, D.: Learning dependency-based compositional semantics. Computational Linguistics 39(2), 389–446 (2013). DOI: 10.1162/COLI\_a\_00127

    Article  MathSciNet  Google Scholar 

  27. Pennington, J., Socher, R., Manning, C.: Glove: Global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543 (2014)

  28. Popescu, A., Etzioni, O., Kautz, H.A.: Towards a theory of natural language interfaces to databases. In: Proceedings of the 8th International Conference on Intelligent User Interfaces (2003)

  29. Qi, J., Tang, J., He, Z., Wan, X., Cheng, Y., Zhou, C., Wang, X., Zhang, Q., Lin, Z.: Rasat: Integrating relational structures into pretrained seq2seq model for text-to-sql. ArXiv abs/2205.06983 (2022)

  30. Rabinovich, M., Stern, M., Klein, D.: Abstract syntax networks for code generation and semantic parsing. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL, pp. 1139–1149 (2017). https://doi.org/10.18653/v1/P17-1105

  31. Ramalho, T., Kociskỳ, T., Besse, F., Eslami, S., Melis, G., Viola, F., Blunsom, P., Hermann, K.M.: Encoding spatial relations from natural language. arXiv preprint arXiv:1807.01670 (2018)

  32. Saha, D., Floratou, A., Sankaranarayanan, K., Minhas, U.F., Mittal, A.R., Özcan, F.: ATHENA: an ontology-driven system for natural language querying over relational data stores. PVLDB 9(12), 1209–1220 (2016). http://www.vldb.org/pvldb/vol9/p1209-saha.pdf

  33. Seo, M.J., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. In: 5th International Conference on Learning Representations, ICLR (2017). https://openreview.net/forum?id=HJ0UKP9ge

  34. Shen, Q., Zhang, X., Jiang, W.: Annotation of spatial relations in natural language. In: 2009 International Conference on Environmental Science and Information Application Technology, vol. 3, pp. 418–421. IEEE (2009)

  35. Susanto, R.H., Lu, W.: Neural architectures for multilingual semantic parsing. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pp. 38–44 (2017)

  36. Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp. 3104–3112 (2014)

  37. Tang, L.R., Mooney, R.J.: Automated construction of database interfaces: Integrating statistical and relational learning for semantic parsing. In: Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora, pp. 133–141. Association for Computational Linguistics (2000)

  38. Tang, L.R., Mooney, R.J.: Using multiple clause constructors in inductive logic programming for semantic parsing. In: European Conference on Machine Learning, pp. 466–477. Springer (2001)

  39. Utama, P., Weir, N., Basik, F., Binnig, C., Çetintemel, U., Hättasch, B., Ilkhechi, A., Ramaswamy, S., Usta, A.: An end-to-end neural natural language interface for databases. CoRR abs/1804.00401 (2018). http://arxiv.org/abs/1804.00401

  40. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)

  41. Wang, A., Kwiatkowski, T., Zettlemoyer, L.: Morpho-syntactic lexical generalization for ccg semantic parsing. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)

  42. Wang, B., Shin, R., Liu, X., Polozov, O., Richardson, M.: Rat-sql: Relation-aware schema encoding and linking for text-to-sql parsers. arXiv preprint arXiv:1911.04942 (2019)

  43. Wang, S., Jiang, J.: Machine comprehension using match-lstm and answer pointer. In: 5th International Conference on Learning Representations, ICLR (2017). https://openreview.net/forum?id=B1-q5Pqxl

  44. Wang, W.: A cross-domain natural language interface to databases using adversarial text method. In: Proceedings of the VLDB 2019 PhD Workshop, co-located with the 45th International Conference on Very Large Databases (VLDB 2019) (2019)

  45. Wang, W., Ku, W.S.: Dynamic indoor navigation with bayesian filters. SIGSPATIAL Special 8(3), 9–10 (2017)

    Article  Google Scholar 

  46. Wang, W., Ku, W.S.: Recommendation-based smart indoor navigation. In: Proceedings of the Second International Conference on Internet-of-Things Design and Implementation, pp. 311–312. ACM (2017)

  47. Wang, W., Tian, Y., Wang, H., Ku, W.S.: A natural language interface for database: Achieving transfer-learnability using adversarial method for question understanding. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 97–108. IEEE (2020)

  48. Wang, W., Tian, Y., Xiong, H., Wang, H., Ku, W.S.: A transfer-learnable natural language interface for databases. arXiv preprint arXiv:1809.02649 (2018)

  49. Wang, W., Zhang, J., Sun, M.T., Ku, W.S.: Efficient parallel spatial skyline evaluation using mapreduce. In: Proceedings of the 20th international conference on extending database technology (2017)

  50. Wang, W., Zhang, J., Sun, M.T., Ku, W.S.: A scalable spatial skyline evaluation system utilizing parallel independent region groups. The VLDB Journal The International Journal on Very Large Data Bases 28(1), 73–98 (2019)

    Google Scholar 

  51. Wang, Y., Berant, J., Liang, P.: Building a semantic parser overnight. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 1, pp. 1332–1342 (2015)

  52. Yin, P., Neubig, G.: TRANX: A transition-based neural abstract syntax parser for semantic parsing and code generation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP, pp. 7–12 (2018). https://aclanthology.info/papers/D18-2002/d18-2002

  53. Zelle, J.M., Mooney, R.J.: Learning to parse database queries using inductive logic programming. In: Proceedings of the national conference on artificial intelligence, pp. 1050–1055 (1996)

  54. Zettlemoyer, L., Collins, M.: Online learning of relaxed ccg grammars for parsing to logical form. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)

  55. Zettlemoyer, L.S., Collins, M.: Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In: UAI ’05, Proceedings of the 21st Conference in Uncertainty in Artificial Intelligence (2005). https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1 &smnu=2 &article_id=1209 &proceeding_id=21

  56. Zhao, K., Huang, L.: Type-driven incremental semantic parsing with polymorphism. In: NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1416–1421 (2015). http://aclweb.org/anthology/N/N15/N15-1162.pdf

  57. Zlatev, J.: Spatial semantics. The Oxford handbook of cognitive linguistics pp. 318–350 (2007)

Download references

Funding

This research has been funded in part by the U.S. National Science Foundation grants IIS-1618669 (III) and ACI-1642133 (CICI).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenlu Wang.

Ethics declarations

Competing Interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, W., Li, J., Ku, WS. et al. Multilingual spatial domain natural language interface to databases. Geoinformatica 28, 29–52 (2024). https://doi.org/10.1007/s10707-023-00496-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-023-00496-3

Keywords

Navigation