Abstract
We present a rule-based approach to automatic factual question generation implemented in the Adaptive Courseware and Natural Language Tutor, a natural language-based intelligent tutoring system. Since machine-generated questions are intended for adaptive teaching, learning and assessment, their accuracy is of the utmost importance. However, the generation of high-quality questions is still challenging. The proposed approach relies on pre-processing techniques and syntactic and semantic feature extraction to transform declarative sentences and their segments into questions. The quality of questions, generated from domain specific texts, was evaluated by using mixed evaluation strategies: (1) human evaluation, (2) qualitative error analysis, (3) automatic evaluation, (4) human and automatic evaluation of machine-generated questions from paraphrases compared to a set of human-authored questions, (5) preliminary comparison to other approaches. The human evaluation involved two teachers of English as a foreign language who set up evaluation criteria (grammaticality, semantic accuracy, and answerability) and a group of 30 English language graduates. Student-generated questions were validated and used as reference questions for automatic evaluation based on similarity metrics (BLEU-4, METEOR, CHRF, NIST and ROUGE-L). Human and automatic evaluation results were satisfactory but improved significantly with the paraphrasing strategy. The preliminary comparison to other approaches showed that the proposed rule-based approach performed equally well despite its limitations.
Similar content being viewed by others
Data availability
The dataset used and analyzed during the current study is available from the corresponding author on reasonable request.
Notes
stanfordnlp.github.io/CoreNLP/index.html.
References
Alsubait, T. (2015). Ontology-based question generation. University of Manchester.
Amidei, J., Piwek, P., & Willis, A. (2018). Evaluation methodologies in automatic question generation 2013–2018. Proceedings of the 11th international conference on natural language generation (pp. 307–317). ACL.
Banerjee, S., & Lavie, A. (2005). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization (pp. 65–72). ACL.
Blšták, M. (2018). Automatic question generation based on sentence structure analysis. Information Sciences & Technologies: Bulletin of the ACM Slovakia, 10(2), 20.
Blšták, M., & Rozinajová, V. (2022). Automatic question generation based on sentence structure analysis using machine learning approach. Natural Language Engineering, 28(4), 487–517. https://doi.org/10.1017/S1351324921000139
Collobert, R. (2011). Deep learning for efficient discriminative parsing. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 15, 224–232.
Danon, G., & Last, M. (2017). A syntactic approach to domain-specific automatic question generation. Preprint retrieved from https://arxiv.org/abs/1712.09827
Deutsch, D., Bedrax-Weiss, T., & Roth, D. (2021). Towards question-answering as an automatic metric for evaluating the content quality of a summary. Transactions of the Association for Computational Linguistics, 9, 774–789. https://doi.org/10.1162/tacl_a_00397
Doddington, G. (2002). Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. Proceedings of the Second International Conference on Human Language Technology Research. https://doi.org/10.5555/1289189.1289273
Du, X., Shao, J., & Cardie, C. (2017). Learning to ask: Neural question generation for reading comprehension. Preprint retrieved from https://arxiv.org/abs/1705.00106v1
Fellbaum, C. (1998). Wordnet: An electronic lexical database. MIT Press.
Flor, M., & Riordan, B. (2018). A semantic role-based approach to open-domain automatic question generation. Proceedings of the thirteenth workshop on innovative use of NLP for building educational applications (pp. 254–263). ACL.
Heilman, M. (2011). Automatic factual question generation from text. Carnegie Mellon University.
Heilman, M., & Smith, N. A. (2010). Extracting simplified statements for factual question generation. Proceedings of QG2010: The third workshop on question generation (pp. 11–20). ACL.
Honnibal, M., & Johnson, M. (2015). An Improved Non-monotonic Transition System for Dependency Parsing. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1373–1378. https://doi.org/10.18653/v1/D15-1162
Huang, Y., & He, L. (2016). Automatic generation of short answer questions for reading comprehension assessment. Natural Language Engineering, 22(3), 457–489. https://doi.org/10.1017/S1351324915000455
Khullar, P., Rachna, K., Hase, M., & Shrivastava, M. (2018). Automatic question generation using relative pronouns and adverbs. Proceedings of the ACL 2018, student research workshop (pp. 153–158). ACL.
Kumar, G., Banchs, R. E., & D’Haro, L. F. (2015). RevUP: Automatic gap-fill question generation from educational texts. ACL.
Kurdi, G., Leo, J., Parsia, B., Sattler, U., & Al-Emari, S. (2020). A systematic review of automatic question generation for educational purposes. International Journal of Artificial Intelligence in Education, 30(1), 121–204. https://doi.org/10.1007/s40593-019-00186-y
Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries. Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL 2004) (pp. 74–81). ACL.
Lindberg, D., Popowich, F., Nesbit, J., & Winne, P. (2013). Generating natural language questions to support learning on-line. Proceedings of the 14th European workshop on natural language generation (pp. 105–114). ACL.
Majumder, M., & Saha, S. K. (2015). A system for generating multiple choice questions: with a novel approach for sentence selection. Proceedings of the 2nd workshop on natural language processing techniques for educational applications (pp. 64–72). ACL.
Mazidi, K., & Nielsen, R. D. (2014). Linguistic considerations in automatic question generation. Proceedings of the 42nd annual meeting of the association for computational linguistics (Volume 2: Short Papers) (pp. 321–326). ACL.
Mazidi, K., & Tarau, P. (2016a). Automatic question generation: From NLU to NLG. In A. Micarelli, J. Stamper, & K. Panourgia (Eds.), Proceedings of the international conference on intelligent tutoring systems—ITS 2016 (pp. 23–33). Springer.
Mazidi, K., & Tarau, P. (2016b). Infusing NLU into automatic question generation. The 9th international natural language generation conference (pp. 51–60). ACL.
Nema, P., & Khapra, M. M. (2018). Towards a better metric for evaluating question generation systems. Proceedings of the 2018 conference on empirical methods in natural language processing (pp. 3950–3959). ACL.
Olney, A. M., Graesser, A. C., & Person, N. K. (2012). Question generation from concept maps. Dialogue & Discourse, 3(2), 75–99. https://doi.org/10.5087/d&d.v3i2.1480
Olney, A. M., Pavlik, P. I., & Maass, J. K. (2017). Improving reading comprehension with automatically generated cloze item practice. In E. André, R. Baker, X. Hu, M. Ma, T. Rodrigo, & B. Du Boulay (Eds.), Artificial intelligence in education (pp. 262–273). Springer.
Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002). Bleu: A method for automatic evaluation of machine translation. Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 311–318). ACL.
Patra, R., & Saha, S. K. (2019). A hybrid approach for automatic generation of named entity distractors for multiple choice questions. Education and Information Technologies, 24(2), 973–993. https://doi.org/10.1007/s10639-018-9814-3
Popović, M. (2015). chrF: Character n-gram F-score for automatic MT evaluation. Proceedings of the tenth workshop on statistical machine translation (pp. 392–395). ACL.
Popović, M. (2016). chrF deconstructed: Beta parameters and n-gram weights. Proceedings of the first conference on machine translation (Vol. 2, pp. 499–504). ACL.
Wang, Z., Lan, A. S., Nie, W., Waters, A. E., Grimaldi, P. J., & Baraniuk, R. G. (2018). QG-net: A data-driven question generation model for educational content. Proceedings of the fifth annual ACM conference on learning at scale (pp. 1–10). ACM.
Yao, X., Bouma, G., & Zhang, Y. (2012). Semantics-based Question Generation and Implementation. Dialogue & Discourse, 3(2), 11–42. https://doi.org/10.5087/d&d.v3i2.1439
Zhang, L., & VanLehn, K. (2016). How do machine-generated questions compare to human-generated questions? Research and Practice in Technology Enhanced Learning, 11(1), 7. https://doi.org/10.1186/s41039-016-0031-7
Zhang, X., Yan, X., & Yao, Z. (2021). The automatic question generation system for CET. Journal of Computer and Communications, 9(9), 9. https://doi.org/10.4236/jcc.2021.99013
Funding
This work has been supported by the Office of Naval Research under the grant N00014-20-1-2066 Enhancing Adaptive Courseware based on Natural Language Processing.
Author information
Authors and Affiliations
Contributions
AGa has made substantial contributions to the conception and design of the work, the acquisition, analysis and interpretation of data, as well as making the draft of the work. AGr has been responsible for supervision, project administration and funding of the supporting project. ISG has participated in the formal analysis of data and visualization of the results. All authors have approved the submitted version and have agreed both to be personally accountable for the author's own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved and the resolution documented in the literature.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests (appropriate disclosures in the further text).
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gašpar, A., Grubišić, A. & Šarić-Grgić, I. Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis. Lang Resources & Evaluation 57, 1431–1461 (2023). https://doi.org/10.1007/s10579-023-09672-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-023-09672-1