Skip to main content
Log in

Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis

  • Original Paper
  • Published:
Language Resources and Evaluation Aims and scope Submit manuscript

Abstract

We present a rule-based approach to automatic factual question generation implemented in the Adaptive Courseware and Natural Language Tutor, a natural language-based intelligent tutoring system. Since machine-generated questions are intended for adaptive teaching, learning and assessment, their accuracy is of the utmost importance. However, the generation of high-quality questions is still challenging. The proposed approach relies on pre-processing techniques and syntactic and semantic feature extraction to transform declarative sentences and their segments into questions. The quality of questions, generated from domain specific texts, was evaluated by using mixed evaluation strategies: (1) human evaluation, (2) qualitative error analysis, (3) automatic evaluation, (4) human and automatic evaluation of machine-generated questions from paraphrases compared to a set of human-authored questions, (5) preliminary comparison to other approaches. The human evaluation involved two teachers of English as a foreign language who set up evaluation criteria (grammaticality, semantic accuracy, and answerability) and a group of 30 English language graduates. Student-generated questions were validated and used as reference questions for automatic evaluation based on similarity metrics (BLEU-4, METEOR, CHRF, NIST and ROUGE-L). Human and automatic evaluation results were satisfactory but improved significantly with the paraphrasing strategy. The preliminary comparison to other approaches showed that the proposed rule-based approach performed equally well despite its limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The dataset used and analyzed during the current study is available from the corresponding author on reasonable request.

Notes

  1. stanfordnlp.github.io/CoreNLP/index.html.

  2. https://www.webfx.com/tools/read-able/.

References

  • Alsubait, T. (2015). Ontology-based question generation. University of Manchester.

    Google Scholar 

  • Amidei, J., Piwek, P., & Willis, A. (2018). Evaluation methodologies in automatic question generation 2013–2018. Proceedings of the 11th international conference on natural language generation (pp. 307–317). ACL.

    Chapter  Google Scholar 

  • Banerjee, S., & Lavie, A. (2005). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. Proceedings of the ACL workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization (pp. 65–72). ACL.

    Google Scholar 

  • Blšták, M. (2018). Automatic question generation based on sentence structure analysis. Information Sciences & Technologies: Bulletin of the ACM Slovakia, 10(2), 20.

    Google Scholar 

  • Blšták, M., & Rozinajová, V. (2022). Automatic question generation based on sentence structure analysis using machine learning approach. Natural Language Engineering, 28(4), 487–517. https://doi.org/10.1017/S1351324921000139

    Article  Google Scholar 

  • Collobert, R. (2011). Deep learning for efficient discriminative parsing. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 15, 224–232.

    Google Scholar 

  • Danon, G., & Last, M. (2017). A syntactic approach to domain-specific automatic question generation. Preprint retrieved from https://arxiv.org/abs/1712.09827

  • Deutsch, D., Bedrax-Weiss, T., & Roth, D. (2021). Towards question-answering as an automatic metric for evaluating the content quality of a summary. Transactions of the Association for Computational Linguistics, 9, 774–789. https://doi.org/10.1162/tacl_a_00397

    Article  Google Scholar 

  • Doddington, G. (2002). Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. Proceedings of the Second International Conference on Human Language Technology Research. https://doi.org/10.5555/1289189.1289273

    Article  Google Scholar 

  • Du, X., Shao, J., & Cardie, C. (2017). Learning to ask: Neural question generation for reading comprehension. Preprint retrieved from https://arxiv.org/abs/1705.00106v1

  • Fellbaum, C. (1998). Wordnet: An electronic lexical database. MIT Press.

    Book  Google Scholar 

  • Flor, M., & Riordan, B. (2018). A semantic role-based approach to open-domain automatic question generation. Proceedings of the thirteenth workshop on innovative use of NLP for building educational applications (pp. 254–263). ACL.

    Chapter  Google Scholar 

  • Heilman, M. (2011). Automatic factual question generation from text. Carnegie Mellon University.

    Google Scholar 

  • Heilman, M., & Smith, N. A. (2010). Extracting simplified statements for factual question generation. Proceedings of QG2010: The third workshop on question generation (pp. 11–20). ACL.

    Google Scholar 

  • Honnibal, M., & Johnson, M. (2015). An Improved Non-monotonic Transition System for Dependency Parsing. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1373–1378. https://doi.org/10.18653/v1/D15-1162

  • Huang, Y., & He, L. (2016). Automatic generation of short answer questions for reading comprehension assessment. Natural Language Engineering, 22(3), 457–489. https://doi.org/10.1017/S1351324915000455

    Article  Google Scholar 

  • Khullar, P., Rachna, K., Hase, M., & Shrivastava, M. (2018). Automatic question generation using relative pronouns and adverbs. Proceedings of the ACL 2018, student research workshop (pp. 153–158). ACL.

    Chapter  Google Scholar 

  • Kumar, G., Banchs, R. E., & D’Haro, L. F. (2015). RevUP: Automatic gap-fill question generation from educational texts. ACL.

    Google Scholar 

  • Kurdi, G., Leo, J., Parsia, B., Sattler, U., & Al-Emari, S. (2020). A systematic review of automatic question generation for educational purposes. International Journal of Artificial Intelligence in Education, 30(1), 121–204. https://doi.org/10.1007/s40593-019-00186-y

    Article  Google Scholar 

  • Lin, C.-Y. (2004). ROUGE: A package for automatic evaluation of summaries. Proceedings of the 42nd annual meeting of the association for computational linguistics (ACL 2004) (pp. 74–81). ACL.

    Google Scholar 

  • Lindberg, D., Popowich, F., Nesbit, J., & Winne, P. (2013). Generating natural language questions to support learning on-line. Proceedings of the 14th European workshop on natural language generation (pp. 105–114). ACL.

    Google Scholar 

  • Majumder, M., & Saha, S. K. (2015). A system for generating multiple choice questions: with a novel approach for sentence selection. Proceedings of the 2nd workshop on natural language processing techniques for educational applications (pp. 64–72). ACL.

    Chapter  Google Scholar 

  • Mazidi, K., & Nielsen, R. D. (2014). Linguistic considerations in automatic question generation. Proceedings of the 42nd annual meeting of the association for computational linguistics (Volume 2: Short Papers) (pp. 321–326). ACL.

    Chapter  Google Scholar 

  • Mazidi, K., & Tarau, P. (2016a). Automatic question generation: From NLU to NLG. In A. Micarelli, J. Stamper, & K. Panourgia (Eds.), Proceedings of the international conference on intelligent tutoring systems—ITS 2016 (pp. 23–33). Springer.

    Google Scholar 

  • Mazidi, K., & Tarau, P. (2016b). Infusing NLU into automatic question generation. The 9th international natural language generation conference (pp. 51–60). ACL.

    Chapter  Google Scholar 

  • Nema, P., & Khapra, M. M. (2018). Towards a better metric for evaluating question generation systems. Proceedings of the 2018 conference on empirical methods in natural language processing (pp. 3950–3959). ACL.

    Chapter  Google Scholar 

  • Olney, A. M., Graesser, A. C., & Person, N. K. (2012). Question generation from concept maps. Dialogue & Discourse, 3(2), 75–99. https://doi.org/10.5087/d&d.v3i2.1480

    Article  Google Scholar 

  • Olney, A. M., Pavlik, P. I., & Maass, J. K. (2017). Improving reading comprehension with automatically generated cloze item practice. In E. André, R. Baker, X. Hu, M. Ma, T. Rodrigo, & B. Du Boulay (Eds.), Artificial intelligence in education (pp. 262–273). Springer.

    Chapter  Google Scholar 

  • Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002). Bleu: A method for automatic evaluation of machine translation. Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 311–318). ACL.

    Google Scholar 

  • Patra, R., & Saha, S. K. (2019). A hybrid approach for automatic generation of named entity distractors for multiple choice questions. Education and Information Technologies, 24(2), 973–993. https://doi.org/10.1007/s10639-018-9814-3

    Article  Google Scholar 

  • Popović, M. (2015). chrF: Character n-gram F-score for automatic MT evaluation. Proceedings of the tenth workshop on statistical machine translation (pp. 392–395). ACL.

    Chapter  Google Scholar 

  • Popović, M. (2016). chrF deconstructed: Beta parameters and n-gram weights. Proceedings of the first conference on machine translation (Vol. 2, pp. 499–504). ACL.

    Google Scholar 

  • Wang, Z., Lan, A. S., Nie, W., Waters, A. E., Grimaldi, P. J., & Baraniuk, R. G. (2018). QG-net: A data-driven question generation model for educational content. Proceedings of the fifth annual ACM conference on learning at scale (pp. 1–10). ACM.

    Google Scholar 

  • Yao, X., Bouma, G., & Zhang, Y. (2012). Semantics-based Question Generation and Implementation. Dialogue & Discourse, 3(2), 11–42. https://doi.org/10.5087/d&d.v3i2.1439

    Article  Google Scholar 

  • Zhang, L., & VanLehn, K. (2016). How do machine-generated questions compare to human-generated questions? Research and Practice in Technology Enhanced Learning, 11(1), 7. https://doi.org/10.1186/s41039-016-0031-7

    Article  Google Scholar 

  • Zhang, X., Yan, X., & Yao, Z. (2021). The automatic question generation system for CET. Journal of Computer and Communications, 9(9), 9. https://doi.org/10.4236/jcc.2021.99013

    Article  Google Scholar 

Download references

Funding

This work has been supported by the Office of Naval Research under the grant N00014-20-1-2066 Enhancing Adaptive Courseware based on Natural Language Processing.

Author information

Authors and Affiliations

Authors

Contributions

AGa has made substantial contributions to the conception and design of the work, the acquisition, analysis and interpretation of data, as well as making the draft of the work. AGr has been responsible for supervision, project administration and funding of the supporting project. ISG has participated in the formal analysis of data and visualization of the results. All authors have approved the submitted version and have agreed both to be personally accountable for the author's own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved and the resolution documented in the literature.

Corresponding author

Correspondence to Ines Šarić-Grgić.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests (appropriate disclosures in the further text).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gašpar, A., Grubišić, A. & Šarić-Grgić, I. Evaluation of a rule-based approach to automatic factual question generation using syntactic and semantic analysis. Lang Resources & Evaluation 57, 1431–1461 (2023). https://doi.org/10.1007/s10579-023-09672-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10579-023-09672-1

Keywords

Navigation