Abstract
Time-sensitive question answering is to answer questions limited to certain timestamps based on the given long document, which mixes abundant temporal events with an explicit or implicit timestamp. While existing models make great progress in answering time-sensitive questions, their performance degrades dramatically when a long distance separates the correct answer from the timestamp mentioned in the question. In this paper, we propose a Context-enhanced Adaptive Graph network (CoAG) to capture long-distance dependencies between sentences within the extracted question-related episodes. Specifically, we propose a time-aware episode extraction module that obtains question-related context based on timestamps in the question and document. As the involvement of episodes confuses sentences with adjacent timestamps, an adaptive message passing mechanism is designed to capture and transfer inter-sentence differences. In addition, we present a hybrid text encoder to highlight question-related context built on global information. Experimental results show that CoAG significantly improves compared to state-of-the-art models on five benchmarks. Moreover, our model has a noticeable advantage in solving long-distance time-sensitive questions, improving the EM scores by 2.03% to 6.04% on TimeQA-Hard.
- Steven Bird. 2006. NLTK: The Natural Language Toolkit. In ACL 2006, 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, Sydney, Australia, 17-21 July 2006, Nicoletta Calzolari, Claire Cardie, and Pierre Isabelle (Eds.). The Association for Computer Linguistics. https://doi.org/10.3115/1225403.1225421Google ScholarDigital Library
- Deyu Bo, Xiao Wang, Chuan Shi, and Huawei Shen. 2021. Beyond Low-frequency Information in Graph Convolutional Networks. In Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), 33rd Conference on Innovative Applications of Artificial Intelligence (IAAI), 11th Symposium on Educational Advances in Artificial Intelligence (EAAI). AAAI Press, Online, 3950–3957. https://doi.org/10.1609/aaai.v35i5.16514Google ScholarCross Ref
- Nicola De Cao, Wilker Aziz, and Ivan Titov. 2019. Question Answering by Reasoning Across Documents with Graph Convolutional Networks. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Association for Computational Linguistics, Online, 2306–2317. https://doi.org/10.18653/v1/n19-1240Google ScholarCross Ref
- Anastasia Chan. 2023. GPT-3 and InstructGPT: technological dystopianism, utopianism, and ”Contextual” perspectives in AI ethics and industry. AI Ethics 3, 1 (2023), 53–64. https://doi.org/10.1007/s43681-022-00148-6Google ScholarCross Ref
- Wenhu Chen, Xinyi Wang, and William Yang Wang. 2021. A Dataset for Answering Time-Sensitive Questions. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, (NeurIPS Datasets and Benchmarks). https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/1f0e3dad99908345f7439f8ffabdffc4-Abstract-round2.htmlGoogle Scholar
- Nurendra Choudhary and Chandan K. Reddy. 2023. Complex Logical Reasoning over Knowledge Graphs using Large Language Models. CoRR abs/2305.01157(2023). https://doi.org/10.48550/arXiv.2305.01157 arXiv:2305.01157Google ScholarCross Ref
- Aliva Das, Xinya Du, Barry Wang, Kejian Shi, Jiayuan Gu, Thomas Porter, and Claire Cardie. 2022. Automatic Error Analysis for Document-level Information Extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, 3960–3975. https://doi.org/10.18653/V1/2022.ACL-LONG.274Google ScholarCross Ref
- Bhuwan Dhingra, Jeremy R. Cole, Julian Martin Eisenschlos, Daniel Gillick, Jacob Eisenstein, and William W. Cohen. 2022. Time-Aware Language Models as Temporal Knowledge Bases. Transactions of the Association for Computational Linguistics 10 (2022), 257–273. https://doi.org/10.1162/tacl_a_00459Google ScholarCross Ref
- Chunliu Dou, Shaojuan Wu, Xiaowang Zhang, Zhiyong Feng, and Kewen Wang. 2022. Function-words Adaptively Enhanced Attention Networks for Few-Shot Inverse Relation Classification. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022, Luc De Raedt (Ed.). ijcai.org, 2937–2943. https://doi.org/10.24963/ijcai.2022/407Google ScholarCross Ref
- Yanlin Feng, Xinyue Chen, Bill Yuchen Lin, Peifeng Wang, Jun Yan, and Xiang Ren. 2020. Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 1295–1309. https://doi.org/10.18653/v1/2020.emnlp-main.99Google ScholarCross Ref
- Shima Imani, Liang Du, and Harsh Shrivastava. 2023. MathPrompter: Mathematical Reasoning using Large Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: Industry Track (ACL). 37–42. https://aclanthology.org/2023.acl-industry.4Google ScholarCross Ref
- Gautier Izacard and Edouard Grave. 2021. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume (EACL). Association for Computational Linguistics, Online, 874–880. https://doi.org/10.18653/v1/2021.eacl-main.74Google ScholarCross Ref
- Zhen Jia, Abdalghani Abujabal, Rishiraj Saha Roy, Jannik Strötgen, and Gerhard Weikum. 2018. TempQuestions: A Benchmark for Temporal Question Answering. In Proceedings of the 37th International World Wide Web Conferences (WWW). ACM, Lyon, France, 1057–1062. https://doi.org/10.1145/3184558.3191536Google ScholarDigital Library
- Zhen Jia, Abdalghani Abujabal, Rishiraj Saha Roy, Jannik Strötgen, and Gerhard Weikum. 2018. TEQUILA: Temporal Question Answering over Knowledge Bases. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (CIKM). ACM, Torino, Italy, 1807–1810. https://doi.org/10.1145/3269206.3269247Google ScholarDigital Library
- Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick S. H. Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 6769–6781. https://doi.org/10.18653/V1/2020.EMNLP-MAIN.550Google ScholarCross Ref
- Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur P. Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural Questions: a Benchmark for Question Answering Research. Transactions of the Association for Computational Linguistics 7 (2019), 452–466. https://doi.org/10.1162/tacl_a_00276Google ScholarCross Ref
- Xingxuan Li, Liying Cheng, Qingyu Tan, Hwee Tou Ng, Shafiq Joty, and Lidong Bing. 2023. Unlocking Temporal Question Answering for Large Language Models Using Code Execution. CoRR abs/2305.15014(2023). https://doi.org/10.48550/arXiv.2305.15014 arXiv:2305.15014Google ScholarCross Ref
- Adam Liska, Tomás Kociský, Elena Gribovskaya, Tayfun Terzi, Eren Sezener, Devang Agrawal, Cyprien de Masson d’Autume, Tim Scholtes, Manzil Zaheer, Susannah Young, Ellen Gilsenan-McMahon, Sophia Austin, Phil Blunsom, and Angeliki Lazaridou. 2022. StreamingQA: A Benchmark for Adaptation to New Knowledge over Time in Question Answering Models. In International Conference on Machine Learning (ICML)(Proceedings of Machine Learning Research, Vol. 162). PMLR, Baltimore, Maryland, USA, 13604–13622. https://proceedings.mlr.press/v162/liska22a.htmlGoogle Scholar
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. https://openreview.net/forum?id=Bkg6RiCqY7Google Scholar
- Di Lu, Shihao Ran, Joel R. Tetreault, and Alejandro Jaimes. 2023. Event Extraction as Question Generation and Answering. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, 1666–1688. https://doi.org/10.18653/V1/2023.ACL-SHORT.143Google ScholarCross Ref
- Puneet Mathur, Vlad I. Morariu, Verena Kaynig-Fittkau, Jiuxiang Gu, Franck Dernoncourt, Quan Hung Tran, Ani Nenkova, Dinesh Manocha, and Rajiv Jain. 2022. DocTime: A Document-level Temporal Dependency Graph Parser. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL). Association for Computational Linguistics, 993–1009. https://doi.org/10.18653/v1/2022.naacl-main.73Google ScholarCross Ref
- Sumit Neelam, Udit Sharma, Hima Karanam, Shajith Ikbal, Pavan Kapanipathi, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Young-Suk Lee, Santosh K. Srivastava, Cezar Pendus, Saswati Dana, Dinesh Garg, Achille Fokoue, G. P. Shrivatsa Bhargav, Dinesh Khandelwal, Srinivas Ravishankar, Sairam Gurajada, Maria Chang, Rosario Uceda-Sosa, Salim Roukos, Alexander G. Gray, Guilherme Lima, Ryan Riegel, Francois P. S. Luus, and L. Venkata Subramaniam. 2021. SYGMA: System for Generalizable Modular Question Answering Over Knowledge Bases. CoRR abs/2109.13430(2021). arXiv:2109.13430 https://arxiv.org/abs/2109.13430Google Scholar
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21 (2020), 140:1–140:67. http://jmlr.org/papers/v21/20-074.htmlGoogle Scholar
- Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2009. The Graph Neural Network Model. IEEE Trans. Neural Networks 20 (2009), 61–80.Google ScholarDigital Library
- Lidong Bing Sen Yang, Xin Li and Wai Lam. 2023. Once Upon a Time in Graph: Relative-Time Pretraining for Complex Temporal Reasoning. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics.Google ScholarCross Ref
- Alessandro Stolfo, Zhijing Jin, Kumar Shridhar, Bernhard Schölkopf, and Mrinmaya Sachan. 2023. A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL). 545–561. https://aclanthology.org/2023.acl-long.32Google ScholarCross Ref
- Xin Su, Phillip Howard, Nagib Hakim, and Steven Bethard. 2023. Fusing Temporal Graphs into Transformers for Time-Sensitive Question Answering. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, 948–966. https://aclanthology.org/2023.findings-emnlp.67Google ScholarCross Ref
- Haitian Sun, Tania Bedrax-Weiss, and William W. Cohen. 2019. PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 2380–2390. https://doi.org/10.18653/v1/D19-1242Google ScholarCross Ref
- Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Kathryn Mazaitis, Ruslan Salakhutdinov, and William W. Cohen. 2018. Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Brussels, Belgium, 4231–4242. https://doi.org/10.18653/v1/d18-1455Google ScholarCross Ref
- Qingyu Tan, Hwee Tou Ng, and Lidong Bing. 2023. Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, Toronto, Canada, 14820–14835. https://aclanthology.org/2023.acl-long.828Google ScholarCross Ref
- Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton-Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini, Rui Hou, Hakan Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa, Isabel Kloumann, Artem Korenev, Punit Singh Koura, Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Diana Liskovich, Yinghai Lu, Yuning Mao, Xavier Martinet, Todor Mihaylov, Pushkar Mishra, Igor Molybog, Yixin Nie, Andrew Poulton, Jeremy Reizenstein, Rashi Rungta, Kalyan Saladi, Alan Schelten, Ruan Silva, Eric Michael Smith, Ranjan Subramanian, Xiaoqing Ellen Tan, Binh Tang, Ross Taylor, Adina Williams, Jian Xiang Kuan, Puxin Xu, Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan, Melanie Kambadur, Sharan Narang, Aurélien Rodriguez, Robert Stojnic, Sergey Edunov, and Thomas Scialom. 2023. Llama 2: Open Foundation and Fine-Tuned Chat Models. CoRR abs/2307.09288(2023). https://doi.org/10.48550/ARXIV.2307.09288 arXiv:2307.09288Google ScholarCross Ref
- Ming Tu, Guangtao Wang, Jing Huang, Yun Tang, Xiaodong He, and Bowen Zhou. 2019. Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs. In Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL). Association for Computational Linguistics, Florence, Italy, 2704–2713. https://doi.org/10.18653/v1/p19-1260Google ScholarCross Ref
- Denny Vrandecic and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM 57, 10 (2014), 78–85. https://doi.org/10.1145/2629489Google ScholarDigital Library
- Jiexin Wang, Adam Jatowt, and Masatoshi Yoshikawa. 2022. ArchivalQA: A Large-scale Benchmark Dataset for Open-Domain Question Answering over Historical News Collections. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). 3025–3035. https://doi.org/10.1145/3477495.3531734Google ScholarDigital Library
- Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, and Denny Zhou. 2022. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Proceedings of the 36th International Conference on Neural Information Processing Systems (NeurIPS). http://papers.nips.cc/paper_files/paper/2022/hash/9d5609613524ecf4f15af0f7b31abca4-Abstract-Conference.htmlGoogle Scholar
- Yifan Wei, Yisong Su, Huanhuan Ma, Xiaoyan Yu, Fangyu Lei, Yuanzhe Zhang, Jun Zhao, and Kang Liu. 2023. MenatQA: A New Dataset for Testing the Temporal Comprehension and Reasoning Abilities of Large Language Models. CoRR abs/2310.05157(2023). https://doi.org/10.48550/ARXIV.2310.05157 arXiv:2310.05157Google ScholarCross Ref
- Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. 2020. Transformers: State-of-the-Art Natural Language Processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16-20, 2020, Qun Liu and David Schlangen (Eds.). Association for Computational Linguistics, 38–45. https://doi.org/10.18653/V1/2020.EMNLP-DEMOS.6Google ScholarCross Ref
- Peng Yang, Wenjun Li, Guangzhen Zhao, and Xianyu Zha. 2023. Row-based hierarchical graph network for multi-hop question answering over textual and tabular data. The Journal of Supercomputing 79, 9 (2023), 9795–9818. https://doi.org/10.1007/s11227-022-05035-9Google ScholarDigital Library
- Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R. Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. In Proceedings of the 11th International Conference on Learning Representations (ICLR). OpenReview.net, Kigali, Rwanda. https://openreview.net/pdf?id=WE_vluYUL-XGoogle Scholar
- Michihiro Yasunaga, Hongyu Ren, Antoine Bosselut, Percy Liang, and Jure Leskovec. 2021. QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT). Association for Computational Linguistics, Online, 535–546. https://doi.org/10.18653/v1/2021.naacl-main.45Google ScholarCross Ref
- Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontañón, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, and Amr Ahmed. 2020. Big Bird: Transformers for Longer Sequences. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NeurIPS). PMLR, Baltimore, Maryland, USA. https://proceedings.neurips.cc/paper/2020/hash/c8512d142a2d849725f31a9a7a361ab9-Abstract.htmlGoogle Scholar
- Jie Zhang, Zhongmin Pei, Wei Xiong, and Zhangkai Luo. 2020. Answer extraction with graph attention network for knowledge graph question answering. In 2020 IEEE 6th International Conference on Computer and Communications (ICCC). IEEE, 1645–1650.Google ScholarCross Ref
- Michael J. Q. Zhang and Eunsol Choi. 2021. SituatedQA: Incorporating Extra-Linguistic Contexts into QA. In Proceedings of the 25th Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Punta Cana, Dominican Republic, 7371–7387. https://doi.org/10.18653/v1/2021.emnlp-main.586Google ScholarCross Ref
- Michael J. Q. Zhang and Eunsol Choi. 2021. SituatedQA: Incorporating Extra-Linguistic Contexts into QA. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Punta Cana, Dominican Republic, 7371–7387. https://doi.org/10.18653/V1/2021.EMNLP-MAIN.586Google ScholarCross Ref
- Xinyu Zhu, Cheng Yang, Bei Chen, Siheng Li, Jian-Guang Lou, and Yujiu Yang. 2023. Question Answering as Programming for Solving Time-Sensitive Questions. CoRR abs/2305.14221(2023). https://doi.org/10.48550/arXiv.2305.14221 arXiv:2305.14221Google ScholarCross Ref
Index Terms
- A Context-enhanced Adaptive Graph Network for Time-sensitive Question Answering
Recommendations
Community-based question answering via heterogeneous social network learning
AAAI'16: Proceedings of the Thirtieth AAAI Conference on Artificial IntelligenceCommunity-based question answering (cQA) sites have accumulated vast amount of questions and corresponding crowdsourced answers over time. How to efficiently share the underlying information and knowledge from reliable (usually highly-reputable) ...
Quality-aware collaborative question answering: methods and evaluation
WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data MiningCommunity Question Answering (QA) portals contain questions and answers contributed by hundreds of millions of users. These databases of questions and answers are of great value if they can be used directly to answer questions from any user. In this ...
Comments