Abstract
The paper introduces a novel methodology for predicting intentions in dialog systems through a graph-based approach. This methodology involves constructing graph structures that represent dialogs, thus capturing contextual information effectively. By analyzing results from various open and closed domain datasets, the authors demonstrate the substantial enhancement in intention prediction accuracy achieved by combining graph models with text encoders. The primary focus of the study revolves around assessing the impact of diverse graph architectures and encoders on the performance of the proposed technique. Through empirical evaluation, the experimental outcomes affirm the superiority of graph neural networks in terms of both \(Recall@k\) (MAR) metric and computational resources when compared to alternative methods. This research uncovers a novel avenue for intention prediction in dialog systems by leveraging graph-based representations.
Similar content being viewed by others
REFERENCES
M. Burtsev, A. Seliverstov, R. Airapetyan, M. Arkhipov, D. Baymurzina, N. Bushkov, O. Gureenkova, T. Khakhulin, Yu. Kuratov, D. Kuznetsov, A. Litinsky, V. Logacheva, A. Lymar, V. Malykh, M. Petrov, V. Polulyakh, L. Pugachev, A. Sorokin, M. Vikhreva, and M. Zaynutdinov, “DeepPavlov: Open-source library for dialog systems,” in Proc. ACL 2018, System Demonstrations, Melbourne, Australia, 2018, Ed. by F. Liu and Th. Solorio (Association for Computational Linguistics, 2018), pp. 122–127. https://doi.org/10.18653/v1/p18-4021
N. Muennighoff, N. Tazi, L. Magne, and N. Reimers, “MTEB: Massive text embedding benchmark,” in Proc. 17th Conf. of the European Chapter of the Association for Computational Linguistics, Dubrovnik, Croatia, 2023, Ed. by A. Vlachos and I. Augenstein (Association for Computational Linguistics, 2023), pp. 2014–2037. https://doi.org/10.18653/v1/2023.eacl-main.148
D. Steinley, “K-means clustering: A half-century synthesis,” Br. J. Math. Stat. Psychol. 59, 1–34 (2006). https://doi.org/10.1348/000711005x48266
J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI Open 1, 57–81 (2020). https://doi.org/10.1016/j.aiopen.2021.01.001
P. Veličković, A. Casanova, P. Liò, G. Cucurull, A. Romero, and Y. Bengio, “Graph attention networks,” in 6th Int. Conf. on Learning Representations, ICLR 2018—Conf. Track Proc. (OpenReview.net, 2018). https://doi.org/10.17863/CAM.48429
S. Yun, M. Jeong, S. Yoo, S. Lee, S. S. Yi, R. Kim, J. Kang, and H. J. Kim, “Graph transformer networks: Learning meta-path graphs to improve GNNs,” Neural Networks 153, 104–119 (2021). https://doi.org/10.1016/j.neunet.2022.05.026
M. Nagovitsin and D. Kuznetsov, “DGAC: Dialog graph auto construction based on data with a regular structure,” in Advances in Neural Computation, Machine Learning, and Cognitive Research VI. Neuroinformatics 2022, Ed. by B. Kryzhanovsky, W. Dunin-Barkowski, V. Redko, and Y. Tiumentsev, Studies in Computational Intelligence, Vol. 1064 (Springer, Cham, 2022), pp. 508–529. https://doi.org/10.1007/978-3-031-19032-2_52
F. Feng, Yi. Yang, D. Cer, N. Arivazhagan, and W. Wang, “Language-agnostic BERT Sentence Embedding,” in Proc. 60th Annu. Meeting of the Association for Computational Linguistics, Dublin, Ireland, 2022, Ed. by S. Muresan, P. Nakov, and A. Villavicencio (Association for Computational Linguistics, 2022), Vol. 1, pp. 878–891. https://doi.org/10.18653/v1/2022.acl-long.62
ACKNOWLEDGMENTS
We would like to express our gratitude to the Laboratory of Neural Systems and Deep Learning DeepPavlov for their invaluable assistance and support during the research and development.
Funding
This work was supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Moscow Institute of Physics and Technology dated November 1, 2021 no. 70-2021-00138.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors of this work declare that they have no conflicts of interest.
Additional information
Translated by E. Oborin
Publisher’s Note.
Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
APPENDIX A
COMPARISON OF TEXT ENCODERS
To create vertices of the multipartite dialog graph, we need to from vector representations of utterances from the dialog dataset and on the basis of these representations we construct the graph. In this study we use the vector representations of dialog utterances obtained with application of text encoder language-agnostic BERT sentence embedding (LaBSE) [8]. The choice of the LaBSE sentence encoder is based on the results of a comparative analysis of modern sentence encoder architectures (see Table 4). Within this analysis we evaluated the capability of text encoders to take into account context in preprocessing of vector representations of utterances. The study demonstrated that the accuracy of intention prediction substantially depends on the quality of vector representations of sentences.
APPENDIX B
DETAILED RESEARCH RESULTS
In this section we provide detailed results of applying different approaches on dialog datasets with closed domain (see Table 5) and dialog datasets with open domain (see Table 6), where the appraoches are evaluated using the \(Recall@k\) metric, \(k \in \{ 1,3,5,10\} \).
APPENDIX C
EXPERIMENTS
Sets of Dialog Data
In this research we evaluated the approaches on different datasets both with open and closed domain. Among the datasets with closed domain we had MultiWOZ, FoCus, and Taskmaster translated from English to Russian language using the NLLB model for machine translation. In addition we used the Telecom Domain Dataset containing dialogs concerning telecommunications.
The dialog datasets with open domains included the Toloka Persona Chat Rus dialog dataset conatining dialogs in the topic of dating, as well as the Russian Dialogs dialog dataset consistsing of dialogs on general topics. Taking into account that the general volume of the Russian Dialogs dataset counts to more than a million dialogs, which requires considerable computational resources for evalutaing the methods, in our study we used a subset of the Russian Dialogs dataset. In addition, we used the Matreshka dialog dataset representing a dataset of open domain, whose distinctive feature is the explicit highlighting of not one, but two roles: bot and user.
Number of Clusters
At the first and second clustering stages, in our study we used different configurations of the numbers of clusters: 200 and 30, 400 and 60, 800 and 120 clusters, respectively. These values were selected according to the methodology described in paper [7]. Nevertheless, the optimal number of clusters for each dataset should be adjusted on the basis of its individual characteristics.
Details of Implementation
All models were trained with application of the Adam optimizers promoting a faster convergence of models. A pooling module was used for information aggregation from graph nodes in all graph-based approaches. In addition, to perform the task of graph classification, to all graph approaches we added a linear layer.
To prevent overfitting, two methods were employed: early stopping, which monitors model performance on the validation set and stops training in the absence of improvement, and the Reduce Learning Rate on Plateau scheduler, which automatically decreases the learning rate. Hyperparameters were tuned using default values adapted to the specific structure of dialogue graphs.
To take into account the possible dependence of the quality of approaches on specific configuraitons of dividing the dialog utterances into clusters, we trained all approaches and evaluated them on three different sets of clusters. Afterwards, we averaged the obtained values of metrics for different sets of clusters in order to obtain a more reliable estimate of the quality of approaches.
APPENDIX D
LIMITATIONS
Although the study has promising results, we need to consider the following limitations.
Multilinguality and Domain Diversity
One of the main limitations of our study is its focus on Russian language dialog datasets. Although such assumption is reasonable for the current research purposes, it limits the possibility of generalizing the approaches to multilingual dialog systems. In order to more comprehensively evaluate the efficiency of our graph approaches in different language environments, we need to perform investigations on dialog datasets in other languages. Another important limitation is a restricted set of domains presented in the dialog datasets used in our experiments. Future research should explore the effectiveness of our approaches across a broader spectrum of domains to better understand their universality and applicability.
Limited Number of Roles in Dialogs
The dialog datasets used in the study cover a small number of roles of the dialog participants. This limitation cannot fully reflect the diversity of real dialogs, for instance, dialogs in social networks and messengers. Carrying out experiments on larger and more diverse datasets will aid to reach a more comprehensive evaluation of efficiency of the presented approaches.
Choice of Text Encoder
Selection of sentence encoder have key importance, especially in analyzing the specialized datasets. The experimental results emphasized the importance of choosing a specific text encoder. Subsequent research must focus on adaptation of text encoders to specific domains for further improvement of the approaches.
Number of Clusters
In this work we used fixed numbers of vertices of the multipartite dialog graph. However, determining the optimal number of clusters is a sophisticated problem depending on the specific set of dialog data. Reducing the number of clusters may improve the metrics due to a smaller number of candidates for a vertex with intention of the next utterance, but also can affect the quality of vertices from the point of view of their correspondence to the intentions of dialog participants.
Rights and permissions
About this article
Cite this article
Kuznetsov, D.P., Ledneva, D.R. Graph Models for Contextual Intention Prediction in Dialog Systems. Dokl. Math. 108 (Suppl 2), S399–S415 (2023). https://doi.org/10.1134/S106456242370117X
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1134/S106456242370117X