Skip to main content
Log in

Graph Models for Contextual Intention Prediction in Dialog Systems

  • Published:
Doklady Mathematics Aims and scope Submit manuscript

Abstract

The paper introduces a novel methodology for predicting intentions in dialog systems through a graph-based approach. This methodology involves constructing graph structures that represent dialogs, thus capturing contextual information effectively. By analyzing results from various open and closed domain datasets, the authors demonstrate the substantial enhancement in intention prediction accuracy achieved by combining graph models with text encoders. The primary focus of the study revolves around assessing the impact of diverse graph architectures and encoders on the performance of the proposed technique. Through empirical evaluation, the experimental outcomes affirm the superiority of graph neural networks in terms of both \(Recall@k\) (MAR) metric and computational resources when compared to alternative methods. This research uncovers a novel avenue for intention prediction in dialog systems by leveraging graph-based representations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.

Similar content being viewed by others

REFERENCES

  1. M. Burtsev, A. Seliverstov, R. Airapetyan, M. Arkhipov, D. Baymurzina, N. Bushkov, O. Gureenkova, T. Khakhulin, Yu. Kuratov, D. Kuznetsov, A. Litinsky, V. Logacheva, A. Lymar, V. Malykh, M. Petrov, V. Polulyakh, L. Pugachev, A. Sorokin, M. Vikhreva, and M. Zaynutdinov, “DeepPavlov: Open-source library for dialog systems,” in Proc. ACL 2018, System Demonstrations, Melbourne, Australia, 2018, Ed. by F. Liu and Th. Solorio (Association for Computational Linguistics, 2018), pp. 122–127. https://doi.org/10.18653/v1/p18-4021

  2. N. Muennighoff, N. Tazi, L. Magne, and N. Reimers, “MTEB: Massive text embedding benchmark,” in Proc. 17th Conf. of the European Chapter of the Association for Computational Linguistics, Dubrovnik, Croatia, 2023, Ed. by A. Vlachos and I. Augenstein (Association for Computational Linguistics, 2023), pp. 2014–2037. https://doi.org/10.18653/v1/2023.eacl-main.148

  3. D. Steinley, “K-means clustering: A half-century synthesis,” Br. J. Math. Stat. Psychol. 59, 1–34 (2006). https://doi.org/10.1348/000711005x48266

    Article  MathSciNet  Google Scholar 

  4. J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI Open 1, 57–81 (2020). https://doi.org/10.1016/j.aiopen.2021.01.001

    Article  Google Scholar 

  5. P. Veličković, A. Casanova, P. Liò, G. Cucurull, A. Romero, and Y. Bengio, “Graph attention networks,” in 6th Int. Conf. on Learning Representations, ICLR 2018—Conf. Track Proc. (OpenReview.net, 2018). https://doi.org/10.17863/CAM.48429

  6. S. Yun, M. Jeong, S. Yoo, S. Lee, S. S. Yi, R. Kim, J. Kang, and H. J. Kim, “Graph transformer networks: Learning meta-path graphs to improve GNNs,” Neural Networks 153, 104–119 (2021). https://doi.org/10.1016/j.neunet.2022.05.026

    Article  Google Scholar 

  7. M. Nagovitsin and D. Kuznetsov, “DGAC: Dialog graph auto construction based on data with a regular structure,” in Advances in Neural Computation, Machine Learning, and Cognitive Research VI. Neuroinformatics 2022, Ed. by B. Kryzhanovsky, W. Dunin-Barkowski, V. Redko, and Y. Tiumentsev, Studies in Computational Intelligence, Vol. 1064 (Springer, Cham, 2022), pp. 508–529. https://doi.org/10.1007/978-3-031-19032-2_52

  8. F. Feng, Yi. Yang, D. Cer, N. Arivazhagan, and W. Wang, “Language-agnostic BERT Sentence Embedding,” in Proc. 60th Annu. Meeting of the Association for Computational Linguistics, Dublin, Ireland, 2022, Ed. by S. Muresan, P. Nakov, and A. Villavicencio (Association for Computational Linguistics, 2022), Vol. 1, pp. 878–891. https://doi.org/10.18653/v1/2022.acl-long.62

Download references

ACKNOWLEDGMENTS

We would like to express our gratitude to the Laboratory of Neural Systems and Deep Learning DeepPavlov for their invaluable assistance and support during the research and development.

Funding

This work was supported by a grant for research centers in the field of artificial intelligence, provided by the Analytical Center for the Government of the Russian Federation in accordance with the subsidy agreement (agreement identifier 000000D730321P5Q0002) and the agreement with the Moscow Institute of Physics and Technology dated November 1, 2021 no. 70-2021-00138.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to D. P. Kuznetsov or D. R. Ledneva.

Ethics declarations

The authors of this work declare that they have no conflicts of interest.

Additional information

Translated by E. Oborin

Publisher’s Note.

Pleiades Publishing remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

APPENDIX A

COMPARISON OF TEXT ENCODERS

To create vertices of the multipartite dialog graph, we need to from vector representations of utterances from the dialog dataset and on the basis of these representations we construct the graph. In this study we use the vector representations of dialog utterances obtained with application of text encoder language-agnostic BERT sentence embedding (LaBSE) [8]. The choice of the LaBSE sentence encoder is based on the results of a comparative analysis of modern sentence encoder architectures (see Table 4). Within this analysis we evaluated the capability of text encoders to take into account context in preprocessing of vector representations of utterances. The study demonstrated that the accuracy of intention prediction substantially depends on the quality of vector representations of sentences.

Table 4. Comparison of the quality of dialogue utterance embeddings using various text encoders on the Russian MultiWOZ dataset. Investigating the impact of choosing sentence encoder on the performance of three main approaches: Message Passing, Encoder, Markov Chain.

APPENDIX B

DETAILED RESEARCH RESULTS

In this section we provide detailed results of applying different approaches on dialog datasets with closed domain (see Table 5) and dialog datasets with open domain (see Table 6), where the appraoches are evaluated using the \(Recall@k\) metric, \(k \in \{ 1,3,5,10\} \).

Table 5. Experimental resutls by metric \(Recall@k\), \(k \in \{ 1,3,5,10\} \), on the closed-domain datasets. The All Metric is the mean value of the user’s and dialog system’s metrics. For stability of results, all methods were trained on three different sets of clusters and the results were averaged.
Table 6. Experimental results by \(Recall@k\) metric, \(k \in \{ 1,3,5,10\} \), on open-domain datasets. The All Metric is the mean value of user’s and dialog system’s metric. For stability of results, all methods were trained on three different sets of clusters and the results were averaged.

APPENDIX C

EXPERIMENTS

Sets of Dialog Data

In this research we evaluated the approaches on different datasets both with open and closed domain. Among the datasets with closed domain we had MultiWOZ, FoCus, and Taskmaster translated from English to Russian language using the NLLB model for machine translation. In addition we used the Telecom Domain Dataset containing dialogs concerning telecommunications.

The dialog datasets with open domains included the Toloka Persona Chat Rus dialog dataset conatining dialogs in the topic of dating, as well as the Russian Dialogs dialog dataset consistsing of dialogs on general topics. Taking into account that the general volume of the Russian Dialogs dataset counts to more than a million dialogs, which requires considerable computational resources for evalutaing the methods, in our study we used a subset of the Russian Dialogs dataset. In addition, we used the Matreshka dialog dataset representing a dataset of open domain, whose distinctive feature is the explicit highlighting of not one, but two roles: bot and user.

Number of Clusters

At the first and second clustering stages, in our study we used different configurations of the numbers of clusters: 200 and 30, 400 and 60, 800 and 120 clusters, respectively. These values were selected according to the methodology described in paper [7]. Nevertheless, the optimal number of clusters for each dataset should be adjusted on the basis of its individual characteristics.

Details of Implementation

All models were trained with application of the Adam optimizers promoting a faster convergence of models. A pooling module was used for information aggregation from graph nodes in all graph-based approaches. In addition, to perform the task of graph classification, to all graph approaches we added a linear layer.

To prevent overfitting, two methods were employed: early stopping, which monitors model performance on the validation set and stops training in the absence of improvement, and the Reduce Learning Rate on Plateau scheduler, which automatically decreases the learning rate. Hyperparameters were tuned using default values adapted to the specific structure of dialogue graphs.

To take into account the possible dependence of the quality of approaches on specific configuraitons of dividing the dialog utterances into clusters, we trained all approaches and evaluated them on three different sets of clusters. Afterwards, we averaged the obtained values of metrics for different sets of clusters in order to obtain a more reliable estimate of the quality of approaches.

APPENDIX D

LIMITATIONS

Although the study has promising results, we need to consider the following limitations.

Multilinguality and Domain Diversity

One of the main limitations of our study is its focus on Russian language dialog datasets. Although such assumption is reasonable for the current research purposes, it limits the possibility of generalizing the approaches to multilingual dialog systems. In order to more comprehensively evaluate the efficiency of our graph approaches in different language environments, we need to perform investigations on dialog datasets in other languages. Another important limitation is a restricted set of domains presented in the dialog datasets used in our experiments. Future research should explore the effectiveness of our approaches across a broader spectrum of domains to better understand their universality and applicability.

Limited Number of Roles in Dialogs

The dialog datasets used in the study cover a small number of roles of the dialog participants. This limitation cannot fully reflect the diversity of real dialogs, for instance, dialogs in social networks and messengers. Carrying out experiments on larger and more diverse datasets will aid to reach a more comprehensive evaluation of efficiency of the presented approaches.

Choice of Text Encoder

Selection of sentence encoder have key importance, especially in analyzing the specialized datasets. The experimental results emphasized the importance of choosing a specific text encoder. Subsequent research must focus on adaptation of text encoders to specific domains for further improvement of the approaches.

Number of Clusters

In this work we used fixed numbers of vertices of the multipartite dialog graph. However, determining the optimal number of clusters is a sophisticated problem depending on the specific set of dialog data. Reducing the number of clusters may improve the metrics due to a smaller number of candidates for a vertex with intention of the next utterance, but also can affect the quality of vertices from the point of view of their correspondence to the intentions of dialog participants.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kuznetsov, D.P., Ledneva, D.R. Graph Models for Contextual Intention Prediction in Dialog Systems. Dokl. Math. 108 (Suppl 2), S399–S415 (2023). https://doi.org/10.1134/S106456242370117X

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S106456242370117X

Keywords:

Navigation