Skip to main content
Log in

Learning consumer preferences through textual and visual data: a multi-modal approach

  • Published:
Electronic Commerce Research Aims and scope Submit manuscript

Abstract

This paper proposes a novel multi-modal probabilistic topic model (LSTIT) to infer consumer preferences by jointly leveraging textual and visual data. Specifically, we use the title and image of the items purchased by consumers. Considering that the titles of items are relatively short text, we thus restrict the topic assignment for these titles. Meanwhile, we employ the same topic distribution to model the relationship between the title and the image of the item. To learn consumer preferences, the proposed model extracts several important dimensions based on textual words in titles and visual features in images. Experiments on the Amazon dataset show that the proposed model outperforms other baseline models for the task of learning consumer preferences. Our findings provide significant implications for managers to understand users’ personalized interests behind purchase behavior from a fine-grained level and a multi-modal perspective.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. https://www.amazon.com/.

References

  1. He, R., Fang, C., Wang, Z., & McAuley, J. (2016) Vista: A visually, socially, and temporally-aware model for artistic recommendation, In Proceedings of the 10th acm conference on recommender systems, 309–316. https://doi.org/10.1145/2959100.2959152.

  2. Wang, H., Zhang, F., Wang, J., Zhao, M., Li, W., Xie, X., & Guo, M. (2019). Exploring high-order user preference on the knowledge graph for recommender systems. ACM Transactions on Information Systems, 37, 1–26. https://doi.org/10.1145/3312738

    Article  Google Scholar 

  3. Agichtein, E., Brill, E., & Dumais, S. (2006) Improving web search ranking by incorporating user behavior information, In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, 19–26. https://doi.org/10.1145/1148170.1148177.

  4. Liu, J., Toubia, O., & Hill, S. (2021). Content-based model of web search behavior: An application to TV show search. Management Science, 67(10), 6378–6398. https://doi.org/10.1287/mnsc.2020.3827

    Article  Google Scholar 

  5. Moon, S., Jalali, N., & Erevelles, S. (2021). Segmentation of both reviewers and businesses on social media. Journal of Retailing and Consumer Services, 61, 102524. https://doi.org/10.1016/j.jretconser.2021.102524

    Article  Google Scholar 

  6. Nilashi, M., Ahmadi, H., Arji, G., Alsalem, K. O., Samad, S., Ghabban, F., Alzahrani, A. O., Ahani, A., & Alarood, A. A. (2021). Big social data and customer decision making in vegetarian restaurants: A combined machine learning method. Journal of Retailing and Consumer Services, 62, 102630. https://doi.org/10.1016/j.jretconser.2021.102630

    Article  Google Scholar 

  7. Zheng, J., Wu, X., Niu, J., & Bolivar, A. (2009) Substitutes or complements: another step forward in recommendations, In Proceedings of the 10th ACM conference on Electronic commerce, 139–146. https://doi.org/10.1145/1566374.1566394.

  8. McAuley, J., Pandey, R., & Leskovec, J. (2015) Inferring networks of substitutable and complementary products, In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 785–794.

  9. Wang, C., & Blei, D. M. (2011) Collaborative topic modeling for recommending scientific articles, In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 448–456. https://doi.org/10.1145/2020408.2020480.

  10. McAuley, J., & Leskovec, J. (2013) Hidden factors and hidden topics: understanding rating dimensions with review text, In Proceedings of the 7th ACM conference on recommender systems, 165–172. https://doi.org/10.1145/2507157.2507163.

  11. Ling, G., Lyu, M. R., & King, I. (2014) Ratings meet reviews, a combined approach to recommend, In Proceedings of the 8th ACM conference on recommender systems, 105–112.

  12. Sun, P., Wu, L., Zhang, K., Fu, Y., Hong, R., & Wang, M. (2020). Dual learning for explainable recommendation: Towards unifying user preference prediction and review generation. In Proceedings of The Web Conference, 2020, 837–847. https://doi.org/10.1145/3366423.3380164

    Article  Google Scholar 

  13. Elkahky, A. M., Song, Y., & He, X. (2015) A multi-view deep learning approach for cross domain user modeling in recommendation systems, In Proceedings of the 24th international conference on world wide web, 278–288. https://doi.org/10.1145/2736277.2741667.

  14. Liu, J., & Toubia, O. (2018). A semantic approach for estimating consumer content preferences from online search queries. Marketing Science, 37, 930–952. https://doi.org/10.1287/mksc.2018.1112

    Article  Google Scholar 

  15. Zhang, F., Yuan, N. J., Lian, D., Xie, X., & Ma, W.-Y. (2016) Collaborative knowledge base embedding for recommender systems, In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 353–362. https://doi.org/10.1145/2939672.2939673.

  16. Cheng, Z., Chang, X., Zhu, L., Kanjirathinkal, R. C., & Kankanhalli, M. (2019). MMALFM: Explainable recommendation by leveraging reviews and images. ACM Transactions on Information Systems, 37, 1–28. https://doi.org/10.1145/3291060

    Article  Google Scholar 

  17. Wei, Y., Wang, X., Nie, L., He, X., Hong, R., & Chua, T.-S. (2019) MMGCN: Multi-modal graph convolution network for personalized recommendation of micro-video, In Proceedings of the 27th ACM international conference on multimedia, 1437–1445. https://doi.org/10.1145/3343031.3351034.

  18. Guo, Y., Cheng, Z., Nie, L., Xu, X.-S., & Kankanhalli, M. (2018) Multi-modal preference modeling for product search, In Proceedings of the 26th ACM international conference on Multimedia, 1865–1873. https://doi.org/10.1145/3240508.3240541.

  19. Guan, Y., Wei, Q., & Chen, G. J. D. S. S. (2019). Deep learning based personalized recommendation with multi-view information integration. Decision Support Systems, 118, 58–69. https://doi.org/10.1016/j.dss.2019.01.003

    Article  Google Scholar 

  20. Li, G., Zhuo, J., Li, C., Hua, J., Yuan, T., Niu, Z., Ji, D., Wu, R., & Zhang, H. (2021). Multi-modal visual adversarial Bayesian personalized ranking model for recommendation. Information Sciences, 572, 378–403. https://doi.org/10.1016/j.ins.2021.05.022

    Article  Google Scholar 

  21. Wei, W., Huang, C., Xia, L., & Zhang, C. J. A. P. A. (2023) Multi-Modal Self-Supervised Learning for Recommendation.

  22. Liu, J., Wu, C., & Liu, W. (2013). Bayesian probabilistic matrix factorization with social relations and item contents for recommendation. Decision Support Systems, 55, 838–850. https://doi.org/10.1016/j.dss.2013.04.002

    Article  Google Scholar 

  23. McAuley, J., Targett, C., Shi, Q., & Van Den Hengel, A. (2015) Image-based recommendations on styles and substitutes, In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, 43–52. https://doi.org/10.1145/2766462.2767755.

  24. Hu, Y., Yi, X., & Davis, L. S. (2015) Collaborative fashion recommendation: A functional tensor factorization approach, In Proceedings of the 23rd ACM international conference on multimedia, 129–138. https://doi.org/10.1145/2733373.2806239.

  25. Chu, W.-T., & Tsai, Y.-L. (2017). A hybrid recommendation system considering visual information for predicting favorite restaurants. World Wide Web, 20, 1313–1331. https://doi.org/10.1007/s11280-017-0437-1

    Article  Google Scholar 

  26. Farseev, A., Samborskii, I., Filchenkov, A., & Chua, T.-S. (2017) Cross-domain recommendation via clustering on multi-layer graphs, In Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, 195–204. https://doi.org/10.1145/3077136.3080774.

  27. Phillip, N. (1970). Information and consumer behavior. Journal of political economy, 78, 311–329. https://doi.org/10.1086/259630

    Article  Google Scholar 

  28. Phillip, N. (1974). Advertising as information. Journal of political economy, 82, 729–754. https://doi.org/10.1086/260231

    Article  Google Scholar 

  29. Yan, H., Wang, Z., Lin, T.-H., Li, Y., & Jin, D. (2018). Profiling users by online shopping behaviors. Multimedia Tools and Applications, 77, 21935–21945. https://doi.org/10.1007/s11042-017-5365-7

    Article  Google Scholar 

  30. Zhang, J., Zhang, J., & Zhang, M. (2019). From free to paid: Customer expertise and customer satisfaction on knowledge payment platforms. Decision Support Systems, 127, 113140. https://doi.org/10.1016/j.dss.2019.113140

    Article  Google Scholar 

  31. Scheuffelen, S., Kemper, J., & Brettel, M. (2019). How do human attitudes and values predict online marketing responsiveness?: Comparing consumer segmentation bases toward brand purchase and marketing response. Journal of Advertising Research, 59, 142–157. https://doi.org/10.2501/JAR-2019-021

    Article  Google Scholar 

  32. Li, H., Chen, Q., Zhong, Z., Gong, R., & Han, G. (2022). E-word of mouth sentiment analysis for user behavior studies. Information Processing and Management, 59, 102784. https://doi.org/10.1016/j.ipm.2021.102784

    Article  Google Scholar 

  33. Saura, J. R., Palacios-Marqués, D., & Ribeiro-Soriano, D. (2023). Privacy concerns in social media UGC communities: Understanding user behavior sentiments in complex networks. Information Systems and e-Business Management. https://doi.org/10.1007/s10257-023-00631-5

    Article  Google Scholar 

  34. Barbosa, B., Saura, J. R., Zekan, S. B., & Ribeiro-Soriano, D. (2023). Defining content marketing and its influence on online user behavior: A data-driven prescriptive analytics method. Annals of Operations Research. https://doi.org/10.1007/s10479-023-05261-1

    Article  Google Scholar 

  35. Kumar, N., Venugopal, D., Qiu, L., & Kumar, S. (2019). Detecting anomalous online reviewers: An unsupervised approach using mixture models. Journal of Management Information Systems, 36, 1313–1346. https://doi.org/10.1080/07421222.2019.1661089

    Article  Google Scholar 

  36. Byun, H., Jeong, S., & Kim, C.-K. (2021). Sc-com: Spotting collusive community in opinion spam detection. Information Processing and Management, 58, 102593. https://doi.org/10.1016/j.ipm.2021.102593

    Article  Google Scholar 

  37. Li, H., & Ma, L. (2020). Charting the path to purchase using topic models. Journal of Marketing Research, 57, 1019–1036. https://doi.org/10.1177/0022243720954376

    Article  Google Scholar 

  38. Toubia, O. (2021). A Poisson factorization topic model for the study of creative documents (and their summaries). Journal of Marketing Research, 58, 1142–1158. https://doi.org/10.1177/002224372094320

    Article  Google Scholar 

  39. Du, Q., Li, N., Liu, W., Sun, D., Yang, S., & Yue, F. (2022). A topic recognition method of news text based on word embedding enhancement. Computational Intelligence and Neuroscience. https://doi.org/10.1155/2022/4582480

    Article  Google Scholar 

  40. Zhu, H., Mei, Y., Wei, J., & Shen, C. (2022). Prediction of online topics’ popularity patterns. Journal of Information Science, 48, 141–151. https://doi.org/10.1177/0165551520961026

    Article  Google Scholar 

  41. Kaya, E., Agca, M., Adiguzel, F., & Cetin, M. (2019). Spatial data analysis with R programming for environment. Human and Ecological Risk Assessment: An International Journal, 25, 1521–1530. https://doi.org/10.1080/10807039.2018.1470896

    Article  Google Scholar 

  42. Zeren Cetin, I., Varol, T., Ozel, H. B., & Sevik, H. (2023). The effects of climate on land use/cover: A case study in Turkey by using remote sensing data. Environmental Science and Pollution Research, 30, 5688–5699. https://doi.org/10.1007/s11356-022-22566-z

    Article  Google Scholar 

  43. Rabitz, F., Olteanu, A., Jurkevičienė, J., & Budžytė, A. (2021). A topic network analysis of the system turn in the environmental sciences. Scientometrics, 126, 2107–2140. https://doi.org/10.1007/s11192-020-03824-8

    Article  Google Scholar 

  44. Fresneda, J. E., & Gefen, D. (2020). Gazing at the stars is not enough, look at the specific word entropy, too! Information & Management, 57, 103388. https://doi.org/10.1016/j.im.2020.103388

    Article  Google Scholar 

  45. Fresneda, J. E., & Gefen, D. (2019). A semantic measure of online review helpfulness and the importance of message entropy. Decision Support Systems, 125, 113117. https://doi.org/10.1016/j.dss.2019.113117

    Article  Google Scholar 

  46. Agnihotri, A., & Bhattacharya, S. (2016). Online review helpfulness: Role of qualitative factors. Psychology & Marketing, 33, 1006–1017. https://doi.org/10.1002/mar.20934

    Article  Google Scholar 

  47. Chou, Y.-C., Chuang, H.H.-C., & Liang, T.-P. (2022). Elaboration likelihood model, endogenous quality indicators, and online review helpfulness. Decision Support Systems, 153, 113683. https://doi.org/10.1016/j.dss.2021.113683

    Article  Google Scholar 

  48. Khern-am-nuai, W., Kannan, K., & Ghasemkhani, H. (2018). Extrinsic versus intrinsic rewards for contributing reviews in an online platform. Information Systems Research, 29, 871–892. https://doi.org/10.1287/isre.2017.0750

    Article  Google Scholar 

  49. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3, 993–1022.

    Google Scholar 

  50. Büschken, J., & Allenby, G. M. (2016). Sentence-based text analysis for customer reviews. Marketing Science, 35, 953–975. https://doi.org/10.1287/mksc.2016.0993

    Article  Google Scholar 

  51. Büschken, J., & Allenby, G. M. (2020). Improving text analysis using sentence conjunctions and punctuation. Marketing Science, 39, 727–742. https://doi.org/10.1287/mksc.2019.1214

    Article  Google Scholar 

  52. Yan, X., Guo, J., Lan, Y., & Cheng, X. (2013) A biterm topic model for short texts, In Proceedings of the 22nd international conference on world wide web, 1445–1456. https://doi.org/10.1145/2488388.2488514.

  53. Yin, J., & Wang, J. (2014) A dirichlet multinomial mixture model-based approach for short text clustering, In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, 233–242. https://doi.org/10.1145/2623330.2623715.

  54. Blei, D., & Lafferty, J. (2006). Correlated topic models. Advances in neural information processing systems, 18, 147.

    Google Scholar 

  55. Biswas, B., Sengupta, P., Kumar, A., Delen, D., & Gupta, S. (2022). A critical assessment of consumer reviews: A hybrid NLP-based methodology. Decision Support Systems, 159, 113799. https://doi.org/10.1016/j.dss.2022.113799

    Article  Google Scholar 

  56. Hu, N., Zhang, T., Gao, B., & Bose, I. (2019). What do hotel customers complain about? Text analysis using structural topic model. Tourism Management, 72, 417–426. https://doi.org/10.1016/j.tourman.2019.01.002

    Article  Google Scholar 

  57. Chen, Y., & Lee, S. (2022). User-generated physician ratings and their effects on patients’ physician choices: Evidence from Yelp. Journal of Marketing. https://doi.org/10.1177/00222429221146

    Article  Google Scholar 

  58. Luangrath, A. W., Xu, Y., & Wang, T. (2023). Paralanguage classifier (PARA): An algorithm for automatic coding of paralinguistic nonverbal parts of speech in text. Journal of Marketing Research, 60, 388–408. https://doi.org/10.1177/00222437221116058

    Article  Google Scholar 

  59. Zhang, Z., Yang, K., Zhang, J. Z., & Palmatier, R. W. (2023). Uncovering synergy and dysergy in consumer reviews: A machine learning approach. Management Science, 69, 2339–2360. https://doi.org/10.1287/mnsc.2022.4443

    Article  Google Scholar 

  60. Yang, Y., Zhang, K., & Fan, Y. (2023). sdtm: A supervised bayesian deep topic model for text analytics. Information Systems Research, 34, 137–156. https://doi.org/10.1287/isre.2022.1124

    Article  Google Scholar 

  61. Vogel, J., & Schiele, B. (2007). Semantic modeling of natural scenes for content-based image retrieval. International Journal of Computer Vision, 72, 133–157. https://doi.org/10.1007/s11263-006-8614-1

    Article  Google Scholar 

  62. Singh, S., Gupta, A., & Efros, A. A. (2012) Unsupervised discovery of mid-level discriminative patches, In European conference on computer vision, 73–86. https://doi.org/10.1007/978-3-642-33709-3_6.

  63. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60, 91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  64. Coates, A., & Ng, A. Y. (2012) Learning feature representations with k-means, In Neural networks: Tricks of the trade, pp 561–580, Springer.

  65. Li, C., Cheung, W. K., Ye, Y., Zhang, X., Chu, D., & Li, X. (2015). The author-topic-community model for author interest profiling and community discovery. Knowledge and Information Systems, 44, 359–383. https://doi.org/10.1007/s10115-014-0764-9

    Article  Google Scholar 

  66. He, L., Jia, Y., Han, W., & Ding, Z. (2014). Mining user interest in microblogs with a user-topic model. China Communications, 11, 131–144. https://doi.org/10.1109/CC.2014.6911095

    Article  Google Scholar 

  67. Minka, T. P. (2013) Expectation propagation for approximate Bayesian inference, arXiv preprint arXiv:https://doi.org/10.48550/arXiv.1301.2294.

  68. Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 10(1145/1756006), 1756008.

    Google Scholar 

  69. Tan, Y., Zhang, M., Liu, Y., & Ma, S. (2016). Rating-boosted latent topics: Understanding users and items with ratings and reviews. In IJCAI, 16, 2640–2646.

    Google Scholar 

  70. Mankad, S., Han, H. S., Goh, J., & Gavirneni, S. (2016). Understanding online hotel reviews through automated text analysis. Service Science, 8, 124–138. https://doi.org/10.1287/serv.2016.0126

    Article  Google Scholar 

  71. Mimno, D., Wallach, H., Talley, E., Leenders, M., & McCallum, A. (2011) Optimizing semantic coherence in topic models, In Proceedings of the 2011 conference on empirical methods in natural language processing, 262–272.

  72. Erosheva, E., Fienberg, S., & Lafferty, J. (2004). Mixed-membership models of scientific publications. Proceedings of the National Academy of Sciences, 101, 5220–5227. https://doi.org/10.1073/pnas.0307760101

    Article  Google Scholar 

  73. Qian, S., Zhang, T., Xu, C., & Shao, J. (2015). Multi-modal event topic model for social event analysis. IEEE Transactions on Multimedia, 18, 233–246. https://doi.org/10.1109/TMM.2015.2510329

    Article  Google Scholar 

  74. Röder, M., Both, A., & Hinneburg, A. (2015) Exploring the space of topic coherence measures, In Proceedings of the eighth ACM international conference on web search and data mining, 399–408. https://doi.org/10.1145/2684822.2685324.

  75. Stevens, K., Kegelmeyer, P., Andrzejewski, D., & Buttler, D. (2012) Exploring topic coherence over many models and many topics, In Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning, 952–961.

  76. Rendle, S. (2010) Factorization machines, In 2010 IEEE international conference on data mining, 995–1000. https://doi.org/10.1109/ICDM.2010.127.

Download references

Acknowledgements

We appreciate the constructive comments from the anonymous reviewers. This work is supported by the National Natural Science Foundation of China (72101072, 72271084, 72171071, 91846201, 72322019), the Fundamental Research Funds for the Central Universities (JZ2022HGTB0282, PA2023IISL0103, JZ2023HGQA0469), and the National Engineering Laboratory for Big Data Distribution and Exchange Technologies.

Author information

Authors and Affiliations

Authors

Contributions

XL: Idea, experiment, result analysis, writing the manuscript. YL: design of the study, final proofreading. YQ: idea, writing the manuscript. YJ: final proofreading. HL: writing the manuscript, algorithm implementation.

Corresponding author

Correspondence to Yang Qian.

Ethics declarations

Competing interest

The authors certify that there is no conflict of interest in the subject matter discussed in the manuscript.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Liu, Y., Qian, Y. et al. Learning consumer preferences through textual and visual data: a multi-modal approach. Electron Commer Res (2023). https://doi.org/10.1007/s10660-023-09780-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10660-023-09780-8

Keywords

Navigation