skip to main content
research-article

Package Arrival Time Prediction via Knowledge Distillation Graph Neural Network

Authors Info & Claims
Published:28 February 2024Publication History
Skip Abstract Section

Abstract

Accurately estimating packages’ arrival time in e-commerce can enhance users’ shopping experience and improve the placement rate of products. This problem is often formalized as an Origin-Destination (OD)-based ETA (i.e., estimated time of arrival) prediction task, where the delivery time is estimated mainly based on sender and receiver addresses and other context information. One inherent challenge of the OD-based ETA problem is that the delivery time highly depends on the actual delivery trajectory which is unknown at the time of prediction. In this article, we tackle this challenge by effectively exploiting historical delivery trajectories. We propose a novel Knowledge Distillation Graph neural network-based package ETA prediction (KDG-ETA) model, which uses knowledge distillation in the training phase to distill the knowledge of historical trajectories into OD pair embeddings. In KDG-ETA, a multi-level trajectory graph representation model is proposed to fully exploit trajectory information at the node-level, edge-level, and path-level. Then, the OD representations embedded with trajectory knowledge are combined with context embeddings from feature extraction module for delivery time prediction using an adaptive attention module. KDG-ETA consistently outperforms existing state-of-the-art OD-based ETA prediction methods on three real-world Alibaba datasets, reducing the Mean Absolute Error (MAE) by 3.0%–39.1% as demonstrated in our extensive empirical evaluation.

REFERENCES

  1. [1] Alkhulaifi Abdolmaged, Alsahli Fahad, and Ahmad Irfan. 2020. Knowledge distillation in deep learning and its applications. CoRR abs/2007.09029 (2020).Google ScholarGoogle Scholar
  2. [2] Amirian Pouria, Basiri Anahid, and Morley Jeremy. 2016. Predictive analytics for enhancing travel time estimation in navigation apps of Apple, Google, and Microsoft. In Proceedings of the 9th ACM SIGSPATIAL International Workshop on Computational Transportation Science. ACM, 3136.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] Chen Tianqi and Guestrin Carlos. 2016. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016. ACM, 785794.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Cui Ruomeng, Sun Tianshu, Lu Zhikun, and Golden Joseph. 2020. Sooner or later? Promising delivery speed in online retail. In Promising Delivery Speed in Online Retail.Google ScholarGoogle Scholar
  5. [5] Dai Yimian, Gieseke Fabian, Oehmcke Stefan, Wu Yiquan, and Barnard Kobus. 2021. Attentional feature fusion. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV 2021). IEEE, 35593568.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Araujo Arthur Cruz De and Etemad Ali. 2019. Deep neural networks for predicting vehicle travel times. In Proceedings of the 2019 IEEE SENSORS. 14. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  7. [7] Araujo Arthur Cruz de and Etemad Ali. 2021. End-to-end prediction of parcel delivery time with deep learning for smart-city applications. IEEE Internet Things J. 8, 23 (2021), 1704317056.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Deng Xiang and Zhang Zhongfei. 2021. Graph-free knowledge distillation for graph neural networks. In Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI). 23212327.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Austin Derrow-Pinion, Jennifer She, David Wong, Oliver Lange, Todd Hester, Luis Perez, Marc Nunkesser, Seongjae Lee, Xueying Guo, Brett Wiltshire, Peter W. Battaglia, Vishal Gupta, Ang Li, Zhongwen Xu, Alvaro Sanchez-Gonzalez, and Yujia Li, Petar Velickovic. 2021. ETA prediction with graph neural networks in google maps. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management (CIKM’21). ACM, 37673776.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Feng Zhenni and Zhu Yanmin. 2016. A survey on trajectory data mining: Techniques and applications. IEEE Access 4 (2016), 20562067.Google ScholarGoogle ScholarCross RefCross Ref
  11. [11] Fey Matthias and Lenssen Jan Eric. 2019. Fast graph representation learning with PyTorch geometric. CoRR abs/1903.02428 (2019).Google ScholarGoogle Scholar
  12. [12] Fu Kun, Meng Fanlin, Ye Jieping, and Wang Zheng. 2020. CompactETA: A fast inference system for travel time prediction. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Virtual Event, CA, August 23-27, 2020). ACM, 33373345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. [13] Fu Kui, Shi Peipei, Song Yafei, Ge Shiming, Lu Xiangju, and Li Jia. 2020. Ultrafast video attention prediction with coupled knowledge distillation. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI). 1080210809.Google ScholarGoogle ScholarCross RefCross Ref
  14. [14] Fu Tao-Yang and Lee Wang-Chien. 2019. DeepIST: Deep image-based spatio-temporal network for travel time estimation. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (CIKM 2019). ACM, 6978.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Gao Ziyan and Sun Zhanbo. 2021. Modeling spatio-temporal interactions for vehicle trajectory prediction based on graph representation learning. In Proceedings of the 24th IEEE International Intelligent Transportation Systems Conference (ITSC). IEEE, 13341339.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Grover Aditya and Leskovec Jure. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, CA, August 13–17, 2016). ACM, 855864.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Guo Zhichun, Zhang Chunhui, Fan Yujie, Tian Yijun, Zhang Chuxu, and Chawla Nitesh V.. 2023. Boosting graph neural networks via adaptive knowledge distillation. In Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI 2023), 35th Conference on Innovative Applications of Artificial Intelligence (IAAI 2023), 13th Symposium on Educational Advances in Artificial Intelligence (EAAI 2023) (Washington, DC, February 7–14, 2023). AAAI Press, 77937801.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. [18] Ha Seungwoong and Jeong Hawoong. 2023. Learning heterogeneous interaction strengths by trajectory prediction with graph neural network. In Proceedings of the 11th International Conference on Learning Representations (ICLR 2023) (Kigali, Rwanda, May 1–5, 2023). OpenReview.net.Google ScholarGoogle Scholar
  19. [19] Hamilton William L., Ying Zhitao, and Leskovec Jure. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems 30: Proceedings of the Annual Conference on Neural Information Processing Systems 2017(December 4–9, 2017, Long Beach, CA, USA). 10241034.Google ScholarGoogle Scholar
  20. [20] Han Peng, Wang Jin, Yao Di, Shang Shuo, and Zhang Xiangliang. 2021. A graph-based approach for trajectory similarity computation in spatial networks. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’21). ACM, 556564.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Hildebrandt Florentin D. and Ulmer Marlin W.. 2021. Supervised learning for arrival time estimations in restaurant meal delivery. Transportation Science (2021).Google ScholarGoogle Scholar
  22. [22] Hinton Geoffrey E., Vinyals Oriol, and Dean Jeffrey. 2015. Distilling the knowledge in a neural network. CoRR abs/1503.02531 (2015). http://arxiv.org/abs/1503.02531Google ScholarGoogle Scholar
  23. [23] Jindal Ishan, Qin Tony, Chen Xuewen, Nokleby Matthew S., and Ye Jieping. 2017. A unified neural network approach for estimating travel time and distance for a taxi trip. arXiv: 1710.04350 (2017). http://arxiv.org/abs/1710.04350Google ScholarGoogle Scholar
  24. [24] Joshi Chaitanya K., Liu Fayao, Xun Xu, Lin Jie, and Foo Chuan-Sheng. 2021. On representation knowledge distillation for graph neural networks. arXiv preprint arXiv:2111.04964 (2021).Google ScholarGoogle Scholar
  25. [25] Karatzoglou Antonios, Schnell Nikolai, and Beigl Michael. 2018. A convolutional neural network approach for modeling semantic trajectories and predicting future locations. In Proceedings of the Artificial Neural Networks and Machine Learning (ICANN 2018), Vol. 11139. Springer, 6172.Google ScholarGoogle ScholarCross RefCross Ref
  26. [26] Li Xingjian, Xiong Haoyi, Chen Zeyu, Huan Jun, Liu Ji, Xu Cheng-Zhong, and Dou Dejing. 2022. Knowledge distillation with attention for deep transfer learning of convolutional networks. ACM Trans. Knowl. Discov. Data 16, 3 (2022), 42:1–42:20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Li Yaguang, Fu Kun, Wang Zheng, Shahabi Cyrus, Ye Jieping, and Liu Yan. 2018. Multi-task representation learning for travel time estimation. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2018). ACM, 16951704.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Li Yang, Wu Xingyu, Wang Jinglong, Liu Yong, Wang Xiaoqing, Deng Yuming, and Miao Chunyan. 2021. Unsupervised categorical representation learning for package arrival time prediction. In Proceedings of the 30th ACM International Conference on Information and Knowledge Management (CIKM ’21). ACM, 39353944.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. [29] Lian Jianxun, Zhou Xiaohuan, Zhang Fuzheng, Chen Zhongxia, Xie Xing, and Sun Guangzhong. 2018. xDeepFM: Combining explicit and implicit feature interactions for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD 2018) (London, UK, August 19–23, 2018). ACM, 17541763.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Lin Lei, Li Weizi, Bi Huikun, and Qin Lingqiao. 2022. Vehicle trajectory prediction using LSTMs with spatial-temporal attention mechanisms. IEEE Intell. Transp. Syst. Mag. 14, 2 (2022), 197208.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Liu Hongbin, Wu Hao, Sun Weiwei, and Lee Ickjai. 2019. Spatio-temporal GRU for trajectory classification. In Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 12281233.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Liu Wei, He Jiayu, Wang Haiming, Zhu Huaijie, and Yin Jian. 2021. A novel road segment representation method for travel time estimation. In Proceedings of the Database Systems for Advanced Applications (DASFAA 2021). Vol. 12680. Springer, 398413.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. [33] Lv Jianming, Sun Qinghui, Li Qing, and Moreira-Matias Luís. 2020. Multi-scale and multi-scope convolutional neural networks for destination prediction of trajectories. IEEE Trans. Intell. Transp. Syst. 21, 8 (2020), 31843195.Google ScholarGoogle ScholarCross RefCross Ref
  34. [34] Perozzi Bryan, Al-Rfou Rami, and Skiena Steven. 2014. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14) (New York, NY, August 24–27, 2014). ACM, 701710.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. [35] Qiang Yuting, Wen Haomin, Wu Lixia, Mao Xiaowei, Wu Fan, Wan Huaiyu, and Hu Haoyuan. 2023. Modeling intra- and inter-community information for route and time prediction in last-mile delivery. In Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE 2023) (Anaheim, CA, April 3–7, 2023). IEEE, 31063112. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Sevlian Raffi and Rajagopal Ram. 2010. Travel time estimation using floating car data. arXiv Preprint arXiv:1012.4249 (2010). http://arxiv.org/abs/1012.4249Google ScholarGoogle Scholar
  37. [37] Shen Yibin, Jin Cheqing, Hua Jiaxun, and Huang Dingjiang. 2022. TTPNet: A neural network for travel time prediction based on tensor decomposition and graph embedding. IEEE Trans. Knowl. Data Eng. 34, 9 (2022), 45144526. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  38. [38] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Lukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017(December 4–9, 2017, Long Beach, CA),. 59986008.Google ScholarGoogle Scholar
  39. [39] Wang Hongjian, Tang Xianfeng, Kuo Yu-Hsuan, Kifer Daniel, and Li Zhenhui. 2019. A simple baseline for travel time estimation using large-scale trip data. ACM Trans. Intell. Syst. Technol. 10, 2 (2019), 19:1–19:22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. [40] Wang Senzhang, Cao Jiannong, and Yu Philip. 2020. Deep learning for spatio-temporal data mining: A survey. IEEE Trans. Knowl. Data Eng. (2020).Google ScholarGoogle ScholarCross RefCross Ref
  41. [41] Wang Yilun, Zheng Yu, and Xue Yexiang. 2014. Travel time estimation of a path using sparse trajectories. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’14). ACM, 2534.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. [42] Wen Haomin, Lin Youfang, Mao Xiaowei, Wu Fan, Zhao Yiji, Wang Haochen, Zheng Jianbin, Wu Lixia, Hu Haoyuan, and Wan Huaiyu. 2022. Graph2Route: A Dynamic spatial-temporal graph neural network for pick-up and delivery route prediction. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington, DC, August 14–18, 2022). ACM, 41434152.Google ScholarGoogle Scholar
  43. [43] Wu Fan and Wu Lixia. 2019. DeepETA: A spatial-temporal sequential neural network model for estimating time of arrival in package delivery system. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence. AAAI. 774781.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. [44] Yang Cheng, Liu Jiawei, and Shi Chuan. 2021. Extract the knowledge of graph neural networks and go beyond it: An effective knowledge distillation framework. In WWW ’21: The Web Conference 2021. ACM, 12271237.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Yang Yiding, Qiu Jiayan, Song Mingli, Tao Dacheng, and Wang Xinchao. 2020. Distilling knowledge from graph convolutional networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 70727081.Google ScholarGoogle ScholarCross RefCross Ref
  46. [46] Yu Cunjun, Ma Xiao, Ren Jiawei, Zhao Haiyu, and Yi Shuai. 2020. Spatio-temporal graph transformer networks for pedestrian trajectory prediction. In Proceedings of the 16th European Conference on Computer Vision (ECCV 2020) , Part XII(Lecture Notes in Computer Science, Vol. 12357). Springer, 507523.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. [47] Zhang Liang and Long Cheng. 2023. Road network representation learning: A dual graph based approach. ACM Trans. Knowl. Discov. Data (Apr. 2023).Google ScholarGoogle Scholar
  48. [48] Zhang Lei, Wang Mingliang, Zhou Xin, Wu Xingyu, Cao Yiming, Xu Yonghui, Cui Lizhen, and Shen Zhiqi. 2023. Dual graph multitask framework for imbalanced delivery time estimation. In Proceedings of the 28th International Conference on Database Systems for Advanced Applications , (DASFAA 2023) (Tianjin, China, April 17–20, 2023), Part IV(Lecture Notes in Computer Science, Vol. 13946). Springer, 606618.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. [49] Zhang Lei, Wu Xingyu, Liu Yong, Zhou Xin, Cao Yiming, Xu Yonghui, Cui Lizhen, and Miao Chunyan. 2023. Estimating package arrival time via heterogeneous hypergraph neural network. Expert Systems with Applications (2023), 121740.Google ScholarGoogle Scholar
  50. [50] Zhang Lei, Zhou Xin, Zeng Zhiwei, Cao Yiming, Xu Yonghui, Wang Mingliang, Wu Xingyu, Liu Yong, Cui Lizhen, and Shen Zhiqi. 2023. Delivery time prediction using large-scale graph structure learning based on quantile regression. In Proceedings of the 39th IEEE International Conference on Data Engineering (ICDE 2023), (Anaheim, CA, April 3–7, 2023). IEEE, 34033416.Google ScholarGoogle ScholarCross RefCross Ref
  51. [51] Zhou Xin, Wang Jinglong, Liu Yong, Wu Xingyu, Shen Zhiqi, and Leung Cyril. 2023. Inductive graph transformer for delivery time estimation. In Proceedings of the 16th ACM International Conference on Web Search and Data Mining (WSDM 2023) (Singapore, 27 February 2023-3 March 2023). ACM, 679687.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. [52] Zhu Lin, Yu Wei, Zhou Kairong, Wang Xing, Feng Wenxing, Wang Pengyu, Chen Ning, and Lee Pei. 2020. Order fulfillment cycle time estimation for on-demand food delivery. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’20). ACM, 25712580.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Package Arrival Time Prediction via Knowledge Distillation Graph Neural Network

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 5
        June 2024
        699 pages
        ISSN:1556-4681
        EISSN:1556-472X
        DOI:10.1145/3613659
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 February 2024
        • Online AM: 24 January 2024
        • Accepted: 16 January 2024
        • Revised: 19 November 2023
        • Received: 27 April 2023
        Published in tkdd Volume 18, Issue 5

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)210
        • Downloads (Last 6 weeks)96

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text