City indicators for geographical transfer learning: an application to crash prediction

Nanni, Mirco; Guidotti, Riccardo; Bonavita, Agnese; Alamdari, Omid Isfahani

doi:10.1007/s10707-022-00464-3

City indicators for geographical transfer learning: an application to crash prediction

Published: 22 March 2022

Volume 26, pages 581–612, (2022)
Cite this article

GeoInformatica Aims and scope Submit manuscript

Mirco Nanni ORCID: orcid.org/0000-0003-3534-4332¹,
Riccardo Guidotti^1,2,
Agnese Bonavita³ &
…
Omid Isfahani Alamdari²

571 Accesses
2 Citations
Explore all metrics

Abstract

The massive and increasing availability of mobility data enables the study and the prediction of human mobility behavior and activities at various levels. In this paper, we tackle the problem of predicting the crash risk of a car driver in the long term. This is a very challenging task, requiring a deep knowledge of both the driver and their surroundings, yet it has several useful applications to public safety (e.g. by coaching high-risk drivers) and the insurance market (e.g. by adapting pricing to risk). We model each user with a data-driven approach based on a network representation of users’ mobility. In addition, we represent the areas in which users moves through the definition of a wide set of city indicators that capture different aspects of the city. These indicators are based on human mobility and are automatically computed from a set of different data sources, including mobility traces and road networks. Through these city indicators we develop a geographical transfer learning approach for the crash risk task such that we can build effective predictive models for another area where labeled data is not available. Empirical results over real datasets show the superiority of our solution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 3

Fig. 5

Spatiotemporal grid-based crash prediction—application of a transparent deep hybrid modeling framework

Article 06 September 2022

A Deep Gravity model for mobility flows generation

Article Open access 12 November 2021

ALF-Score++, a novel approach to transfer knowledge and predict network-based walkability scores across cities

Article Open access 18 August 2022

Availability of Data and Material

The vehicles datasets adopted in this work are private, and were provided within the scope of the Track & Know project (https://trackandknowproject.eu/), while the city indicators can be requested through the project web site.

Code Availability

The code is open source, and can be downloaded at: https://github.com/riccotti/CrashPrediction

Notes

https://tinyurl.com/32k589z2
We refer the interested reader to: https://christophm.github.io/interpretable-ml-book/shapley.html
The source code is available at: https://github.com/riccotti/CrashPrediction. The city indicators used in this paper can be obtained from the Track & Know project website (see next footnote), while the mobility datasets are proprietary, and cannot be publicly shared.
https://trackandknowproject.eu/
The drivers were sampled among those that had consistent data throughout the 12 months, and also ensuring to keep all those that had at least one crash in the year. This latter step was not possible on Dataset 2, a side effect being that Dataset 1 has a higher percentage of crash events.
Cross-validation was also tested, yet results do not change in any significant way.
https://lightgbm.readthedocs.io/en/latest/index.html
https://scikit-learn.org/stable/
https://keras.io/
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RandomizedSearchCV.html
In particular, we used RF with 100 estimators, allowing leaves with at least 1% of the training data, and with a cost matrix weighting a crash 100 times more than a no crash.
An ablation study (omitted due to space limits) showed that both IMN- and context-based features significantly contributed to such performances.

References

Longhi L, Nanni M (2019) Car telematics big data analytics for insurance and innovative mobility services. J Ambient Intell Humanized Comput 11:3989–3999
Wang Y, Xu W, Zhang Y, Qin Y, Zhang W, Wu X (2017) Machine learning methods for driving risk prediction. In: Proceedings of the 3rd ACM SIGSPATIAL workshop on emergency management using, p 10. ACM
Lee C, Hellinga B, Saccomanno F (2003) Real-time crash prediction model for application to crash prevention in freeway traffic. Transportation Research Record 1840(1):67–77
Article Google Scholar
Ba Y et al (2017) Crash prediction with behavioral and physiological features for advanced vehicle collision avoidance system. Transportation Research Part C: Emerging Technologies 74:22–33
Article Google Scholar
Cruz LA, et al (2019) Trajectory prediction from a mass of sparse and missing external sensor data. In: 2019 20th IEEE International conference on mobile data management (MDM), pp 310–319. IEEE
Guidotti R, Nanni M (2020) Crash prediction and risk assessment with individual mobility networks. In: 2020 21st IEEE International conference on mobile data management (MDM), pp 89–98. IEEE
Rinzivillo S, et al (2014) The purpose of motion: Learning activities from individual mobility networks. In: 2014 International conference on data science and advanced analytics (DSAA), pp 312–318. IEEE
Guidotti R, et al (2017) There’s a path for everyone: A data-driven personal model reproducing mobility agendas. In: 2017 IEEE International conference on data science and advanced analytics (DSAA), pp 303–312. IEEE
Nanni M, Bonavita A, Guidotti R (2021) City indicators for mobility data mining. In: Big mobility data analytics (BMDA). CEUR
Wang J, Xu W, Gong Y (2010) Real-time driving danger level prediction. Google Patents. US Patent 7,839,292
Salim FD, Loke SW, Rakotonirainy A, Srinivasan B, Krishnaswamy S (2007) Collision pattern modeling and real-time collision detection at road intersections. In: 2007 IEEE Intelligent transportation systems conference, pp 161–166. IEEE
Abdel-Aty MA, Pemmanaboina R (2006) Calibrating a real-time traffic crash-prediction model using archived weather and its traffic data. IEEE Transactions on Intelligent Transportation Systems 7(2):167–174
Article Google Scholar
Mannering FL, Bhat CR (2014) Analytic methods in accident research: Methodological frontier and future directions. Analytic Methods in Accident Research 1:1–22
Article Google Scholar
Kweon Y-J et al (2011) Development of crash prediction models with individual vehicular data. Transportation Research Part C: Emerging Technologies 19(6):1353–1363
Article Google Scholar
Lord D, Mannering F (2010) The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives. Transportation research part A: policy and practice 44(5):291–305
Google Scholar
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22(10):1345–1359
Article Google Scholar
Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2020) A comprehensive survey on transfer learning. Proceedings of the IEEE PP, 1–34. https://doi.org/10.1109/JPROC.2020.3004555
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10):1345–1359
Article Google Scholar
Bazzi H, Ienco D, Baghdadi N, Zribi M, Demarez V (2020) Distilling before refine: Spatio-temporal transfer learning for mapping irrigated areas using Sentinel-1 time series. IEEE Geoscience and Remote Sensing Letters 17(11):1909–1913. https://doi.org/10.1109/LGRS.2019.2960625
Article Google Scholar
Syrris V, Pesek O, Soille P (2020) Satimnet: Structured and harmonised training data for enhanced satellite imagery classification. Remote Sensing 12:3358. https://doi.org/10.3390/rs12203358
Article Google Scholar
Bappee FK, Soares A, Petry LM, Matwin S (2021) Examining the impact of cross-domain learning on crime prediction. J. Big Data 8(1):96. https://doi.org/10.1186/s40537-021-00489-9
Article Google Scholar
Liu Z, Shen Y, Zhu Y (2018) Where will dockless shared bikes be stacked? — parking hotspots detection in a new city. In: Proc. of the 24th ACM SIGKDD. KDD ’18, pp 566–575. ACM, New York, NY, USA. https://doi.org/10.1145/3219819.3219920
Iddianozie C, McArdle G (2019) A transfer learning paradigm for spatial networks. In: Proceedings of the 34th ACM/SIGAPP symposium on applied computing. SAC ’19, pp. 659–666. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3297280.3297342
Rogerson PA (2010) Statistical methods for geography: a student’s guide. SAGE Publications, New York. https://books.google.ch/books?id=Zz69Ab8i0QsC
De Sherbinin HGA (2003) Bittar: The role of sustainability indicators as a tool for assessing territorial. Environmental Competitiveness; International Forum for Rural Development, London
Google Scholar
Nélson A et al (2015) A comparative evaluation of mobility conditions in selected cities of the five brazilian regions. Transport Policy 37:147–156. https://doi.org/10.1016/j.tranpol.2014.10.017
Article Google Scholar
Gillis D, Semanjski I, Lauwers D (2015) How to monitor sustainable mobility in cities? literature review in the frame of creating a set of sustainable mobility indicators. Sustainability 8:29
Article Google Scholar
CITEAIR consortium (2007) Air Quality in Europe web site. [Online; accessed 21-December-2020]. http://www.airqualitynow.eu/
Tafidis P et al (2017) Sustainable urban mobility indicators: policy versus practice in the case of greek cities. Transportation Research Procedia 24:304–312. https://doi.org/10.1016/j.trpro.2017.05.122 (CSUM 2016, 26-27 May 2016, Volos, Greece)
Article Google Scholar
Giannotti F et al (2011) Unveiling the complexity of human mobility by querying and mining massive trajectory data. The VLDB Journal 20(5):695–719
Article Google Scholar
F L, G A, et al (2020) A.N.: Citywide traffic analysis based on the combination of visual and analytic approaches. J Geovis Spat Anal 4(15):1–17
Trasarti R, et al (2011) Mining mobility user profiles for car pooling. In: Proceedings of the 17th ACM SIGKDD International conference on knowledge discovery and data mining, pp 1190–1198. ACM
Guidotti R, Trasarti R, Nanni M (2015) Tosca: two-steps clustering algorithm for personal locations detection. In: Proceedings of the 23rd SIGSPATIAL international conference on advances in geographic information systems, p 38. ACM
Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Advances in neural information processing systems, pp 4765–4774
Shannon CE (1948) A mathematical theory of communication. The Bell System Technical Journal 27(3):379–423
Article MathSciNet MATH Google Scholar
Moran PAP (1950) Notes on continuous stochastic phenomena. Biometrika 37(1/2):17–23
Article MathSciNet MATH Google Scholar
Saberi M, Mahmassani HS, Brockmann D, Hosseini A (2017) A complex network perspective for characterizing urban travel demand patterns: graph theoretical analysis of large-scale origin-destination demand networks. Transportation 44(6):1383–1402
Article Google Scholar
Blondel VD, Guillaume J-L, Lambiotte R (2008) Lefebvre E (2008) Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 10:10008
Article Google Scholar
Alonso W (1976) A theory of movements: Introduction. Working Paper 266
Simini F, Gonzalez MC, Maritan A, Barabasi A-L (2012) A universal model for mobility and migration patterns. Nature 484(7392):96–100
Article Google Scholar
Masucci AP, Serras J, Johansson A, Batty M (2013) Gravity versus radiation models: On the importance of scale and heterogeneity in commuting flows. Physical Review E 88(2):022812
Article Google Scholar
Porta S, Crucitti P, Latora V (2006) Centrality measures in spatial networks of urban streets. Physical Review E 73(3, part 2):036125–1
Article MATH Google Scholar
Tan P-N et al (2005) Introduction to data mining. Pearson Addison Wesley, Boston
Google Scholar
Belkin M, Niyogi P, Sindhwani V (2006) Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 7(85):2399–2434
MathSciNet MATH Google Scholar
Chakravarti L (1967) R.: Handbook of methods of applied statistics, Volume I. John Wiley and Sons, Hoboken
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. Journal of artificial Intelligence Research 16:321–357
Article MATH Google Scholar
Tan P-N (2018) Introduction to Data Mining. Pearson Education India, Tamil Nadu
Google Scholar
Wang X, Khattak AJ, Liu J, Masghati-Amoli G, Son S (2015) What is the level of volatility in instantaneous driving decisions? Transportation Research Part C: Emerging Technologies 58:413–427. https://doi.org/10.1016/j.trc.2014.12.014 (Big Data in Transportation and Traffic Engineering)
Article Google Scholar
Johnson DA, Trivedi MM (2011) Driving style recognition using a smartphone as a sensor platform. In: 2011 14th International IEEE conference on intelligent transportation systems (ITSC), pp 1609–1615. https://doi.org/10.1109/ITSC.2011.6083078

Download references

Acknowledgements

This work is partially supported by the European Community H2020 programme under the funding scheme Track&Know (Big Data for Mobility Tracking Knowledge Extraction in Urban Areas), G.A. 780754, https://trackandknowproject.eu/; and SoBigData++, G.A. 871042, http://www.sobigdata.eu.

Funding

This work is partially supported by the European Community H2020 programme under the funding scheme Track&Know (Big Data for Mobility Tracking Knowledge Extraction in Urban Areas), G.A. 780754, https://trackandknowproject.eu/; and SoBigData++, G.A. 871042, http://www.sobigdata.eu.

Author information

Authors and Affiliations

ISTI, CNR, Via G. Moruzzi, 1, 56127, Pisa, Italy
Mirco Nanni & Riccardo Guidotti
Computer Science, University of Pisa, Largo B. Pontecorvo, 3, 56127, Pisa, Italy
Riccardo Guidotti & Omid Isfahani Alamdari
Scuola Normale Superiore, Piazza dei Cavalieri, 7, 56126, Pisa, Italy
Agnese Bonavita

Authors

Mirco Nanni
View author publications
You can also search for this author in PubMed Google Scholar
Riccardo Guidotti
View author publications
You can also search for this author in PubMed Google Scholar
Agnese Bonavita
View author publications
You can also search for this author in PubMed Google Scholar
Omid Isfahani Alamdari
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Mirco Nanni: Conceptualization, Methodology, Formal analysis, Investigation, Resources, Writing, Supervision, Project administration, Funding acquisition. Riccardo Guidotti: Conceptualization, Methodology, Formal analysis, Investigation, Software, Writing, Supervision. Agnese Bonavita: Methodology, Software, Validation, Investigation, Writing, Visualization. Omid Isfahani Alamdari: Methodology, Software, Validation, Investigation, Writing, Visualization.

Corresponding author

Correspondence to Mirco Nanni.

Ethics declarations

Conflicts of interest

Not applicable

Ethics Approval

Not applicable

Consent to Participate

Not applicable

Consent for Publication

Not applicable

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nanni, M., Guidotti, R., Bonavita, A. et al. City indicators for geographical transfer learning: an application to crash prediction. Geoinformatica 26, 581–612 (2022). https://doi.org/10.1007/s10707-022-00464-3

Download citation

Received: 15 September 2021
Revised: 03 February 2022
Accepted: 01 March 2022
Published: 22 March 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10707-022-00464-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

City indicators for geographical transfer learning: an application to crash prediction

Abstract

Access this article

Similar content being viewed by others

Spatiotemporal grid-based crash prediction—application of a transparent deep hybrid modeling framework

A Deep Gravity model for mobility flows generation

ALF-Score++, a novel approach to transfer knowledge and predict network-based walkability scores across cities

Availability of Data and Material

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethics Approval

Consent to Participate

Consent for Publication

Rights and permissions

About this article

Cite this article

Keywords

Navigation

City indicators for geographical transfer learning: an application to crash prediction

Abstract

Access this article

Similar content being viewed by others

Spatiotemporal grid-based crash prediction—application of a transparent deep hybrid modeling framework

A Deep Gravity model for mobility flows generation

ALF-Score++, a novel approach to transfer knowledge and predict network-based walkability scores across cities

Availability of Data and Material

Code Availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Ethics Approval

Consent to Participate

Consent for Publication

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation