Multivariate count time series segmentation with “sums and shares” and Poisson lognormal mixture models: a comparative study using pedestrian flows within a multimodal transport hub

de Nailly, Paul; Côme, Etienne; Oukhellou, Latifa; Samé, Allou; Ferriere, Jacques; Merad-Boudia, Yasmine

doi:10.1007/s11634-023-00543-9

Multivariate count time series segmentation with “sums and shares” and Poisson lognormal mixture models: a comparative study using pedestrian flows within a multimodal transport hub

Regular Article
Published: 29 May 2023

(2023)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

Paul de Nailly ORCID: orcid.org/0000-0001-6204-6176^1,2,
Etienne Côme¹,
Latifa Oukhellou¹,
Allou Samé¹,
Jacques Ferriere² &
…
Yasmine Merad-Boudia²

427 Accesses
1 Citation
Explore all metrics

Abstract

This paper deals with a clustering approach based on mixture models to analyze multidimensional mobility count time-series data within a multimodal transport hub. These time series are very likely to evolve depending on various periods characterized by strikes, maintenance works, or health measures against the Covid19 pandemic. In addition, exogenous one-off factors, such as concerts and transport disruptions, can also impact mobility. Our approach flexibly detects time segments within which the very noisy count data is synthesized into regular spatio-temporal mobility profiles. At the upper level of the modeling, evolving mixing weights are designed to detect segments properly. At the lower level, segment-specific count regression models take into account correlations between series and overdispersion as well as the impact of exogenous factors. For this purpose, we set up and compare two promising strategies that can address this issue, namely the “sums and shares” and “Poisson log-normal” models. The proposed methodologies are applied to actual data collected within a multimodal transport hub in the Paris region. Ticketing logs and pedestrian counts provided by stereo cameras are considered here. Experiments are carried out to show the ability of the statistical models to highlight mobility patterns within the transport hub. One model is chosen based on its ability to detect the most continuous segments possible while fitting the count time series well. An in-depth analysis of the time segmentation, mobility patterns, and impact of exogenous factors obtained with the chosen model is finally performed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A mixture model clustering approach for temporal passenger pattern characterization in public transport

Article 18 January 2016

Scalable clustering of segmented trajectories within a continuous time framework: application to maritime traffic data

Article 12 July 2021

State-space models reveal bursty movement behaviour of dance event visitors

Article Open access 06 July 2021

Code availability

R-code and simulated data used are available in the GitHub repository https://github.com/pdenailly/segmentation_models

Notes

For these sections, only $\textbf{x}_{j,t}^{1,...,8}$ (see Table 3) are used for computing the results, in order to exclude non-calendar effects. Thus, the total profiles and spatial distributions section are invariant by day.

References

Agard B, Morency C, Trépanier M (2006) Mining public transport user behaviour from smart card data. IFAC Proc Vol 39(3):399–404
Article Google Scholar
Aitchison J, Ho C (1989) The multivariate poisson-log normal distribution. Biometrika 76(4):643–653
Article MathSciNet MATH Google Scholar
Bai J (2010) Common breaks in means and variances for panel data. J Econom 157(1):78–92
Article MathSciNet MATH Google Scholar
Baid U, Talbar S (2016) Comparative study of k-means, gaussian mixture model, fuzzy c-means algorithms for brain tumor segmentation. In: International conference on communication and signal processing 2016 (ICCASP 2016), Atlantis Press, pp 583–588
Balzotti C, Bragagnini A, Briani M et al (2018) Understanding human mobility flows from aggregated mobile phone data. IFAC-PapersOnLine 51(9):25–30
Article Google Scholar
Bouveyron C, Celeux G, Murphy TB et al (2019) Model-based clustering and classification for data science: with applications in R, vol 50. Cambridge University Press, Cambridge
Book MATH Google Scholar
Briand AS, Côme E, Trépanier M et al (2017) Analyzing year-to-year changes in public transport passenger behaviour using smart card data. Transp Res Part C Emerg Technol 79:274–289
Article Google Scholar
Briand AS, Come E, Khouadjia M, et al (2019) Detection of atypical events on a public transport network using smart card data. In: European transport conference 2019 Association for European Transport (AET)
Cecaj A, Lippi M, Mamei M et al (2021) Sensing and forecasting crowd distribution in smart cities: Potentials and approaches. IoT 2(1):33–49
Article Google Scholar
Celeux G, Soromenho G (1996) An entropy criterion for assessing the number of clusters in a mixture model. J Classif 13(2):195–212
Article MathSciNet MATH Google Scholar
Chiquet J, Robin S, Mariadassou M (2019) Variational inference for sparse network reconstruction from count data. In: International conference on machine learning, PMLR, pp 1162–1171
Chiquet J, Mariadassou M, Robin S (2021) The poisson-lognormal model as a versatile framework for the joint analysis of species abundances. Front Ecol Evol 9:188
Article Google Scholar
Côme E, Oukhellou L (2014) Model-based count series clustering for bike sharing system usage mining: a case study with the vélib’system of paris. ACM Trans Intell Syst Technol(TIST) 5(3):1–21
Article Google Scholar
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–22
MathSciNet MATH Google Scholar
Fernández-Ares A, Mora A, Arenas MG et al (2017) Studying real traffic and mobility scenarios for a smart city using a new monitoring and tracking system. Futur Gener Comput Syst 76:163–179
Article Google Scholar
Ghaemi MS, Agard B, Trépanier M et al (2017) A visual segmentation method for temporal smart card data. Transp A Transp Sci 13(5):381–404
Google Scholar
Hilbe JM (2011) Negative binomial regression. Cambridge University Press, Cambridge
Book MATH Google Scholar
Holland PW, Welsch RE (1977) Robust regression using iteratively reweighted least-squares. Commun Statistics-theory Methods 6(9):813–827
Article MATH Google Scholar
Jones M, Marchand É (2019) Multivariate discrete distributions via sums and shares. J Multivar Anal 171:83–93
Article MathSciNet MATH Google Scholar
Kim J, Zhang Y, Day J et al (2018) Mglm: an r package for multivariate categorical data analysis. R J 10(1):73
Article Google Scholar
Kristoffersen MS, Dueholm JV, Gade R et al (2016) Pedestrian counting with occlusion handling using stereo thermal cameras. Sensors 16(1):62
Article Google Scholar
Lange K, Hunter DR, Yang I (2000) Optimization transfer using surrogate objective functions. J Comput Graph Stat 9(1):1–20
MathSciNet Google Scholar
Lashkari D, Golland P (2007) Convex clustering with exemplar-based models. Adv Neural Inf Process Syst 20
Li J, Zheng P, Zhang W (2020) Identifying the spatial distribution of public transportation trips by node and community characteristics. Transp Plan Technol 43(3):325–340
Article Google Scholar
Li Y, Rahman T, Ma T et al (2021) A sparse negative binomial mixture model for clustering rna-seq count data. Biostatistics 24(1):68–84
Article MathSciNet Google Scholar
Magidson J, Vermunt J (2002) Latent class models for clustering: a comparison with k-means. Can J Marketing Res 20(1):36–43
Google Scholar
Manley E, Zhong C, Batty M (2018) Spatiotemporal variation in travel regularity through transit user profiling. Transportation 45(3):703–732
Article Google Scholar
McLachlan GJ, Lee SX, Rathnayake SI (2019) Finite mixture models. Annu Rev Stat Appl 6:355–378
Article MathSciNet Google Scholar
Mohamed K, Côme E, Oukhellou L et al (2016) Clustering smart card data for urban mobility analysis. IEEE Trans Intell Transp Syst 18(3):712–728
Google Scholar
Mützel CM, Scheiner J (2021) Investigating spatio-temporal mobility patterns and changes in metro usage under the impact of covid-19 using taipei metro smart card data. Public Transp 1–24
de Nailly P, Côme E, Samé A et al (2021) What can we learn from 9 years of ticketing data at a major transport hub? a structural time series decomposition. Transp A Transp Sci 18(3):1445–1469
Google Scholar
Pavlyuk D, Spiridovska N, Yatskiv I (2020) Spatiotemporal dynamics of public transport demand: a case study of riga. Transport 35(6):576–587
Article Google Scholar
Peláez G, Bacara D, de la Escalera A, et al (2015) Road detection with thermal cameras through 3d information. In: 2015 IEEE intelligent vehicles symposium (IV), IEEE, pp 255–260
Peyhardi J, Fernique P, Durand JB (2021) Splitting models for multivariate count data. J Multivar Anal 181(104):677
MathSciNet MATH Google Scholar
Ren B, Barnett I (2020) Autoregressive mixture models for serial correlation clustering of time series data. arXiv preprint arXiv:2006.16539
Ripley B, Venables B, Bates DM et al (2013) Package ‘mass’. Cran r 538:113–120
Google Scholar
Ripley B, Venables W, Ripley MB (2016) Package ‘nnet’. R Package Version 7(3–12):700
Google Scholar
Ronchi E, Scozzari R, Fronterrè M (2020) A risk analysis methodology for the use of crowd models during the covid-19 pandemic. LUTVDG/TVBB (3235)
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 461–464
Sibuya M, Yoshimura I, Shimizu R (1964) Negative multinomial distribution. Ann Inst Stat Math 16(1):409–426
Article MathSciNet MATH Google Scholar
Silva A, Rothstein SJ, McNicholas PD et al (2019) A multivariate poisson-log normal mixture model for clustering transcriptome sequencing data. BMC Bioinf 20(1):1–11
Article Google Scholar
Singh U, Determe JF, Horlin F et al (2020) Crowd forecasting based on wifi sensors and lstm neural networks. IEEE Trans Instrum Meas 69(9):6121–6131
Article Google Scholar
Toqué F, Côme E, Oukhellou L, et al (2018) Short-term multi-step ahead forecasting of railway passenger flows during special events with machine learning methods. In: CASPT 2018, conference on advanced systems in public transport and transitdata 2018, p 15
Truong C, Oudre L, Vayatis N (2020) Selective review of offline change point detection methods. Signal Process 167(107):299
Google Scholar
Wang Z, Liu H, Zhu Y et al (2021) Identifying urban functional areas and their dynamic changes in beijing: Using multiyear transit smart card data. J Urban Plan Dev 147(2):04021002
Article Google Scholar
Winkelmann R (2008) Econometric analysis of count data. Springer Science and Business Media, Berlin
MATH Google Scholar
Zhang Y, Zhou H, Zhou J et al (2017) Regression models for multivariate count data. J Comput Graph Stat 26(1):1–13
Article MathSciNet Google Scholar
Zhong C, Manley E, Arisona SM et al (2015) Measuring variability of mobility patterns from multiday smart-card data. J Comput Sci 9:125–130
Article Google Scholar
Zhou M, Hannah L, Dunson D, et al (2012) Beta-negative binomial process and poisson factor analysis. In: Artificial intelligence and statistics, PMLR, pp 1462–1471

Download references

Author information

Authors and Affiliations

Cosys-Grettia, Université Gustave Eiffel, Descartes Boulevard Newton, 77454, Champs-sur-Marne, France
Paul de Nailly, Etienne Côme, Latifa Oukhellou & Allou Samé
EDT, RATP, 54 Quai de la Rapée, 75012, Paris, France
Paul de Nailly, Jacques Ferriere & Yasmine Merad-Boudia

Authors

Paul de Nailly
View author publications
You can also search for this author in PubMed Google Scholar
Etienne Côme
View author publications
You can also search for this author in PubMed Google Scholar
Latifa Oukhellou
View author publications
You can also search for this author in PubMed Google Scholar
Allou Samé
View author publications
You can also search for this author in PubMed Google Scholar
Jacques Ferriere
View author publications
You can also search for this author in PubMed Google Scholar
Yasmine Merad-Boudia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul de Nailly.

Ethics declarations

Conflict of interest

The authors declared that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Mixture of sums and shares model estimation

Given $z_{j,s}=1$ and $\textbf{x}_{j,t}$, the series $\textbf{y}_{j,t}$ are distributed according to the following mixture model:

$$\begin{aligned} p(\textbf{y}_{j,t};{\varvec{{\varvec{\alpha }}}}, {\varvec{{\varvec{\gamma }}}}, r, {\varvec{{\varvec{\xi }}}}) = \sum _{s=1}^S \pi _s(j;{\varvec{{\varvec{\alpha }}}}) g(v_{j,t}|\textbf{x}_{j,t}, {\varvec{{\varvec{\gamma }}}}_s, r_s) h(\textbf{y}_{j,t}|v_{j,t},\textbf{x}_{j,t}, {\varvec{{\varvec{\xi }}}}_s), \end{aligned}$$

(A1)

with ${\varvec{{\varvec{\gamma }}}} = ({\varvec{{\varvec{\gamma }}}}_s)_{s=1,...,S}$, $r = (r_s)_{s=1,...,S}$ and ${\varvec{{\varvec{\xi }}}} = ({\varvec{{\varvec{\xi }}}}_s)_{s=1,...,S}$. The parameters of the model are estimated with the Expectation Maximization (EM) algorithm (Dempster et al. 1977) which requires a complete data log-likelihood maximization. The complete data log-likelihood can be written:

$$\begin{aligned} {\mathcal {L}}_c({\varvec{{\varvec{\alpha }}}}, {\varvec{{\varvec{\gamma }}}}, r, {\varvec{{\varvec{\xi }}}} ) = \sum _{s=1}^S\sum _{j=1}^J\sum _{t=1}^T z_{j,s} \log \left( \pi _s(j;{\varvec{{\varvec{\alpha }}}}) g(v_{j,t}|\textbf{x}_{j,t}, {\varvec{{\varvec{\gamma }}}}_s, r_s) h(\textbf{y}_{j,t}|v_{j,t}, \textbf{x}_{j,t}, {\varvec{{\varvec{\xi }}}}_s) \right) . \end{aligned}$$

(A2)

Given the initial value of the parameters ${\varvec{{\varvec{\xi }}}}^{(0)}$, ${\varvec{{\varvec{\gamma }}}}^{(0)}$, $r^{(0)}$ and ${\varvec{{\varvec{\alpha }}}}^{(0)}$, the following two steps are repeated until convergence.

Expectation step (E) The expectation of the completed log-likelihood is evaluated knowing the observed data Y and the set of current parameters: ${\varvec{{\varvec{\xi }}}}^{(c)}$, ${\varvec{{\varvec{\gamma }}}}^{(c)}$, $r^{(c)}$ and ${\varvec{{\varvec{\alpha }}}}^{(c)}$.
$$\begin{aligned} Q({\varvec{{\varvec{\alpha }}}}^{(c)}, {\varvec{{\varvec{\gamma }}}}^{(c)}, r^{(c)}, {\varvec{{\varvec{\xi }}}}^{(c)}) =&\sum _{s=1}^S\sum _{j=1}^J\sum _{t=1}^T E_{{\varvec{\xi }}^{(c)}, {\varvec{\gamma }}^{(c)}, r^{(c)}, {\varvec{\alpha }}^{(c)}}[z_{j,s}|Y] \end{aligned}$$
(A3)
$$\begin{aligned}&log\left( \pi _s(j;{\varvec{\alpha }}^{(c)}) g(v_{j,t}|\textbf{x}_{j,t}, {\varvec{\gamma }}^{(c)}_s, r^{(c)}_s) h(\textbf{y}_{j,t}|v_{j,t},\textbf{x}_{j,t}, {\varvec{\xi }}^{(c)}_s) \right) , \end{aligned}$$
(A4)
where
$$\begin{aligned} E_{{\varvec{\xi }}^{(c)}, {\varvec{\gamma }}^{(c)}, r^{(c)}, {\varvec{\alpha }}^{(c)}}[z_{j,s}|Y]&= \tau ^{(c)}_{j,s} \end{aligned}$$
(A5)
$$\begin{aligned}&= \frac{\pi _s(j;{\varvec{\alpha }}^{(c)}) \prod _T g(v_{j,t}|{\varvec{x}}_{j,t}, {\varvec{\gamma }}^{(c)}_s, r^{(c)}_s) h(\textbf{y}_{j,t}|\textbf{x}_{j,t}, v_{j,t}, {\varvec{\xi }}^{(c)}_s)}{ \sum _{s'}\pi _{s'}(j;{\varvec{\alpha }}^{(c)}) \prod _T g(v_{j,t}|\textbf{x}_{j,t}, {\varvec{\gamma }}^{(c)}_s, r^{(c)}_s) h(\textbf{y}_{j,t}|\textbf{x}_{j,t}, v_{j,t}, {\varvec{\xi }}^{(c)}_s)}. \end{aligned}$$
(A6)
The a posteriori probabilities that each day j belongs to segment s, $\tau ^{(c)}_{j,s}$, are updated at each iteration of step E.
Maximization step (M) Parameters ${\varvec{\xi }}^{(c+1)}$, ${\varvec{\gamma }}^{(c+1)}$, $r^{(c+1)}$ and ${\varvec{\alpha }}^{(c+1)}$ that maximize $ Q({\varvec{{\varvec{\alpha }}}}^{(c)}, {\varvec{{\varvec{\gamma }}}}^{(c)},r^{(c)}, {\varvec{{\varvec{\xi }}}}^{(c)})$ are calculated. It is possible to rewrite this quantity as:
$$\begin{aligned} Q({\varvec{{\varvec{\alpha }}}}^{(c)}, {\varvec{{\varvec{\gamma }}}}^{(c)}, r^{(c)}, {\varvec{{\varvec{\xi }}}}^{(c)}) = Q_1({\varvec{\alpha }}^{(c)}) + Q_2({\varvec{\gamma }}^{(c)},r^{(c)}) + Q_3({\varvec{\xi }}^{(c)}) \end{aligned}$$
(A7)
where
$$\begin{aligned} Q_1({\varvec{\alpha }}^{(c)}) = \sum _{s=1}^S\sum _{j=1}^J \tau ^{(c)}_{j,s} log(\pi _s(j;{\varvec{\alpha }}^{(c)})) \end{aligned}$$
(A8)
$$\begin{aligned} Q_2({\varvec{\gamma }}^{(c)},r^{(c)}) = \sum _{s=1}^S\sum _{j=1}^J\sum _{t=1}^T \tau ^{(c)}_{j,s} log(g(v_{j,t}|\textbf{x}_{j,t}, {\varvec{\gamma }}^{(c)}_s, r^{(c)}_s)) \end{aligned}$$
(A9)
$$\begin{aligned} Q_3({\varvec{\xi }}^{(c)}) = \sum _{s=1}^S\sum _{j=1}^J\sum _{t=1}^T\tau ^{(c)}_{j,s} log(h(\textbf{y}_{j,t}|\textbf{x}_{j,t}, v_{j,t}, {\varvec{\xi }}^{(c)}_s)). \end{aligned}$$
(A10)
The maximisation of $Q_1$ consists in solving a weighted multinomial logistic regression. New values of ${\varvec{\alpha }}$ can be found using iterative procedures such as iteratively reweighted least squares (IRLS) (Holland and Welsch 1977). This problem is solved with with the multinom function of the nnet package (Ripley et al. 2016). $Q_2$ is the log-likelihood corresponding to a negative binomial generalized linear model. Its maximisation is solved through an alternating iteration process provided by the glm.nb function in the MASS package (Ripley et al. 2013). Within each segment s, for a given value of $r^{(c)}_s$ the linear model is fitted using an IRLS method. Next for fixed found ${\varvec{\gamma }}^{(c)}_s$ parameters, the $r^{(c)}_s$ parameter is estimated with score and information iterations. The two steps are alterned until convergence and ${\varvec{\gamma }}^{(c+1)}_s$ and $r^{(c+1)}_s$ are found. Note that $\tau ^{(c)}_{j,s}$ are here used as prior weights in the fitting process. The criterion $Q_3$, which is associated to a weighted Dirichlet multinomial regression model, is solved with the MGLM package (Kim et al. 2018). Because Dirichlet multinomial distribution does not belong to the exponential family, IRLS method is not used, as the expected information matrix is difficult to calculate. The method used here combine the minorization-maximization (MM) (Lange et al. 2000) algorithm and the Newton’s method. MM and Newton updates are computed at each iteration and the one with the higher log-likelihood is chosen.

Appendix B: Poisson log-normal mixture model estimation

The series $\textbf{y}_{j,t}$ are distributed according to the following mixture model:

$$\begin{aligned} p(\textbf{y}_{j,t};{\varvec{\alpha }}, {\varvec{\rho }}, {\varvec{\Sigma }}) = \sum _{s=1}^S \pi _s(j;{\varvec{\alpha }}) \int _{R^P}\left[ \prod _{p=1}^P g(\textbf{y}_{j,t,p}|\theta _{j,t,p})\right] m(\theta _{j,t}|{\varvec{\rho }}_s, {\varvec{\Sigma }}_s)d\theta _{j,t}, \end{aligned}$$

(B11)

with ${\varvec{\rho }} = ({\varvec{\rho }}_s)_{s=1,...,S}$ and ${\varvec{\Sigma }} = ({\varvec{\Sigma }}_s)_{s=1,...,S}$. g is a Poisson distribution and m a Gaussian distribution function. The EM algorithm can be used for parameter estimation but finding the expected value of the complete data log-likelihood requires estimating the conditional expectations $\mathop {\mathbb {E}}(Z_{js}\theta _{j,t}|\textbf{y}_{j,t},{\varvec{\rho }}_s,{\varvec{\Sigma }}_s)$ and $\mathop {\mathbb {E}}(Z_{js}\theta _{j,t}\theta '_{j,t}|\textbf{y}_{j,t},{\varvec{\rho }}_s,{\varvec{\Sigma }}_s)$ which are intractable. These conditional expectations can be calculated with an EM algorithm coupled with a Markov chain Monte Carlo (MCMC-EM) algorithm as presented in the work of Silva et al. (2019), which however comes with a heavy calculation load. We refer instead to the work presented by Chiquet et al. (2019) that uses variational approximation which is an approximate inference technique. The idea behind variational inference is to use Gaussian densities and approximate complex posterior distributions by minimizing the Kullback–Leibler divergence between the true $p(\theta )$ and approximating densities $q(\theta )$. The marginal log-likelihood for $\textbf{y}_{j,t}$ can be written as

$$\begin{aligned} \log p(\textbf{y}_{j,t}) = F(q(\theta _{j,t}), \textbf{y}_{j,t}) + D_{KL}(q(\theta _{j,t})|p(\theta _{j,t})), \end{aligned}$$

(B12)

with $D_{KL}(q(\theta _{j,t})|p(\theta _{j,t}))$ the Kullback-Leibler divergence between $p(\theta _{j,t})$ and $q(\theta _{j,t})$. $F(q(\theta _{j,t}), \textbf{y}_{j,t})$ is the expression of the variational lower bound of the log-likelihood. This is the criterion that we aim to maximize in the parameter estimation process. In the case of the Poisson-Lognormal model, q is assumed to be a Gaussian distribution:

$$\begin{aligned} q(\theta _{j,t}; \textbf{m}_{j,t}, \textbf{S}_{j,t}) = {\mathcal {N}}(\theta _{j,t}; \textbf{m}_{j,t}, \textbf{S}_{j,t}), \end{aligned}$$

(B13)

with $\textbf{m}_{j,t}$ and $\textbf{S}_{j,t} = diag(\textbf{S}_{j,t})$ the variational parameters associated with sample $\textbf{y}_{j,t}$ at day j and time slot t. To minimize the Kullback–Leibler divergence, the variational lower bound has to be maximized. The complete data log-likelihood can be written as follows:

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_c({\varvec{\alpha }}, {\varvec{\rho }}, {\varvec{\Sigma }}, {\varvec{m}}, {\varvec{S}})&= \sum _{s=1}^S\sum _{j=1}^J\sum _{t=1}^T z_{j,s} \log (\pi _s(j;{\varvec{\alpha }}))+\\ {}&\sum _{s=1}^S\sum _{j=1}^J\sum _{t=1}^T z_{j,s}[F(q^{(s)}(\theta _{j,t}), \textbf{y}_{j,t}) + D_{KL}(q^{(s)}(\theta _{j,t})|p^{(s)}(\theta _{j,t}))], \end{aligned} \end{aligned}$$

(B14)

where $D_{KL}(q^{(s)}(\theta _{j,t})|p^{(s)}(\theta _{j,t}))$ is the Kullback–Leibler divergence between $p(\theta _{j,t}|\textbf{y}_{j,t}, z_{j}=s)$ and $q^{(s)}(\theta _{j,t})$ with $q^{(s)}(\theta _{j,t}) = {\mathcal {N}}(\textbf{m}_{j,t}^{(s)}, \textbf{S}_{j,t}^{(s)})$. And the variational lower bound of the log-likelihood for each observation $\textbf{y}_{j,t}$ is

$$\begin{aligned} \begin{aligned} & F(q^{(s)}(\theta _{j,t}), \textbf{y}_{j,t})= \frac{1}{2}\log |\textbf{S}_{j,t}^{(s)}|- \frac{1}{2}(\textbf{m}_{j,t}^{(s)} - \textbf{x}_{j,t}^T{\varvec{\rho }}_{s})'{\varvec{\Sigma }}_s^{-1}(\textbf{m}_{j,t}^{(s)} - \textbf{x}_{j,t}^T{\varvec{\rho }}_{s}) - tr({\varvec{\Sigma }}_s^{-1}\textbf{S}_{j,t}^{(s)}) - \\&\quad \frac{1}{2}\log |{\varvec{\Sigma }}_s|- \frac{P}{2} + (\textbf{m}^{(s)})^{'}_{j,t}\textbf{y}_{j,t} - \sum _{p=1}^P(\exp (m_{j,t,p}^{(s)} + \frac{1}{2}s_{j,t,p}^{(s)}) + \log (y_{j,t,p}!)). \end{aligned} \end{aligned}$$

(B15)

The EM algorithm is used to estimate the parameters and the following two steps are repeated until convergence.

Expectation step (E) The expectation of the completed log-likelihood is evaluated knowing the observed data Y, the set of current parameters ${\varvec{\rho }}^{(c)}$, ${\varvec{\Sigma }}^{(c)}$ and ${\varvec{\alpha }}^{(c)}$ and variational parameters $\textbf{m}^{(c)}_{j,t}$, $\textbf{S}^{(c)}_{j,t}$.
$$\begin{aligned} \begin{aligned} Q({\varvec{\rho }}^{(c)},{\varvec{\Sigma }}^{(c)},{\varvec{\alpha }}^{(c)},\textbf{m}^{(c)},\textbf{S}^{(c)})&= \sum _{s=1}^S\sum _{j=1}^J\sum _{t=1}^T \tau _{j,s}^{(c)} \log (\pi _s(j;{\varvec{\alpha }}^{(c)})) +\\&\sum _{s=1}^S\sum _{j=1}^J\sum _{t=1}^T \tau _{j,s}^{(c)} E_{{\varvec{\rho }}^{(c)}, {\varvec{\Sigma }}^{(c)}, {\varvec{\alpha }}^{(c)}, m^{(c)}_{j,t}, S^{(c)}_{j,t}}[F(q^{(s)}(\theta _{j,t}), \textbf{y}_{j,t}) +\\&D_{KL}(q^{(s)}(\theta _{j,t})|p^{(s)}(\theta _{j,t}))], \end{aligned} \end{aligned}$$
(B16)
with $ \tau _{j,s}^{(c)} = E_{{\varvec{\rho }}^{(c)}, {\varvec{\Sigma }}^{(c)}, {\varvec{\alpha }}^{(c)}, m^{(c)}_{j,t}, S^{(c)}_{j,t}}[z_{j,s}|Y]$. The variational lower bound of the log-likelihood is used to approximate $ \tau _{j,s}^{(c)}$:
$$\begin{aligned} \tau _{j,s}^{(c)} = \frac{\pi _s(j;{\varvec{\alpha }}^{(c)})\prod _{t=1}^T \exp (F(q^{(s)}(\theta _{j,t}), \textbf{y}_{j,t}))}{\sum _{h=1}^S\pi _h(j;{\varvec{\alpha }}^{(c)})\prod _{t=1}^T \exp (F(q^{(h)}(\theta _{j,t}), \textbf{y}_{j,t}))}. \end{aligned}$$
(B17)
Note that this approximation is used in the R package PLNmodels.
Maximization step (M) The maximization step is divided into two parts:
- Conditionally on ${\varvec{\rho }}_s$ and ${\varvec{\Sigma }}_s$ and given $\tau _{j,s}$, variational parameters $\textbf{m}^{(c)}_{j,t}$ and $\textbf{S}^{(c)}_{j,t}$ are updated. Because $F(q^{(s)}(\theta _{j,t}), \textbf{y}_{j,t})$ is strictly concave with respect to $\textbf{m}^{(c)}_{j,t}$ and $\textbf{S}^{(c)}_{j,t}$, it is possible to obtain $\textbf{S}^{(c+1)}_{j,t}$ with the fixed-point method and $\textbf{m}^{(c+1)}_{j,t}$ with Newton’s method.
- Knowing $\tau _{j,s}^{(c)}$, $\textbf{m}^{(c+1)}_{j,t}$ and $\textbf{S}^{(c+1)}_{j,t}$ parameters ${\varvec{\rho }}^{(c+1)}$, ${\varvec{\Sigma }}^{(c+1)}$ and ${\varvec{\alpha }}^{(c+1)}$ are obtained.

Appendix C: Description of the time segments

See Table 5.

Table 5 Time segmentation

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

de Nailly, P., Côme, E., Oukhellou, L. et al. Multivariate count time series segmentation with “sums and shares” and Poisson lognormal mixture models: a comparative study using pedestrian flows within a multimodal transport hub. Adv Data Anal Classif (2023). https://doi.org/10.1007/s11634-023-00543-9

Download citation

Received: 01 August 2022
Revised: 22 March 2023
Accepted: 24 April 2023
Published: 29 May 2023
DOI: https://doi.org/10.1007/s11634-023-00543-9

Keywords

Mathematical Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multivariate count time series segmentation with “sums and shares” and Poisson lognormal mixture models: a comparative study using pedestrian flows within a multimodal transport hub

Abstract

Access this article

Similar content being viewed by others

A mixture model clustering approach for temporal passenger pattern characterization in public transport

Scalable clustering of segmented trajectories within a continuous time framework: application to maritime traffic data

State-space models reveal bursty movement behaviour of dance event visitors

Code availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Mixture of sums and shares model estimation

Appendix B: Poisson log-normal mixture model estimation

Appendix C: Description of the time segments

Rights and permissions

About this article

Cite this article

Keywords

Mathematical Subject Classification

Navigation

Multivariate count time series segmentation with “sums and shares” and Poisson lognormal mixture models: a comparative study using pedestrian flows within a multimodal transport hub

Abstract

Access this article

Similar content being viewed by others

A mixture model clustering approach for temporal passenger pattern characterization in public transport

Scalable clustering of segmented trajectories within a continuous time framework: application to maritime traffic data

State-space models reveal bursty movement behaviour of dance event visitors

Code availability

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix A: Mixture of sums and shares model estimation

Appendix B: Poisson log-normal mixture model estimation

Appendix C: Description of the time segments

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematical Subject Classification

Search

Navigation