Skip to main content
Log in

Mixture of shifted binomial distributions for rating data

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Rating data are a kind of ordinal categorical data routinely collected in survey sampling. The response value in such applications is confined to a finite number of ordered categories. Due to population heterogeneity, the respondents may have several different rating styles. A finite mixture model is thus most suitable to fit datasets of this nature. In this paper, we propose a two-component mixture of shifted binomial distributions for rating data. We show that this model is identifiable and propose a numerically stable penalized likelihood approach for parameter estimation. We adapt an expectation-maximization algorithm for the penalized maximum likelihood estimation. Our simulation results show that the penalized maximum likelihood estimator is consistent and effective. We fit the proposed model and other models in the literature to some real-world datasets and find the proposed model can have much better fits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Agresti, A. (2010). Analysis of ordinal categorical data (2nd ed.). Hoboken: John Wiley and Sons.

    Book  MATH  Google Scholar 

  • Atienza, N., Garcia-Heras, J., Munoz-Pichardo, J. M. (2006). A new condition for identifiability of finite mixture distributions. Metrika, 63, 215–221.

    Article  MathSciNet  MATH  Google Scholar 

  • Breen, R., Luijkx, R. (2010). Mixture models for ordinal data. Sociological Methods and Research, 39, 3–24.

    Article  MathSciNet  Google Scholar 

  • Chen, H., Chen, J., Kalbfleisch, J. D. (2001). A modified likelihood ratio test for homogeneity in finite mixture models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63, 19–29.

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, J. (1995). Optimal rate of convergence for finite mixture models. The Annals of Statistics, 23, 221–233.

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, J. (1998). Penalized likelihood ratio test for finite mixture models with multinomial observations. Canadian Journal of Statistics, 26, 583–599.

    Article  MathSciNet  MATH  Google Scholar 

  • Chen, J., Li, P. (2009). Hypothesis test for normal mixture models: The EM approach. The Annals of Statistics, 37, 2523–2542.

    Article  MathSciNet  MATH  Google Scholar 

  • D’elia, A., Piccolo, D. (2005). A mixture model for preference data analysis. Computational Statistics Data Analysis, 49, 917–934.

    Article  MathSciNet  MATH  Google Scholar 

  • Dempster, A. P., Laird, N. M., Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society. Series B (Methodological), 39, 1–38.

    Article  MATH  Google Scholar 

  • Iannario, M. (2010). On the identifiability of a mixture model for ordinal data. Metron, 68, 87–94.

    Article  MathSciNet  MATH  Google Scholar 

  • Kiefer, J., Wolfowitz, J. (1956). Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. The Annals of Mathematical Statistics, 27, 887–906.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, P., Chen, J., Marriott, P. (2009). Non-finite Fisher information and homogeneity: The EM approach. Biometrika, 96, 411–426.

    Article  MathSciNet  MATH  Google Scholar 

  • Lindsay, B. G. (1995). Mixture models: theory. Geometry and applications. Hayward: Institute for Mathematical Statistics.

    Book  MATH  Google Scholar 

  • McLachlan, G. J., Peel, D. (2000). Finite mixture models. New York: John Wiley and Sons.

    Book  MATH  Google Scholar 

  • Oh, C. (2014). A maximum likelihood estimation method for a mixture of shifted binomial distributions. Journal of the Korean Data and Information Science Society, 25, 255–261.

    Article  Google Scholar 

  • Piccolo, D. (2003). On the moments of a mixture of uniform and shifted binomial random variables. Quaderni di Statistica, 5, 85–104.

    Google Scholar 

  • Simone, R. (2021). An accelerated EM algorithm for mixture models with uncertainty for rating data. Computational Statistics, 36, 691–714.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou, H., Lange, K. (2009). Rating movies and rating the raters who rate them. The American Statistician, 63, 297–307.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 11701071, 11871419), the Natural Science and Engineering Research Council (Grant No. 2019–04204) and the Scientific Research Projects of Dongbei University of Finance and Economics (Grand No. 20210261). We thank the AE and referee for helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiahua Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 133 KB)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Chen, J. Mixture of shifted binomial distributions for rating data. Ann Inst Stat Math 75, 833–853 (2023). https://doi.org/10.1007/s10463-023-00865-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-023-00865-7

Keywords

Navigation