CAPTURING EXCESS ZEROS IN MODELING AUTO-INSURANCE CLAIMS IN AN INDIGENOUS INSURANCE FIRM USING ZERO INFLATED MODELS AND HURDLE MODELS

Mary Akinyemi; Abisola Rufai; Nofiu Idowu Badmus

Mary Akinyemi UNIVERSITY OF LAGOS
Abisola Rufai Bank of America
Nofiu Idowu Badmus University of Lagos

Abstract

Count data occur naturally in a number of disciplines ranging from economics and social sciences to finance as well as medical sciences. Most count data are plagued with over-dispersion and excess zeros making it difficult to model them with vanilla linear models. Different models have been proposed to capture this peculiarity in count data viz.: A number of classical regression models such as the generalized Poisson and negative binomial have been used to model dispersed count data. Hurdle and zero-inflated models are also said to be able to capture over-dispersion and excess zeros in count data.

In this paper, we compare the performance of Poisson and Negative Binomial hurdle models, zero-inflated Poisson and Negative Binomial models, classical Poisson and Negative Binomial regression models as well as the zero-inflated compound Poisson generalized linear models to modelling frequency of auto insurance claims in a typical emerging market.

The model parameters are estimated using the method of maximum likelihood. The models performances are compared based based on some model selection criteria, including: Akaike and Bayesian information Criteria (AIC and BIC), and Gini index. The zero-inflated compound Poisson generalized linear models performed better than the other models considered.

References

Aghion, P. and Durlauf, S., editors (2005). Handbook of Economic Growth, volume 1. Elsevier, 1 edition.

Atkinson, A. and Bourguignon, F., editors (2000). Handbook of Income Distribution, volume 1. Elsevier, 1 edition.

BBC (2014). The mint countries: Next economic giants? International Journal of Environmental Research and Public Health.

Boucher, Jean-Philippe, M. D. and Guillen, M. (2008). Models of insurance claim counts with time dependence based on generalization of poisson and negative binomial distributions. Variance, 2(1):132–162.

Cameron, A. C. and Trivedi, P. K. (1996). Count data models for financial data. In Handbook of Statistics, pages 363–391. Elsevier, North-Holland.

Famoye, F. and Singh, K. P. (2006). Zero-inflated generalized poisson regression model with an application to domestic violence data. Journal of Data Science, 4(1):117–130.

Gurmu, S. (1998). Generalized hurdle count data regression models. Economics Letters, 58(3):263 – 268.

Hidayat, B. and Pokhrel, S. (2010). The selection of an appropriate count data model for modelling health insurance and health care demand: Case of indonesia. International Journal of Environmental Research and Public Health, 7(1):9–27.

Ismail, N. and Zamani, H. (2013). Estimation of claim count data using negative binomial, generalized poisson, zero-inflated negative binomial and zero-inflated generalized poisson regression models.

Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90(430):773– 795.

Lambert, D. (1992). Zero-inflated poisson regression, with an application to defects in manufacturing. Technomet- rics, 34(1):1–14.

Mullahy, J. (1986). Specification and testing of some modified count data models. Journal of Econometrics, 33(3):341–365.

Ozmen, I. and Famoye, F. (2007). Count regression models with an application to zoological data containing struc- tural zeros. Journal of Data Science, 5(4):491–502.

Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2):461–464.

Shi, P. and Valdez, E. A. (2014). Multivariate negative binomial models for insurance claim counts. Insurance:

Mathematics and Economics, 55:18 – 29.

Zeileis, A., Kleiber, C., and Jackman, S. (2008). Regression models for count data in r. Journal of Statistical Software, 27(1):1–25.