Hostname: page-component-76fb5796d-22dnz Total loading time: 0 Render date: 2024-04-26T05:14:18.434Z Has data issue: false hasContentIssue false

ACTUARIAL APPLICATIONS OF WORD EMBEDDING MODELS

Published online by Cambridge University Press:  22 October 2019

Gee Y Lee*
Affiliation:
Department of Statistics and Probability Department of MathematicsMichigan State UniversityC337 Wells Hall, 619 Red Cedar Rd, East Lansing, MI 48824, USA E-Mail: leegee@msu.edu
Scott Manski
Affiliation:
Department of Statistics and ProbabilityMichigan State UniversityC511 Wells Hall, 619 Red Cedar Rd, East Lansing, MI 48824, USA E-Mail: manskisc@stt.msu.edu
Tapabrata Maiti
Affiliation:
Department of Statistics and ProbabilityMichigan State UniversityC424 Wells Hall, 619 Red Cedar Rd, East Lansing, MI 48824, USA E-Mail: maiti@stt.msu.edu
*

Abstract

In insurance analytics, textual descriptions of claims are often discarded, because traditional empirical analyses require numeric descriptor variables. This paper demonstrates how textual data can be easily used in insurance analytics. Using the concept of word similarities, we illustrate how to extract variables from text and incorporate them into claims analyses using standard generalized linear model or generalized additive regression model. This procedure is applied to the Wisconsin Local Government Property Insurance Fund (LGPIF) data, in order to demonstrate how insurance claims management and risk mitigation procedures can be improved. We illustrate two applications. First, we show how the claims classification problem can be solved using textual information. Second, we analyze the relationship between risk metrics and the probability of large losses. We obtain good results for both applications, where short textual descriptions of insurance claims are used for the extraction of features.

Type
Research Article
Copyright
© Astin Bulletin 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Chollet, F. and Allaire, J. J. (2018) Deep Learning with R. Shelter Island, NY: Manning Publications Co.Google Scholar
Frees, E. W. (2009) Regression Modeling with Actuarial and Financial Applications. Cambridge, UK: Cambridge University Press.CrossRefGoogle Scholar
Frees, E. W., Lee, G. Y. and Yang, L. (2016) Multivariate frequency-severity regression models in insurance. Risks, 2016(4): 4.CrossRefGoogle Scholar
Goldberg, Y. (2017) Neural Network Methods for Natural Language Processing. San Rafael, CA: Morgan & Claypool Publishers.CrossRefGoogle Scholar
Goodfellow, I., Bengio, Y. and Courville, A. (2016) Deep Learning. Cambridge, MA: MIT Press.Google Scholar
Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition. Berlin: Springer Science & Business Media.CrossRefGoogle Scholar
Hastie, T. J. and Tibshirani, R. J. (1990). Generalized Additive Models. Boca Raton, FL: Chapman and Hall.Google Scholar
Kearney, S. (2010). Insurance Operations. Malvern, PA: The Institutes.Google Scholar
Manning, C. D. and Schutze, H. (1999). Foundations of Statistical Natural Language Processing, 1st Edition. Cambridge, MA: The MIT Press.Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. and Dean, J. (2013) Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems 26: 31113119.Google Scholar
Pennington, J., Socher, R. and Manning, C. D. (2014). Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing (EMNLP), vol. 2014, pp. 15321543.CrossRefGoogle Scholar
Sokolova, M. and Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing and Management, 45:427437.CrossRefGoogle Scholar
Wood, S. (2013). On p values for smooth components of an extended generalized additive model. Biometrika 100, 221228.CrossRefGoogle Scholar
Wood, S. N. (2017). Generalized Additive Models: An Introduction with R, Second Edition. Boca Raton, FL: CRC Press.CrossRefGoogle Scholar