Skip to main content
Log in

Case studies on using natural language processing techniques in customer relationship management software

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

How can we use a text corpus stored in a customer relationship management (CRM) database for data mining and segmentation? To answer this question, we inherited the state of the art methods commonly used in natural language processing (NLP) literature, such as word embeddings, and deep learning literature, such as recurrent neural networks (RNN). We used the text notes from a CRM system taken by customer representatives of an internet ads consultancy agency between 2009 and 2020. We trained word embeddings by using the corresponding text corpus and showed that these word embeddings could be used directly for data mining and used in RNN architectures, which are deep learning frameworks built with long short-term memory (LSTM) units, for more comprehensive segmentation objectives. The obtained results prove that we can use structured text data populated in a CRM to mine valuable information. Hence, any CRM can be equipped with useful NLP features once we correctly built the problem definitions and conveniently implement the solution methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. In this study, the procedures and principles introduced by the legal regulations that the company is subject to for the protection of personal data, and the company’s privacy policy, which is notified to the customers, were followed.

References

  • Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., & Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems Software available from tensorflow.org. https://www.tensorflow.org/.

  • Bahari, T.F., & Elayidom, M.S. (2015). An efficient CRM-data mining framework for the prediction of customer behaviour. Procedia Computer Science, 46, 725–731. https://doi.org/10.1016/j.procs.2015.02.136.

    Article  Google Scholar 

  • Bates, M. (1995). Models of natural language understanding. Proceedings of the National Academy of Sciences, 92(22), 9977–9982. https://doi.org/10.1073/pnas.92.22.9977.

    Article  Google Scholar 

  • Bolukbasi, T., Chang, K.W., Zou, J.Y., Saligrama, V., & Kalai, A.T. (2016). Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., & Garnett, R. (Eds.) Advances in neural information processing systems 29 (pp. 4349–4357): Curran Associates, Inc.

  • Oliphant, T.E. (2006). A guide to NumPy, USA: Trelgol Publishing.

  • Jones, E., Oliphant, T., & Peterson, P. (2001). SciPy: Open Source Scientific Tools for Python.

  • Feinberg, R.A., Kim, I., Hokama, L., de Ruyter, K., & Keen, C. (2000). Operational determinants of caller satisfaction in the call center. International Journal of Service Industry Management, 11(2), 131–141. https://doi.org/10.1108/09564230010323633.

    Article  Google Scholar 

  • Gupta, G., Aggarwal, H., & Rani, R. (2016). Segmentation of retail customers based on cluster analysis in building successful CRM. International Journal of Business Information Systems, 23(2), 212. https://doi.org/10.1504/ijbis.2016.078907.

    Article  Google Scholar 

  • Gupta, S.T., Sahoo, J.K., & Roul, R.K. (2019). Authorship identification using recurrent neural networks. In Proceedings of the 2019 3rd international conference on information system and data mining - ICISDM 2019. ACM Press. https://doi.org/10.1145/3325917.3325935.

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735.

    Article  Google Scholar 

  • Jiang, Z., Li, L., Huang, D., & Liuke, J. (2015). Training word embeddings for deep learning in biomedical text mining tasks. In 2015 IEEEx International conference on bioinformatics and biomedicine (BIBM), pp 625–628. https://doi.org/10.1109/BIBM.2015.7359756.

  • Jurafsky, D. (2019). Speech and language processing an introduction to natural language processing, computational linguistics, and speech recognition 3rd edition draft. https://web.stanford.edu/jurafsky/slp3/edbook_oct162019.pdf.

  • Karpathy, A. (2015). The unreasonable effectiveness of recurrent neural networks. http://karpathy.github.io/2015/05/21/rnn-effectiveness/.

  • Kingma, D.P., Ba, J., Bengio, Y., & LeCun, Y. (2015). Adam: A method for stochastic optimization. In 3rd International conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, conference track proceedings. 1412.6980.

  • Leglaive, S., Hennequin, R., & Badeau, R. (2015). Singing voice detection with deep recurrent neural networks. In 2015 IEEE International conference on acoustics, speech and signal processing (ICASSP), pp 121–125. https://doi.org/10.1109/ICASSP.2015.7177944.

  • McKinney, W. (2011). Pandas: a foundational python library for data analysis and statistics.

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., & Weinberger, K. Q. (Eds.) Advances in neural information processing systems 26 (pp. 3111–3119): Curran Associates, Inc.

  • Mueller, A. Python word cloud library. https://pypi.org/project/wordcloud/.

  • Müller, J.M., Pommeranz, B., Weisser, J., & Voigt, K.I. (2018). Digital, social media, and mobile marketing in industrial buying: Still in need of customer segmentation? empirical evidence from Poland and Germany. Industrial Marketing Management, 73, 70–83. https://doi.org/10.1016/j.indmarman.2018.01.033.

    Article  Google Scholar 

  • Nair, V., & Hinton, G.E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on international conference on machine learning, ICML’10 (pp. 807–814). USA: Omnipress, Madison, WI.

  • Nowak, J., Taspinar, A., & Scherer, R. (2017). Lstm recurrent neural networks for short text and sentiment classification. In Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., & Zurada, J.M. (Eds.) Artificial intelligence and soft computing (pp. 553–562). Cham: Springer International Publishing.

  • Ozan, S. (2018). A case study on customer segmentation by using machine learning methods. In 2018 International conference on artificial intelligence and data processing (IDAP), IEEE. https://doi.org/10.1109/idap.2018.8620892.

  • Ozan, S., & Iheme, L.O. (2019). Artificial neural networks in customer segmentation. In 2019 27Th signal processing and communications applications conference (SIU), IEEE. https://doi.org/10.1109/siu.2019.8806558.

  • Pennington, J., Socher, R., & Manning, C.D. (2014). Glove: Global vectors for word representation. In Empirical methods in natural language processing (EMNLP), pp 1532–1543. http://www.aclweb.org/anthology/D14-1162.

  • Rehman, A., Naz, S., & Razzak, M.I. (2018). Writer identification using machine learning approaches: a comprehensive review. Multimedia Tools and Applications, 78(8), 10889–10931. https://doi.org/10.1007/s11042-018-6577-1.

    Article  Google Scholar 

  • Rehurek, R., & Sojka, P. (2010). Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 workshop on new challenges for NLP frameworks (pp. 45–50). Malta: ELRA, Valletta.

  • Rossum, G. (1995). Python reference manual. Tech. rep., Amsterdam, The Netherlands The Netherlands.

  • Sarvari, P.A., Ustundag, A., & Takci, H. (2016). Performance evaluation of different customer segmentation approaches based on RFM and demographics analysis. Kybernetes, 45(7), 1129–1157. https://doi.org/10.1108/k-07-2015-0180.

    Article  Google Scholar 

  • Schuster, M., & Paliwal, K.K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673–2681. https://doi.org/10.1109/78.650093.

    Article  Google Scholar 

  • Thakur, R., & Workman, L. (2016). Customer portfolio management (cpm) for improved customer relationship management (crm): Are your customers platinum, gold, silver, or bronze?. Journal of Business Research, 69(10), 4095–4102. http://www.sciencedirect.com/science/article/pii/S0148296316300625.

    Article  Google Scholar 

  • Tsiptsis, K., & Chorianopoulos, A. (2010). Data mining techniques in CRM: Inside customer segmentation. Hoboken: Wiley Publishing.

    Book  Google Scholar 

  • Wang, J.H., Liu, T.W., Luo, X., & Wang, L. (2018). An LSTM approach to short text sentiment classification with word embeddings. In Proceedings of the 30th conference on computational linguistics and speech processing (ROCLING 2018), pp. 214–223. the association for computational linguistics and chinese language processing (ACLCLP), Hsinchu, Taiwan. https://www.aclweb.org/anthology/O18-1021.

  • Windler, K., Jüttner, U., Michel, S., Maklan, S., & Macdonald, E.K. (2017). Identifying the right solution customers: A managerial methodology. Industrial Marketing Management, 60, 173–186. http://www.sciencedirect.com/science/article/pii/S001985011630027X.

    Article  Google Scholar 

  • Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa, J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M., & Dean, J. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation.arXiv:1609.08144.

  • Yao, Y., Rosasco, L., & Caponnetto, A. (2007). On early stopping in gradient descent learning. Constructive Approximation, 26(2), 289–315. https://doi.org/10.1007/s00365-006-0663-2.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

All the related work was carried out by the corresponding author.

Corresponding author

Correspondence to Şükrü Ozan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Availability of Data and Material

The data is strictly confidential, hence can not be shared in any circumstances.

Code Availability

The code is available in Jupyter Notebook format and can be shared up on request.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ozan, Ş. Case studies on using natural language processing techniques in customer relationship management software. J Intell Inf Syst 56, 233–253 (2021). https://doi.org/10.1007/s10844-020-00619-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-020-00619-4

Keywords

Navigation