Skip to main content

Advertisement

Log in

Opinion mining with reviews summarization based on clustering

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

Automatic text summarization can be used in recommendation systems to present useful texts obtained from the available comments and texts. For summarization, a human reads all of the writing and gains a background understanding of the text, but computers do differently. Several methods have been proposed for automatic text summarization until now, from abstract summarization methods that deal with new sentences produced from important points existed in the texts to extraction summarization methods, which deal with original main sentences from the text. In this study, we present an extraction method for text summarizing. In this method, at first, the sentences are processed, and the similarities between sentences are calculated by a proposed similarity measure. Afterward, the sentences are clustered based on the similarities, and at last, a certain number of sentences are selected from each cluster. The Gaussian Mixture Model (GMM) algorithm is used to cluster the sentences. The proposed method is tested on a collected dataset from Tripadvisor (https://www.tripadvisor.com/) customer reviews, and the results show that using GMM results in a more informative summary and more variation in sentences compared to K-means.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

Notes

  1. https://www.nltk.org.

  2. https://scikit-learn.org.

References

  1. Rather RA, Sharma J (2017) Customer engagement for evaluating customer relationships in hotel industry, The Business School, University of Jammu, India 8:1–13

  2. Cezar A (2011) The factors affecting writing reviews in hotel websites. Int Strateg Manag Conf 27:634–639

    Google Scholar 

  3. Poormasoomi A, Kahani M, Kamyar M, Kamyar H (2010) Auto Summarization multi-document based concepts, Annual Computer Conference of Iran.

  4. Ye Q, Law R, Gu B, Chen W (2011) The influence of user-generated content on traveler behavior: an empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Comput Human Behav 27:634–639

    Article  Google Scholar 

  5. Yavary A, Sajedi H, Saniee Abadeh M (2019) Information verification in social networks based on user feedback and news agencies. Soc Netw Anal Min. https://doi.org/10.1007/s13278-019-0616-4

    Article  Google Scholar 

  6. Ross J (2014) The business value of user experience, vol 2. Commerce Drive Cranbury, NJ 08512

  7. Gavilan D, Avello M (2018) The influence of online ratings and review on hotel booking consideration. Touris Manag 66:53–61

    Article  Google Scholar 

  8. Cilibrasi RL, Vitanyi PMB (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383

    Article  Google Scholar 

  9. Casalo LV, Flavian C, Guinalu M (2011) Understanding the intention to follow the advice obtained in an online travel community. Comput Human Behav 27:622–633

    Article  Google Scholar 

  10. Petz G, Karpowicz M, Furschub H, Auinger A, Stritesky V, Holzinger A (2015) Reprint of: computational approaches for mining user’s opinions on the Web 2.0. Inf Process Manag 51:510–519

    Article  Google Scholar 

  11. Lochter R, Zanetti D, Reller T, Almeida TA (2016) Short text opinion detection using ensemble of classifiers and semantic indexing. Expert Syst Appl 62:234–249

    Article  Google Scholar 

  12. Marujo L, Ling W, Ribeiro R, Gershman A, Carbonell J, Matos D, Neto JP (2016) Exploring events and distributed representations of text in multi- document summarization. Knowl Based Syst 94:33–42

    Article  Google Scholar 

  13. Gupta V, Singh Lehal G (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2:258–268

    Google Scholar 

  14. Lee F, Chen Yang C, Hung Chen C, Wang C, Yuan S (2016) Mining perceptual maps from consumer reviews. Decis Support Syst 82:12–25

    Article  Google Scholar 

  15. Tsirakis N, Poulopoulos V, Tsantilas P, Varlamis I (2017) Large Opinion mining for social, news and blog data. J Syst Softw 127:237–248

    Article  Google Scholar 

  16. Saif H, He Y, Fernandez M, Alani H (2016) Contextual semantics for sentiment analysis of Twitter. Inf Process Manage 52:5–19

    Article  Google Scholar 

  17. Severyn A, Moschitti A, Uryupina O, Plank B, Filippova K (2016) Multi-lingual opinion mining on YouTube. Inf Process Manage 52:46–60

    Article  Google Scholar 

  18. Eirinaki M, Pisal S, Singh S (2012) Feature-based opinion mining and ranking. J Comput Syst Sci 78:1175–1184

    Article  MathSciNet  Google Scholar 

  19. Mars A, Gouider M (2017) Big data analysis to features opinion extraction of customer. Procedia Comput Sci 112:906–916

    Article  Google Scholar 

  20. Kayser V, Blind K (2017) Extending the knowledge base of foresight: the contribution of text mining. Technol Forecast Soc Chang 116:208–215

    Article  Google Scholar 

  21. Mohd M, Jan R, Shan M (2020) Text document summarization using word embedding. Expert Syst Appl 143:112958. https://doi.org/10.1016/j.eswa.2019.112958

    Article  Google Scholar 

  22. Lloret E, Palomar M (2013) Tackling redundancy in text summarization through different levels of language analysis. Comput Stand Interface 35(5):507–518

    Article  Google Scholar 

  23. Wanga D, Zhub S, Lia T (2013) SumView: a Web-based engine for Summarizing product reviews and customer opinions. Expert Syst Appl 40(1):27–33

    Article  Google Scholar 

  24. Lichouri M, Abbas M, Freihat AA, Megtouf DEH (2018) Word level vs sentence level language identification: application to algerian and arabic dialects. In: The 4th International Conference on Arabic on computational Linguistic, vol 142, pp 246–253

  25. Jaffar Y, Bouzoubaa K (2018) Towards a New Hybrid Approach for Abstractive Summarization. In: The 4th International Conference on Arabic on computational Linguistic, vol 142, pp 286–293

  26. Sahoo D, Bhoi A, Balabantaray RC (2018) Hybrid approach to abstractive summarization, international conference on computational intelligence and data science (ICCIDS 2018), vol 132, pp 1228–1237

  27. Ebarougy R, Behery G, El Khatib A (2020) Extractive Arabic summarization using modified PagerRank Algorithm. Egypt Inf J 22(3):73–81

    Google Scholar 

  28. Rouane O, Belhadef H, Bouakkaz M (2019) Combine clustering and frequent itemsets mining to enhance biomedical text summarization. Expert Syst Appl 135:362–373

    Article  Google Scholar 

  29. Bhatia N, Jaiswal A (2015) Trends in extractive and abstractive techniques text summarization. Int J Comput Appl 117:0975–8887

    Google Scholar 

  30. Razaghnoori M, Sajedi H, Khani I (2018) Question classification in Persian using word vectors and frequencies. Cogn Syst Res 47:16–27

    Article  Google Scholar 

  31. Hu Y, Chen Y, Chou H (2017) Opinion mining from online hotel reviews—a text summarization approach. Inf Process Manage 53:436–449

    Article  Google Scholar 

  32. Hosseini Khan T, Ahmadi A, Mohebi A (2008) Gensim 22.0: a customizable process simulation model for software process evaluation, vol 13(1), pp 294–306

  33. (2014) Information Resources Management Association (IRMA), Marketing and consumer behavior: concepts, methodologies, tools, and applications: concepts, methodologies, tools, and applications, IGI Global, 2014 (ISBN: 1466673583, 9781466673588)

  34. Han J, Kamber M, Pei J (2012) Data mining concepts and techniques. Morgan Kaufman Publishers

  35. Zhang B, Zhang C, Yi X (2004) Competitive EM algorithm for finite mixture models. Pattern Recogn 48:131–144

    Article  Google Scholar 

  36. Xinfan M, Wang H (2009) Mining user reviews: from specification to summarization. In: Proceedings of ACL-IJCNLP, pp 177–180

  37. Atkinson J, Munoz R (2013) Rhetorics-based multi-document summarization. Expert Syst Appl 40(11):4346–4352

    Article  Google Scholar 

  38. Skalicky S, Crossley S (2014) A statistical analysis of satirical Amazon.com product reviews. Psychology. https://doi.org/10.7592/EJHR2014.2.3.skalicki(Corpus ID: 32691144)

  39. Jeong H, Ko Y, Seo J (2016) How to Improve Text Summarization and Classification by Mutual Cooperation on an Integrated Framework. Expert Syst Appl Int J. https://doi.org/10.1016/j.eswa.2016.05.001

    Article  Google Scholar 

  40. Qiang J, Chen P, Ding W (2016) Multi-document summarization using closed patterns. Knowl Based Syst 99:28–38

    Article  Google Scholar 

  41. Hu M, Liu B (2004) Mining and summarizing customer reviews, KDD '04: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 168–177. https://doi.org/10.1145/1014052.1014073

  42. Tseng YH, Wang YM, Lin YI, Lin CJ, Juang DW (2007) Patent surrogate extraction and evaluation in the context of patent mapping. J Inform Sci 33(6):718–736

    Article  Google Scholar 

  43. Qaroush A, Abu Farha I,Ghanem W, Washaha M, Maali E (2019) An efficient single document Arabic text summarization using a combination of statistical and semantic features. J King Saud Univ Comput Inf Sci

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hedieh Sajedi.

Ethics declarations

Conflict of interest

None of the authors has any conflicts of interests.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 14 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Marzijarani, S.B., Sajedi, H. Opinion mining with reviews summarization based on clustering. Int. j. inf. tecnol. 12, 1299–1310 (2020). https://doi.org/10.1007/s41870-020-00511-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-020-00511-y

Keywords

Navigation