当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Review of Text-Based Recommendation Systems
IEEE Access ( IF 3.4 ) Pub Date : 2021-02-13 , DOI: 10.1109/access.2021.3059312
Safia Kanwal , Sidra Nawaz , Muhammad Kamran Malik , Zubair Nawaz

Many websites over the Internet are producing a variety of textual data; such as news, research articles, ebooks, personal blogs, and user reviews. In these websites, the textual data is so large that the process of finding pertinent information by a user often becomes cumbersome. To overcome this issue, “Text-based Recommendation Systems (RS)” are being developed. They are the systems with the capability to find the relevant information in a minimal time using text as the primary feature. There exist several techniques to build and evaluate such systems. And though a good number of surveys compile the general attributes of recommendation systems, there is still a lack of comprehensive literature review about the text-based recommendation systems. In this paper, we present a review of the latest studies on text-based RS. We have conducted this survey by collecting literature from preeminent digital repositories, that was published during the period 2010-2020. This survey mainly covers the four major aspects of the textual based recommendation systems used in the reviewed literature. The aspects are datasets, feature extraction techniques, computational approaches, and evaluation metrics. As benchmark datasets carry a vital role in any research, publicly available datasets are extensively reviewed in this paper. Moreover, for text-based RS many proprietary datasets are also used, which are not available in the public. But we have consolidated all the attributes of these publically available and proprietary datasets to familiarize these attributes to new researchers. Furthermore, the feature extraction methods from the text are briefed and their usage in the construction of text-based RS are discussed. Later, various computational approaches that use these features are also discussed. To evaluate these systems, some evaluation metrics are adopted. We have presented an overview of these evaluation metrics and diagramed them according to their popularity. The survey concludes that Word Embedding is the widely used feature selection technique in the latest research. The survey also deduces that hybridization of text features with other features enhance the recommendation accuracy. The study highlights the fact that most of the work is on English textual data, and News recommendation is the most popular domain.

中文翻译:


基于文本的推荐系统回顾



互联网上的许多网站都在产生各种文本数据;例如新闻、研究文章、电子书、个人博客和用户评论。在这些网站中,文本数据非常大,使得用户查找相关信息的过程常常变得繁琐。为了解决这个问题,正在开发“基于文本的推荐系统(RS)”。它们是能够使用文本作为主要特征在最短时间内找到相关信息的系统。存在多种构建和评估此类系统的技术。尽管大量调查总结了推荐系统的一般属性,但仍然缺乏关于基于文本的推荐系统的全面文献综述。在本文中,我们回顾了基于文本的 RS 的最新研究。我们通过收集 2010 年至 2020 年期间出版的优秀数字存储库文献来进行这项调查。这项调查主要涵盖了所评论文献中使用的基于文本的推荐系统的四个主要方面。这些方面是数据集、特征提取技术、计算方法和评估指标。由于基准数据集在任何研究中都起着至关重要的作用,因此本文对公开可用的数据集进行了广泛的审查。此外,对于基于文本的 RS,还使用了许多专有数据集,这些数据集不向公众开放。但我们已经整合了这些公开可用和专有数据集的所有属性,以使新研究人员熟悉这些属性。此外,还简要介绍了文本特征提取方法,并讨论了它们在基于文本的 RS 构建中的应用。 随后,还讨论了使用这些特征的各种计算方法。为了评估这些系统,采用了一些评估指标。我们概述了这些评估指标,并根据它们的受欢迎程度绘制了图表。调查得出的结论是,词嵌入是最新研究中广泛使用的特征选择技术。该调查还推断,文本特征与其他特征的混合可以提高推荐的准确性。该研究强调了这样一个事实:大部分工作都是关于英语文本数据,而新闻推荐是最受欢迎的领域。
更新日期:2021-02-13
down
wechat
bug