Abstract

With the increase of library collections, it is difficult for readers to quickly find the books they want when choosing books. Book recommendation system is becoming more and more important. Based on the previous research, this paper proposes a book recommendation algorithm based on collaborative filtering and interest. Take the interest of the book itself as an important measurement index, including the number of searches, borrowing time, borrowing times, borrowing interval, and renewal times. Through the analysis of MAE and RMSE experiments, the results show that the method proposed in this paper converges faster than the traditional method.

1. Introduction

With the continuous deepening of the informatization process, all walks of life are carrying out informatization reforms. In this context, the process of digitalization of library management information is gradually improving [1]. Digital libraries are deeply loved by readers for their convenient and quick document retrieval methods, personalized recommendations, and other characteristic services [2]. It is difficult for readers to find books of interest in a short period of time in the face of various bibliographies. Therefore, the user experience of the traditional library borrowing method is poor. At present, many scholars have proposed book recommendation methods, including content-based recommendation algorithms, which recommend products similar to the products he liked in the past according to the products that the user liked in the past [3]. A recommendation algorithm based on association rules [4], the same user borrows different books can be considered as having an association relationship [5]. Searching for the collection of books with the highest degree of association from the borrowing information can be used as an important reference for book recommendation [6]. A combination recommendation model based on tagging and association rule mining and user-based collaborative filtering has nothing to do with items [7]. Find the most similar neighboring users of the current user, and recommend the books borrowed by neighboring users to the current user [8]. A collaborative filtering recommendation algorithm is based on the popularity of the item scoring system combined with average preference weights [9]. A book recommendation algorithm is based on social networks and so on [10].

This paper proposes a book recommendation algorithm based on collaborative filtering and interest degree. Collaborative filtering uses cosine similarity to calculate; interest degree includes search times, borrowing time, borrowing times, borrowing interval, and renewing times according to book attributes. Finally, the average deviation and the root mean square error are used to measure.

The rest of this paper is organized as follows: Section 2 analyzes the factors that affect book recommendation from the two aspects of interest and similarity. The book recommendation algorithm based on collaborative filtering and interest degree is discussed in Section 3. Section 4 shows the experimental analysis, and Section 5 concludes the paper with summary and future research directions.

2. Analysis of the Factors Affecting Book Recommendation

Recommending books to users can be analyzed in terms of interest and similarity, as shown in Figure 1.

In interest degree, interest is the characteristic of the book itself, and it is recommended from the perspective of attractiveness to the target user [11]. It includes the interest of the book itself and the interest of the user. Book interest refers to the attributes of the book itself, including search times, borrowing time, borrowing interval, borrowing times, and renewing times [12]. User interest refers to books that users like, books that have already been borrowed, and so on.

In similarity, the similarity is to recommend the target user from the perspective of relevance to the user. There are two situations for being associated with users [13]. Either they have common attributes, including major, grade, and gender, or they have a common borrowing group with the books borrowed by the user, and the books borrowed by these people can also be recommended to the user [14].

After comprehensively analyzing the interest and similarity, the relevant model is used to make predictions, and a recommendation list is generated [15]. Finally, comprehensive evaluation is carried out through evaluation indicators, and the prediction results are analyzed.

3. Book Recommendation Algorithm Based on Collaborative Filtering and Interest

3.1. Book Interest Model

Attributes related to books include search times, borrowing time, borrowing times, borrowing interval, and renewing times.

The proportion of search times of a certain book is the proportion of search times to the search times of all books. For normalization, the proportion is divided into five levels; the formula is as follows:

In which, is the grade score of the search proportion, is the number of searches, and is the total number of searches for all books. The higher the ranking means the higher the score.

The length of the borrowing time can basically reflect the popularity of a book [16]. Of course, the borrowing time may be too long because you forget to return the book, or you cannot return the book because of the holiday. This special case is not considered here for the time being, and the formula is as follows:

In which, is the grade score of the borrowing time, is the number of borrowers of the book, is the longest borrowing time of the book, is the book return time, and is the book borrowing time. The higher the ranking means the higher the score.

The number of borrowings can accurately reflect the popularity of the book [17]. The more borrowing times, the higher the popularity. The formula is as follows:

In which, is the rating value of the number of borrowings, is the number of borrowings of the book, and is the total number of borrowings of all books. Divide the number of borrowings into five levels, and rank according to the number of borrowings. The ranking is within 20%, the highest level. The ranking is 81%-100%, the level is the lowest, and the score is the lowest.

The borrowing interval refers to the time interval for a book to be borrowed after being returned [18]. If the time is shorter, the demand for the book is greater, or the popularity of the book is greater. On the contrary, if it is returned and is no longer borrowed, it means that the popularity of the book is very low. The formula is as follows:

In which, is the grade score of the borrowing interval, is the number of borrowers of the book, is the longest borrowing time of the book, is the return time, and is the next person’s borrowing time. is the sum. The higher the ranking means the higher the score.

The number of renewals of a book can also reflect the popularity of the book to a certain extent. The formula is as follows:

In which, is the grade score of the proportion of the number of renewals, is the number of renewals of the book, and is the total number of renewals of all books. The higher the ranking means the higher the score.

Finally, the average of the five indicators is used as a comprehensive indicator of book interest; the formula is as follows:

3.2. Collaborative Filtering Recommendation Model

The basic idea of the collaborative filtering algorithm is to find similar users of the current user and predict the current user’s score based on the similar user’s score information to make recommendations. A recommendation system based on collaborative filtering does not analyze information from data but establishes an effective evaluation feedback mechanism to allow users to form a good feedback [19]. In other words, the recommendation users get may not be mined from the data at all but contributed by other users. There are three main steps: collecting scoring data, finding neighbors, and generating a recommendation list.

The scoring data is shown in Table 1.

The user-based collaborative filtering algorithm is to find neighbor users with high similarity for the current user and then recommend items that the neighbor users have rated, and the current user has not rated to the current user. The steps include calculating the similarity between the current user and other users, sorting according to the similarity from the highest to the bottom and the user with the highest ranking as the current user’s neighbor, filtering the current user’s rating items from the neighboring user’s rating list, predicting the current user’s rating of unrated item scoring, selecting the one with the highest score, and recommending it to the current user.

Cosine similarity can describe the linear correlation between two sets of data, and its value range is between -1 and 1. The cosine similarity is calculated based on the set of items jointly evaluated by two users [20]. When using this method to calculate, it is necessary to remove the average value of all commodities evaluated by the user. Generally, the following calculation formula is used to calculate the similarity.

In which, is the set evaluated by user , is the set evaluated by user , is the rating of user on item , and is the rating of user on item .

In this paper, the interest degree and collaborative filtering are averaged for a comprehensive analysis.

3.3. Evaluation Model

Mean deviation and root mean square error are usually two standards to measure the accuracy of the recommended system.

Divide the attributes into five levels and rank them according to the data. The ranking is within 20%, the highest level; the ranking is 81%-100%, the lowest level, and the score is the lowest.

The formula for mean deviation is as follows:

In which, is the predicted user rating, and is the user’s actual rating. The smaller the deviation of the average value, the closer the predicted score of the recommendation algorithm is to the actual score.

The formula for the root mean square error is as follows:

In which, represents the test data, represents the size of the test data set, represents the user, represents the book, represents the user’s actual score for the book, and represents the user’s predicted score for the book.

4. Experimental Results and Analysis

The experimental data selected the borrowing data and user data of Wuxi Vocational College of Science and Technology from 2014 to 2020, and the data was copied and expanded to three times the original data volume. The borrowing data for 2020 is shown in Table 2. In the experiment, the data set is divided into training set and test set in an 8 : 2 manner.

The MAE is calculated according to formula (8), and the partial comprehensive data obtained is shown in Table 3, and the MAE of the first 150 neighbors is shown in Figure 2.

It can be seen from Figure 2 that in the comparison of the first 150 neighbor users, it gradually decreased at the beginning, and the maximum value was 0.89. When it reaches about 40 neighbor users, it basically tends to be stable, and the stable value is around 0.6. The data is divided into literature and history and science and engineering for analysis. Figure 3 is the MAE value of science and engineering, and Figure 4 is the MAE value of literature and history. It can be seen that the literature and history category basically tends to be stable when it reaches about 40 neighboring users, similar to Figure 2. The science and engineering category also tends to be stable among 40 neighbor users, but there are some fluctuations. This shows that the data of literature and history accounts for a higher proportion of the whole data, and the frequency of borrowing is also higher.

The RMSE is calculated according to formula (9), and the partial comprehensive data obtained is shown in Table 4, and the RMSE of the first 150 neighbors is shown in Figure 5.

It can be seen from Figure 5 that the literature and history category also tends to be stable with 40 neighbor users. The science and engineering category tends to be stable with 60 users. This is the same as the previous article, indicating that the data of literature and history account for a higher proportion of the entire data, and the frequency of borrowing is also higher.

This article analyzes from two perspectives of collaborative filtering and interest and compares it with the traditional single cosine similarity collaborative filtering, as shown in Figure 6. It can be seen that the method proposed in this paper tends to be stable at about 50 times, and the fluctuation is small. The traditional method only stabilizes after about 70 times, and the fluctuations are still greater after stabilization. It shows that the method in this paper has better convergence and stability.

5. Conclusion

This paper proposes a book recommendation algorithm based on collaborative filtering and interest. Collaborative filtering uses cosine similarity for analysis, and the interest degree uses the basic attributes of the book as a measurement index. Through analyzing the statistical data of the library of Wuxi Vocational College of Science and Technology for many years, using MAE and RMSE two measurement indicators for experimental analysis, it is concluded that the method proposed in this paper has a good convergence result. The goal of the next step is to optimize the collaborative filtering algorithm and at the same time to optimize the measurement indicators, so as to have better convergence results.

Data Availability

The data used to support the findings of this study are included in the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.