1 Introduction

The Internet of vehicles (IOV) is an integral part in the context of the Internet of things [1]. The equipment on the vehicle can obtain the vehicle information from the information network platform by wireless communication technology, making relevant recommendation technology, and viewing the vehicle in real time. The recommendation algorithm can provide personalized functional services for vehicle operation [2, 3]. The important idea of the KGCN model is summarized in that when calculating its feature representation for a given entity in the knowledge graph, it first performs a width search method to obtain its multi-hop associated entity in the knowledge graph [4], and then combines the relevant information of its neighbor nodes. The weights are aggregated and the features of the entity are merged. There are two main meanings of adopting this method [5, 6]: first, the calculation of the feature vector of the entity contained in the knowledge graph is weighted and aggregated with the information of the neighboring entity within a certain range of the entity, second, the information of the neighbor node is calculated for the node. The degree of selection is jointly influenced by the established entity and neighboring entities. This method not only effectively combines the semantic information between entities [7], but also characterizes the user's own interests. There may be many in extreme conditions since the neighbor nodes of each entity are largely different. For this reason, drawing lessons from the idea of graph convolutional network, a concept of receptive field is defined in knowledge graph convolutional network [7]. The receptive field refers to a set of neighborhoods of a fixed size. In this way, the calculation amount of the knowledge graph convolutional network has a certain controllability, and this greatly improves the scalability of the algorithm. So as to achieve effective access to vehicle information, real-time information of the vehicle.

2 Methods

2.1 KGCN layer

The predictive equation for judging whether the users are interested in vehicle information in the IOV system.

$$\mathop {\hat{y}}\nolimits_{uv} = F\left( {u,v|\theta ,Y,G} \right)$$
(1)

where G is the knowledge graph composed of entity-relation-entity triples, Y represents the sparse matrix of a given user's vehicle, u represents user, v represents Vehicles, \(\theta\) represents the set of learning parameters of the whole model.

2.1.1 Measure users’ preference for different vehicle information relationships

$$\pi_{r}^{u} = g\left( {r,u} \right)$$
(2)

where g is aggregator, r represents relationship, π represents weight. Define function g: a certain function that can make Rd × Rd → R, here is an aggregator, used to calculate the score between the user and a certain. Where d is the dimension of the feature representation of the knowledge graph.

The real meaning here can be understood as (calculating the user’s preference for different relationships). For example, if a user pays much attention to the vehicle service information of the corresponding online ride-hailing, he will pay more attention to the number of vehicle service times, user evaluation and other information, and will be more willing to use high-quality vehicles based on these judgments. Here, g function is used to measure the user's preference for different vehicle information relationships.

2.1.2 Use the linear combination of neighbor information to describe the neighborhood information of a node

$$\mathop v\nolimits_{N\left( v \right)}^{u} = \sum\nolimits_{e \in N\left( v \right)} {\mathop \pi \nolimits_{{\mathop r\nolimits_{v,e} }}^{u} } e$$
(3)

where N(v) is the set of all entities directly connected to entity v (that is, within one hop), Note that the weight π here is not only related to the relationship between the two nodes v and e, but also related to the characteristics of the corresponding user u at this time. The weight here is the g function scores of all entities e and relations r in N(v) corresponding to v and then normalized to formula 4.

$$\mathop \pi \nolimits_{{\mathop r\nolimits_{v,e} }}^{u} = \frac{{\exp \left( {\mathop \pi \nolimits_{{\mathop r\nolimits_{v,e} }}^{u} } \right)}}{{\sum\nolimits_{e \in N\left( v \right)} {\exp \left( {\mathop \pi \nolimits_{{\mathop r\nolimits_{v,e} }}^{u} } \right)} }}$$
(4)

where N(v) represents the set of all entities directly connected to entity v (that is, within one hop), where e is the characteristics of entity.

When calculating the representation of a given entity neighborhood, the vector scores of users and relationships are similar to the role of personalized filters. This processing is because the neighbor feature vectors are focused on the basis of these user-specific vectors.

2.1.3 Control the number of neighbor users

$$S\left( v \right) \to \left\{ {e|e \in N\left( v \right)} \right\}$$
(5)
$$S\left( v \right)| = {\rm K}$$
(6)

where N(v) represents the set of all entities directly connected to entity v (that is, within one hop), S(v) is entity, (single-layer) accepting field, e is the characteristics of entity, K represents Hyperparameter, select K neighbor users.

In a real knowledge graph [7], there will be too many neighbor users in a certain node v, which will bring huge pressure to the calculation of the overall model. It does not use all its neighbor users, but randomly and uniformly sample a fixed-size set from its neighbor users. Instead, a hyperparameter K is defined. For each node v, only K neighbor users are selected for calculation. In other words, the neighborhood representation of v at this time is written as. In KGCN, because the calculation of the final feature of v is sensitive to these regions \(v_{S\left( v \right)}^{u}\).

2.1.4 Define aggregator

The key point in the KGCN model is the fusion of the feature v of the entity and the neighbor feature \(v_{S\left( v \right)}^{u}\). Three types of aggregators are defined in KGCN. In this experiment, an additive aggregator is used for comparison experiments.

Sum Aggregator. The sum aggregator is to perform arithmetic addition of two eigenvectors, and then add a nonlinear transformation.

$$\mathop {{\text{agg}}}\nolimits_{{{\text{sum}}}} = \sigma \left( {W \cdot \left( {v + \mathop v\nolimits_{S(v)}^{u} } \right) + b} \right)$$
(7)

In the formula, w is conversion weight, b represents bias, S(v) represents physical (single-layer) acceptance field.

Splicing aggregator The splicing aggregator performs splicing operations on two feature vectors, and then adds a nonlinear transformation.

$$\mathop {{\text{agg}}}\nolimits_{{{\text{concat}}}} = \sigma \left( {W \cdot {\text{conact}}\left( {v,\mathop v\nolimits_{S(v)}^{u} } \right) + b} \right)$$
(8)

Neighbor aggregator The neighbor aggregator only uses the neighbor features of the entity v, and directly uses the neighborhood representation to replace the representation of the v node.

$$\mathop {{\text{agg}}}\nolimits_{{{\text{neighbor}}}} = \sigma \left( {W \cdot \mathop v\nolimits_{S(v)}^{u} + b} \right)$$
(9)

3 Algorithm

For a given entity, its final vector feature indicates that it must have a significant relationship with its neighbor users directly connected, which is called a first-order entity feature.

In order to further study the relationship between the two entities in the knowledge graph and explore other potential interests of users, the user's neighborhood set is expanded from one layer to multiple layers, that is, the selected single entity feature is transferred to a layer of adjacent neighborhoods, Then the first-order feature vector can be obtained, as shown in Fig. 1. In other words, the h-level features of a single entity emphatically aggregate the neighbor feature representations within h hops near itself. For this design, different aggregators are used to collect neighborhood information.

Fig. 1
figure 1

Neighborhood distribution of K = 2

H represents the maximum number of receptive fields. For the obtained entity pairs of users and Vehicles, first obtain the receptive field size of the item through iteration, and then repeat the aggregation for H times. In the h time, calculate the neighborhood \(e \in M\left[ h \right]\) features of each entity. Then aggregate it with its own features \(e^{u} \left[ {h - 1} \right]\) to obtain the feature representation of v in the next iteration, as shown in Fig. 2. Vu and user feature u are passed into the function together: \(R^{d} \times R^{d} \to R^{d}\) to get the final predicted probability.

$$\mathop {\hat{y}}\nolimits_{uv} = f\left( {u,\mathop v\nolimits^{u} } \right)$$
(10)

where Vu represents H-order entity feature, Since the algorithm traverses all possible user-item pairs, in order to make the algorithm more efficient, the negative sampling technique is selected in the training of this article. Therefore, the specific loss function of the KGCN model is as follows:

$$\Gamma = - \sum\nolimits_{u \in U} {\left( {\sum\nolimits_{{v:\mathop y\nolimits_{uv} = 1}} {\ell \left( {\mathop y\nolimits_{uv} ,\mathop {\hat{y}}\nolimits_{uv} } \right) - \sum\nolimits_{i = 1}^{{\mathop T\nolimits^{u} }} {\mathop E\nolimits_{{\mathop v\nolimits_{i} \sim P(\mathop v\nolimits_{i} )}} } \ell \left( {\mathop y\nolimits_{uv} ,\mathop {\hat{y}}\nolimits_{uv} } \right)} } \right)} + \lambda \left\| F \right\|_{2}^{2}$$
(11)
Fig. 2
figure 2

Self-vector representation in the h iteration

In the formul, \(\ell\) represents a-cross-direction loss function, P represents the distribution of negative samples, Tu is the number of negative samples of user u, p represents uniform distribution.

$$T^{u} = \left| {\left\{ {v:y_{uv} = 1} \right\}} \right|$$
(12)

The code of the KGCN algorithm is shown in Table 1.

Table 1 KGCN algorithm code

4 Results and discussion

4.1 Results

This experiment uses a medical data database as a training experiment. The specific data is shown in Table 2.

Table 2 Training data

The experimental data and various parameters are shown in Table 3.

Table 3 Experimental parameters

In this experiment, a classic hidden factor model based on collaborative filtering, singular value decomposition (SVD) and the fusion knowledge graph convolutional network KGCN [8, 9], are used in the recommended technology to compare experiments, and the results can be compared to Table 4.

Table 4 Comparison of data and experimental results

The F1 and accurate value line graphs of the training set, validation set, and test set of the data are shown in Figs. 3 and 4.

Fig. 3
figure 3

F1 line chart of training set, verification set and test set

Fig. 4
figure 4

Line chart of exact values of training set, verification set and test set

4.2 Discussion

The traditional collaborative filtering algorithm SVD does not have the auxiliary effect of the knowledge map, and the experimental effect is worse than that of KGCN. In the real relationship, using the attributes of users and vehicles, the various attributes do not exist alone, but are related to each other, forming a knowledge graph. Therefore, KGCN can better represent the relationship and perform better than SVD [9]. Because the multi-hop neighborhood structure is used, it shows that the knowledge graph captures neighborhood information is helpful for recommendation.

First, the influence of the number of samples in the neighborhood, the performance of the algorithm is further analyzed by changing the number of samples, as shown in Table 5. It can be concluded from the table that the best performance is achieved when the sampling format is 8. This is because a too small K does not have enough capacity to contain neighbor information, and a large K is easily affected by noise [10].

Table 5 The influence of increasing K value on accuracy

Second, adjust the influence of the number of receiving field layers on the experimental results. After experimenting with the number of layers H of the receiving field from 1 to 4, it is found that it has an effect on the performance of KGCN. The results are shown in Table 6. By comparing the data, it can be concluded that H is more sensitive to the experimental effect of the model than the number of samples K in the field. When H = 4, the accuracy rate drops significantly. This is because too many layers of the receiving field have a negative impact on the experimental effect. This result is consistent with common sense judgment, because if the connected relationship in the knowledge graph is longer, these influencing factors at the end of the relationship chain are meaningless when recommending the similarity of Vehicles. In other words, H being 1 or 2 is sufficient for real life scenarios.

Table 6 Influence of increasing H value on accuracy

In the IOV system, for physical vehicle information extraction, the data required for training data is too high, and high-quality data has a large labor cost. Therefore, unsupervised learning is considered in subsequent studies to reduce the manual pressure as much as possible. For recommendation technology, and the algorithm is based on knowledge graph construction, the workload is relatively large.

5 Conclusions

The main content of this chapter is to in-depth study of the recommendation technology based on the knowledge graph-KGCN, this method is to aggregate the feature vector representation of the adjacent entities of a given entity [11, 12], and calculate the neighborhood features of the entity by weighting, and by setting a fixed-size neighborhood collection as the acceptance field, the calculation amount of the algorithm is controlled. By comparing with the traditional collaborative filtering algorithm, the result proves the superiority of the algorithm, and it is more suitable for the recommendation of knowledge graph. In the application of IOV system. The new algorithm is proposed to better obtain real-time and effective vehicle information, and provide personalized functions for vehicle operation to provide better services.