Abstract

In order to solve the problems of low accuracy and low efficiency of answer prediction in machine reading comprehension, a multitext English reading comprehension model based on the deep belief neural network is proposed. Firstly, the paragraph selector in the multitext reading comprehension model is constructed. Secondly, the text reader is designed, and the deep belief neural network is introduced to predict the question answering probability. Finally, the popular English dataset of SQuAD is used for test analysis. The final results show that, after the comparative analysis of different learning methods, it is found that the English multitext reading comprehension model has a strong reading comprehension ability. In addition, two evaluation methods are used to score the overall performance of the model, which shows that the overall score of the English multitext reading comprehension model based on the deep confidence neural network is more than 90, and the efficiency will not be reduced because of the change of the number of documents in the dataset. The above results show that the use of the deep belief neural network to improve the probability generation performance of the model can well solve the task of English multitext reading comprehension, effectively reduce the difficulty of machine reading comprehension in multitask reading, and has a good guiding significance for promoting human convenient Internet knowledge acquisition.

1. Introduction

With the development of intelligence, the field of natural language processing has gradually become a research hotspot. How to use computer technology to imitate human consciousness has always been an urgent problem that scientists want to solve. Reading comprehension is the key ability of human beings. In order to improve the practical application ability of reading comprehension, more and more research studies have been carried out to analyse the multitext reading comprehension task. Ruiz et al. recently proposed a text classification method based on the principle of minimum description length, which can be applied to multilabel classification without transforming the classification problem. At the same time, it can make use of the dependent information between labels and naturally support online learning [1]. Höhn et al. [2] designed a multilabel reasoning algorithm based on reasoning and adopted a new iterative reasoning mechanism to effectively use the information between labels, which can make use of the information between labels and avoid the problem of label order sensitivity. Wang et al. [3] proposed S&I reader. Based on the idea of granular computing, multigranular modules of computing context granularity and sequence granularity are added to the training model to simulate human text understanding behavior. Experiments show that the proposed model is effective for Chinese and English datasets. Facing the interpretable multihop reading comprehension problem of multiple documents, Tu et al. [4] proposed an effective system with selection, answering, and interpretation (SAE) capabilities, aiming at multitask learning, which predicted answers at the representation level, supported sentence prediction at the sentence level, and conducted attention-based interaction between the two tasks. Feng et al. [5] proposed a sentence-based circular reasoning (SCR) method to solve the problem that the text cannot be accurately understood. In addition, they also proposed a nesting mechanism to expand the probability distribution to weighting and verified the effectiveness and interpretability of the model through experiments. Pilozzi et al. [6] led the team to propose a new path-based constraint type attention model, which uses the relationship type constraints in the corresponding triples to identify entity types and proves that the real performance of the model is better than the current technical level. Baskin et al. [7] proposed a new method to solve the multihop reading comprehension problem and proposed gating RGCN to accumulate evidence on the path-based reasoning graph. The results show that the performance of this model is 4.2% higher than that of humans.

In addition, with the development of artificial intelligence technology, the research on machine learning technology of multitext reading comprehension is also continuing. Skolik et al. [8] proposed a multichannel MRC task model with the channel renaming framework and hierarchical neural network, which can produce more accurate answers by reducing noise information and extracting hierarchical information. Park et al. [9] constructed a Korean machine reading comprehension dataset and added a self-matching layer for the encoder recurrent neural network by using multilayer SRU. The results show that the model has high performance. Guo et al. [10] proposed a new frame-based neural network for machine reading comprehension, which makes use of frame semantic knowledge to answer questions conveniently and can integrate multiframe semantic information to obtain better sentence representation. Huang et al. [11] proposed a new text feature-based hybrid embedding training method based on the neural network, which introduced the additional attention layer and output layer and enhanced the data by using the independence between questions and paragraphs. Mikalef et al. [12] proposed a model of machine reading comprehension-gated feature network, which can selectively use language features according to the role of language features in the process of choosing answers. Under the background of the new era of the development of the Internet of things, Jimenez et al. [13] analysed the development prospect of smart home in recent years, designed a smart home application voice assistant based on Alexa, and further applied machine reading comprehension to people’s daily life.

In conclusion, there are many research studies on the task of multitext reading comprehension in the world, and there are many research studies on the development of AI which combine with artificial intelligence. However, few research studies apply the deep belief neural network to the analysis of multitext reading comprehension. Therefore, in order to fill the blank in the study, this study uses the deep belief neural network to study the task of English multitext reading comprehension, which is of great significance to solve the problem of multidocument mining in practical application and promote the convenient life of human beings.

2. Construction of English Multitext Reading Comprehension Model

2.1. Model Framework of English Multitext Reading Comprehension

Multitext reading is different from single text reading. The task of multitext reading comprehension needs to screen out important content and answers from multiple documents, and providing answers is only part of the document. When it is necessary to extract the content and answers of multiple documents, the existence of multiple interference information will increase the difficulty of machine reading comprehension and screening answers to a certain extent [1416]. In the face of massive documents, multitext machine reading comprehension needs to first retrieve and filter multiple documents and then conduct single text retrieval on the filtered documents to capture the content and answers in the documents. The specific framework is shown in Figure 1.

As shown in Figure 1, the multitext reading comprehension system first takes a series of documents related to the question as the system input and enters the paragraph selector. The paragraph selector searches the documents in multiple documents one by one according to the given problems to be solved, selects the most suitable paragraphs from each document, and then stitches the selected paragraphs in each document into a new set, that is, to synthesize a new document. Finally, the text reader is used to predict the new documents, from which the final answer range is obtained. The paragraph selector can mine the semantic relevance of each paragraph in the document. In this study, a paragraph selector based on the deep belief network is proposed to extract the text answers by calculating the probability of the selected paragraph in the document. The essence of the text reader is the model of single text reading comprehension, which can read and understand the new document, predict the beginning and ending position of the answer from the single document, and take the middle part of the selected range as the prediction range of the final answer.

2.2. Paragraph Selector Based on Deep Belief Network

According to the framework of the multitext reading comprehension system, the task of the paragraph selector is to select the relevant paragraphs from multiple documents to narrow the reading range of the text reader so that the text reader can only focus on the paragraphs related to the answers and reduce the difficulty of machine reading. Therefore, the accuracy of the paragraph selector directly determines the final performance of the multitext reading comprehension model. In this study, the task of the paragraph selector is abstracted as text classification, which encodes the paragraph information of multiple input documents and also encodes the problem. In the process of document coding, we first represent the words in the document as vectors, and at the same time, we need to mine the main semantic information from the questions and document paragraphs to choose the right answers [1719]. The paragraph selector will divide the paragraphs into two categories, one contains the correct answers and the other does not contain the answers.

Based on the analysis of the paragraph selector function, this study uses the attention mechanism to mine and match document information in paragraph selection. In order to improve the prediction performance of the selector, the deep belief neural network is used to extract the feature information from the multidocument paragraphs, and the two classification and multiclassification methods are used to classify the paragraphs. Firstly, the attention mechanism model obtains the context characteristics of the question information through the two-way mechanism and outputs the prediction range of the answer. The model structure mainly includes four levels, namely, embedding layer, bidirectional attention mechanism layer, modeling layer, and output layer. The embedding layer includes character embedding, word embedding, and document paragraph context embedding. The attention mechanism model maps the characters in the document from low dimension to high dimension vector through the character embedding layer. At the same time, each word is mapped to a higher dimension vector by using the word embedding layer in the same way. In addition, the context embedding layer connects the paragraphs in the document and connects the output so that the output words and sentences are connected; the sentence and paragraph information formed by characters are fully integrated. The main work of the two-way attention mechanism is to link and fuse the semantic information between the input problem and the document information and take the above and below vectors as the output [20, 21]. In the two-way attention mechanism layer, two-way attention computation needs to rely on and share the same similarity matrix, as shown in the following formula:

In formula (1), represents the similarity between the question in the question and the words in the document, represents the t column in the question vector matrix H, represents the j column in the document word vector matrix U, is used to calculate the similarity between the two matrices, and represents the document vector information. Then, formula (2) is used to fuse the calculation results of the bidirectional attention mechanism:

In equation (2), represents the document fusion matrix, 8d represents the dimension of the fusion vector, and T represents the number of information in a vector matrix. Finally, we use the following formula to make an abstract summary:

In equation (3), G represents the document word vector sequence formed by the final encoding, H represents the word vector as the document input, and u represents the word vector as the question input.

In the modeling layer, the deep belief network is used to extract features from fused document information. The deep belief network is a machine learning algorithm limited by the Boltzmann machine. As a probability generation model, it is applied to the multitext reading comprehension model to predict the probability of answers in a fused single document. In this study, the basic structure of the deep belief neural network model is shown in Figure 2.

As shown in Figure 2, the deep belief network has two parts, a total of six layers. Firstly, in the restricted Boltzmann machine (RBM) structure part, the fusion document is taken as the input, from which the unbiased samples are extracted through the visual layer and hidden layer of the first and second layers and then through the two hidden units. After RBM weight calculation, the information is input to the fifth hidden unit, and the third hidden unit will continuously optimize the data through fine-tuning. Finally, the final information extracted from the feature is input to the top unit of the sixth layer, and the information data is classified by the classifier structure to generate label data.

After the construction of the deep belief neural network, that is to say, after using the classifier to process the input samples, it is necessary to predict the range of the final answer. Considering the influence and correlation between different candidate paragraphs, this paper proposes a multicategory paragraph selector to solve this problem and considers the difference information between paragraphs in paragraph selection. The construction framework is shown in Figure 3.

As shown in Figure 3, all the paragraphs in the question and document are input to Bi-LSTM layer. Bi-LSTM encodes the two-way information in the paragraph sentence and stitches the two-way information to construct a new representation. It is the vector of word mapping that contains both historical and future information features. Then, the paragraph information of the document processed by Bi-LSTM and part of the word information in the problem are input to the attention mechanism layer, and the other part of the word information in the problem is directly sent to the deep belief neural network, which is used to obtain the eigenvector of the problem. Then, calculate the similarity between the feature vector in the document paragraph and the problem feature vector. The calculation method is shown in the following equation:

In equation (4), represents the feature vector of the paragraph in the input document. The angle between the two vectors is expressed as . As shown in Figure 4, after solving the similarity between the paragraph and the question mapping vector, the softmax function is executed to calculate the probability of the existence of an answer in the output paragraph. The process is shown in the following equation:

In equation (5), is the combination vector of prediction probability, represents the similarity, and represents the number of paragraphs in the document.

2.3. Text Reader Task

Text reader detects a given document and predicts the range of answers. In this study, we use match the LSTM attention mechanism to build a text reader’s answer prediction model, which can represent the given document and question in the matrix [2224]. The LSTM of the match LSTM reading comprehension model encodes the problem and the paragraph of the document, respectively, and the context information begins to fuse. See the following equation for the process:

In equation (6), and are the hidden layer of the document and the problem, respectively, P is the length of the document, and Q is the length of the problem. In addition, there is also an attention layer in the model to mark the weight of words and payment in documents and questions. The calculation process is shown in the following equation:

In equation (7), , , , , and are represented as parameters of model training and learning and represent hidden vectors in the model. In addition, is used to sort vectors from left to right into matrices or row vectors.

For dealing with unnecessary information in the document information, the sigmoid function is added to select the input information, and the unnecessary redundant or wrong information is gradually filtered. The calculation of the function is shown in the following equation:

In equation (8), represents the vector stitched by the feature vectors in the document paragraphs and problems, represents the filter parameter, and represents the filtered matrix or vector. Finally, the results are further input into the Match-LSTM layer for the next operation, as shown in the following equation:

In equation (9), represents the vector obtained by processing the mosaic vector with the Match-LSTM layer. Based on the above operation, the structure of the text reader constructed in this study is shown in Figure 4.

As shown in Figure 4, the text reader includes embedding layer, coding layer, attention interaction layer, and prediction layer. The main function of the embedding layer is to encode the words that have certain similarity between the received document and the problem. The coding layer is responsible for fusing the context information of the preprocessed document and the problem and then encoding the information to get a new representation. The calculation method is shown in the following equation:

In equation (10), and indicate the length of the problem and document context information encoding, respectively. In addition, the attention interaction layer is responsible for mining the semantic information of the included documents and questions and expressing the information related to the question information in the documents. Finally, the answer prediction layer predicts the range of the answers in the fusion documents and marks the beginning and ending positions of the answers. The final result selects the middle position in the range as the prediction answer. In this study, BLEU-4 and ROUGE-L [25, 26] are used to evaluate the final prediction efficiency of the model. BLEU-4 is evaluated by analysing the frequency of the same words between multiple sentences. First, the coincidence accuracy of the segments composed of multiple words is calculated, as shown in the following equation:

In equation (11), is the prediction answer, is the number of occurrences, and indicates the number of times the reference answer appears. The length penalty formula is used to avoid the influence of the length of the statement on the coincidence accuracy. The length penalty formula is shown in the following equation:

In equation (12), indicates the length of the candidate answer and is the length of the reference answer. Therefore, the calculation formula of BLEU-4 can be obtained, as shown in the following equation:

In equation (13), x = 4 is the BLEU-4 score. ROUGE-L is a probability analysis method based on the recall rate of the model [2729]. The final probability is obtained by calculating the number of common occurrences of the reference answer and the predicted answer. The calculation formula is shown in the following equation:where ROUGE-L is the sequence of candidate answers, is the sequence of reference answers, and is the longest length of the common sentence in the sequence of candidate answers and reference answers.

3. Experimental Analysis of English Multitext Reading Comprehension Based on Deep Belief Network

3.1. Training and Experimental Analysis of Paragraph Selector

This study constructs a deep learning framework through Python language and uses the SQuAD dataset as the English reading comprehension dataset in this study. The SQuAD dataset was released by Stanford University in 2018, which is based on the questions raised by the masses in Wikipedia. The answers to the questions correspond to the texts in the reading paragraphs, with a total of more than 500 articles, a total of more than 100000 questions and answers. Different from other datasets, the questions and answers of this dataset are manually annotated. In the experiment, the paragraph selector based on the depth belief neural network is trained and predicted, and the experimental parameters are set, as shown in Table 1.

As shown in Table 1, the parameters of the two schemes are set as follows: the proportional threshold of the number of positive and negative samples is controlled to 0.1 and 0.05, the data size of a batch is set to 8, and the word vector dimension value is the word in the question and paragraph, set to 300 and 200. The training and testing of paragraph selectors are established, and the performance of paragraph selectors is preliminarily judged by analysing the loss value and training time of paragraph selectors. The specific test results are shown in Figure 5.

As shown in Figure 5, the loss value of the multiclassification paragraph selector based on the depth belief neural network in the training set is the minimum loss value in the training set. Similarly, the loss value of the multicategory paragraph selector in the test set is the minimum loss value in the test set. It can be seen that the loss value of the multicategory paragraph selector in both datasets is small, and from the point of view of the training time of the model, the score of the training time of the model is higher, which indicates that the time spent is shorter, which indicates that the overall performance of the multicategory paragraph selector can be preliminarily considered as better. In addition, the parameter setting of scheme 1 can achieve better results. Therefore, the parameters of scheme 1 are used for model parameter setting in the test. After parameter setting, analyse the calculation capacity of the model, as shown in Figure 6.

In order to reflect the characteristics of multicategory paragraph selectors, this paper compares and analyses the advantages and disadvantages of multicategory paragraph selectors proposed in previous studies, as shown in Figure 7.

As shown in Figure 7, the baseline method, TF-IDF, and multiclassification selector were involved in the experimental comparison. It can be seen clearly that no matter in the training set or the test set, the three algorithms retrieve and analyse the same question and answer. The results show that the information mining percentage of the multiclassification selector for questions and paragraphs is relatively high, and it is not difficult to see that the multiclassification selector can achieve more than 90% semantic mining. It shows that the proposed paragraph selection tool has good performance of document information feature extraction.

3.2. Text Reader

Different from the task of the paragraph selector, the task of the text reader is to accurately predict the range and position of the answers from the selected paragraphs. Similarly, before training and analysing the text reader, set the experimental parameters of the text reader. The setting results are shown in Table 2.

As shown in Table 2, most of the parameter settings of the text reader are the same as those of the paragraph selector, but there is a big difference in the number of hidden units. The number of hidden units of the text reader is set to 150, and the dropout retention rate of the text reader is set to 1. After training the text reader, the results are shown in Figure 8. BLEU-4 and ROUGE-L are used to evaluate the final results.

As shown in Figure 8, the processing performance of text readers varies with different paragraph selectors. However, as mentioned above, the premise of a good multitext reader is that it needs a paragraph selector that can accurately mine information. Therefore, according to the final training results, using the paragraph selector and text reader to form the final machine learning English multitext reading comprehension model has great advantages. Finally, BLEU-4 and ROUGE-L evaluation methods are used to analyse the machine learning performance of text readers in real data ets, and the final results are shown in Figure 9.

It can be seen from Figure 9 that the English multitext reading comprehension model based on the deep belief network achieves high scores according to BLEU-4 and ROUGE-L scores when analysing different datasets. And, it is undeniable that, with the increase of the number of text paragraphs in the test set, the analysis performance of the model can still maintain a good score. The above results show that the model based on the deep belief network has a strong ability of question answering, which has a good guiding significance for the development of reading comprehension.

4. Conclusion

With the advent of the intelligent era, machine learning has gradually become a research topic in various fields of the world, and with the development of the Internet, people surfing the Internet has become a common practice, and how to achieve intelligent online English question answering has attracted countless scholars to study. Therefore, this study will be based on the deep belief network to solve the task of English multitext reading comprehension, in order to increase the accuracy and efficiency of online Q and A in the past. Through the performance analysis of the selectors and readers in the English multitext reading comprehension model based on the deep belief network, the results show that the paragraph selectors proposed in this study have good ability to extract the similarity features of the information in documents and questions, and BLEU-4 and ROUGE-L have higher scores on the final classification ability of English multitext reading comprehension model, and with the increase of the number of documents in the dataset, the performance of the model will not decline sharply. To sum up, combining the deep belief network to solve the task of English multitext reading comprehension has a good effect, which is of great significance for the improvement of machine learning technology and the development of computer artificial intelligence.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The work was performed as part of the authors employment under Hunan Institute of Technology.