Abstract

Evaluation of music teaching is a highly subjective task often depending upon experts to assess both the technical and artistic characteristics of performance from the audio signal. This article explores the task of building computational models for evaluating music teaching using machine learning algorithms. As one of the widely used methods to build classifiers, the Naïve Bayes algorithm has become one of the most popular music teaching evaluation methods because of its strong prior knowledge, learning features, and high classification performance. In this article, we propose a music teaching evaluation model based on the weighted Naïve Bayes algorithm. Moreover, a weighted Bayesian classification incremental learning approach is employed to improve the efficiency of the music teaching evaluation system. Experimental results show that the algorithm proposed in this paper is superior to other algorithms in the context of music teaching evaluation.

1. Introduction

Recently, music has become an important part of education. With the reform and development of music education and the introduction of foreign advanced teaching methods, music education as an important part of quality education has been paid more and more attention [1]. Music teaching has shifted from focusing on skills to cultivating students’ emotions, attitudes, and values. Music teaching evaluation has shifted from summative evaluation to formative evaluation, from partial evaluation to comprehensive evaluation, and from focusing on results to focusing on the process [2].

The task of developing and implementing an effective evaluation process for music education seems to be more abstract than producing evaluation methods in other areas of teaching. Traditional methods to the evaluation of music to date failed to prepare evaluators with comprehensive information required to make educational decisions about music teacher performance. For educational leaders to make valid decisions about music education in our schools, more precise and accurate evaluation models will be required about the effectiveness and nature of music teacher evaluation. Music performance evaluation (MPE) is the process of identification, assessment, and modeling the impact of music on the human listener [3]. The majority of the initial researches in MPE investigated symbolic data collected from musical instrument digital interface (MIDI) devices. Recently, attention has moved to the investigation of audio signals.

The concept of applying technology to evaluate music teaching education is old. An initial attempt to point out the importance of systematic assessment of music acts for enhancing learning was carried out by Seashore in the 1930s [4]. One of the initial works discovering the benefits of computer-aided evaluation techniques for music was accomplished by Allvin [5]. He described the application of pitch detection to analyze errors in musical presentation and provide a positive response to the learners. However, currently, the music signal is different from the original sound wave. The musical signal is sampled, multiple times per second, and converted by an analog-to-digital converter into a series of digital values. These values characterize the digital audio signal and are used to regenerate the music. This musical signal comprises numerous acoustic information and features. Usually, these audio features are extracted from the small segments of the audio signal and then combined into a more abstract feature vector [6].

Music teaching evaluation systems usually depend upon extracting prominent and standard audio features from voice signal and then applying machine learning algorithms to describe the value of the performance. Kosina [7], for example, extracted audio features from the 3 second segments of musical sounds and classified the features with a neural network classifier. Knight et al. [8] execrated a group of audio features and employed a support vector machine (SVM) classifier to classify the tonal characteristics of trumpet performances into “good” or “bad” labels. An evaluation system based on the identification of pitch interval for the assessment of singing voice was presented in [9]. Abeßer et al. used the features of sound pitch and intonation a music sound and instrumental performances to determine the rhythmic accuracy [10]. Luo et al. extracted the music spectral, timbral, and pitch-based features to predict mistakes of the violin players [11].

Features based on associated pitch contours were extracted by Bozkurt et al. to investigate vocal conservatory exam sounds and categorize them as “pass “or “fail” [12]. Han and Lee [13] used the mel-frequency cepstrum coefficients (MFCCs) as features to detect common musical mistakes. Wu and Lerch [14] utilized sparse coding to extract features for developing regression models for the evaluation of music performances. The models exhibited enhancement in performance compared to other models.

Despite the different feature sets and machine learning algorithms used by the above works, the basic idea that links them together is that they all are based on features and classification algorithms tuned for a specific task [1518]. Motivated by the triumph of these music evaluation systems, the inspiration of this study is to discover the potential of machine learning in the form of Naïve Bayes classifier for the valuation of music performance. In this study, we employ an improved Naïve Bayes classifier (NBC) for the music teaching evaluation. The Naïve Bayes method has become greatly popular for building classifiers. This is because of the ability of NBC to acquire prior knowledge, the distinctive knowledge representation of NBC, and the accuracy of NBC. Although NBC is capable of extraordinary classification performance [19, 20], questions remain about their performance in multilabel classification.

The remaining sections of the paper are organized as follows. In Section 2, we provide an outline of the commonly known machine learning techniques. In Section 3, we present the proposed music teaching evaluation model. The results are given in Section 4, and finally, the proposed work is summarized in Section 5.

2. Overview of Classification Algorithms

Classification is an important problem in supervised learning. Its purpose is to summarize the unique audio features and find a model or accurate description for each class. The trained classification model is used to classify the features set in the dataset, and the class of the new data with unknown labels can be recognized through the trained model. The classification problem is mainly divided into two phases: learning and classification. In the learning phase, according to the known training dataset, we use the appropriate learning method to train a classifier. In the process of classification, new input instances are classified by the learned classifier.

In machine learning, the common classification algorithms are NB, SVM, KNN, decision tree (DT), and artificial neural network (ANN). Among them, the NB algorithm is a classification method built on the Bayesian theorem, which introduces the theory of feature condition independence, and the classification model is easy to understand. SVM is a linear classification model with the largest hyperplane defined in the feature space. KNN assumes that given a training dataset, the class of the instance has been determined. When classifying new test cases, KNN first judges the class label of the k-nearest neighbor training instance and then uses the majority voting method to predict the class. The DT model is a tree structure, which denotes the process of predicting data item examples based on features. The DT learning process usually comprises three steps: feature selection, tree generation, and decision tree pruning. ANN can effectively solve the problem of nonlinear comprehensive evaluation and reduce the influence of human factors on decision-making results. It can fully model any complex nonlinear relationship and model the nonlinear process without knowing the cause of the data.

Different machine learning algorithms exhibit different properties. The effect of classification is mostly related to the characteristics of data and application background. Table 1 provides a comparative analysis of several common classification algorithms.

Based on a given problem of music teaching evaluation, the different machine learning algorithms can be selected for music teaching evaluation. Taking a sequence of evaluation attributes as input data and an evaluation grade as a class label, a classification algorithm can give the most likely class label for the new evaluation attribute, namely, the evaluation result. To make sure the reliability of the evaluation results, it is indispensable to use the appropriate evaluation index to construct the classifier. Accuracy is an important index to evaluate the performance of a classifier, which is the ratio of correctly classified data samples to the total number of applied samples in a given dataset. Accuracy can be computed as follows:where Acc is the accuracy rate, Nc is the number of correctly classified samples, and N shows the total number of total samples.

K-fold cross-validation is a key statistical assessment technique, mostly used to assess the performance of machine learning algorithms. In this study, 10-fold cross-validation was applied in addition to accuracy to gauge the performance of the proposed technique.

3. Music Teaching Evaluation Recognition Model

3.1. Principle of Naïve Bayes Algorithm

In the classification process of NB, the Bayesian theorem is applied. During classification, the prior probability of each category is obtained by learning a large training set, and then compute the posterior probability of an item x. Finally, the item x is predicted as one of the categories with higher posterior probability. Suppose D is the training dataset, T = {T1, T2…, Tn} is the set of attribute variables, and n is the total attributes. C = {C1, C2, …, Cm} is the set of class variables; K is the number of classes; then a training sample can be represented as {{x1, x2, …, xn, C, }; . It means that the class label of the training sample is given, and a test sample x can be represented as the probability of {x1, x2, …, xn} judging that the test sample is related to a certain class. This can be computed using the following equation:

NB is an effective classification algorithm. The classification model has the advantages of simple explanation, high computational efficiency, and good stability, and its performance is superior to DT, SVM, and other classifiers in some cases. The Naïve Bayesian model adopts the simplest network structure, as shown in Figure 1.

The root node C is the class variable, and the leaf nodes A1, A2, …, An are the attribute variables. The NB classification depends upon the ordinary Bayesian theory, which avoids the condition that the attributes are independent. If p (X) is a constant, then the NB model can be represented using the following equation:where is a kind of prior probability that can be learned from training data as follows:where is class in training samples and s represents training samples.

NB algorithm assumes that all attribute variables are conditionally independent, and there is no relationship between attributes, so these attribute variables are also independent of the class attribute. The computation cost of is very large, and the condition of independence can reduce the computation cost, but it will also lose some computational accuracy. Based on the conditional independence hypothesis, the equation can be simplified as follows:where can be obtained from the training set. Combined with the above three equations, the class of test samples can be determined.

3.2. Determination of Weight Attributes Based on Weighted Naïve Bayes

To reduce the computational cost, in the NB algorithm, it is assumed that the condition attributes are independent of each other, and their importance in decision-making is the same for all attributes, that is, the weight is equal to 1. However, in practical application, the significance of each conditional attribute varies; therefore, when the default ownership weight is 1, the accuracy of classification is reduced. In this study, the weighted Naïve Bayes (WNB) algorithm is used to assign an appropriate weight to the attributes based on their contribution to classification. It retains the computation speed and also decreases the influence of the attribute conditional independence statement on the classifier performance. The weighted NB is computed as follows:where denotes the weight of attribute . The weight governs the significance of various classification attributes. The higher the value of is, the greater the corresponding attribute A will be. In many applications, the prediction of a precise weight for each attribute is the key problem of the WNB model.

Based on the correlation between the evaluation attributes of music teaching evaluation data and the comprehensive evaluation value, it can be observed that the value of each index has a distinct effect on the results of the evaluation. In this study, we propose a new method to define the weight of each evaluation attribute by calculating the correlation probability of attributes in class. Each attribute may have n different values, where n ∈ N. Suppose a concrete instance z, when the value of z is for a given class , attribute a about the correlation probability and uncorrelated probability are calculated using Equations (7) and (8), respectively:where count is the statistical number; when the value of is and belongs to for class, the attribute weight can be computed using the following equation:

Hence, the equation of the WNB algorithm is as follows:

In a dataset, suppose there are l class labels, n attributes, and K attributes for each attribute, then the total number of weights for all attributes is l ∗ n ∗ k. Different values of the same attribute lead to different weights. When the value of the same attribute is the same, the weight is different under different categories. Finally, corresponding to each attribute value, the weight of the correlation probability with the present class label is used for computation, and the output of each class is compared. The class with the maximum value is the classification result. The specific steps of the WNB algorithm are given in Algorithm 1.

3.3. Incremental Learning of WNB Algorithm

In the WNB algorithm, the idea of incremental learning can reduce the performance requirements of the algorithm. The Bayesian classifier has a unique characteristic of incremental learning. A large part of the calculation process can be carried out in the way of incremental learning, which can reduce the time consumption of the algorithm. At the same time, the prediction performance of the classification algorithm is not affected by the completeness of training samples. Generally, the more complete the training samples are, the stronger their prediction ability and generalization ability are. In practice, the training samples of the classifier need to be gradually completed, and it is difficult to complete all the samples at one time. The incremental learning process of Bayes is to update the original class prior probability according to the new sample data and attribute conditional probability . The incremental learning of the algorithm is Algorithm 1:

Input: test case to be classified
Step 1: scan all the training sample data and count the class label C. At the same time, the statistical attribute a, the number of samples whose value is AE, and the number of samples whose value is not a, are recorded in the count table.
Step 2: based on the information in the count table, for all attributes values, the correlation probability, and noncorrelation probability are calculated using equations (7) and (8), and the results are saved in the RP table.
Step 3: acquire the weight parameter. Using the information in the RP table, the weights of all attributes for various class labels are calculated using formula (9), and the weights are saved in the weight table.
Step 4: learn a priori probability. Using the number of classes in the count table, equation (4) is used to calculate the a priori probability of all class labels and save it to the correlation probability (CP) table. Meanwhile, equation (5) is used in the calculation of conditional probability for all attributes, and the results are saved in the conditional probability table (CPT) table.
Step 5: implement classification. When predicting the category of a test case, the CP table and the CPT table are called first, and then the corresponding value in the weight table is called using the specific value of each attribute. Finally, the posterior probability of the case belonging to each category is calculated using equation (10), and the maximum posterior probability is found out, and the class label is assigned.
Output class labels

The classifier does not need to retrain the classification model but only needs to import the newly added data into the classification model and modify the relevant parameters of the classifier. The modified equation of prior probability of Bayesian incremental algorithm is as follows:

Likewise, the modified equation of the conditional probability of the Bayesian incremental algorithm is as follows:where and are the class prior probability and attribute conditional probability updated after adding new samples, n is the total number of original data records, is the total number of original data records means belonging to category , and is the value of a feature. In addition, when a new sample is added, it is necessary to count the number of samples of each category corresponding to the attribute value of the new sample set again. Combined with the statistical value of the old sample data, the correlation probability and noncorrelation probability value of the attribute are updated, to update the weight of each attribute. Using equations (9) and (10), and weighted Bayes equation (12), the category of each data record x can be calculated.

4. Results

4.1. Comparative Analysis of Classification Accuracy of Traditional Machine Learning Algorithms and NB Algorithm

In this section, the five well-known machine learning classification algorithms are used to evaluate the existing teaching evaluation dataset, and the feasibility of the algorithm is judged by its accuracy. For each classification algorithm, we used the algorithm function provided by Sklearn, a machine learning library of the Python language. The default parameters were used to compare the experimental results. The dataset was divided into 220 training sets and 70 test sets. We applied 10-fold cross-validation in which the dataset was categorized into 10 equal parts. At each iteration, 1 part was used for testing, and the remaining 9 parts were used for training the classifier. The final results were averages as shown in Table 2.

The average classification results of these classification algorithms are shown in Figure 2.

For the same experimental dataset, the average computation time of each algorithm is shown in Table 3.

NB algorithm has high classification accuracy and the minimum running time on the dataset. The highest average computation time of 34.9 s was taken by the BP algorithm, whereas the NB algorithm took only 0.06 s to complete the classification task. Therefore, this validates the selection of the NB algorithm as the best candidate machine learning algorithm for the construction of the music teaching evaluation model.

4.2. Comparative Analysis of Classification Accuracy of NB and WNB Algorithms

We compared the performance of standard NB and WNB classifiers based on classification accuracy and competition time. All the 290 music samples were divided into 10 equal parts with 220 samples used for training and 70 samples for testing the classification performance. Using 10-fold cross-validation, the classification performance of the NB and WNB algorithms was measured. The results are given in Table 4.

The experimental results show that the average accuracy of the NB algorithm is 0.71%, whereas the average accuracy of the WNB is 0.75%. It can be observed that the classification accuracy of the WNB algorithm is better than that of the traditional NB algorithm in the music teaching evaluation dataset.

Since neural networks have rapidly gained attraction within the music fields [21] Among the different classes of the neural network, the backpropagation (BP) neural network is probably the most widely used. When the BP neural network algorithm processes the training dataset, it unifies the original score value (percentage system) into a decimal of the [0, 1] interval through data normalization and sets an error threshold to form a model to identify the evaluation level of the new sample data. We experimented with the BP algorithm, where all the 220 music samples were arbitrarily chosen as the training set and 70 data samples as the test set. According to the number of attributes, the input layer node was set to 10; the hidden layer node was 5; the output layer node was 1, the activation function was tanh(); the learning rate was 0.01; and the number of cycles was 10,000. After the neural network algorithm training, the test results are shown in Table 5.

Based on the above analysis, since in the actual evaluation process, the percentage system evaluation value given is generally very high, which leads to model overfitting, resulting in a high prediction level. Therefore, after preprocessing, the percentage score value was discretized into the evaluation value of the five grade system, and the data of different grades were randomly selected and mixed into the training dataset, in which the training set contained 220 and the test set contained 70 data samples. The results of the BP network and WNB algorithm were computed and are shown in Table 6 and Figure 3.

According to Table 6, the comparison chart of classification accuracy between the NB and WNB algorithms is shown in Figure 3.

In the actual teaching evaluation dataset, because there are more excellent grades in the evaluation grade and fewer other grades when using hierarchical data to train the classification model when the extracted training datasets are different, the experimental results will be affected to a great extent. In this experiment, the average accuracy of the WNB algorithm is 0.767, and that of the BP algorithm is 0.683. The experimental results show that the performance of WNB is better than that of the BP algorithm. Likewise, the average computation time of WNB is 0.025 s, while the average computation time of the BP algorithm is about 34 s. Compared with the BP algorithm, the time consumption of the WNB algorithm is less, and the recognition accuracy is higher. Hence, the WNB algorithm has greater advantages in music teaching evaluation.

4.3. Experimental Analysis of WNB with Incremental Learning-Based WNB

We completed the construction of an incremental classification model based on WNB with 220 training samples and 70 test samples and gradually increased the training sample sets. The specific calculation results of test samples in each phase of random sampling increment are shown in Table 7.

Using incremental learning strategy, the calculation results are more inclined to the correct category, that is, the probability value of the correct category increases, while the probability value of other categories decreases. With incremental strategy, the changes of average classification accuracy are shown in Table 8, which shows a very small degradation in performance while adding more samples.

WNB algorithm with incremental learning referred to as adding WNB algorithm; using the same experimental dataset, the consumption time of WNB with incremental strategy is shown in Figure 4.

The introduction of the incremental model not only helps improve the classification model. At the same time, because the incremental model does not need to train and calculate the previously trained dataset again, it only needs to classify and calculate the newly added data, directly merge with the previous training values, and update the relevant parameters of the model, which saves time and increase the effectiveness of the classification model. Although the results of the proposed model are encouraging, more work is required to test the performance on other musical datasets and advanced machine learning algorithms.

5. Conclusion

This article explores the construction of a music teaching evaluation model using machine learning techniques. By comparing the potential of traditional classification algorithms, the weighted Naïve Bayes algorithm is presented as the baseline algorithm for the music teaching evaluation process to evaluate the music teaching effect. In addition, the concept of incremental learning is introduced to improve the classification performance of the weighted Naïve Bayes classifier. The experimental results show that the introduction of the incremental learning approach not only optimizes the performance of the classifier but also reduces the computation time of the algorithm. Results validate that the algorithm proposed in this paper is superior to other algorithms in the performance of music teaching evaluation.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.