Abstract

Since the artistry of the work cannot be accurately described, the identification of reproducible plagiarism is more difficult. The identification of reproducible plagiarism of digital image works requires in-depth research on the artistry of artistic works. In this paper, a remote judgment method for plagiarism of painting image style based on wireless network multitask learning is proposed. According to this new method, the uncertainty of painting image samples is removed based on multitask learning algorithm edge sampling. The deep-level details of the painting image are extracted through the multitask classification kernel function, and most of the pixels in the image are eliminated. When the clustering density is greater than the judgment threshold, it can be considered that the two images have spatial consistency. It can also be judged based on this that the two images are similar, that is, there is plagiarism in the painting. The experimental results show that the discrimination rate is always close to 100%, the misjudgment rate of plagiarism of painting images has been reduced, and the various indicators in the discrimination process are the lowest, which fully shows that a very satisfactory discrimination result can be obtained.

1. Introduction

In the commercial society, the competition in the cultural industry has increased fiercely, and commercial imitation and plagiarism may have a direct impact on the economic interests of both parties [1]. In order to avoid infringement disputes that may arise in the future, consciously avoid copying directly as it is, and adopting methods of reference and imitation to intercept fragments or expressions from others’ works and absorb them into their works has become a creative norm, which frequently triggers plagiarism disputes [2]. Whether the similarity of works constitutes infringement must consider the balance between personal interests and social interests, and neither can we turn a blind eye to the obvious plagiarism of other people’s works, nor can it be regarded as infringement without consideration for any acts involving plagiarism. In the plagiarism of painting works, plagiarism often refers to the content and style of the existing works and consciously changes the color, angle, and relative position of the main content in the painting. Change the size, shape, and details of the object, and rebuild other objects in the painting [3].

The intellectual property rights of digital image works are mainly embodied in the use of image color, composition, and artistry of expression. The artistic plagiarism of digital image works is regenerative [4]. Like other forms of digital works, the identification of regenerative plagiarism is obviously difficult, and the identification of regenerative plagiarism is even more difficult because the artistry of the work cannot be accurately described. The identification of reproducible plagiarism of digital image works requires in-depth research on the artistry of artistic works and refines the artistry into quantifiable artistic indicators. Then, based on these indicators, the two works are tested for similarity. The reproducible plagiarism identification of digital image works will be an important direction of plagiarism identification research in the future.

Reference [5] discusses the painting art style conversion technology in NPR, uses image representation derived from convolutional neural networks, and designs an art style algorithm. The algorithm can separate and recombine the content and style of the image and can produce new images with high quality. Taking the painting style of Xin’an School as an example, this algorithm is used to convert the sample image into a new image of Xin’an School. Reference [6] proposed the Cycle GANSN algorithm. After the Cycle GAN algorithm identifies each convolutional layer of the network, a spectral normalization layer is added, the spectral norm of the convolutional layer parameter matrix is estimated by the power iteration method, and the stochastic gradient descent method is used to update the convolutional layer parameters. Since the change of parameters in each update is very small, only one iteration can quickly estimate the maximum singular value of the matrix. According to the obtained maximum singular value, the convolutional layer parameters are normalized, so that the entire discriminant network meets 1-Lipschitz continuity. Reference [7] proposes to recognize similar parts from multiple execution trajectories of multithreaded programs under the same input and abstract behavior motifs that are not easily affected by thread interleaving to realize plagiarism detection for multithreaded programs. This method captures the dynamic execution trajectory of the program. After trajectory pruning, gram matching, expansion, and abstraction, motif birthmarks are extracted to model the behavior of multithreaded programs. Finally, the potential plagiarism between programs is determined by measuring the similarity of motif birthmarks. In Reference [8], it investigates the usefulness of the normalized compression distance (NCD) for image similarity detection. Instead of the direct NCD between images, the paper considers the correlation between NCD-based feature vectors extracted for each image. The vectors are derived by computing the NCD between the original image and sequences of translated (rotated) versions. Feature vectors for simple transforms (circular translations on horizontal, vertical, and diagonal directions and rotations around image center) and several standard compressors are generated and tested in a very simple experiment of similarity detection between the original image and two filtered versions (median and moving average). The promising vector configurations (geometric transform and lossless compressor) are further tested for similarity detection on the 24 images of the Kodak set subject to some common image processing. While the direct computation of NCD fails to detect image similarity even in the case of simple median and moving average filtering in windows, for certain transforms and compressors, the proposed approach appears to provide robustness at similarity detection against smoothing, lossy compression, contrast enhancement, and noise addition and some robustness against geometrical transforms (scaling, cropping, and rotation). In Reference [9], the authors declare that binary code similarity detection (BCSD) plays an important role in malware analysis and vulnerability discovery. Existing methods mainly rely on the expert’s knowledge for the BCSD, which may not be reliable in some cases. More importantly, the detection accuracy (or performance) of these methods is not so satisfactory. To address these issues, we propose BinDeep, a deep learning approach for binary code similarity detection. This method firstly extracts the instruction sequence from the binary function and then uses the instruction embedding model to vectorize the instruction features. Next, BinDeep applies a Recurrent Neural Network (RNN) deep learning model to identify the specific types of two functions for later comparison. According to the type information, BinDeep selects the corresponding deep learning model for similarity comparison. Specifically, BinDeep uses the Siamese neural networks, which combine the LSTM and CNN to measure the similarities of the two target functions. Different from the traditional deep learning model, our hybrid model takes advantage of CNN spatial structure learning and LSTM sequence learning. The evaluation shows that our approach can achieve good BCSD between cross-architecture, cross-compiler, cross-optimization, and cross-version binary code.

For two images that are exactly the same, the pixel values of the corresponding positions are the same. At present, the common method of image consistency detection software on the Internet is to use a cryptographic hash algorithm for judgment. When the image pixel values are exactly the same, the result calculated by the cryptographic hash algorithm is also exactly the same. Therefore, as long as the calculation results are verified to be the same, it can be judged whether the image content is exactly the same. However, the limitation of this method is that it requires a large amount of calculation for point-by-point discrimination and requires high image consistency. If you modify the details of the image, such as toning, cropping, and zooming, the calculated results will be completely different.

This paper proposes a remote judgment method of painting image style plagiarism based on wireless network multitask learning. Firstly, the edge sampling is based on the multitask learning algorithm to remove the uncertainty of painting image samples. Secondly, the multitask classification kernel function is used to extract the deep-level details of the painting image, and most of the pixels in the image are excluded. Lastly, when the cluster density is greater than the threshold value, the two images can be considered to have spatial consistency. It can also be judged based on this that the two images are similar, that is, there is plagiarism in the painting.

2. Wireless Network Multitask Learning

2.1. Remove the Uncertainty of Painting Image Samples

The selection of painting image samples based on the multitask learning algorithm can better avoid the influence of uncertainty conditions on the Internet storage capacity, and all the instruction behaviors in the entire storage processing process maintain the multitasking ability. Therefore, regardless of the increase or decrease of the painting image sample space, the multitask learning algorithm can better stabilize the transmission and application capabilities of valuable information parameters. It is stipulated that 1 represents the information measurement value in the painting image sample space, and 2 represents the storage amount of the painting image sample when the measurement value is 3.

The painting image sample is a description of the storage capacity of the network painting image. With the support of the multitask learning algorithm, the larger the painting image sample space, the stronger the storage capacity of the matching network painting image [10]. Uncertainty of painting image samples is a generalized application ability. In the space of painting image samples, two adjacent painting image data have the ability to be infinitely close. But it will never be completely relieved, so the larger the sample space of the painted image, the weaker the sample uncertainty ability [11]. The selection of painting image samples based on the multitask learning algorithm can better avoid the influence of uncertainty conditions on the Internet storage capacity, and all the instruction behaviors in the entire storage processing process maintain the multitasking ability. Therefore, regardless of the increase or decrease of the painting image sample space, the multitask learning algorithm can better stabilize the transmission and application capabilities of valuable information parameters. It is stipulated that represents the information measurement value in the painting image sample space, and represents the storage amount of the painting image sample when the measurement value is . Combining the above physical quantities, the uncertainty condition of the painting image sample based on the multitask learning algorithm can be defined as follows: where is the valuable information feature value in the painting image sample space and is the uncertain search condition of the painting image sample.

Edge sampling is a common processing method in multitask learning algorithms. Due to the existence of uncertain conditions of painting image samples, this processing behavior can directly determine the final execution result of painting image classification. When the information space of the painting image is large, the real-valued result of the edge sampling coefficient will also increase; when the information space of the painting image is small, the real-valued result of the edge sampling coefficient will also decrease [12]. In other words, the edge sampling coefficient is a flexible index parameter that can interfere with the actual processing behavior of painting image classification. And as the uncertainty of painting image samples weakens, the amount of valuable information of painting images affected by multitask learning algorithms will gradually increase. This is also the main reason why the edge sampling coefficient can determine the direction of painting image classification processing to a certain extent [13]. represents the sampling parameter value of valuable painting image information, and represents the sampling parameter value of the multitask learning algorithm. Simultaneous formula (1) can express the edge sampling coefficient of painting image classification as follows:

In this formula, and represent two different valuable painting image information respectively, and represents the information extraction coefficient.

2.2. Extraction of Deep-Level Detail Feature of Painting Image

The multitask classification kernel function determines the practical application ability of the multitask learning algorithm. When the uncertainty condition of the painting image sample is allowed, the network host often needs to store a large amount of painting image sample data at the same time. With the increase of the edge sampling coefficient, the range of the kernel function will continue to expand until the multitask learning algorithm can fully meet the actual storage requirements of the painting image sample data. If the accurate calculated value of the edge sampling coefficient is known, the space where the sample data of the painting image has been stored can be arranged according to the specific requirements of the multitask learning algorithm. On the one hand, it provides information parameter support for subsequent image classification and collaborative processing, and on the other hand, it can also avoid the obvious accumulation of stored image data [14]. represents the first image information classification condition, and represent the image information classification condition. The simultaneous formula (2) can define the classification kernel function based on the multitask learning algorithm as follows: where , , and represent different image information classification coefficient items, respectively, and represents the average value of coefficient items.

For the painting image reconstruction method, it is first assumed that the painting image and the painting image are sparse expressions for their respective dictionaries, and then, the image sample features are obtained through the PCA Net deep network, and a pair of over-complete dictionaries and are obtained by combining the dictionaries. In the painting image reconstruction stage, the painting image is processed in the same way, deep-level feature extraction is performed through PCA Net, and the coefficients expressed by the sparseness of the painting image feature are directly applied to to obtain the corresponding painting feature image. Realize the painting reconstruction of painting images [15]. In the mining of image samples through PCA Net deep network, better image features than nondeep networks can be obtained. The deep feature dictionary established can also improve its description ability and significantly improve the quality of image reconstruction.

It is necessary to sample fuzzy images in the training data set, adjust them to the same size, obtain painting images corresponding to the painting images, and combine them into a model sample pair: , where and represent the painting feature. Calculate matrix blocks for all samples in the data set, and select a sliding window with a size of (normally, a square window of image pixels of 3, 5, or 7 is used). After feature extraction of all images through the above-set sliding window, a new -column data matrix can be obtained. Each column of this matrix represents an image block, with a total of elements.

The formula for obtaining the painting image training sample is as follows:

The features are extracted using the data matrix obtained above, and the extracted features are regarded as feature samples in the ScSR model and substituted into the feature dictionary of PCA Net.

Suppose represents the painting image result after sparse coding, the sparse coding result is quantized, the histogram coding is completed, and the deep-level detail features of the painting image are extracted, namely,

Through the same processing process as the painting image, the result of extracting the deep details of the painting image is given, namely,

In the formula, and , respectively, represent the feature extraction results of the painting image; Bhist represents the encoding process of the straight image; represents the number of divided image sample blocks.

If the pixel point is used as the detection point, a discrete circle with a radius of 3 is made with it as the center, and 16 pixel points formed on the circumference are obtained to surround the central pixel . Set a threshold suitable for feature point . When there are consecutive pixels, the gray value satisfies formula (1). That is, when the gray values of consecutive pixels are larger than the threshold value or smaller than the threshold value of the pixel of the point, it is determined that the pixel point is a corner point. Feature extraction is shown in Figure 1.

In the formula, is the scoring function, which represents the score of the detection point, represents the pixel value of pixels on the circle and represents the pixel value of points, and takes the point with the largest score within the radius as the distinctive feature. In order to increase the calculation speed, is generally selected for calculation. At the same time, the pixels on the 16 circles are numbered in turn clockwise. In corner detection, the 4 pixels numbered 1, 5, 9, and 13 can be judged first. Only when at least 3 of these 4 pixels satisfy the above formula can it be confirmed as a corner point. Therefore, the FAST feature extraction algorithm can quickly eliminate most of the pixels in the image, improve the calculation efficiency of corner detection, and save time.

3. Remote Judgment of Plagiarism of Painting Image Style

If two pictures have spatial consistency in a painting, then the two works can be considered similar. According to the principle of spatial consistency, if two pictures are similar, the lines corresponding to the matching feature points of the two pictures are parallel [16]. Therefore, most of the angle values obtained after Hough transform fall within the same angle clustering range.

Assuming that there are images A and B with pixels of and , the image copy detection process is as follows: (1)Image Expansion. First, expand the image A into an image G and use A to tile to the right and down in order, as shown in Figure 2.(2)Scan Comparison. First, align the upper edge of B with the upper edge of G, align the left edge of B with the left edge of G, and cover B on G. Compare the part B and G are covered by B, as shown in Figure 3(a). Then, shift B to the right by one pixel and then compare B and the part of G covered by B. Continue like this until the right edge of B is aligned with the right edge of G, as shown in Figures 3(b) and 3(c).

Move B to the left of G again so that the left edge is aligned with the left edge of G, but the top edge is moved down one pixel. Repeat the horizontal scan comparison until the right edge of B is aligned with the right edge of G, as shown in Figures 3(d)3(f).

This continues until the lower edge of B is aligned with the lower edge of G, and the right edge of B is aligned with the right edge of G, as shown in Figures 3(g)3(i). (3)Comparison Result. In the scan comparison, a copy amount, copy pattern, and copy method are obtained for each comparison. The maximum value of the copy amount is regarded as the copy amount of A and B, and the copy pattern with the largest copy amount is regarded as the copy pattern of A and B.(4)Copy Judgment. According to the set copy amount threshold, it is judged whether there is a possibility of copying between A and B, and finally, it will be further judged by human work.

When there is any that makes the calculation result satisfy formula (2), it can be judged that the image has spatial consistency, that is, the two pictures are similar, where is any value in the interval, FW is the threshold that controls the clustering range, TH is the judgment threshold, and is the total number of connections. represents the number of angles equal to after transformation and represents the total number of within the interval clustering range [17].That is, when there is a cluster density greater than the determination threshold in any interval range, it can be considered that the two images have spatial consistency. It can also be judged based on this that the two images are similar, that is, there is plagiarism in the painting.

When different cluster width values are used, the obtained judgment results are different, and the final judgment results will also be different when different judgment thresholds are taken. Therefore, the cluster width and the judgment threshold are important parameters that need to be adjusted. Image clustering density curve is shown in Figure 4.

The blue line represents the similar image cluster density curve, and the orange line represents the dissimilar image cluster density curve. It can be seen that as the cluster width FW increases, the probability of meeting the specified threshold TH also increases, and the probability that the input image is judged as a similar image increases. The opposite is the same. In addition, in the case of the same cluster width, the threshold TH decreases, and the probability of the input image being judged as a similar image will increase and vice versa.

4. Experimental Study

According to the experiment, 1620 pairs of typical images were selected as the training set from the works of the National Science and Technology Innovation Competition for Children’s Science Fantasy Painting Competition, and the parameter optimization experiment was carried out. In addition, a total of 30,870 pairs of images were selected as the test set to verify the reliability of the algorithm. The main performance indicators of the effect of the painting duplicate check algorithm are the precision rate and the recall rate.

Discrimination rate: it indicates the ratio of the number of identified incidents to the total incidents of plagiarism of painting images within a certain time range when applying the discrimination method. The specific calculation method is as follows:

In the formula, represents the discrimination rate, represents the number of plagiarism incidents of the identified painting image, and represents the number of actual painting image incidents. In order to verify the performance of the method proposed in this paper, we do the experiment as well as with the other five methods proposed in References [59], and the results of IR-FIR are shown in Figure 5.

It can be seen from the above figure that when the false alarm rate is the same, the discrimination rate of this method is always better than the other five methods proposed in References [59]. When the false alarm rate reaches 1%, the discrimination rate of the method in this paper is close to 82%, the discrimination rate of the method in Reference [7] is 43%, the discrimination rate of the method in Reference [8] is 57%, and the discrimination rate of the method in Reference [9] is 70% (63% and 61% for References [5, 6], respectively). Therefore, the method proposed in this paper has the best discrimination effect performance. When the judgment rate is at the same level, it can be found that the misjudgment rate of the method proposed in this paper is lower than the other three methods. When the judgment rate is 60%, the misjudgment rate of the methods proposed in References [79] is 2.0%, 1.2%, and 0.7%, respectively, and the misjudgment rate of the method proposed in this paper is 04%. The reason that the method proposed in this paper has the best performance might be that this method can make full use of the advantages of multitask learning, and it can make use of the acquired knowledge to get better classification effect. According to this figure, we can draw that the effectiveness of the method in this paper is significantly better than the other three methods.

False alarm rate: it indicates the ratio of the number of false alarm events to the total number of detected events within a certain time range. The specific calculation method is as follows:

In the formula, represents the false alarm rate, represents the number of false alarms, and represents the number of detected events.

Average discrimination time: it means the average value of the time difference from when an event occurs within a certain time interval to the end of the algorithm discrimination. Due to the difference between the detector and the painting image discrimination method, the average detection time will also change, and the range of change will remain within 4-10 min.

In the formula, represents the actual time of the event , represents the estimated time of the event , and represents the number of traffic incidents.

Result analysis of false alarm rate and average discrimination time is shown in Table 1.

It can be seen from the above table that the method in this paper meets the requirements of three-level alarm-related indicators, and the false judgment rate of plagiarism of painting images is reduced. On the basis of meeting the relevant requirements of the first-level alarm, the relevant requirements for the third-level alarm indicators can also be met, ensuring efficient management of the identification of plagiarism in painting images.

In order to verify the accuracy of the judgment result, the mean square percentage error index is selected as the evaluation index. The specific calculation formula is as follows:

In the above formula, MSPE also represents the distribution of errors, reflecting the degree of deviation between the predicted value and the true value, and the two are inversely proportional. In this paper, we use the average of s as the forecast index which is also used to compare the performance of the four methods. The forecast index is defined as formula (13). where FI is forecast index, is the sum of MSPE of times, and is the number of experiment. Figure 6 gives the comparison of the forecast index value of different methods.

A comprehensive analysis of the experimental data in Figure 6 shows the forecast index of the six prediction methods. In order to verify the performance, we have given the performance of FI as well as compare the method proposed in this paper with methods proposed in References [59]. Among the six methods for identifying plagiarism in painting images, the indicator of the proposed method is the lowest, which fully shows that the proposed method can achieve very satisfactory results. It should be noted that, when the number of experiments is 40 in this figure, the FI of the proposed method is much higher, and the main reason is that the method in this paper is based on the edge sampling of the multitask learning algorithm to remove the uncertainty of painting image samples. Meanwhile, there may be negative transfer in multitask learning. Therefore, its performance may be poor at some times. However, with the accumulation of knowledge and continuous training, the network performance will eventually be very good. According to the method proposed in this paper, it can take advantage of “experience.” The multitask classification kernel function can extract the deep-level details of the painting image, eliminate most of the pixels in the image, and improve the calculation efficiency of corner detection. According to References [8, 9], the FI of the method in Reference [9] is higher than that of the method in Reference [8] when the number of experiments is 30, and the reason might be that the performance of deep learning can be affected by the change of parameters in the learning process. Meanwhile, we can see that the performance of the method proposed in Reference [7] is higher than the other four methods. Besides, the FI of the method in Reference [7] is lower than that of the method in Reference [6] in other cases, and the main reason is the change of parameters in the learning process too. References [6, 9] are two methods based on neural network, so these two methods have the same drawbacks. Since the advantage of the multitask learning, this paper can take the advantage of transfer learning, so it can use the “knowledge,” which can further improve the performance of the method proposed in this paper.

5. Conclusion

This paper proposes a remote judgment method of painting image style plagiarism based on wireless network multitask learning strategy. Firstly, the proposed method uses edge sampling based on multitask learning algorithm to remove the uncertainty of painting image samples. Secondly, the deep-level detail features of painting images are extracted through multitask classification kernel function. Excluding most of the pixels in the image, when the clustering density is greater than the determination threshold, it can be considered that the two images have spatial consistency. It can also be judged based on this that the two images are similar, that is, there is plagiarism in the painting.

Under this circumstance, it can make the “substantial similarity” more reasonable and the judgment result more objective and fair. To further enhance the predictability of the trial results to the public, it is necessary to conduct a deeper discussion and research on the judgment of “substantial similarity” in copyright infringement.

The method proposed in this paper can use the basic features of the painting image for similarity detection. According to this method, it can remove the uncertainty of painting image samples by the edge sampling, and then, the deep-level detail features of painting images can be extracted according to the multitask classification kernel function. To sum up, the method proposed in this paper can extract the features of painting image from multiple angles, so as to help judge whether there is plagiarism. However, the solution of the multitask learning method, which is a nonconvex, is generally difficult to obtain. Therefore, this method is difficult to be applied to the actual system. In the future, we need to introduce more efficient feature extraction algorithms.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This study was supported by the 2019 Ministry of Education Humanities and Social Sciences Research Planning Project “Research on the Image of Utensils in the Remains of Ancient Murals in Shanxi-From Song and Jin to Ming and Qing” (approval number: 19YJA760066) and the 2018 Shanxi Province Soft Science Research Program Project: “Study on the Construction and Mechanism of Innovation and Entrepreneurship Environment for College Students of Art and Design in Shanxi Province under the Background of the New Era” (2018011038-1).