Abstract

With the improvement of material living standards, spiritual entertainment has become more and more important. As a more popular spiritual entertainment project, film and television entertainment is gradually receiving attention from people. However, in recent years, the film industry has developed rapidly, and the output of animation movies has also increased year by year. How to quickly and accurately find the user’s favorite movies in the huge amount of animation movie data has become an urgent problem. Based on the above background, the purpose of this article is to study the new visual expression of animation movies based on artificial intelligence and machine learning technology. This article takes the film industry’s informatization and intelligent development and upgrading as the background, uses computer vision and machine learning technology as the basis to explore new methods and new models for realizing film visual expression, and proposes relevant thinking to promote the innovative development of film visual expression from a strategic level. This article takes the Hollywood anime movie “Kung Fu Panda” as a sample and uses convolutional neural algorithms to study its new visual expression. The study found that after the parameters of the model were determined, the accuracy of the test set did not change much, all around 57%. This is of great significance for improving the audiovisual quality and creative standards of film works and promoting the healthy and sustainable development of the film industry.

1. Introduction

The rapid development of the Internet industry has brought great convenience to users, and at the same time, the amount of information has also exploded at an exponential level [1, 2]. The cost for users to obtain valuable information has greatly increased, and how to help Internet users find the content that is valuable to users efficiently and quickly in massive information data has become an important issue. The visual performance of film is an eternal topic in the field of film. It cannot be separated from the development of the times, the progress of the industry, and the development and application of new technologies [3]. The conception of film works is shown naturally, smoothly, and efficiently, showing the feelings of film characters and explaining the experience of film characters, using visual elements to promote the world view, life view, and value of film works [4]. The visual performance of the film, combined with the informatization and intelligent evolution of the whole society, needs to correspond to the times and continuously improve the visual impact of film works [5, 6]. Therefore, it is very important to innovate the visual performance of movies, explore new ways of visual performance of movies, and meet the characteristics of the times and industry needs [7, 8].

The visual media began with the emergence of printed materials such as newspapers and magazines. As visual designers, they are constantly thinking about the conveyer of information. That is, the delicate relationship between the designer and the audience and the effectiveness of the information transmission [9]. The new visual media is a new generation of media that is more popular and generally accepted by the public. After immersive visual expressions and other types of design opened up a new field of visual experience, visual designers adopted a comprehensive approach to several senses. Experiments make participants feel the same and thus have synchronicity and interaction [10, 11]. The emergence of interactive media such as the Internet has enriched the scope of traditional visual media. Due to the varying degrees of audience acceptance, the effect of information or emotional expression is not satisfactory [12, 13]. At this time, the visual designer began to think about how to make the audience as empathetic as possible. Only when the designer and the audience resonated would the work reflect the real value [14]. Visual artists are not just the identity of designers, but they are also practitioners of technology and psychologists. The indispensable or even irreplaceable component of information or emotion is the participant, and the pleasure, imagination, comprehension, discovery, and exploration produced in the process itself contribute to the integrity of the expression of artistic ideas [15, 16].

The purpose of artificial intelligence (AI) is to enable machines to perform intelligent tasks that require manual processing. AI-based prediction models have great potential for solving complex environmental applications that include a large number of independent parameters and nonlinear relationships. Due to its predictive ability and nonlinear characteristics, several AI-based modeling techniques have recently been adopted in the modeling of various realistic processes in the field of energy engineering, such as artificial neural networks, fuzzy logic, and adaptive neural fuzzy inference systems [17]. De Raedt et al. discuss the foundations of two widely used AI-based technologies, namely, artificial neural networks and support vector machines, and highlight the important mathematical aspects of these methods. It describes computational problems and the respective advantages of these methods, and many implementations of these models are now available in large numbers in software repositories [18]. In recent years, deep learning has shown encouraging results in various artificial intelligence applications. Therefore, Foulquier et al. studied a comparison of a deep learning architecture called deep neural network (DNN) with classical random forest (RF) machine learning algorithms for malware classification. They studied the performance of classic RF and DNN with 2-, 4-, and 7-layer architectures with 4 different feature sets and found that the classic RF accuracy is better than DNN regardless of the feature input [7]. The research of artificial intelligence has become a hot spot in academia and industry. With this, there has been interest in the ability of artificial intelligence to make the right decisions when compared to human decisions. Due to the complex relationship between interest variables, predicting the outcome of a sporting event has traditionally been considered a difficult task. Because of the limited rationality of human decision-making functions, trying to make accurate predictions is full of bias. Pascual and Zapirain propose that artificial intelligence methods using machine learning will produce a considerable level of accuracy. Random forest classification algorithm is used to predict the results of the 2015 Rugby World Cup. The performance of this model is compared with the comprehensive results of Super Bru and OddsPortal. Machine learning-based systems have accuracy rates of 89.58% and 85.58% at the 95% confidence intervals (77.83, 95.47) and 85.42% confidence intervals (72.83, 92.75), respectively. These results indicate that for rugby, for a limited time in a particular game, the evidence is insufficient to show that at a significance level , human agents are superior in terms of accuracy in predicting game results compared to machine learning methods. However, compared with these two platforms, the model is better able to estimate the probability measured by betting on currency wins for the round [1]. Machine learning (ML) is now common. Traditional software engineering can be applied to develop ML applications. However, we must consider the specific issues of ML applications in terms of quality requirements. Nalmpantis and Vrakas presented a survey of software quality for ML applications to see the quality of ML applications as an emerging discussion. Through this survey, they raised issues with ML applications and discovered software engineering methods and software testing research areas to address these issues. They divided the survey goals into academic conferences, magazines, and communities. They targeted 16 academic conferences on artificial intelligence and software engineering, including 78 papers. They targeted 5 journals, including 22 papers. The results indicate that key areas such as deep learning, fault location, and prediction will be studied through software engineering and testing [19]. Azmat et al. compared the application of machine learning algorithms in sensor data collection for aggregation processes. Several machine learning algorithms such as adaptive neural fuzzy inference system, neural network, and genetic algorithm were implemented using the MATLAB software, and the accuracy and robustness of the modeling process (MSE) were compared. The results show that compared with artificially established models, machine learning-based methods can produce more accurate and robust process models, while being more adaptable to new data. Azmat et al. looked forward to the potential advantages of machine learning algorithms and its application prospects in the industrial process industry [20].

This article takes the film industry’s informatization and intelligent development and upgrading as the background, uses computer vision and machine learning technology as the basis to explore new methods and new models for realizing film visual expression, and proposes relevant considerations at the strategic level to promote the innovative development of film visual expression. It is of great significance to improve the audiovisual quality and creative standard of film works and promote the healthy and sustainable development of the film industry.

2. Proposed Method

2.1. Artificial Intelligence

Artificial intelligence is a machine that simulates humans in various environments. Like many emerging disciplines, artificial intelligence does not yet have a unified definition. “The next general definition is almost impossible, because intelligence seems to be a hybrid of many information processing and information expression skills.” Mainly because the main reason is that different disciplines are different from their respective perspectives. Disciplines from their respective perspectives have different definitions. In addition, the most fundamental point is what is the nature or mechanism of artificial intelligence (such as hearing, vision, and expression of knowledge), and it is not clear to people at present. The artificial intelligence dictionary is defined as “making computer systems simulate human intelligent activities and accomplish tasks that humans can only accomplish with intelligence, which is called artificial intelligence.” To be precise, this cannot be used as a definition of artificial intelligence. Test the functional parameters of the machine in an appropriate way. Some scholars are focusing on the development of artificial intelligence, and they have also made certain achievements in this area and have been applied to many fields. For example, artificial intelligence technologies in medicine, biology, geology, geophysics, aeronautics, chemistry, and electronics have all found applications.

2.1.1. Weak Artificial Intelligence

Weak artificial intelligence (TOP-DOWNAI) refers to the use of designed programs to simulate the logical thinking of animals and humans. Scholars who hold the view of weak artificial intelligence believe that it is impossible to create intelligent machines that can really reason and solve problems. At present, many electronic products have certain intelligence. When the external data input changes, corresponding programs will be run to get different results, which can replace people to complete repetitive simple tasks. Weak artificial intelligence can be seen everywhere. But the application of weak artificial intelligence is also limited to imitating human lower behavior.

2.1.2. Strong Artificial Intelligence

Strong artificial intelligence (BOTTOM-UPAI) belongs to more advanced artificial intelligence. The strong artificial intelligence believes that it is possible to create intelligent machines that have real consciousness, thinking ability and feelings, can solve problems, and can reason. For strong artificial intelligence, not only can computers study consciousness; in other words, computers have certain cognitive abilities after corresponding programming, and it is also conscious to understand computers from this perspective.

2.2. Artificial Intelligence Applications

No matter how the structure of the convolutional neural network changes, its basic process mainly includes four parts: image input, region feature extraction, neural convolution feature calculation, and regional object classification. The core is to organically integrate feature extraction and classifiers. The method of gradient descent is back-propagated, and the parameters of the convolution template and the parameters of the fully connected layer are continuously optimized, so that the finally learned features and classifiers are close to optimal, and the classification features are obtained.

This paper uses a deep recursive convolutional neural network algorithm to implement platform construction, detection, and verification: (1)Part is the feature extraction of machine learning, i.e., simulating human recognition of category features (so-called “see more knowledge”), the data set used is related to the accuracy of category features, and the public MNIST, ImageNet dataset, PASCALVOC training set, COCO, and other image training data sets to obtain features of major categories, such as people, various animals, and various vehicles(2)Part is to perform feature vector recognition on specific targets in the specific analysis movie (multitarget objects are possible, and in this case, Jean Agen is used as an example)(3)Apply deep recursive convolutional neural network CAFFE to classify and identify categories, scenes, and dialogues in movie films (or scenes and lens fragments) and input category features, etc. into the classifier to meet the specific needs of some movie object matching classification(4)Recognize and calibrate scenes and characters in movies and videos and synthesize movie scenes and lens content semantics

Since the movie video itself contains a large amount of data, it is composed of multiple frames of static images per second, and each frame of image contains rich information. In order to achieve efficient, accurate, and high-precision identification and retrieval of movie content, recursive convolutional neural on the network algorithm was used, the method and strategy of improving the object recognition accuracy by adjusting the LOSS function, to achieve fast and accurate target detection and positioning.

2.3. Classification of Machine Learning

The learning process is a complex knowledge activity closely related to the inference process. Machine learning can be classified according to learning strategies, descriptions of knowledge, or application areas. In general, the form of knowledge expression is determined by the algorithm selection of the mechanical learning system itself. Learners with the same structure can also be applied in different fields. Reasoning strategies can better reflect the description of the form and method of learning and the relationship between the learner and the adjustment of data and knowledge, so the main categories of machine learning are briefly introduced in the following based on the reasoning strategy. (1)Mechanical learning

Mechanical learning does not require the computer to automatically infer or summarize the input data into knowledge. Instead, the input data is processed step by step according to the designed operating procedures. It mainly considers how to search and use the established facts and knowledge. (2)Supervised learning

Supervised learning is to provide some correct input and output examples to the learner (similar to the standard answer). The data provider does not need to care about the functional relationship between these inputs and outputs. The learner learns from the examples. The mapping relationship between input and output is updated to its own knowledge base and maintains compatibility with existing knowledge. Supervised learning is a common learning method for training neural network and decision tree learners. (3)Deductive learning

At this time, the reasoning form of the learner is deductive reasoning, and the process is to derive some meaningful conclusions through logical transformation by using more basic axioms. This learning process helps computers to explore unknown knowledge to get some more useful knowledge. (4)Analogy learning is similar to human analogy thinking, assuming that the learner has mastered the knowledge of the source domain and the target domain is an unknown knowledge domain. Analogy learning is the process of comparing the similarity of knowledge in the source domain and the target domain to obtain useful knowledge in the target domain. Analog learning is more complicated than the previous three learning methods. The main reason is that it involves the knowledge of two domains. In order to make the knowledge of the target domain more useful and effective, it is necessary to learn as much as possible of the knowledge in the source domain. One of the powerful functions of analog learning is that a computer system that has been successfully run in a certain field is quickly transformed into a similar field application, which reduces the cost of secondary development(5)Explain learning

Explain learning is an environment that provides the learner with an example and its criteria under the concept of achieving a certain goal, explains how the example satisfies a given goal concept, and learns the rules of interpretation in the example. A sufficient condition to extend this interpretation to the concept of the goal is based on the inherent knowledge system. Explaining learning is now widely used to streamline the knowledge base and improve system performance. (6)Inductive learning

Inductive learning is the most complicated learning process compared to the inference process in the aforementioned learning categories. When the learner conducts inductive learning, the environment does not provide a general description of the concepts that need to be learned, but provides some examples or counterexamples that reflect the concept, from which the learner learns the general description of the concept.

2.4. Machine Learning Technology and Its Application in New Visual Expression of Movies

Machine learning is an algorithm to identify new intelligent samples, analyze, and predict the future, and the machine can learn rules from a large amount of historical data. Mechanical learning should follow the following basic steps: determine the training data set, use the training data set for model training or learning and learner construction, use the validation data set to evaluate learner performance and model selection, and predict the final model based on the test data used and the output of the prediction. The ultimate goal of machine learning is to adapt the learning or training model to a new sample. In short, there is a strong generalization ability to avoid overfitting and underfitting. Among them, overfitting is due to proper prediction of known data and insufficient learning ability to make insufficient prediction of unknown data. This reduces the generalization performance of the learner. This is the main obstacle to machine learning and must be mitigated. Insufficient learning is due to poor learning ability, which can be overcome by increasing training samples.

Mechanical learning technology is widely used in computer graphics, movie virtual asset analysis, and 3D animation production. For example, the reuse of 3D motion data has become the focus of attention in the production of 3D animation movies. This is the basic data-driven motion generation technique. Based on a large amount of 3D motion data and expression data, mechanical learning technology can be used to realize the reuse of 3D motion data such as subspace analysis, statistical learning, principal component analysis, and multiple body learning, which can be used for the execution of existing 3D motion data. Analyze, learn, and guide the generation of new motion data. In addition, the automation and intelligent generation of action animation is an important research content for the production of 3D animation movies, and its implementation is to build agents with autonomous decision-making functions and create action animations for virtual characters. It can be applied to mechanical learning technology, such as constructing an introduction mode of action. The establishment of the action and consciousness mode of the virtual character enables the agent to quickly and firmly learn the dialogue action with different users and provides a dialogue-type virtual environment and a dynamic action plan for the virtual character. In addition, 3D motion data and expression data belong to multimedia data. Mechanical learning technology is widely used in the field of multimedia content analysis and image understanding, so it has important application value for 3D motion data and expression data processing.

In addition, semisupervised learning technology plays an important role in the field of virtual filming of movies. In semisupervised learning, learners do not rely on external interactions, but automatically use unlabeled samples to improve learning performance. This can be divided into pure semisupervised learning and direct-driven learning. The latter assumes that the unlabeled samples considered in the learning process are predicted data, and the learning objective is to obtain the best generalized performance among the unlabeled samples. In the collection and analysis of movie’s virtual assets, samples without labels are often collected. Obtaining labels requires a lot of time and effort. The combination of a small number of labeled samples and unlabeled samples simplifies the workload of sample labeling, establishes more accurate data models, and completes learning tasks. The semisupervised learning data model is shown in Figure 1.

The basic mode of human beings’ understanding of the unknown world is perception→learning→cognition. However, scientific research topics are based on the nature and laws of human beings being known, the mystery of human mind, and the emergence of scientific knowledge based on the future, able to discover more universal mechanisms of human behavior, expressions, and emotions.

3. Experiments

3.1. Experimental Algorithm Steps

This article uses a convolutional neural network algorithm to study the new vision of anime films. The core idea of the convolutional neural network algorithm is to iteratively train, use gradient descent to reduce the loss function, and propagate the adjustment results back to the layers. Constantly adjust and optimize the weight offset according to the difference, and finally, get the parameters of each layer that can fit the training data well. The steps of the convolutional neural network algorithm are as follows: (1)Network Initialization. First, define the number of convolution layers and pooling layers, the number of convolution kernels contained in each convolution layer, the step size to be moved during the convolution operation, the calculation method of the loss function, the number of iterations, and the learning rate.(2)Initialize weights and offsets randomly with a series of random numbers(3)Randomly select input data and corresponding output data from the training set as the training sample(4)Calculate the output according to the input of each layer of neurons until it reaches the tail sensor. The input and output of each layer of neurons are calculated using the following formula(5)After the final result is obtained by the neural network, the training set results and actual output results are used to inversely adjust the corresponding convolution kernel of each convolution layer by calculating the partial derivative of the loss function about the weights and offsets. Define error function (4)(6)Constantly correct weights and offsets through iterative learning(7)Determine whether the requirements are met according to the error function. When the value of the error meets the preset area or the number of iterative learning is greater than the preset value, the training is ended. Otherwise, return to step 3 and continue to the next round of training. The detailed flowchart of network training in the entire process is shown in Figure 2

3.2. Selection of Experimental Data

The training data is Movielens-100 K dataset of Movielens, which contains 400000 rating records of 1882 anime movies by 943 users. The information in the data set includes user ID, user age, user occupation, user gender, user region, anime movie name, anime movie type, and anime movie rating. Feature vector data is shown in Figure 3.

As input data of the convolutional neural network, occupations need to be represented by numbers. The paper uses random assignment to evaluate 21 types of occupations. The initial value is assigned, and the random adjustment is performed according to the different attributes of the occupation and the proportion of the user’s interest value in the process of forward adjustment error in the later stage.

3.3. Experimental Data Verification Method
3.3.1. -Fold Cross-Validation

The main idea of the -fold cross-validation method is to divide the data set into similarly sized nonintersection subsets, each time using the union of subsets as the training data set and the remaining subset as the test data set; in this way, we can obtain the -group training plus test data set. Since the cross-validation method depends on the value of , the cross-validation method is also called “-fold cross-validation.”

3.3.2. 2-Fold Cross-Validation Method

Based on -fold cross-validation, 2-fold cross-validation divides each subset into two. The small sets are denoted as and ; the set is used as the training set, and is used as the test set. After the experiment is performed, then is used as the training set, and is used as the test set. The advantage of the 2-fold cross-validation method is that the test set and the training set are both large enough and can be repeatedly tested according to the number of samples. The parameter is often taken in the 2-fold cross-validation method.

4. Discussion

4.1. Convolutional Neural Network Training Analysis

Since different hyperparameters will have a greater impact on the final result during the training of the neural network, multiple combinations are used to verify the actual effect of the algorithm. It includes the learning rate, the number of convolutional layers, the number of convolution kernels, the number of pooling layers, the number of iterative learning, and the step size. The convolutional neural network hyperparameter combinations and their corresponding accuracy rates are shown in Table 1 and Figure 4.

After continuous training verification, it is found that from the two factors of training time and accuracy, the eleven-layer convolutional neural network can achieve a best training result, so the eleven-layer convolutional neural network is taken as an example to compare. In this paper, the size of the first layer of convolution kernel is selected based on the effective user information and movie information size. Each record has 5 valid attributes, so the size of the first layer of convolution kernel is. Some parameter settings of the eleven-layer convolutional neural network suitable for score prediction are shown in Table 2 and Figure 5:

In this experiment, the number of iterations is not repeated training for the same batch of data sets, but all data is divided into 200 batches, each input of 200 data, and each training is an iteration, so it does not exist for the same batch of data, and the problem of “overfitting” occurs because of too many trainings. During the test, 300 data sets were used for each batch. From the experimental results, it can be found that, for the training set, the accuracy rate continues to increase as the number of iterative trainings increases. After the parameters of the model were determined, the accuracy of the test set did not change much, all around 57%.

4.1.1. Comparative Experimental Analysis of Different Algorithms

In order to test the actual effect of the algorithm in this paper, it needs to be compared with existing algorithms. The higher the accuracy of the score prediction, the better the performance of the score prediction algorithm. For different comparison algorithms, the parameter values are as follows: (1)Support Vector Machine. The regularization parameter is 100, the kernel function is Gaussian radial kernel function, and Gamma is 0.003.(2)Decision Tree. Criterion is set to “entropy,” and Splitter is set to “best.”(3)Naive Bayes Classifier. The data dimension is 5, and the feature vector dimension is 5.

The accuracy comparison results of the eleven-layer convolutional neural network and three machine learning methods after three experiments are shown in Table 3 and Figure 6.

As shown in Table 3 and Figure 6, the new visual expression research algorithms for animation films based on convolutional neural networks have higher advantages than other machine learning algorithms, whether they are related to machine learning algorithms or to traditional algorithms in comparison, and the eleven-layer convolutional neural network proposed in this paper has a greater advantage in accuracy.

4.2. New Visual Expression of Artificial Intelligence in Anime Films

At present, artificial intelligence technology is mainly based on Convolutional God Network (CNN), and many excellent algorithms have been generated according to applications in different fields, such as the famous SIFT features, Alex Net, RCNN, GoogLeNet, Faster RCNN, SOLO, and SSD, regardless of the network structure. The core of how to change is to transform its basic process. It mainly includes image input, region feature extraction, region feature classification, partial feature extraction, and classifier implementation. Through the method of stochastic gradient descent, the parameters of intermittent splicing panels and parameters of all connected layers can be optimized to obtain the features and categories closest to the final arrival. This paper uses a deep recursive convolutional neural network algorithm to implement platform construction, detection, and verification, as shown in Figure 7.

The electronic image itself is composed of multiple static images due to the large amount of data. On the movie deep recursive convolutional neural network algorithm, by adjusting the LOSS function, it provides methods and strategies to improve the accuracy of object recognition to achieve high-speed standard target detection and location.

5. Conclusions

So far, the electronic imaging industry has introduced large-scale data, which has shown massive, multisource, structural features, and data-intensive applications in the film industry. In this article, artificial intelligence technology is the core technology associated with language by independently analyzing electronic content analysis and its characteristics and analyzing electronic content elements to extract content elements individually. Through in-depth study of the content of dynamic images, analysis of the occurrence of a large number of similar effects and formation characteristics is looking for electric effects, in order to improve the level of animated films and promote the creation and marketing innovation of animated films.

After researching and summarizing the relevant knowledge in the recommendation field at this stage, this article analyzes the personalized recommendation algorithms involved in the recommendation system in various fields. This paper proposes the application of convolutional neural networks in deep learning to the new visual expression research of anime films and proposes a new visual research expression algorithm of anime films. After further research on the principle of new vision of animation film, the new visual expression research model of animation film is established according to the research process of new visual expression of movie, and the established algorithm is used in the new vision research expression model of animation film.

Due to the limitation of time and conditions, there are still some shortcomings and to be improved in this article. There are still some shortcomings, including the following: cold start of the algorithm, data setting and initialization, data structure, and the balance between accuracy and recall. With the continuous innovation of artificial intelligence technology, the analysis methods and means of artificial intelligence technology in movies will also be more widely used in movie creation, production, review, appreciation, marketing recommendations, and other links.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.