Abstract

Severe weather conditions will have a great impact on urban traffic. Automatic recognition of weather condition has important application value in traffic condition warning, automobile auxiliary driving, intelligent transportation system, and other aspects. With the rapid development of deep learning, deep convolutional neural networks (CNN) are used to recognize weather conditions on traffic road. A new simplified model named ResNet15 is proposed based on the residual network ResNet50 in this paper. The convolutional layers of ResNet15 are utilized to extract weather characteristics, and then the characteristics extracted at the previous layer are shortcut to the next layer through four groups of residual modules. Finally, the weather images are classified and recognized through the fully connected layer and Softmax classifier. In addition, we build a medium-scale dataset of weather images on traffic road, called “WeatherDataset-4,” which consists of 4 categories and contains 4983 weather images covering most of the severe weather. In this paper, ResNet15 is used to train and test on the “WeatherDataset-4,” and desirable recognition results are obtained. The evaluation of a large number of experiments demonstrates that the proposed ResNet15 is superior to traditional network models such as ResNet50 in recognition accuracy, recognition speed, and model size.

1. Introduction

In modern traffic, severe weather conditions will have a great impact on urban traffic. Severe weather conditions such as rain, snow, and fog will reduce the visibility and friction coefficient on traffic road, which may cause traffic jams and serious traffic accidents, with great potential danger. Through real-time recognition of weather conditions and comprehensive utilization of traffic information, the timing parameters of traffic signals can be adjusted in real time according to weather conditions, such as appropriately extending the time of green light time and yellow light under severe weather conditions. It can effectively avoid serious traffic accidents and improve driving efficiency. Some automobile auxiliary systems can improve driving safety through weather identification, such as setting speed limits in bad weather conditions, prompting drivers to keep their distance, and automatically turning on wipers in rainy weather [1]. In summary, the automatic recognition of weather conditions has important application value in traffic condition warning, automobile auxiliary driving, and intelligent transportation system [24].

Traditional methods of weather recognition were mainly based on multiple sensors [5]. However, the installation and maintenance of sensors will cost a lot of manpower and material resources. In addition, the complexity of external environment will affect the recognition accuracy of various sensors, and it is difficult to track the changes of weather conditions in space and time. In recent years, with the development of intelligent transportation systems, various monitoring devices have been installed on the road [6]. Therefore, the methods of weather recognition based on image processing and machine vision have gradually developed [7]. A method based on decision tree and support vector machine (SVM) was proposed to classify and identify weather images by extracting features such as power spectrum slope, contrast, noise, and saturation [8, 9]. The method was tested on a small-scale dataset named Wild Dataset containing hundreds of images, and the recognition error rates of foggy and rainy days were 15% and 25%, respectively, which cannot meet the actual requirements [10]. In addition, the images were taken in a fixed position, and the number of images was not enough, which lead to the lack of generalization and authority. In summary, most of these methods based on machine learning require the construction of complex feature engineering and the manual extraction of various weather features. The methods are too complex and have weak universality and generalization ability. At present, with the rapid development of deep learning, AlexNet, VGGNet, GoogLeNet, ResNet, and other traditional convolutional neural networks (CNN) have shown amazing performance in various machine vision tasks including image classification, object detection, and semantic segmentation [1116]. As CNN can extract rich, abstract, and deep semantic information from weather images, they are superior to traditional methods of weather recognition to a large extent. A pretrained AlexNet network was fine-tuned through a two-class weather dataset, and the recognition accuracy reached 82.2%, achieving good results [17, 18]. Then, the features extracted by hand and by CNN were combined to improve the classification performance, and the recognition accuracy can reach 91.4% [18]. A pretrained GoogLeNet network was fine-tuned through a large-scale extreme weather dataset collected by themselves and obtained a more accurate model of weather recognition, with the recognition accuracy up to 94.5% [19]. Then, a new double-fine-tuning strategy was proposed to train the GoogLeNet model and optimize the original GoogLeNet network [20]. The size of the model was only one-third of that of the original model, and the recognition accuracy was improved to 95.46%. These methods of weather recognition based on deep learning are generally superior to traditional methods, but they require large-scale datasets as a support and can only be efficiently trained on high-end GPU, making them very expensive to recognize weather conditions. Therefore, it is difficult to apply these methods to the terminal equipment in the field of traffic widely at present.

Considering the limitations of the above methods, a novel method is proposed for weather recognition on traffic road with deep convolutional neural networks. The contributions of this paper are as follows. First, deep convolutional neural networks (CNN) are used to recognize weather conditions on traffic road. In addition, a medium-scale dataset of weather images on traffic road covering most of the severe weather named “WeatherDataset-4” is established by ourselves. Last but not least, a new simplified model named ResNet15 is proposed for the task of weather recognition on traffic road based on ResNet50. Considering the influence of network depth and computational complexity overall, we make a compromise decision. The number of layers in this network model is reduced from 50 to 15 and the number of convolution kernels in each convolution layer is also reduced accordingly. After a series of simplifications, the model size of ResNet15 is only one-eighth of that of ResNet50 and this model can work efficiently even on the common CPU. ResNet15 is used to train and test on the dataset, and desirable recognition results are obtained. The rest of this paper is structured as follows. Section 2 introduces the construction method of “WeatherDataset-4.” Section 3 describes the architecture of ResNet15 for weather recognition. Section 4 presents the experimental results of ResNet15 on “WeatherDataset-4” and compares them with other methods. Finally, the conclusions are summarized in Section 5.

2. Dataset

Data is the key point of deep learning algorithms and it is no exception in the task of weather recognition in this paper. The recognition performance of the deep neural network is highly dependent on the size of datasets [21]. AlexNet achieved the best results in the ILSVRC-2010 and ILSVRC-2012 competitions. This model can classify 1000 different classes so precisely, which not only benefits from the advantages of network structure but also benefits from the large-scale dataset, including 1.2 million training images [11]. The above small-scale dataset named Wild Dataset contains only a few hundred weather images and the recognition results are not satisfactory. Due to the lack of public weather datasets, a medium-scale dataset of weather images on traffic road named “WeatherDataset-4” is established for the task of weather recognition in this paper. The details of this dataset are shown in Table 1. This dataset contains a total of 4,983 weather images covering most of the severe weather, which are divided into four categories: foggy days, rainy days, snowy days, and sunny days. As cloudy and sunny days have roughly the same impact on traffic road, they are uniformly merged into sunny days. Weather images of each category are divided into the train set and the test set, with 4,000 images for training and 983 ones for testing. Most of these images are collected from the Internet and selected according to the specific requirements. The image modes include aerial photography, camera, news, traffic accidents, and automobile data recorder. This dataset is collected to identify the weather conditions on traffic road, so most of the images contain complex road scenes such as city streets, highways, and country roads. Figure 1 shows four samples of weather images from “WeatherDataset-4.” Because these images are taken from multiple angles and contain a variety of complex scenes and the number of weather images of each category is relatively large, this dataset has certain generalization and universality.

The methods of deep learning usually need large-scale datasets as support to avoid overfitting in the process of network learning. In view of this situation, a series of data augmentation methods for the training images in “WeatherDataset-4” have been adopted in this paper. Firstly, the image block with size of 224 × 224 is randomly intercepted from the original image with size of 256 × 256. Then the image block is rotated, flipped, translated, cut, enlarged, and so forth. Finally, a series of different images are generated, as shown in Figure 2. Specifically, ImageDataGenerator function offered by the Keras API is used to generate weather images after stochastic transformation in each training epoch by setting parameters such as rotation range, width shift, and shear range [22]. After the adoption of this method, the model will not receive any two identical images, which is beneficial to the inhibition of overfitting and improves the generalization and robustness of the model. In order to verify the effectiveness of data augmentation for the task of weather recognition, the contrast experiments with or without data augmentation are described in Section 4.2.

3. Method

3.1. Architecture of ResNet15

The residual network structure achieved great performance for image classification and object recognition in the 2015 ImageNet Large Scale Visual Recognition Challenge (ILSVRC). The residual modules were used to solve the vanishing gradient problem in Resnet50 caused by an increase in network depth [14]. Without adding additional parameters, the convergence speed and accuracy can be accelerated and improved simply by increasing network depth. Due to the excellent performance of the residual network structure, a new simplified model ResNet15 was proposed on the basis of ResNet50 in this paper. The architecture of ResNet50 and ResNet15 is shown in Table 2. First, ResNet50 is made up of a convolutional layer with size of 7 × 7 and a MaxPooling layer with size of 3 × 3, followed by a series of residual modules. There are two basic stacking modes for residual modules. One mode is Identity Block (IB) shown in Figure 3(a), the input and output of which have the same dimensions, so they can be connected in series. The other is Conv Block (CB) shown in Figure 3(b), the input and output of which have different dimensions, so they can not be connected in series, and the function of which is to change the dimensions of the feature vector by the convolution layer with size of 1 × 1. Finally, the weather images are classified and recognized through the AveragePooling layer and Softmax classifier.

ResNet15 is simplified and improved on the basis of ResNet50. Firstly, the convolutional layer with size of 7 × 7 and the MaxPooling layer with size of 3 × 3 in ResNet50 are retained. Secondly, four Conv Blocks of the first level in four groups of residual modules are retained, and the stride parameter of the residual module of the first group is changed from 1 to 2, but other Identity Blocks are deleted. Then, the AveragePooling layer is changed to a fully connected layer with 512 dimensions, and the dropout layer is added after the fully connected layer. Finally, the Softmax classifier is kept unchanged. The architecture of ResNet15 is shown in Figure 4. The dotted box in the figure shows four groups of residual modules. Since the number of parameters of the network model should be proportional to the size of the dataset, the number of convolutional kernels is appropriately adjusted in this paper to reduce the number of parameters of the network model from 45.7 M to 5.4 M, and the size of the model is only one-eighth of that of ResNet50. These changes are based on the optimal results of a large number of experiments, which can be seen in Section 3.2.

In this paper, the convolutional layers of ResNet15 network model are utilized to extract weather characteristics of the sky, road, background, and so forth in the weather images layer by layer. Figure 5 shows the visual feature maps extracted from the convolutional layers of each group. The deeper the network level is, the fewer the pixel number of feature maps is, indicating more abstract weather characteristics. Then the weather characteristics extracted at the previous layer are shortcut to the next layer through four groups of residual modules to prevent the loss and damage of important characteristics in the process of transmission through the deep convolutional layers. Finally, the weather images are classified and recognized through the fully connected layer and Softmax classifier.

3.2. Setup of Hyperparameters

The residual network model can accelerate the convergence speed and improve the accuracy by increasing the depth of the network without adding additional parameters. However, with the increase of the network depth, the gradient will disappear and the important weather characteristics extracted by the convolutional layers will be lost and destroyed during the transmission. Even if the introduction of residual modules can solve most of these problems, it does not mean that the deeper the better, and the computational complexity increases greatly as the depth increases. Therefore, it is necessary to find a balance between model performance and computational complexity. For this reason, we stack 2, 3, 4, and 5 residual modules, respectively, to constitute ResNet9, ResNet12, ResNet15, and ResNet18 for comparative experiment. The comparison results are shown in Figure 6(a). As the depth increases, the recognition accuracy of ResNet9, ResNet12, and ResNet15 network models increases successively, but the recognition accuracy of ResNet18 is slightly lower than that of ResNet15. Considering the influence of weather recognition performance and computational complexity overall, the ResNet15 network structure composed of four groups of residual modules is adopted in this paper finally.

There are MaxPooling layer and AveragePooling layer, respectively, after the first convolutional layer and before the Softmax classifier in ResNet50. Pooling layer is to extract significant features again. MaxPooling layer focuses on extracting edge texture features, while AveragePooling layer focuses on extracting background features. This operation not only reduces the amount of data processing but also retains important features, achieving the functions of feature dimensionality reduction, data compression, and overfitting inhibition. We try various combinations of MaxPooling layer and AveragePooling layer after the first convolutional layer and before the Softmax classifier in ResNet15 and replace the pooling layer with a fully connected layer before the Softmax classifier for a series of comparative experiments. As shown in Figure 6(b), the experimental results show that the combinations of a MaxPooling layer adopted after the first convolutional layer and a fully connected layer with 512 dimensions added before the Softmax classifier achieve the highest recognition accuracy of 96.03%. In addition, we add a dropout layer between the fully connected layer and the Softmax classifier. During training, the data in the fully connected layer is randomly discarded in a certain proportion to inhibit overfitting. In this paper, the model is trained and compared with different dropout rates of 0.2–0.8. The experimental results are shown in Figure 6(c); the recognition accuracy of the model is the highest when the dropout rate is set to 0.2.

3.3. Details of Learning the ResNet15

The ResNet15 network model proposed in this paper is built on the basis of Keras library with python. The 4,000 training images in “WeatherDataset-4” are traversed by the ImageDataGenerator function, and the data augmentation operations of stochastic transformation are carried out for each image. The training batch size is set to 40, and the epoch is set to 100. In this paper, the Stochastic Gradient Descent (SGD) optimizer is used to update and optimize the weight of ResNet15. The momentum is set to 0.9 and weight decay is set to 0.0002. In addition, the model is initialized with learning rate of 0.001 and the value is reduced by 10% after each four thousand iterations. After the training, 983 test images are tested. The experiments are run on the NVIDIA GeForce GTX 1080 Ti GPU, and the recognition time is also tested on the CPU. The contrast experiments are shown in Section 4.3.

4. Experimental Results and Analysis

4.1. Recognition Performance

In order to verify the validity of the method in this paper, ResNet15 is built on the basis of Keras library with python. First, ResNet15 is trained on the training set of “WeatherDataset-4” and the model is saved after training. Then the images of test set are input into the trained model for weather recognition and the classification results are output (one of the four weather conditions: foggy, rainy, snowy, and sunny). Finally, according to the classification results of 983 test images, the recognition accuracy is calculated and the confusion matrix of weather recognition is drawn, as shown in Figure7(f). The values on the diagonal of the confusion matrix represent the recognition accuracy of each category, respectively. Among them, the recognition accuracy of foggy days is 96.38%, that of rainy days is 97.25%, that of snowy days is 94.65%, and that of sunny days is 95.08%. The average accuracy of weather recognition is 96.03%. Therefore, ResNet15 has achieved great performance for the task of weather recognition on traffic road.

In this paper, abstract deep-level characteristics of the sky, road, and background in weather images are extracted through the convolutional layers of ResNet15, and then the weather images are classified and recognized by the fully connected layer and Softmax classifier. Examples that can be correctly identified are shown in Figure 8(a). Those obvious weather characteristics such as dense fog, smooth ground, white snow, and blue sky make weather images be effectively identified as foggy days, rainy days, snowy days, and sunny days, respectively. Among them, fog will reduce visibility, and rain and snow will make roads smoother, which are not conducive to traffic and may cause traffic accidents and traffic jams. If these severe weather conditions can be recognized in real time, serious traffic accidents can be effectively avoided and driving efficiency can be improved.

However, most of the background information in the weather images is relatively complex. Due to the complexity of weather recognition, not all weather images can be correctly recognized. The first reason can be attributed to uncertainty, as there is no clear boundary between the various weather categories. In addition, sometimes weather recognition is not a simple classification task because there is more than one weather element in some weather images; those may be a multilabel classification task [23, 24], which can be summarized as incompleteness. Therefore, some weather images could not be identified correctly as shown in Figure 8(b). The hazy sky and background characteristics belong to foggy days, but sometimes they were finally misrecognized as snowy days because of snow covered on the road. Sometimes, despite of snow covered on the road, the blue sky made the image misrecognized as sunny days. Sometimes the image is misrecognized as foggy days due to complex background characteristics such as splashing water, thick cloud, and gray sky. Due to the complexity of the weather recognition, many weather images are also difficult to recognize accurately even manually, and the labels of dataset may have errors. Moreover, the cleanliness of datasets is also a very important property for image classification, which is not less than the size of datasets [21]. Therefore, it is also an important measure for improving the performance of weather recognition to screen the errors in the labels of dataset and remove the weather images with fuzzy labels.

4.2. Effects of Data Augmentation and Residual Models

In order to validate the effectiveness of data augmentation for the task of weather recognition, in this paper, the training images in “WeatherDataset-4” with or without data augmentation are fed into the ResNet50 and ResNet15 network models for training, and the test images are used for validation. The accuracy curves of two methods on validation set are shown in Figure 9(a), in which the dotted line represents the curve without data augmentation. The experimental results are shown in Table 3. The recognition accuracy of ResNet50 is 75.38%, and that of Resnet15 is 80.57% without data augmentation. After data augmentation, the recognition accuracy of ResNet50 reaches 85.76%, with an increase of about 10%, and that of ResNet15 reaches 96.03%, with an increase of about 15%. Moreover, it can also be seen from the curve that overfitting began to appear in ResNet50 and ResNet15 after about 30 training epochs, and the accuracy of the validation set no longer increased. However, data augmentation could significantly inhibit the overfitting phenomenon, greatly improve the recognition accuracy of weather images, and enhance the generalization and robustness of those models.

In order to prevent the loss and damage of deep-level weather characteristics extracted by the convolutional layers during the transmission, four groups of residual modules are introduced into the ResNet15. The weather characteristics extracted at the previous layer are shortcut to the next layer through the residual modules, speeding up the convergence rate of the network model, improving the recognition accuracy, and solving the vanishing gradient problem caused by the increase in network depth. In order to validate the effectiveness of the residual modules for the task of weather recognition, four groups of residual modules in the dotted box shown in Figure 4 are removed, thus forming a 15-layer convolutional neural network without residual modules (CNN15) as a comparative experiment. The accuracy curve of the validation set is shown in Figure 9(b). The recognition accuracy of CNN15 is only 79.35%, while the recognition accuracy of ResNet15 with the addition of four groups of residual modules is improved by about 16%, and the vanishing gradient problem is solved, which greatly improves the performance of weather recognition.

4.3. Comparison with Other Methods

With the development of big data and the increase of computing power on CPU and GPU, deep learning has made a very outstanding breakthrough in computer vision. Over the years, classical convolutional neural networks such as AlexNet, VGGNet, GoogLeNet, and ResNet have achieved the best results in the ILSVRC competitions every year. Many advanced network structures are proposed in these four classic networks, which greatly promoted the development of deep learning. Among these classic networks, the improvement of performance is almost accompanied by the deepening of convolutional neural network. However, it is not certain in all aspects due to the limitations of available datasets for weather recognition in scale and quantity. In order to evaluate the weather recognition performance of ResNet15, the recognition performance of ResNet15 on “WeatherDataset-4” is compared with that of other deep learning methods such as AlexNet, VGG16, GoogLeNet, ResNet50, and ResNet34. Figure 9(c) shows the accuracy curves of various methods, in which different colored curves represent different methods. As can be seen from the curves, ResNet15 proposed in this paper is superior to other methods in recognition accuracy, while the gap between other classical networks is not obvious. The accuracy of four categories of weather condition and average recognition accuracy are shown in Table 4. According to the average recognition accuracy of various methods, the histogram is drawn as shown in Figure 10(a), in which ResNet15 proposed in this paper has the highest recognition accuracy, followed by ResNet34, GoogLeNet, AlexNet, ResNet50, and VGG16 successively. Figure 7 shows the confusion matrices of different network models. This also validates that the size of network model should be proportional to the size of dataset, and a network model with too deep levels cannot get better performance because of the limitation of the dataset, which makes the network overfit. Obviously, “WeatherDataset-4” is not very suitable for those complex networks, while the ResNet15 network model with relatively simple structure is more suitable for this dataset, so it has better recognition performance.

Due to the timeliness of weather recognition, considering that the weather recognition method based on ResNet15 proposed in this paper can be applied to the real-time control of traffic signal parameters and auxiliary driving of automobiles as well as other fields, the training time of the network model and the recognition time for weather images are also very important parameters. Moreover, in consideration of too expensive computational cost of the GPU at present, in this paper, the training time and test time of various deep learning methods are tested both on the CPU and GPU. However, as VGG16, GoogLeNet, ResNet50, and ResNet34 are excessively complex and the number of parameters is too big, which lead to a very slow pace of training on the CPU, we train them for five epochs only and then estimate the time of 100 training epochs by calculating the mean. Finally, we calculate the FPS of image recognition on both the CPU and GPU according to the recognition time of 983 test images. FPS indicates the number of weather images that can be recognized per second. The experimental results are shown in Table 5, and the histogram drawn according to the FPS of various methods on GPU is shown in Figure 10(b). As indicated by the experimental results, the model size of ResNet15 is only one-eighth of that of ResNet50, and the training and test time of ResNet15 is the shortest on both the GPU and CPU among these methods. In addition, the FPS of image recognition reaches 4.4 on the CPU and 163.3 on the GPU. This means that our network model can recognize more than four weather images per second and can be trained and tested even on an ordinary cheap CPU. Therefore, this model is superior to other methods in recognition speed and model size and can be widely applied to the terminal equipment in the field of traffic, which has practical value and can meet the actual needs.

5. Conclusions

In this paper, aiming at the task of weather recognition on traffic road, a medium-scale dataset of weather images on traffic road named “WeatherDataset-4” covering most of the severe weather is established, and a new simplified model, ResNet15, is proposed based on ResNet50. The number of layers in this network model is reduced from 50 to 15 and the model size of ResNet15 is only one-eighth of that of ResNet50. In addition, simplified model can work efficiently even on an ordinary cheap CPU. The evaluation of a large number of experiments demonstrates that the ResNet15 model achieves a desirable effect on the “WeatherDataset-4.” The average accuracy of weather recognition reaches 96.03%, while the FPS of image recognition reaches 4.4 on the CPU and 163.3 on the GPU, so this method is superior to traditional network models such as ResNet50 in recognition accuracy, recognition speed, and model size. Therefore, ResNet15 proposed in this paper can mostly meet the requirements of practical application and can be widely applied to the terminal equipment in the field of traffic. At present, all weather conditions are just roughly divided into four categories in our dataset. In the following work, the severe weather conditions should to be subdivided more specially, such as light rain, moderate rain, and heavy rain. Moreover, this network model can only be used to recognize the weather conditions on traffic road in the daytime. We hope that this method can work normally to recognize weather conditions at night through subsequent research and improvement.

Data Availability

The weather data used to support the findings of this study have been deposited in the GitHub are available from the corresponding author upon request. The examples of our dataset “WeatherDataset-4” are available upon request at https://github.com/DaweiXuan/WeatherDataset-4.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was carried out with the support of the National Natural Science Foundation of China (41505017).