当前位置: X-MOL 学术Expert Syst. Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Vehicle classification using a real-time convolutional structure based on DWT pooling layer and SE blocks
Expert Systems with Applications ( IF 7.5 ) Pub Date : 2021-06-15 , DOI: 10.1016/j.eswa.2021.115420
Hossein Gholamalinejad , Hossein Khosravi

Real-time vehicle type classification has become more popular and significant in recent years owing to its potential applications. In this paper, a real-time Convolutional Neural Network is proposed for vehicle type classification. This novel CNN structure is a combination of classic CNN layers along with squeeze-and-excitation (SE) modules and Haar wavelet as the pooling layer. This structure improves the performance of the CNN classifier by emphasizing informative feature maps and decreasing the entropy of the network. A cross-entropy loss function is proposed for better performance. A new pooling method is also introduced based on the Haar transform. Network parameters and the number of layers are selected in such a way to be real-time. Experimental results on two vehicle datasets show that the overall performance of this model, including recognition time and accuracy, is better than the others. For example, using DWT instead of Max-pooling improved the recognition rate on the IRVD dataset from 97.12% to 99.06%. The number of training parameters of the proposed model is about 5 million which is much less than popular networks like VGG (128 m), ResNet152 (58 m), DarkNet (40 m), and Inception-V3 (24 m). This made it very faster so that its recognition time is only 42 ms on CPU which is suitable for real-time applications.



中文翻译:

使用基于 DWT 池化层和 SE 块的实时卷积结构进行车辆分类

由于其潜在的应用,实时车辆类型分类近年来变得越来越流行和重要。在本文中,提出了一种用于车辆类型分类的实时卷积神经网络。这种新颖的 CNN 结构是经典 CNN 层与挤压激励 (SE) 模块和 Haar 小波作为池化层的组合。这种结构通过强调信息特征图和减少网络的熵来提高 CNN 分类器的性能。提出了交叉熵损失函数以获得更好的性能。还引入了一种基于 Haar 变换的新池化方法。网络参数和层数的选择方式是实时的。在两个车辆数据集上的实验结果表明,该模型的整体性能,包括识别时间和准确率,都优于其他的。例如,使用 DWT 代替 Max-pooling 将 IRVD 数据集的识别率从 97.12% 提高到 99.06%。所提出模型的训练参数数量约为 500 万,远低于 VGG(128 m)、ResNet152(58 m)、DarkNet(40 m)和 Inception-V3(24 m)等流行网络。这使其速度非常快,因此其在 CPU 上的识别时间仅为 42 毫秒,适用于实时应用程序。和 Inception-V3 (24 m)。这使其速度非常快,因此其在 CPU 上的识别时间仅为 42 毫秒,适用于实时应用程序。和 Inception-V3 (24 m)。这使其速度非常快,因此其在 CPU 上的识别时间仅为 42 毫秒,适用于实时应用程序。

更新日期:2021-06-18
down
wechat
bug