当前位置: X-MOL 学术J. Comput. Des. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Traffic control hand signal recognition using convolution and recurrent neural networks
Journal of Computational Design and Engineering ( IF 4.9 ) Pub Date : 2022-02-25 , DOI: 10.1093/jcde/qwab080
Taeseung Baek 1 , Yong-Gu Lee 1, 2
Affiliation  

Abstract
Gesture understanding is one of the most challenging problems in computer vision. Among them, traffic hand signal recognition requires the consideration of speed and the validity of the commanding signal. The lack of available datasets is also a serious problem. Most classifiers approach these problems using the skeletons of target actors in an image. Extracting the three-dimensional coordinates of skeletons is simplified when depth information accompanies the images. However, depth cameras cost significantly more than RGB cameras. Furthermore, the extraction of the skeleton needs to be performed in prior. Here, we show a hand signal detection algorithm without skeletons. Instead of skeletons, we use simple object detectors trained to acquire hand directions. The variance in the time length of gestures mixed with random pauses and noise is handled with a recurrent neural network (RNN). Furthermore, we have developed a flag sequence algorithm to assess the validity of the commanding signal. In whole, the computed hand directions are sent to the RNN, which identifies six types of hand signals given by traffic controllers with the ability to distinguish time variations and intermittent randomly appearing noises. We constructed a hand signal dataset composed of 100 thousand RGB images that is made publicly available. We achieved correct recognition of the hand signals with various backgrounds at 91% accuracy. A processing speed of 30 FPS in FHD video streams, which is a 52% improvement over the best among previous works, was achieved. Despite the extra burden of deciding the validity of the hand signals, this method surpasses methods that solely use RGB video streams. Our work is capable of performing with nonstationary viewpoints, such as those taken from moving vehicles. To accomplish this goal, we set a higher priority for the speed and validity assessment of the recognized commanding signals. The collected dataset is made publicly available through the Korean government portal under the URL “data.go.kr/data/15075814/fileData.do.”


中文翻译:

使用卷积和循环神经网络的交通控制手势识别

摘要
手势理解是计算机视觉中最具挑战性的问题之一。其中,交通手势识别需要考虑速度和指挥信号的有效性。缺乏可用的数据集也是一个严重的问题。大多数分类器使用图像中目标演员的骨架来解决这些问题。当深度信息伴随图像时,骨骼的三维坐标的提取被简化。然而,深度相机的成本明显高于 RGB 相机。此外,骨架的提取需要事先进行。在这里,我们展示了一种没有骨架的手势检测算法。我们使用经过训练的简单对象检测器来获取手部方向,而不是骨架。手势时间长度的变化与随机暂停和噪声混合使用循环神经网络 (RNN) 来处理。此外,我们开发了一种标志序列算法来评估指挥信号的有效性。总体而言,计算出的手部方向被发送到 RNN,RNN 识别出交通管制员给出的六种手势信号,能够区分时间变化和间歇性随机出现的噪声。我们构建了一个由 10 万张 RGB 图像组成的手势数据集,该数据集已公开发布。我们以 91% 的准确率实现了对各种背景的手势信号的正确识别。FHD 视频流的处理速度达到了 30 FPS,比以前的作品中最好的提高了 52%。尽管决定手势的有效性有额外的负担,这种方法超越了仅使用 RGB 视频流的方法。我们的工作能够以非静止的观点执行,例如从移动的车辆中获取的观点。为了实现这一目标,我们为识别的指挥信号的速度和有效性评估设置了更高的优先级。收集的数据集通过韩国政府门户网站“data.go.kr/data/15075814/fileData.do”公开提供。
更新日期:2022-02-25
down
wechat
bug