Deep Unsupervised Multi-Modal Fusion Network for Detecting Driver Distraction,Neurocomputing

当前位置： X-MOL 学术 › Neurocomputing › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Unsupervised Multi-Modal Fusion Network for Detecting Driver Distraction
Neurocomputing ( IF 5.5 ) Pub Date : 2021-01-01 , DOI: 10.1016/j.neucom.2020.09.023
Yuxin Zhang , Yiqiang Chen , Chenlong Gao

Abstract The risk of incurring a road traffic crash has increased year by year. Studies show that lack of attention during driving is one of the major causes of traffic accidents. In this work, in order to detect driver distraction, e.g., phone conversation, eating, texting, we introduce a deep unsupervised multi-modal fusion network, termed UMMFN. It is an end-to-end model composing of three main modules: multi-modal representation learning, multi-scale feature fusion and unsupervised driver distraction detection. The first module is to learn low-dimensional representation of multiple heterogeneous sensors using embedding subnetworks. The goal of multi-scale feature fusion is to learn both the temporal dependency for each modality and spatio dependencies from different modalities. The last module utilizes a ConvLSTM Encoder-Decoder model to perform an unsupervised classification task that is not affected by new types of driver behaviors. During the detection phase, a fine-grained detection decision can be made through calculating reconstruction error of UMMFN as a score for each captured testing data. We empirically compare the proposed approach with several state-of-the-art methods on our own multi-modal dataset for distracted driving behavior. Experimental results show that UMMFN has superior performance over the existing approaches.

中文翻译：

用于检测驾驶员分心的深度无监督多模态融合网络

摘要发生道路交通事故的风险逐年增加。研究表明，驾驶过程中注意力不集中是导致交通事故的主要原因之一。在这项工作中，为了检测驾驶员分心，例如电话交谈、吃饭、发短信，我们引入了一种深度无监督多模态融合网络，称为 UMMFN。它是一个端到端模型，由三个主要模块组成：多模态表示学习、多尺度特征融合和无监督驾驶员注意力分散检测。第一个模块是使用嵌入子网学习多个异构传感器的低维表示。多尺度特征融合的目标是学习每个模态的时间依赖性和不同模态的空间依赖性。最后一个模块利用 ConvLSTM 编码器-解码器模型来执行不受新型驾驶员行为影响的无监督分类任务。在检测阶段，可以通过计算 UMMFN 的重构误差作为每个捕获的测试数据的分数来做出细粒度的检测决策。我们根据经验将所提出的方法与我们自己的多模态数据集上的几种最先进的方法进行比较，以解决分心驾驶行为。实验结果表明，UMMFN 具有优于现有方法的性能。我们根据经验将所提出的方法与我们自己的多模态数据集上的几种最先进的方法进行比较，以解决分心驾驶行为。实验结果表明，UMMFN 具有优于现有方法的性能。我们根据经验将所提出的方法与我们自己的多模态数据集上的几种最先进的方法进行比较，以解决分心驾驶行为。实验结果表明，UMMFN 具有优于现有方法的性能。

更新日期：2021-01-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11