Transportation Research Part C: Emerging Technologies ( IF 7.6 ) Pub Date : 2023-03-15 , DOI: 10.1016/j.trc.2023.104089 Dan He , Jiwon Kim , Hua Shi , Boyu Ruan
This study develops an autonomous artificial intelligence (AI) agent to detect anomalies in traffic flow time series data, which can learn anomaly patterns from data without supervision, requiring no ground-truth labels for model training or knowledge of a threshold for anomaly definition. Specifically, our model is based on reinforcement learning, where an agent is built by a Long-Short-Term-Memory (LSTM) model and Q-learning algorithm to incorporate sequential information in time series data into policy optimization. The key contribution of our model is the development of a novel unsupervised reward learning algorithm that automatically learns the reward for an action taken by the agent based on the distribution of data, without requiring a manual specification of a reward function. To test the performance of our model, we conduct a comprehensive set of experimental study on both real-world data from Brisbane city, Australia, and synthetic data simulated according to the distribution of real-world data. We compare the performance of our model against three state-of-the-art models, and the experimental results show that our model outperforms the other models in different parameter settings, with around 90% precision, 80% recall, and 85% F1 score.
中文翻译:
基于强化学习的交通流时间序列自主异常检测
本研究开发了一种自主人工智能 (AI) 代理来检测交通流时间序列数据中的异常,它可以在没有监督的情况下从数据中学习异常模式,不需要用于模型训练的地面实况标签或异常定义阈值的知识。具体来说,我们的模型基于强化学习,其中代理由长短期记忆 (LSTM) 模型和 Q 学习算法构建,以将时间序列数据中的顺序信息纳入策略优化。我们模型的主要贡献是开发了一种新颖的无监督奖励学习算法,该算法可以根据数据分布自动学习代理采取的行动的奖励,而无需手动指定奖励函数。为了测试我们模型的性能,我们对来自澳大利亚布里斯班市的真实世界数据和根据真实世界数据分布模拟的合成数据进行了全面的实验研究。我们将我们的模型的性能与三个最先进的模型进行了比较,实验结果表明,我们的模型在不同的参数设置下优于其他模型,精度约为 90%,召回率为 80%,F1 分数为 85% .