当前位置:
X-MOL 学术
›
arXiv.cs.LG
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Weakly Supervised Video Summarization by Hierarchical Reinforcement Learning
arXiv - CS - Machine Learning Pub Date : 2020-01-12 , DOI: arxiv-2001.05864 Yiyan Chen, Li Tao, Xueting Wang and Toshihiko Yamasaki
arXiv - CS - Machine Learning Pub Date : 2020-01-12 , DOI: arxiv-2001.05864 Yiyan Chen, Li Tao, Xueting Wang and Toshihiko Yamasaki
Conventional video summarization approaches based on reinforcement learning
have the problem that the reward can only be received after the whole summary
is generated. Such kind of reward is sparse and it makes reinforcement learning
hard to converge. Another problem is that labelling each frame is tedious and
costly, which usually prohibits the construction of large-scale datasets. To
solve these problems, we propose a weakly supervised hierarchical reinforcement
learning framework, which decomposes the whole task into several subtasks to
enhance the summarization quality. This framework consists of a manager network
and a worker network. For each subtask, the manager is trained to set a subgoal
only by a task-level binary label, which requires much fewer labels than
conventional approaches. With the guide of the subgoal, the worker predicts the
importance scores for video frames in the subtask by policy gradient according
to both global reward and innovative defined sub-rewards to overcome the sparse
problem. Experiments on two benchmark datasets show that our proposal has
achieved the best performance, even better than supervised approaches.
中文翻译:
分层强化学习的弱监督视频摘要
基于强化学习的传统视频摘要方法存在的问题是,只有在生成整个摘要后才能获得奖励。这种奖励是稀疏的,它使强化学习难以收敛。另一个问题是标记每一帧既乏味又昂贵,这通常会禁止构建大规模数据集。为了解决这些问题,我们提出了一个弱监督的分层强化学习框架,它将整个任务分解为几个子任务以提高摘要质量。该框架由一个管理器网络和一个工作器网络组成。对于每个子任务,管理器被训练以仅通过任务级别的二进制标签来设置子目标,这比传统方法需要的标签少得多。在子目标的指导下,工作人员根据全局奖励和创新定义的子奖励,通过策略梯度预测子任务中视频帧的重要性分数,以克服稀疏问题。在两个基准数据集上的实验表明,我们的提议取得了最好的性能,甚至优于监督方法。
更新日期:2020-03-03
中文翻译:
分层强化学习的弱监督视频摘要
基于强化学习的传统视频摘要方法存在的问题是,只有在生成整个摘要后才能获得奖励。这种奖励是稀疏的,它使强化学习难以收敛。另一个问题是标记每一帧既乏味又昂贵,这通常会禁止构建大规模数据集。为了解决这些问题,我们提出了一个弱监督的分层强化学习框架,它将整个任务分解为几个子任务以提高摘要质量。该框架由一个管理器网络和一个工作器网络组成。对于每个子任务,管理器被训练以仅通过任务级别的二进制标签来设置子目标,这比传统方法需要的标签少得多。在子目标的指导下,工作人员根据全局奖励和创新定义的子奖励,通过策略梯度预测子任务中视频帧的重要性分数,以克服稀疏问题。在两个基准数据集上的实验表明,我们的提议取得了最好的性能,甚至优于监督方法。