Video Decolorization Based on the CNN and LSTM Neural Network,ACM Transactions on Multimedia Computing, Communications, and Applications

当前位置： X-MOL 学术 › ACM Trans. Multimed. Comput. Commun. Appl. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Video Decolorization Based on the CNN and LSTM Neural Network
ACM Transactions on Multimedia Computing, Communications, and Applications ( IF 5.1 ) Pub Date : 2021-07-22 , DOI: 10.1145/3446619
Shiguang Liu ₁ , Huixin Wang ₁ , Xiaoli Zhang ₁

Affiliation

Video decolorization is the process of transferring three-channel color videos into single-channel grayscale videos, which is essentially the decolorization operation of video frames. Most existing video decolorization algorithms directly apply image decolorization methods to decolorize video frames. However, if we only take the single-frame decolorization result into account, it will inevitably cause temporal inconsistency and flicker phenomenon meaning that the same local content between continuous video frames may display different gray values. In addition, there are often similar local content features between video frames, which indicates redundant information. To solve the preceding problems, this article proposes a novel video decolorization algorithm based on the convolutional neural network and the long short-term memory neural network. First, we design a local semantic content encoder to learn and extract the same local content of continuous video frames, which can better preserve the contrast of video frames. Second, a temporal feature controller based on the bi-directional recurrent neural networks with Long short-term memory units is employed to refine the local semantic features, which can greatly maintain temporal consistency of the video sequence to eliminate the flicker phenomenon. Finally, we take advantages of deconvolution to decode the features to produce the grayscale video sequence. Experiments have indicated that our method can better preserve the local contrast of video frames and the temporal consistency over the state of the-art.

中文翻译：

基于 CNN 和 LSTM 神经网络的视频脱色

视频脱色是将三通道彩色视频转换为单通道灰度视频的过程，本质上是对视频帧的脱色操作。大多数现有的视频去色算法直接应用图像去色方法对视频帧进行去色。但是，如果我们只考虑单帧的去色结果，不可避免地会导致时间不一致和闪烁现象，这意味着连续视频帧之间的相同局部内容可能会显示不同的灰度值。此外，视频帧之间往往存在相似的局部内容特征，这表明存在冗余信息。针对上述问题，本文提出了一种基于卷积神经网络和长短期记忆神经网络的新型视频去色算法。首先，我们设计了一个局部语义内容编码器来学习和提取连续视频帧的相同局部内容，这样可以更好地保留视频帧的对比度。其次，采用基于具有长短期记忆单元的双向递归神经网络的时间特征控制器来细化局部语义特征，可以极大地保持视频序列的时间一致性，从而消除闪烁现象。最后，我们利用反卷积来解码特征以产生灰度视频序列。实验表明，我们的方法可以更好地保持视频帧的局部对比度和最新技术的时间一致性。可以更好地保留视频帧的对比度。其次，采用基于具有长短期记忆单元的双向递归神经网络的时间特征控制器来细化局部语义特征，可以极大地保持视频序列的时间一致性，从而消除闪烁现象。最后，我们利用反卷积来解码特征以产生灰度视频序列。实验表明，我们的方法可以更好地保持视频帧的局部对比度和最新技术的时间一致性。可以更好地保留视频帧的对比度。其次，采用基于具有长短期记忆单元的双向递归神经网络的时间特征控制器来细化局部语义特征，可以极大地保持视频序列的时间一致性，从而消除闪烁现象。最后，我们利用反卷积来解码特征以产生灰度视频序列。实验表明，我们的方法可以更好地保持视频帧的局部对比度和最新技术的时间一致性。可以极大地保持视频序列的时间一致性，消除闪烁现象。最后，我们利用反卷积来解码特征以产生灰度视频序列。实验表明，我们的方法可以更好地保持视频帧的局部对比度和最新技术的时间一致性。可以极大地保持视频序列的时间一致性，消除闪烁现象。最后，我们利用反卷积来解码特征以产生灰度视频序列。实验表明，我们的方法可以更好地保持视频帧的局部对比度和最新技术的时间一致性。

更新日期：2021-07-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>