Using Sensory Time-cue to enable Unsupervised Multimodal Meta-learning,arXiv - CS - Neural and Evolutionary Computing

当前位置： X-MOL 学术 › arXiv.cs.NE › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Using Sensory Time-cue to enable Unsupervised Multimodal Meta-learning
arXiv - CS - Neural and Evolutionary Computing Pub Date : 2020-09-16 , DOI: arxiv-2009.07879
Qiong Liu, Yanxia Zhang

As data from IoT (Internet of Things) sensors become ubiquitous, state-of-the-art machine learning algorithms face many challenges on directly using sensor data. To overcome these challenges, methods must be designed to learn directly from sensors without manual annotations. This paper introduces Sensory Time-cue for Unsupervised Meta-learning (STUM). Different from traditional learning approaches that either heavily depend on labels or on time-independent feature extraction assumptions, such as Gaussian distribution features, the STUM system uses time relation of inputs to guide the feature space formation within and across modalities. The fact that STUM learns from a variety of small tasks may put this method in the camp of Meta-Learning. Different from existing Meta-Learning approaches, STUM learning tasks are composed within and across multiple modalities based on time-cue co-exist with the IoT streaming data. In an audiovisual learning example, because consecutive visual frames usually comprise the same object, this approach provides a unique way to organize features from the same object together. The same method can also organize visual object features with the object's spoken-name features together if the spoken name is presented with the object at about the same time. This cross-modality feature organization may further help the organization of visual features that belong to similar objects but acquired at different location and time. Promising results are achieved through evaluations.

中文翻译：

使用 Sensory Time-cue 实现无监督多模态元学习

随着来自 IoT（物联网）传感器的数据变得无处不在，最先进的机器学习算法在直接使用传感器数据方面面临许多挑战。为了克服这些挑战，必须将方法设计为直接从传感器中学习，而无需手动注释。本文介绍了无监督元学习 (STUM) 的感官时间提示。与严重依赖标签或依赖于时间无关的特征提取假设（例如高斯分布特征）的传统学习方法不同，STUM 系统使用输入的时间关系来指导模态内和模态之间的特征空间形成。STUM 从各种小任务中学习的事实可能会将这种方法置于 Meta-Learning 的阵营中。与现有的元学习方法不同，STUM 学习任务由基于时间线索与物联网流数据共存的多种模态组成。在视听学习示例中，由于连续的视觉帧通常包含相同的对象，因此该方法提供了一种将来自相同对象的特征组织在一起的独特方式。如果口述名称与对象大约同时出现，则相同的方法还可以将视觉对象特征与对象的口述名称特征组织在一起。这种跨模态特征组织可以进一步帮助组织属于相似对象但在不同位置和时间获得的视觉特征。有希望的结果是通过评估取得的。因为连续的视觉帧通常包含相同的对象，所以这种方法提供了一种将来自相同对象的特征组织在一起的独特方式。如果口述名称与对象大约同时出现，则相同的方法还可以将视觉对象特征与对象的口述名称特征组织在一起。这种跨模态特征组织可以进一步帮助组织属于相似对象但在不同位置和时间获得的视觉特征。有希望的结果是通过评估取得的。因为连续的视觉帧通常包含相同的对象，所以这种方法提供了一种将来自相同对象的特征组织在一起的独特方式。如果口述名称与对象大约同时出现，则相同的方法还可以将视觉对象特征与对象的口述名称特征组织在一起。这种跨模态特征组织可以进一步帮助组织属于相似对象但在不同位置和时间获得的视觉特征。有希望的结果是通过评估取得的。如果口语名称与对象大约同时出现，则口语名称会一起出现。这种跨模态特征组织可以进一步帮助组织属于相似对象但在不同位置和时间获得的视觉特征。有希望的结果是通过评估取得的。如果口语名称与对象大约同时出现，则口语名称会一起出现。这种跨模态特征组织可以进一步帮助组织属于相似对象但在不同位置和时间获得的视觉特征。有希望的结果是通过评估取得的。

更新日期：2020-09-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>