当前位置: X-MOL 学术Computing › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A data reuse strategy based on deep learning for high dimensional data’s pattern and instance similarity
Computing ( IF 3.7 ) Pub Date : 2021-06-13 , DOI: 10.1007/s00607-021-00964-4
Feng Wu , Hongwei Lv , Tongrang Fan , Wenbin Zhao , Jiaqi Wang

Data reuse strategy is an effective method to save storage space and improve data utilization in data management. In view of the successful application of deep learning in the field of text mining, a data reuse strategy based on deep learning is proposed for high dimensional data’s pattern and instance similarity. With traditional feature analysis and deep learning model of convolutional neural network, the pattern similarity of data dimension is analyzed so as to optimize the similar dimension pairs among high dimensional data sets. Combining inner-attention mechanism, a semantic similarity model IA-LSTM is designed for instance similarity, which can build the association mapping among data entities by the calculation of the similarity of short text. Based on the pattern and instance similarity in the proposed strategy, reusable data entities are discovered, and column storage is designed to improve data reuse efficiency.



中文翻译:

一种基于深度学习的高维数据模式和实例相似度数据重用策略

数据重用策略是数据管理中节省存储空间、提高数据利用率的有效方法。针对深度学习在文本挖掘领域的成功应用,针对高维数据的模式和实例相似度,提出了一种基于深度学习的数据重用策略。利用卷积神经网络的传统特征分析和深度学习模型,分析数据维度的模式相似性,从而优化高维数据集之间的相似维度对。结合inner-attention机制,针对实例相似度设计了语义相似度模型IA-LSTM,通过计算短文本的相似度来构建数据实体间的关联映射。基于所提出策略中的模式和实例相似性,

更新日期:2021-06-14
down
wechat
bug