A data reuse strategy based on deep learning for high dimensional data’s pattern and instance similarity

Wu, Feng; Lv, Hongwei; Fan, Tongrang; Zhao, Wenbin; Wang, Jiaqi

doi:10.1007/s00607-021-00964-4

A data reuse strategy based on deep learning for high dimensional data’s pattern and instance similarity

Special Issue Article
Published: 13 June 2021

(2021)
Cite this article

Computing Aims and scope Submit manuscript

Feng Wu¹,
Hongwei Lv²,
Tongrang Fan²,
Wenbin Zhao² &
…
Jiaqi Wang²

191 Accesses
Explore all metrics

Abstract

Data reuse strategy is an effective method to save storage space and improve data utilization in data management. In view of the successful application of deep learning in the field of text mining, a data reuse strategy based on deep learning is proposed for high dimensional data’s pattern and instance similarity. With traditional feature analysis and deep learning model of convolutional neural network, the pattern similarity of data dimension is analyzed so as to optimize the similar dimension pairs among high dimensional data sets. Combining inner-attention mechanism, a semantic similarity model IA-LSTM is designed for instance similarity, which can build the association mapping among data entities by the calculation of the similarity of short text. Based on the pattern and instance similarity in the proposed strategy, reusable data entities are discovered, and column storage is designed to improve data reuse efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effective Big Data Retrieval Using Deep Learning Modified Neural Networks

Article 05 January 2019

Feature Representation Based on Improved Word-Vector Clustering Using AP and E2LSH

Ontology construction and mapping of multi-source heterogeneous data based on hybrid neural network and autoencoder

Article 03 March 2023

References

Jha NK, Mittal S (2020) modeling data reuse in deep neural networks by taking data-types into cognizance. In: IEEE transactions on computers
Nie Y, Tang X, Ma Y, et al. (2020) Design of CNN computing module to improve data reuse. In: Microcontrollers and embedded systems
Belhadi H, Akli-Astouati K, Djenouri Y et al (2020) Data mining-based approach for ontology matching problem. Appl Intell 50(11):1204–1221
Article Google Scholar
Chung TL, Xu B, Liu YB, Ouyang CP, Li SL, Luo LY (2019) Empirical study on character level neural network classifier for Chinese text. Eng Appl Artif Intell 802(1):1–6
Article Google Scholar
Wei L, Guo XP (2017) Data reuse strategy based on parallel processing mechanism. Appl Res Comput 34(8):2324–2328
Google Scholar
Zhao WB, Fan TR, Nie YC et al (2018) Research on attribute dimension partition based on SVM classifying and MapReduce. Wirel Pers Commun 102(4):2759–2774
Article Google Scholar
Sun ZQ, Hu W, Zhang QH, Qu YZ (2018) Bootstrapping entity alignment with knowledge graph embedding. In: Twenty-seventh international joint conference on artificial intelligence, IJCAI-18, pp 4396–4402
Xu K, Wang L, Yu M, et al. (2019) Cross-lingual knowledge graph alignment via graph matching neural network. In: Proceedings of the annual meeting of theassociation for computational linguistics, ACL, pp 3156–3161
Li C, Cao Y, Hou L, et al. (2019) Semi-supervised entity alignment via joint knowledge embedding model and cross-graph model. In: Proceedings of the conference on empirical methods in natural language processing and the international joint conference on natural language processing, EMNLP-IJCNLP, pp 2723–2732
Paulheim H (2017) Data-driven joint debugging of the dbpedia mappings and ontology. In: European semantic web conference. Springer, Cham, pp 404–418
Majid M, Wout H, Tan YH (2018) A comparative study of ontology matching systems via inferential statistics. IEEE Trans Knowl Data Eng 31:615–628
Google Scholar
Xue X, Liu J (2017) A compact hybrid evolutionary algorithm for large scale instance matching in linked open data cloud. Int J Artif Intell Tools 26(4):1750013
Article Google Scholar
Ochieng P, Kyanda S (2018) A statistically-based ontology matching tool. Distrib Parallel Databases 36(1):195–217
Article Google Scholar
Sang CJ, Pierro MD (2018) Improving trading technical analysis with TensorFlow long short-term memory (LSTM) neural network. J Finance Data Sci 2(1):1–6
Article Google Scholar
Pratim Barman P, Boruah A (2018) A RNN based approach for next word prediction in assamese phonetic transcription. Proc Comput Sci 143(2):825–834
Google Scholar
Wang HY, Luo C, Wang XY (2019) Synchronization and identification of nonlinear systems by using a novel self-evolving interval type-2 fuzzy LSTM-neural network. Eng Appl Artif Intell 81(1):123–136
Google Scholar
Wu Y, Liu X, Feng Y, et al. (2019) Relation-aware entity alignment for heterogeneous knowledge graphs. In: Proceedings of the international joint conference on artificial intelligence, IJCAI, pp 5278–5284
Zhao WB, Fan TR, Yin ZX et al (2020) An evaluation method of scientific research team influence based on heterogeneity and node similarity of content and structure. J Ambient Intell Human Comput 11:3617–3626
Article Google Scholar
Sun Z, Wang C, Hu W, et al. (2020) Knowledge graph alignment network with gated multi-hop neighborhood aggregation. In: Proceedings of the AAAI conference on artificial intelligence, AAAI, pp 222–229

Download references

Acknowledgements

The authors acknowledge the national natural science foundation of China (61373160), the research project “Research on Repetition Detection Technology of High Dimensional Data based on Deep Learning” of Hebei science and technology information processing laboratory, the research project “Research on recognition method of knowledge evolution path for sequential associated text based on graph neural network” of the natural science foundation of Hebei province and the research project "Knowledge Graph Construction of Multi-Source Domain data based on Knowledge Representation learning" of the education department of Hebei province.

Author information

Authors and Affiliations

Hebei Science and Technology Information Processing Laboratory, Hebei Institute of Science and Technology Information, Shijiazhuang, 050021, Hebei, China
Feng Wu
School of Information Science and Technology, Shijiazhuang Tiedao University, Shijiazhuang, 050043, Hebei, China
Hongwei Lv, Tongrang Fan, Wenbin Zhao & Jiaqi Wang

Authors

Feng Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Lv
View author publications
You can also search for this author in PubMed Google Scholar
Tongrang Fan
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jiaqi Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tongrang Fan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, F., Lv, H., Fan, T. et al. A data reuse strategy based on deep learning for high dimensional data’s pattern and instance similarity. Computing (2021). https://doi.org/10.1007/s00607-021-00964-4

Download citation

Received: 30 January 2021
Accepted: 25 May 2021
Published: 13 June 2021
DOI: https://doi.org/10.1007/s00607-021-00964-4

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A data reuse strategy based on deep learning for high dimensional data’s pattern and instance similarity

Abstract

Access this article

Similar content being viewed by others

Effective Big Data Retrieval Using Deep Learning Modified Neural Networks

Feature Representation Based on Improved Word-Vector Clustering Using AP and E2LSH

Ontology construction and mapping of multi-source heterogeneous data based on hybrid neural network and autoencoder

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A data reuse strategy based on deep learning for high dimensional data’s pattern and instance similarity

Abstract

Access this article

Similar content being viewed by others

Effective Big Data Retrieval Using Deep Learning Modified Neural Networks

Feature Representation Based on Improved Word-Vector Clustering Using AP and E2LSH

Ontology construction and mapping of multi-source heterogeneous data based on hybrid neural network and autoencoder

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation