Skip to main content

Advertisement

Log in

A multi-granularity knowledge association model of geological text based on hypernetwork

  • Research Article
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

With the explosive growth of geological data, considerable researches are focused on accurately retrieving specific information from massive data and fully exploiting the potential knowledge and information in unstructured data. Currently, the researches on unstructured content retrieval mostly ignore the association of semantics and knowledge or only consider the association of a singular granularity, which leads to the lost of concepts with the same semantics but expressed in different forms during the retrieval process. To address these problems, this paper has made some enhancements, and the main contributions include: (1) Define a decision rule, split unstructured geological survey data into content fragments, and construct a more fine-grained geologic textual semantic description model. (2) Present a multi-constraint fusion feature weighted model to extract the thematic feature items from the content fragments. (3) From the three granularity of document, content-item, feature-item, the associations of the same-granularity and cross-granularity are merged to construct a multi-granularity geological text hypernetwork model. (4) The experiments verify that the proposed approaches can improve the precision and recall rate of unstructured content retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  • Asim MN, Wasim M, Khan MUG, Mahmood N, Mahmood W (2019) The use of ontology in retrieval: a study on textual, multilingual, and multimedia retrieval. IEEE Access 7:21662–21686

    Google Scholar 

  • Atzeni P, Bugiotti F, Cabibbo L, Torlone R (2020) Data modeling in the NoSQL world. Computer Standards & Interfaces 67:103149

    Google Scholar 

  • Ben Abacha A, Zweigenbaum P (2015) MEANS: a medical question-answering system combining NLP techniques and semantic web technologies. Inf Process Manag 51:570–594

    Google Scholar 

  • Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3:1137–1155

    Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    Google Scholar 

  • Breunig M, Bradley PE, Jahn M, Kuper P, Mazroob N, Rösch N, al-Doori M, Stefanakis E, Jadidi M (2020) Geospatial data management research: Progress and future directions. ISPRS Int J Geo Inf 9:95

    Google Scholar 

  • Brock A, Lim T, Ritchie JM, Weston N (2017) Smash: one-shot model architecture search through hypernetworks. arXiv preprint arXiv:170805344

  • Chandiok A, Chaturvedi D (2018) Cognitive functionality based question answering system. Int J Comput Appl 179:1–6

    Google Scholar 

  • Chaokui L, Yanan Z, Keyan X, Jianhui C (2019) Innovation method of distributed storage for huge data of geological and mineral resources based on Hadoop. American Journal of Applied Scientific Research 5:6–16

    Google Scholar 

  • Chen J, Li J, Cui N, Yu P (2015a) The construction and application of geological cloud under the big data background. Geological Bulletin of China 34:1260–1265

    Google Scholar 

  • Chen X, Qiu X, Zhu C, Liu P, Huang X-J (2015b) Long short-term memory neural networks for chinese word segmentation. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1197–1206

  • Chen X, Shi Z, Qiu X, Huang X (2017) Adversarial multi-criteria learning for chinese word segmentation. arXiv preprint arXiv:170407556

  • Chen Z, Song J, Yang Y (2018) An approach to measuring semantic relatedness of geographic terminologies using a thesaurus and lexical database sources. ISPRS Int J Geo Inf 7:98

    Google Scholar 

  • Chen Z et al. (2020) An ontology-driven treatment article retrieval system for precision oncology. arXiv preprint arXiv:200205653

  • Cheng Y, Tao F, Zhao D, Zhang L (2017) Modeling of manufacturing service supply–demand matching hypernetwork in service-oriented manufacturing systems. Robot Comput Integr Manuf 45:59–72

    Google Scholar 

  • Dai AM, Olah C, Le QV (2015) Document embedding with paragraph vectors. arXiv preprint arXiv:150707998

  • Daskin MS (1985) Urban transportation networks: equilibrium analysis with mathematical programming methods

  • Enkhsaikhan M, Liu W, Holden E-J, Duuring P (2018) Towards geological knowledge discovery using vector-based semantic similarity. In: INTERNATIONAL conference on advanced data mining and applications, Cham. Advanced data mining and applications. Springer International Publishing, pp 224–237

  • Eremenko VS, Naumova VV (2019) Computational and analytical environment for processing and analysis of geological data

  • Estrada E, Rodriguez-Velazquez JA (2005) Subgraph centrality in complex networks. Phys Rev E 71:056103

    Google Scholar 

  • Estrada E, Rodríguez-Velázquez JA (2006) Subgraph centrality and clustering in complex hyper-networks. Physica A: Statistical Mechanics and its Applications 364:581–594

    Google Scholar 

  • Gao J, Li M, Huang C-N, Wu A (2005) Chinese word segmentation and named entity recognition: a pragmatic approach. Computational Linguistics 31:531–574

    Google Scholar 

  • Garcia LF, Abel M, Perrin M, dos Santos AR (2020) The GeoCore ontology: a core ontology for general use in geology. Comput Geosci 135:104387

    Google Scholar 

  • Gessert F, Wingerath W, Friedrich S, Ritter N (2017) NoSQL database systems: a survey and decision guidance. Comput Sci Res Dev 32:353–365

    Google Scholar 

  • Giachetta R (2015) A framework for processing large scale geospatial and remote sensing data in MapReduce environment. Comput Graph 49:37–46

    Google Scholar 

  • Han SY, Tsou M-H, Clarke KC (2018) Revisiting the death of geography in the era of big data: the friction of distance in cyberspace and real space. International Journal of Digital Earth 11:451–469

    Google Scholar 

  • Hearst MA, Plaunt C (1993) Subtopic structuring for full-length document access, vol 149, pp 59–68

  • Hou Z, Zhu Y, Gao X, Luo K, Wang D, Sun KA (2015) Chinese geological time scale ontology for geodata discovery. In: 2015 23rd international conference on geoinformatics. IEEE, pp 1–5

  • Hou Z, Zhu Y, Gao Y, Song J, Qin C (2018) Geologic time scale ontology and its applications in semantic retrieval. Journal of Geo-information Science 20:17–27

    Google Scholar 

  • Huang L, Du Y, Chen G (2015) GeoSegmenter: a statistically learned Chinese word segmenter for the geoscience domain. Comput Geosci 76:11–17

    Google Scholar 

  • Hwang J, Nam KW, Ryu KH (2012) Designing and implementing a geologic information system using a spatiotemporal ontology model for a geologic map of Korea. Comput Geosci 48:173–186

    Google Scholar 

  • Landauer TK, Laham D, Derr M (2004) From paragraph to graph: latent semantic analysis for information visualization. Proc Natl Acad Sci 101:5214–5219

    Google Scholar 

  • Li L, Liu Y, Zhu H, Ying S, Luo Q, Luo H, Kuai X, Xia H, Shen H (2017) A bibliometric and visual analysis of global geo-ontology research. Comput Geosci 99:1–8

    Google Scholar 

  • Li W, Wu L, Xie Z, Tao L, Zou K, Li F, Miao J (2019) Ontology-based question understanding with the constraint of Spatio-temporal geological knowledge. Earth Sci Inf 12:599–613

    Google Scholar 

  • Liang G, Peng Y, Dong Y (2015) SHDC: a fast documents classification method based on Simhash

  • Manning C, Raghavan P, Schütze H (2010) Introduction to information retrieval. Nat Lang Eng 16:100–103

    Google Scholar 

  • Mehta V, Rishabh K, Raja R, Varma V (2016) MultiStack: multi-cloud big data research framework/platform. In: 2016 IEEE international conference on cloud computing in emerging markets (CCEM). IEEE, pp 147–152

  • Pei W, Ge T, Chang B (2014) Max-margin tensor neural network for Chinese word segmentation. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (volume 1: Long papers), pp 293–303

  • Peng G, Wang H, Zhang H, Huang K (2019) A hypernetwork-based approach to collaborative retrieval and reasoning of engineering design knowledge. Adv Eng Inform 42:100956

    Google Scholar 

  • Perrin M, Mastella LS, Morel O, Lorenzatti A (2011) Geological time formalization: an improved formal model for describing time successions and their correlation. Earth Science Informatics 4:81–96

    Google Scholar 

  • Qi Z, Xuelong L (2019) Big data: new methods and ideas in geological scientific research. Big Earth Data 3:1–7

    Google Scholar 

  • Qi Y, Das SG, Collobert R, Weston J (2014) Deep learning for character-based information extraction. In: European conference on information retrieval. Springer, pp 668–674

  • Qiu Q, Xie Z, Wu L, Li W (2018a) DGeoSegmenter: a dictionary-based Chinese word segmenter for the geoscience domain. Comput Geosci 121:1–11

    Google Scholar 

  • Qiu Q, Zhong X, Liang W (2018b) A cyclic self-learning Chinese word segmentation for the geoscience domain. Geomatica 72:16–26

    Google Scholar 

  • Salloum SA, Al-Emran M, Monem AA, Shaalan K (2018) Using text mining techniques for extracting information from research articles

  • Salton G, Buckley C (1987) Term weighting approaches in automatic text retrieval. Cornell University

  • Sidorov G, Gelbukh A, Gómez-Adorno H, Pinto D (2014) Soft similarity and soft cosine measure: similarity of features in vector space model. Computación y Sistemas 18:491–504

    Google Scholar 

  • Sobhana N (2012) Enhancing retrieval of geological text using named entity disambiguation. International Journal of Emerging Technology and Advanced Engineering 2:2250–2459

    Google Scholar 

  • Sobhana N, Barua A, Das M, Mitra P, Ghosh S (2010) Co-occurrence based place name disambiguation and its application to retrieval of geological text. In: Recent trends in networks and communications. Springer, pp 543–552

  • Wang J-P, Guo Q, Yang G-Y, Liu J-G (2015) Improved knowledge diffusion model based on the collaboration hypernetwork. Physica A: Statistical Mechanics and its Applications 428:250–256

    Google Scholar 

  • Wang C, Ma X, Chen J, Chen J (2018a) Information extraction and knowledge graph construction from geoscience literature. Comput Geosci 112:112–120

    Google Scholar 

  • Wang L, Ma Y, Yan J, Chang V, Zomaya AY (2018b) pipsCloud: high performance cloud computing for remote sensing big data management and processing. Futur Gener Comput Syst 78:353–368

    Google Scholar 

  • Wei W, Guo C (2019) A text semantic topic discovery method based on the conditional co-occurrence degree. Neurocomputing 368:11–24

    Google Scholar 

  • Wu D, Cong G, Jensen CS (2012) A framework for efficient spatial web object retrieval. VLDB J 21:797–822

    Google Scholar 

  • Wu L, Xue L, Li C, Lv X, Chen Z, Jiang B, Guo M, Xie Z (2017) A knowledge-driven geospatially enabled framework for geological big data. ISPRS Int J Geo Inf 6:166

    Google Scholar 

  • Wylot M, Cudré-Mauroux P (2015) Diplocloud: efficient and scalable management of rdf data in the cloud. IEEE Trans Knowl Data Eng 28:659–674

    Google Scholar 

  • Xi Y, Yang Q, Liao X (2019) Research review on super-network and knowledge super-network. Modern Management 9:557–565

    Google Scholar 

  • Xu J, Nyerges TL, Nie G (2014) Modeling and representation for earthquake emergency response knowledge: perspective for working with geo-ontology. Int J Geogr Inf Sci 28:185–205

    Google Scholar 

  • Yan J, Ma Y, Wang L, Choo K-KR, Jie W (2018) A cloud-based remote sensing data production system. Futur Gener Comput Syst 86:1154–1166

    Google Scholar 

  • Yanan Z, Chaokui L, Keyan X, Jianfu F (2019) Research on distributed storage method of geological and mineral big data based on Hadoop. Geological Bulletin of China

  • Yang G-Y, Hu Z-L, Liu J-G (2015) Knowledge diffusion in the collaboration hypernetwork. Physica A: Statistical Mechanics and its Applications 419:429–436

    Google Scholar 

  • Yang C, Yu M, Hu F, Jiang Y, Li Y (2017) Utilizing cloud computing to address big geospatial data challenges. Comput Environ Urban Syst 61:120–128

    Google Scholar 

  • Zhang S, Zhang Y, Zhang B, Sun D (2016) Research and implementation of the results geological data retrieval system. Land and Resource Information:38–44

  • Zheng X, Chen H, Xu T (2013) Deep learning for Chinese word segmentation and POS tagging. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 647–657

  • Zhong RY, Lan S, Xu C, Dai Q, Huang GQ (2016) Visualization of RFID-enabled shopfloor logistics big data in cloud manufacturing. Int J Adv Manuf Technol 84:5–16

    Google Scholar 

  • Zhong S, Fang Z, Zhu M, Huang Q (2017) A geo-ontology-based approach to decision-making in emergency management of meteorological disasters. Nat Hazards 89:531–554

    Google Scholar 

  • Zhu Y, Tan Y, Luo X, He Z (2018) Big data management for cloud-enabled geological information services. Scientific Programming 2018

  • Zykov AA (2007) Hypergraphs. Russian Mathematical Surveys 29:89–154

    Google Scholar 

Download references

Acknowledgments

This work was funded by the National Natural Science Foundation of China (Grant No. 41671400) and the National Key Research and Development Program of China (Grant Nos. 2017YFB0503600, 2018YFB0505500, 2017YFC0602204, 2018YFB0505504). We thank the National Engineering Research Center of Geographic Information System for providing hardware support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liang Wu.

Additional information

Communicated by: H. Babaie

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhuang, C., Li, W., Xie, Z. et al. A multi-granularity knowledge association model of geological text based on hypernetwork. Earth Sci Inform 14, 227–246 (2021). https://doi.org/10.1007/s12145-020-00534-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-020-00534-w

Keywords

Navigation