A transformer-based deep learning model for recognizing communication-oriented entities from patents of ICT in construction

https://doi.org/10.1016/j.autcon.2021.103608Get rights and content

Highlights

  • Communication-oriented entities of ICT in construction were introduced.

  • A deep learning model to recognize the entities from the patents was proposed.

  • The model was structured based on the Transformer rather than RNN.

  • The Transformer enables the model to efficiently address the contextual information.

  • The model provides an effective way to perceive the communication functionalities.

Abstract

The patents of information and communication technology (ICT) in construction are valuable sources of technological solutions to communication problems in the construction practice. However, it is often difficult for practitioners and stakeholders to identify the key communication functionalities from complicated expressions in the patent documents. Addressing such challenges, this study develops a deep learning model to enable automatic recognition of communication-oriented entities (CEs) from patent documents. The proposed model is structured based on the Transformer, consisting of feed-forward and self-attention neural networks to better recognize ambiguous and unknown entities by utilizing contextual information. The validation results showed that the proposed model has superior performance in CE recognition than traditional recurrent neural networks (RNN)-based models, especially in recognizing ambiguous and unknown entities. Moreover, experimental results on some research literature and a real-life project report showed satisfactory performance of the model in CE recognition across different document types.

Introduction

Information and communication technology (ICT) is an extensional concept, incorporating a wide range of technical approaches that mainly concentrate on communication functionalities [1]. The core benefit of ICT application in the construction industry is to enable and enhance communication, improving the coordination of data in the whole life cycle of construction projects [2,3]. Successful adoption of ICT relies on appropriate choices of technologies to enable desired communication functionalities according to specific objectives in construction practice [4,5]. In order to choose the right technologies for the confronting problems, practitioners and stakeholders need to fully comprehend the communication functionalities embedded in ICTs [2]. Patents are a common source for up-to-date technologies, from which 95% inventions can be found. The information of communication functionalities of ICT was archived as raw texts in patent documents [6,7]. Analyzing patent documents effectively is important to acquire technological knowledge, link potential solutions to problems and inspire innovation in the industry [8]. Therefore, exploiting information underlying patent documents has gained increasing interests by researchers, patent analysts, and practitioners [9].

In patent documents of ICT in construction, the hints of communication functionalities are hidden in complicated expressions like how construction data was transmitted through virtual or physical models and how it was coordinated among sites, users or stakeholders [5]. Examples of such expressions include “installation information was transferred from a radio frequency identification (RFID) tag to a construction item” in an RFID patent [10], and “the technology conveys geographic data to display devices that users could manipulate” in a geographic information system (GIS) patent [11]. To make this embedded information more accessible, this study seeks to develop a computer-aided system to automatically identify the communication-oriented entities (CEs) and categorize them into pre-defined types. The task is named as entity recognition in natural language processing (NLP) [12].

Although some patent analysis tools (e.g., TRIZ1) have been developed to process patent documents, these approaches aim for general purposes and are limited in specific problem solving [13]. Entity recognition offers a way to analyze patents based on customized problems or interests. An entity is a category of phrases that have similar properties, including rigid designators or members of a semantic class [14]. Mostly, the entities are “names” (e.g., drug names, disease names, chemical names) [14]. They usually have highly distinguishable spellings (e.g., chemistry entity “Deuterium” can be easily recognized due to its unique combination of characters and the capitalized initial letter [15]). However, recognizing CEs from the patent documents of ICT in construction is a more complicated task. There are two main technical challenges. One is the ambiguity of CEs. An entity is ambiguous if its spellings appear as an entity at one position, and appear as a different entity type at another [16]. Communication functions in the patents are expressed by not only mixtures of unique technical terms that appear with distinguishable spellings, but also words that are typically normal terms [17]. Thus, for recognizing ambiguous entities, it is important to incorporate the contextual information surrounding the candidate entities to discern their relevancy. Another challenge is the unknown of entities (entities that appear in testing set but not in training set). The previous studies attempted to address these problems by using additional linguistic materials, such as lexicons, dictionaries, gazetteers, ontologies, knowledge graphs [[18], [19], [20], [21]]. However, due to the unavoidable limitations in the coverage of lexical databases, these problems remain critical [22].

This study resorts to deep learning techniques to utilize the contextual information for recognizing the ambiguous and unknown CEs from the patents of ICT in construction. Rather than focusing on word-level information, a deep learning method can enhance the understanding of entities by incorporating surrounding texts. As recognized deep learning approaches, the recurrent neural networks (RNN)-based models, such as long short-term memory (LSTM) and gated recurrent unit (GRU), have been widely adopted in many NLP tasks, including entity recognition, text classification, sentiment analysis, and machine translation [23,24]. In these models, bi-directional structures and convolutional neural networks (CNN) were adopted to achieve improved performance [25]. However, despite the elaborate architectures, the RNN-based models have limitations in addressing long-term dependencies. A deep learning model of the Transformer-based neural networks (TBNN) was adopted instead in this study to remedy this deficiency. Proposed in 2017 by Google AI team [26], the Transformer can enable the so-called “self-attention” mechanism that computes the contextual representations in parallel rather than in sequence [27], enabling a more effective approach to memorize both long and short term dependencies compared with the RNN-based models. Previous methods used for recognizing communication functionalities from ICT patents were mostly manual searching, which are labor-intensive and time-consuming [28,29]. The TBNN model developed in this study provides an efficient alternative. Also, It has its merits in utilizing contextual information, which is an important advancement for computer-aided systems to achieve intelligence in NLP tasks [16].

The research procedure is shown in Fig. 1.. First, based on the literature review, the main technical challenges were identified and the classes of CEs for recognition were illustrated. Second, the architecture of the proposed TBNN was illustrated in detail. Third, the validation of the model was conducted using the training and testing instances. Finally, the results and findings were discussed to report the performance of the proposed model compared with the baseline model.

Section snippets

Overview of entity recognition

Entity is an NLP concept that was first introduced in 1996 [18]. An entity is a phrase representing the elements that have similar properties. Entities are rigid designators or members of a semantic class that can be characterized by specific purposes [14]. Generally, entity recognition is used to automatically identify names of people, locations, and organizations using information extraction techniques. At the beginning, such a task was called “Named Entity Recognition”. It was rapidly

Definition of CE classes

CEs refer to the information units that describe communication functionalities in the patents of ICT in construction. They present approaches of virtual or physical transmission of data, or of data coordination among sites, users or stakeholders [5]. For example, the sentence “sensing the material information through RFID tags” indicates that the RFID technology can be applied to timely transmit information on construction materials [4]. This communication functionality involves two important

The objectives of the proposed model

The developed TBNN model is to utilize contextual information to automatically identify and classify CEs out of patent documents, addressing the aforementioned problems in recognizing ambiguity and unknown of entities. As it is shown in Fig. 3, two examples of CEs extractions show the utilization of contextual information in recognizing ambiguous entities. In Fig. 3 (a), the entity “building data” was recognized as TI, because the surrounding text indicated that the “building data” is a type of

Empirical validation

This section reports the validation results of the proposed model compared with the baseline model. This study selected the bi-directional LSTM with CNN (abbreviated as BLC) as the baseline model, which is one of the most typical and outperformed deep learning models for entity recognition [57].

Conclusion

This study proposed a TBNN model to recognize CEs from patents of ICT in construction. It provides an efficient alternative for construction practitioners and stakeholders to better access and comprehend the complex specifications of communication functionalities embedded in the patent documents. The deep learning techniques were employed to overcome the challenges in recognizing ambiguous and unknown entities. The proposed model was based on the Transformer as the basic neural networks to form

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (NSFC).

(No. 71771067, No. 71801159 and No. 52078302), the National Natural Science Foundation of Guangdong Province (No. 2018A030310534), and Youth Fund of Humanities and Social Sciences Research of the Ministry of Education (No. 18YJCZH090).

References (68)

  • R. Davies et al.

    Implementing ‘Site BIM’: A case study of ICT innovation on a large hospital project

    Automation in Construction

    (2013)
  • M. Majumder et al.

    A novel technique for name identification from homeopathy diagnosis discussion forum

    Procedia Technology

    (2012)
  • Y. Wang et al.

    Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study

    Journal of Biomedical Informatics

    (2014)
  • S.K. Saha et al.

    P. mitra, Feature selection techniques for maximum entropy based biomedical named entity recognition

    Journal of Biomedical Informatics.

    (2009)
  • M. Riedmiller

    Advanced supervised learning in multi-layer perceptrons-from backpropagation to adaptive learning algorithms

    Computer standards and interfaces.

    (1994)
  • Z. Huang et al.

    Bidirectional LSTM-CRF models for sequence tagging

    arXiv

    (2015)
  • K.-Y. Lin et al.

    Promoting transactions for A/E/C product information

    Automation in Construction

    (2006)
  • N. Mahmoudi et al.

    Deep neural networks understand investors better

    Decision Support Systems

    (2018)
  • Q.J. Qiu et al.

    Geoscience keyphrase extraction algorithm using enhanced word embedding, Expert Systems with Applications

    (2019)
  • H. Niemann et al.

    Use of a new patent text-mining and visualization method for identifying patenting patterns over time: Concept, method and test application

    Technological Forecasting and Social Change

    (2017)
  • A. Garcia-Pablos et al.

    W2VLDA: Almost unsupervised system for Aspect Based Sentiment Analysis, Expert Systems with Applications

    (2018)
  • M.E. Peters et al.

    Deep contextualized word representations

    arXiv

    (2018)
  • M. Sokolova et al.

    A systematic analysis of performance measures for classification tasks

    Information Processing & Management.

    (2009)
  • X. Li et al.

    Integrating Building Information Modeling and Prefabrication Housing Production, Automation in Construction

    (2019)
  • H. Baker et al.

    AI-based prediction of independent construction safety outcomes from universal attributes

    Automation in Construction

    (2020)
  • B. Zhong et al.

    Deep learning and network analysis: Classifying and visualizing accident narratives in construction

    Automation in Construction

    (2020)
  • Y.M. Goh et al.

    Construction accident narrative classification: An evaluation of text mining techniques

    Accident Analysis & Prevention

    (2017)
  • P. Mathur

    Technological Forms and Ecological Communication: A Theoretical Heuristic

    (2017)
  • J.M. Sardroud

    Perceptions of automated data collection technology use in the construction industry

    J. Civil Eng. Manag.

    (2015)
  • B. Heinzerling et al.

    Bpemb: Tokenization-free pre-trained subword embeddings in 275 languages

  • Y. El Ghazali et al.

    The potential of RFID as an enabler of knowledge management and collaboration for the procurement cycle in the construction industry

    Journal of technology management & innovation.

    (2012)
  • D. Nadeau et al.

    A survey of named entity recognition and classification

    Lingvisticae Investigationes

    (2007)
  • S.A. Akhondi et al.

    Chemical entity recognition in patents by combining dictionary-based and statistical approaches

    Database

    (2016)
  • W. El-Ghandour et al.

    Survey of information technology applications in construction

    Construction innovation.

    (2004)
  • Cited by (21)

    • Transformer-based approach for automated context-aware IFC-regulation semantic information alignment

      2023, Automation in Construction
      Citation Excerpt :

      The pretrained transformer-based language models can then be finetuned on smaller, domain- or task-specific text data for downstream NLP tasks, such as sequence labeling, machine translation, and question answering (e.g., [27–29]). Recent efforts in the construction domain have applied transformer-based models in solving problems including defect detection (e.g., [33–35]) and information extraction (e.g., [25,36,37]). For example, Zhou et al. [35] used transformer-based models to extract features for point cloud classification to support sewer defect detection.

    • Improving knowledge capture and retrieval in the BIM environment: Combining case-based reasoning and natural language processing

      2022, Automation in Construction
      Citation Excerpt :

      Fang et al. [73] adopted deep learning neural networks to automatically classify near-miss information contained in a safety report, which helps site managers understand the nature of near-misses better. Wu et al. [72] and Fang et al. [73] used a Transformer as feature extractor in their studies. Although deep learning has been applied to process text in construction projects, extracting information/knowledge from text alone ignores the role of project attributes in information/knowledge.

    View all citing articles on Scopus
    View full text