Construction machine pose prediction considering historical motions and activity attributes using gated recurrent unit (GRU)

doi:10.1016/j.autcon.2020.103444

Automation in Construction

Volume 121, January 2021, 103444

https://doi.org/10.1016/j.autcon.2020.103444 Get rights and content

Highlights

•
Proposed a GRU-based method for machine pose prediction using motion capture data
•
Improved machine pose prediction results by keypoint-based activity recognition
•
Considered non-geometric data like machine interaction and activity type
•
A rollback method greatly reduced the influence of uncertainty in activity recognition
•
Achieved an average accuracy over 90% for machine pose prediction

Abstract

The variation of construction machine poses is one of the main causes for interactive on-site safety issues such as struck-by hazards. With the aim to reduce such hazards, we propose a framework for predicting construction machine poses based on historical motion data and activity attributes. After building a machine motion dataset, we develop a keypoint-based method for recognizing machine activities considering working patterns and interaction characteristics. The recognized activity information is then incorporated with historical pose data to predict future machine poses through a type of recurrent neural network (RNN), named Gated Recurrent Unit (GRU). In experiments of using excavators as the objects, our framework achieves decent performance for machine pose prediction, which is further improved by incorporating activity information, reaching an average percentage of correct keypoints (PCK) of 90.22%. The results indicate the high potential of our framework in predicting construction machine poses and improving on-site safety.

Introduction

Construction sites are suffering from high hazard rates among all workplaces, making it a foremost concern to improve the on-site safety. According to the reports from the U.S., mainland China and Hong Kong Special Administrative Region [[1], [2], [3]], unsafe operation of construction machines is an essential reason of fatal hazards occurred on construction sites. In the U.S., over 38% of construction accidents are caused by interactions between construction resources (e.g. workers and machines), which also resulted in more than 16% and 29% of construction accidents in mainland China and Hong Kong SAR, respectively. Hence, it is important to monitor the motion of construction machines on sites. Typically, accidents are likely to take place when two on-site objects move towards each other, making it important for safety managers to pay attention to the changing locations (i.e. trajectories) of on-site construction resources. In recent years, several efforts have been made to automatically monitor locations of construction resources based on data captured from surveillance cameras and pre-installed devices to assist site managers with traditional error-prone and tedious safety inspections of construction sites [[4], [5], [6]]. Nevertheless, it is common but easily overlooked that accidents may still occur when the location of heavy construction machines remains unchanged, but their poses are varying constantly as deformable components of construction machine are operated. Therefore, monitoring machine poses can be a necessary supplement for safety management of construction projects.

Up till now, most site managers monitor on-site machine poses by watching surveillance videos and evaluating potential risks manually. Such manual observations of on-site safety condition are error-prone and time-consuming because they are greatly dependent on physical status and expertise of the inspector. To address such limitations, previous studies have attempted to automate pose monitoring of construction machines based on surveillance videos. For example, several efforts have been contributed to estimating past and current poses of construction machines through processing on-site videos not only using conventional computer vision techniques [7], but also further adopting deep learning techniques [8,9] that have shown promising performance in many vision-based tasks. Besides, another common and practicable approach for automating pose monitoring of construction machines is processing the collected signals from pre-installed devices [10]. Most of previous studies only focus on estimating past and current machine poses which have occurred, yet understanding the current poses of construction machines is not sufficient to avoid potential hazards. Instead, pose prediction (or pose forecasting) [11], of construction machines can provide more clues to prevent possible collisions or other accidents, for which there is still a lack of research.

When tentatively investigating potential methods for predicting future poses, we found that both geometric and non-geometric information provide insights for prediction. On the one hand, future poses of construction machines are influenced by machine motions which can be informed by geometric construction data, such as the geometry information of machines as well as the construction environment. On the other hand, non-geometric construction data, such as the working task the machine is focusing on and the interaction of the machine with other objects, can provide contextual information for predicting machine poses, which is not fully considered in other research. With the aim to reduce potential on-site hazards, we propose a framework incorporating both geometric and non-geometric construction information to improve the performance of machine pose prediction, where geometric construction data refer to the dimensions (e.g. length, width, height) and coordinates of both construction machines and project terrain, while non-geometric data are semantic data such as working tasks and working natures of the construction machine. The proposed framework consists of three modules, i.e. motion capture module, activity recognition module and machine pose prediction module. Firstly, geometric construction site information is included in the machine motion capture module, and thereon historical motion data of target construction machines are provided. Next, the machine activity recognition module recognizes historical activities with the help of non-geometric construction information and historical motion data from the machine motion capture module. Lastly, the machine pose prediction module generates future poses of construction machines by integrating information of historical poses and activities. To validate the proposed framework, we adopted excavators as experiment objects as excavators own more deformable components and tend to have complex pose variations and interactions with the surrounding objects, and performed experiments based on a motion capture dataset proposed by adding historical poses data to an existing video dataset. The proposed framework is expected to provide more comprehensive information to reduce potential hazards caused by varying machine poses and to protect on-site workers.

The rest of this paper is summarized below. Section 2 reviews works related to motion analysis of the past and current states of construction machines, as well as machine motion prediction. Afterwards, Section 3 introduces the proposed overall framework of machine pose prediction that considers both geometric and non-geometric construction information, including detailed descriptions and implementation approaches of each framework module. Next, taking excavators performing earthmoving tasks as the example, the proposed overall framework is verified in Section 4, and thereon discussion and insights about the experiment results are given. In the end, conclusion is stated in Section 5.

Section snippets

Related works

Machine motion (i.e. locations, poses and movements) monitoring has attracted increasing attentions of the construction industry in recent years because of the high on-site hazard rates resulted from moving construction machines. On construction sites, machine motion monitoring includes not only analyzing machine motions that have occurred in the past and current time, but also, on this basis, predicting potential motions of construction machines in the future. Therefore, this paper firstly

The proposed framework for construction machine pose prediction

The overall framework of machine pose prediction considering non-geometric construction information is illustrated in Fig. 1. Firstly, the motion capture module is to obtain historical motion data (i.e. locations, poses and movements) by processing geometric construction information captured by external devices such as surveillance cameras and pre-installed devices. Subsequently, the machine activity recognition module provides historical activity information considering both historical motions

Validation of the proposed framework

To validate the feasibility and effectiveness of the proposed framework for machine pose prediction, excavators are adopted as the experiment objects. One reason is that excavators are fundamental and essential machines in construction projects with the most deformable components. Besides, excavators tend to have more complex pose variations and interactions with the surrounding objects.

Conclusion

Safety is the first priority of all construction projects. Construction machines are the major source of safety issues on construction sites due to their frequent interactions with workers and other construction-related objects. Therefore, it is necessary to monitor states (i.e. locations, poses and movements) of construction machines for avoiding potential collisions and other accidents. Besides tracking the past and the current states of construction machines, evaluating the future states of

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Financial supports of this study by the Hong Kong PhD Fellowship Scheme (HKPFS) to Han LUO and Peter K. Y. WONG are gratefully acknowledged.

References (51)

M. Memarzadeh et al.
Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors
Autom. Constr.
(2013)
M. Golparvar-Fard et al.
Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers
Adv. Eng. Inform.
(2013)
M.M. Soltani et al.
Skeleton estimation of excavator by detecting its parts
Autom. Constr.
(2017)
H. Luo et al.
Full body pose estimation of construction equipment using computer vision and deep learning techniques
Autom. Constr.
(2020)
S. Chae et al.
Application of RFID technology to prevention of collision accident with heavy equipment
Autom. Constr.
(2010)
T. Cheng et al.
Performance evaluation of ultra wideband technology for construction resource location tracking in harsh environments
Autom. Constr.
(2011)
H. Kim et al.
Application of dynamic time warping to the recognition of mixed equipment activities in cycle time measurement
Autom. Constr.
(2018)
Y. Li et al.
Integrating field data and 3D simulation for tower crane activity monitoring and alarming
Autom. Constr.
(2012)
F. Vahdatikhaki et al.
Optimization-based excavator pose estimation using real-time location systems
Autom. Constr.
(2015)
M. Haga et al.
Digging control system for hydraulic excavator
Mechatronics.
(2001)

R. Akhavian et al.

Construction equipment activity recognition for simulation input modeling using mobile sensors and machine learning classifiers

Adv. Eng. Inform.

(2015)

J. Kim et al.

Action recognition of earthmoving excavators based on sequential pattern analysis of visual features and operation cycles

Autom. Constr.

(2019)

D. Roberts et al.

End-to-end vision-based detection, tracking and activity analysis of earthmoving equipment filmed at ground level

Autom. Constr.

(2019)

J. Kim et al.

Interaction analysis for vision-based activity identification of earthmoving excavators and dump trucks

Autom. Constr.

(2018)

J. Yoon et al.

Spatial factors affecting the loading efficiency of excavators

Autom. Constr.

(2014)

K.M. Rashid et al.

Times-series data augmentation and deep learning for construction equipment activity recognition

Adv. Eng. Inform.

(2019)

U.S. Bureau of Labor Statistics (BLS)

National census of fatal occupational injuries in 2017

MOHURD

Report of Safety Accidents in China's Building Construction Activities in 2017

Hong Kong Labour Department

Occupational Safety and Health Statistics 2017

H. Kim et al.

Detecting construction equipment using a region-based fully convolutional network and transfer learning

Journal of Computing in Civil Engineering.

(2018)

C.-J. Liang et al.

Stacked hourglass networks for markerless pose estimation of articulated construction robots

C. Zhang

Improving Crane Safety by Agent-Based Dynamic Motion Planning Using UWB Real-Time Location System

(2010)

H.K. Chiu et al.

Action-agnostic human pose forecasting

M.M. Soltani et al.

Framework for location data fusion and pose estimation of excavators using stereo vision

J. Comput. Civ. Eng.

(2018)

H. Vogt

Efficient object identification with passive RFID tags, in: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

(2002)

Cited by (28)

Automatic Fine-Grained BIM element classification using Multi-Modal deep learning (MMDL)
2024, Advanced Engineering Informatics
In Building Information Modeling (BIM)-based domain-specific applications, elements should be classified into fine-grained sub-categories concerning their graphical and/or non-graphical characteristics to support downstream tasks. Traditional rule-based and machine learning methods are either time-consuming or cannot meet fine-grained classification requirements in domain-specific applications. To overcome this challenge, this paper presents a novel framework based on BIM and Multi-Modal Deep Learning (MMDL) for automatic fine-grained BIM element classification. It begins with transforming multi-modal (i.e., graphical and non-graphical) element features from BIM models. A feature selection algorithm is then designed to determine relevant BIM element features automatically. Subsequently, an MMDL model is developed and deployed to fuse the selected multi-modal BIM element features for end-to-end fine-grained classification. The framework is validated with a BIM element classification dataset. The results show that fine-grained elements can be classified with high accuracy (over 98%) in an end-to-end manner, which is unattainable by other BIM element classification methods.
Construction safety management in the data-rich era: A hybrid review based upon three perspectives of nature of dataset, machine learning approach, and research topic
2023, Advanced Engineering Informatics
Although substantial progress in safety management performance has been made in the construction industry, continuing fatalities and injuries at workplaces hinder sustainable development of this labor-intensive industry. Many machine learning approaches using different types of data such as text, image, video, and audio were adopted for safety risk analysis at construction sites. Our paper aimed to implement a hybrid review of construction safety research based upon machine learning. This hybrid review focused on various attributes from three perspectives: Nature of dataset, machine learning approach, and research topic. After the review of individual attributes, intra-relationships between attributes in each perspective and inter-relationships between attributes across the three perspectives were determined. According to risk recognition, risk prediction, and risk control, feasible research paths were developed from both intra-relationships and inter-relationships between multiple attributes for reference in future studies. Finally, gaps and opportunities were discussed in detail for research agendas on this subject. This hybrid review contributes to outlining the framework of construction safety management based upon machine learning. It is able to provide new entrants with a systematic idea of promising research trends for the future. Research findings are helpful for academia and industry to fill in the gaps between study and practice in the area of construction safety, in order to assist in sustainable development of the construction industry by use of machine learning.
Novel hybrid multi-head self-attention and multifractal algorithm for non-stationary time series prediction
2022, Information Sciences
Traditional time series prediction methods have shown their outstanding capabilities in time series prediction. However, due to essential differences in volatility characteristics among diverse types of non-stationary multivariate time series (NSMTS), it is difficult for traditional methods to maintain robust prediction performance. This study proposes a novel dynamic recurrent neural network to achieve stable and robust prediction performance. First, a multifractal gated recurrent unit (MF-GRU) based on the multifractal method is proposed to extract volatility characteristics. Meanwhile, to strengthen the parameters of the historical hidden layer state that has a more significant impact on the output, a self-attention mechanism is introduced into the MF-GRU, leading to a multifractal gated recurrent unit multi-head self-attention model. The efficiency of the proposed model was verified on public datasets. The experimental results show that the proposed model outperforms the traditional methods, such as long short-term memory (LSTM), the gated recurrent unit (GRU), and the minimal gated unit (MGU). etc.
Human motion prediction for intelligent construction: A review
2022, Automation in Construction
Citation Excerpt :
The variation of construction machine poses can cause interactive on-site safety issues such as struck-by hazards. Luo, et al. [112] used GRU to recognize machine activities considering working patterns and interaction characteristics to predict future machine poses. Sadeghian, et al. [109] combined deep neural network features from the scene semantic segmentation model and GAN using attention to model human trajectory.
Intelligent construction is an important construction trend. With the growing number of intelligent autonomous systems implemented in the construction area, understanding and predicting human motion becomes increasingly important. Based on such predictions, the autonomous systems can optimize their actions to improve the efficiency of human-robot interactions, and supervisors can make informed decisions about when and where to intervene in human motion to avoid collisions. This paper presents a comprehensive review of existing literature on human motion prediction (HMP). Relevant studies from a wide range of fields are reviewed, analyzed and synthesized, in terms of prediction indicators, methods and applications, based on a three-level taxonomy. The taxonomy is structured based on the levels of human information required by different prediction methods, and reflects different understandings of the underlying causality and mediators of human motions and intent. The paper also discusses the evolutions of the theoretical understanding and methodological development of HMP, its application scenarios in and beyond the construction domain, and possible directions for future research. This review is expected to increase the visibility of this rapidly expanding research area, and inspire future studies and advancements for human-robot interactions in construction.
Excavator joint node-based pose estimation using lightweight fully convolutional network
2022, Automation in Construction
Citation Excerpt :
Inspired by human pose estimation, Liang et al. [40] proposed a deep learning method for excavator pose estimation based on a stacked hourglass network; the network parameters employed reached 6.4 million with a network computation of 6.67 G floating operations per second (FLOPs, when the input size was 384 × 256 × 3). Luo et al. proposed an excavator pose estimation method based on a modified stacked hourglass network with a cascaded pyramid network (CPN) [5] and used gated recurrent units to predict time-series excavator poses from historical motions [6], where the network parameters employed reached 27.1 Million with a network computation of 6.2 Giga FLOPs (when input size was 384 × 256 × 3). To solve the problem of the high cost of training data acquisition and annotation, researchers have begun to use virtual techniques to synthesize images for training dataset construction.
Current deep-learning-based excavator pose estimation methods usually face problems such as high memory consumption and low operation speed owing to large parameter redundancy. This paper presents a joint node-based excavator pose estimation approach using a lightweight fully convolutional network (FCN) that achieves higher accuracy with lower computation and storage requirements. The method directly encodes excavator joint nodes into multilevel features, and employs a deconvolution head to decode them into heat maps to provide joint node coordinates. The lightweight design is made at two levels: block level (employing depth-wise separable convolution instead of conventional convolution for efficiency) and layer level (employing the slimming technique to optimize layer channels for redundant depth removal). Using images collected from real construction sites, the superiority of the method was validated by comparing it with other state-of-the-art algorithms using various hardware platforms. The results indicate the high potential of excavator pose estimation for edge device deployment.
Feature-based sensor configuration and working-stage recognition of wheel loader
2022, Automation in Construction
Citation Excerpt :
J. Kim et al. [16] combined the sequence mode with the action recognition of earthmoving excavators based on visual features, the action recognition accuracy can be up to 93.8%, which effectively assists in the analysis of the cycle time and the automation of the monitoring of excavator productivity. H. Luo et al. [17] further developed the gating recurrent unit and the machine action recognition method based on key points, the prediction framework of historical motion data, the attitude activity attributes of construction machinery is established to predict the future attitude. Because the attitude recognition based on machine vision changes over time, J. Yoon et al. [18] improved the loading efficiency of the excavator and dump truck by introducing spatial factors and studying the height difference, distance, and rotation angle between them.
With the maturity of sensor and data acquisition technology, the intelligent development of multi-sensor integrated loader becomes inevitable. This paper focuses on the realization of intelligent recognition of loader's working stage via low-cost and efficient sensor configuration of bulk operation data. A feature selection method, Redundancy-Complementariness Dispersion-and-Relevance-based (RCDR) is introduced to select the optimal configuration with fewer sensors. By comparing different combinations of window size and various classifiers, it is found that the sensor set configured based on RCDR feature selection can achieve an accuracy of 94.17% in working-stages recognition. Arguably, the method is potent in configuring a subset of sensors with fewer sensors and accurately recognizing working stages in various types of low-cost operation data without introducing an intelligent calibration system (IF-Then strategy). Future research is expected to tackle the limited applicability of the model caused by data discontinuity, window size combination difference and the change of loader type.

View all citing articles on Scopus

View full text

Construction machine pose prediction considering historical motions and activity attributes using gated recurrent unit (GRU)

Highlights

Abstract

Introduction

Section snippets

Related works

The proposed framework for construction machine pose prediction

Validation of the proposed framework

Conclusion

Declaration of Competing Interest

Acknowledgments

Autom. Constr.

Adv. Eng. Inform.

Autom. Constr.

Autom. Constr.

Autom. Constr.

Autom. Constr.

Autom. Constr.

Autom. Constr.

Autom. Constr.

Mechatronics.

Adv. Eng. Inform.

Autom. Constr.

Autom. Constr.

Autom. Constr.

Autom. Constr.

Adv. Eng. Inform.

National census of fatal occupational injuries in 2017

Report of Safety Accidents in China's Building Construction Activities in 2017

Occupational Safety and Health Statistics 2017

Detecting construction equipment using a region-based fully convolutional network and transfer learning

Journal of Computing in Civil Engineering.

Stacked hourglass networks for markerless pose estimation of articulated construction robots

Improving Crane Safety by Agent-Based Dynamic Motion Planning Using UWB Real-Time Location System

Action-agnostic human pose forecasting

Framework for location data fusion and pose estimation of excavators using stereo vision

J. Comput. Civ. Eng.

Efficient object identification with passive RFID tags, in: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)