Construction machine pose prediction considering historical motions and activity attributes using gated recurrent unit (GRU)
Introduction
Construction sites are suffering from high hazard rates among all workplaces, making it a foremost concern to improve the on-site safety. According to the reports from the U.S., mainland China and Hong Kong Special Administrative Region [[1], [2], [3]], unsafe operation of construction machines is an essential reason of fatal hazards occurred on construction sites. In the U.S., over 38% of construction accidents are caused by interactions between construction resources (e.g. workers and machines), which also resulted in more than 16% and 29% of construction accidents in mainland China and Hong Kong SAR, respectively. Hence, it is important to monitor the motion of construction machines on sites. Typically, accidents are likely to take place when two on-site objects move towards each other, making it important for safety managers to pay attention to the changing locations (i.e. trajectories) of on-site construction resources. In recent years, several efforts have been made to automatically monitor locations of construction resources based on data captured from surveillance cameras and pre-installed devices to assist site managers with traditional error-prone and tedious safety inspections of construction sites [[4], [5], [6]]. Nevertheless, it is common but easily overlooked that accidents may still occur when the location of heavy construction machines remains unchanged, but their poses are varying constantly as deformable components of construction machine are operated. Therefore, monitoring machine poses can be a necessary supplement for safety management of construction projects.
Up till now, most site managers monitor on-site machine poses by watching surveillance videos and evaluating potential risks manually. Such manual observations of on-site safety condition are error-prone and time-consuming because they are greatly dependent on physical status and expertise of the inspector. To address such limitations, previous studies have attempted to automate pose monitoring of construction machines based on surveillance videos. For example, several efforts have been contributed to estimating past and current poses of construction machines through processing on-site videos not only using conventional computer vision techniques [7], but also further adopting deep learning techniques [8,9] that have shown promising performance in many vision-based tasks. Besides, another common and practicable approach for automating pose monitoring of construction machines is processing the collected signals from pre-installed devices [10]. Most of previous studies only focus on estimating past and current machine poses which have occurred, yet understanding the current poses of construction machines is not sufficient to avoid potential hazards. Instead, pose prediction (or pose forecasting) [11], of construction machines can provide more clues to prevent possible collisions or other accidents, for which there is still a lack of research.
When tentatively investigating potential methods for predicting future poses, we found that both geometric and non-geometric information provide insights for prediction. On the one hand, future poses of construction machines are influenced by machine motions which can be informed by geometric construction data, such as the geometry information of machines as well as the construction environment. On the other hand, non-geometric construction data, such as the working task the machine is focusing on and the interaction of the machine with other objects, can provide contextual information for predicting machine poses, which is not fully considered in other research. With the aim to reduce potential on-site hazards, we propose a framework incorporating both geometric and non-geometric construction information to improve the performance of machine pose prediction, where geometric construction data refer to the dimensions (e.g. length, width, height) and coordinates of both construction machines and project terrain, while non-geometric data are semantic data such as working tasks and working natures of the construction machine. The proposed framework consists of three modules, i.e. motion capture module, activity recognition module and machine pose prediction module. Firstly, geometric construction site information is included in the machine motion capture module, and thereon historical motion data of target construction machines are provided. Next, the machine activity recognition module recognizes historical activities with the help of non-geometric construction information and historical motion data from the machine motion capture module. Lastly, the machine pose prediction module generates future poses of construction machines by integrating information of historical poses and activities. To validate the proposed framework, we adopted excavators as experiment objects as excavators own more deformable components and tend to have complex pose variations and interactions with the surrounding objects, and performed experiments based on a motion capture dataset proposed by adding historical poses data to an existing video dataset. The proposed framework is expected to provide more comprehensive information to reduce potential hazards caused by varying machine poses and to protect on-site workers.
The rest of this paper is summarized below. Section 2 reviews works related to motion analysis of the past and current states of construction machines, as well as machine motion prediction. Afterwards, Section 3 introduces the proposed overall framework of machine pose prediction that considers both geometric and non-geometric construction information, including detailed descriptions and implementation approaches of each framework module. Next, taking excavators performing earthmoving tasks as the example, the proposed overall framework is verified in Section 4, and thereon discussion and insights about the experiment results are given. In the end, conclusion is stated in Section 5.
Section snippets
Related works
Machine motion (i.e. locations, poses and movements) monitoring has attracted increasing attentions of the construction industry in recent years because of the high on-site hazard rates resulted from moving construction machines. On construction sites, machine motion monitoring includes not only analyzing machine motions that have occurred in the past and current time, but also, on this basis, predicting potential motions of construction machines in the future. Therefore, this paper firstly
The proposed framework for construction machine pose prediction
The overall framework of machine pose prediction considering non-geometric construction information is illustrated in Fig. 1. Firstly, the motion capture module is to obtain historical motion data (i.e. locations, poses and movements) by processing geometric construction information captured by external devices such as surveillance cameras and pre-installed devices. Subsequently, the machine activity recognition module provides historical activity information considering both historical motions
Validation of the proposed framework
To validate the feasibility and effectiveness of the proposed framework for machine pose prediction, excavators are adopted as the experiment objects. One reason is that excavators are fundamental and essential machines in construction projects with the most deformable components. Besides, excavators tend to have more complex pose variations and interactions with the surrounding objects.
Conclusion
Safety is the first priority of all construction projects. Construction machines are the major source of safety issues on construction sites due to their frequent interactions with workers and other construction-related objects. Therefore, it is necessary to monitor states (i.e. locations, poses and movements) of construction machines for avoiding potential collisions and other accidents. Besides tracking the past and the current states of construction machines, evaluating the future states of
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
Financial supports of this study by the Hong Kong PhD Fellowship Scheme (HKPFS) to Han LUO and Peter K. Y. WONG are gratefully acknowledged.
References (51)
- et al.
Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors
Autom. Constr.
(2013) - et al.
Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers
Adv. Eng. Inform.
(2013) - et al.
Skeleton estimation of excavator by detecting its parts
Autom. Constr.
(2017) - et al.
Full body pose estimation of construction equipment using computer vision and deep learning techniques
Autom. Constr.
(2020) - et al.
Application of RFID technology to prevention of collision accident with heavy equipment
Autom. Constr.
(2010) - et al.
Performance evaluation of ultra wideband technology for construction resource location tracking in harsh environments
Autom. Constr.
(2011) - et al.
Application of dynamic time warping to the recognition of mixed equipment activities in cycle time measurement
Autom. Constr.
(2018) - et al.
Integrating field data and 3D simulation for tower crane activity monitoring and alarming
Autom. Constr.
(2012) - et al.
Optimization-based excavator pose estimation using real-time location systems
Autom. Constr.
(2015) - et al.
Digging control system for hydraulic excavator
Mechatronics.
(2001)
Construction equipment activity recognition for simulation input modeling using mobile sensors and machine learning classifiers
Adv. Eng. Inform.
Action recognition of earthmoving excavators based on sequential pattern analysis of visual features and operation cycles
Autom. Constr.
End-to-end vision-based detection, tracking and activity analysis of earthmoving equipment filmed at ground level
Autom. Constr.
Interaction analysis for vision-based activity identification of earthmoving excavators and dump trucks
Autom. Constr.
Spatial factors affecting the loading efficiency of excavators
Autom. Constr.
Times-series data augmentation and deep learning for construction equipment activity recognition
Adv. Eng. Inform.
National census of fatal occupational injuries in 2017
Report of Safety Accidents in China's Building Construction Activities in 2017
Occupational Safety and Health Statistics 2017
Detecting construction equipment using a region-based fully convolutional network and transfer learning
Journal of Computing in Civil Engineering.
Stacked hourglass networks for markerless pose estimation of articulated construction robots
Improving Crane Safety by Agent-Based Dynamic Motion Planning Using UWB Real-Time Location System
Action-agnostic human pose forecasting
Framework for location data fusion and pose estimation of excavators using stereo vision
J. Comput. Civ. Eng.
Efficient object identification with passive RFID tags, in: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Cited by (28)
Automatic Fine-Grained BIM element classification using Multi-Modal deep learning (MMDL)
2024, Advanced Engineering InformaticsHuman motion prediction for intelligent construction: A review
2022, Automation in ConstructionCitation Excerpt :The variation of construction machine poses can cause interactive on-site safety issues such as struck-by hazards. Luo, et al. [112] used GRU to recognize machine activities considering working patterns and interaction characteristics to predict future machine poses. Sadeghian, et al. [109] combined deep neural network features from the scene semantic segmentation model and GAN using attention to model human trajectory.
Excavator joint node-based pose estimation using lightweight fully convolutional network
2022, Automation in ConstructionCitation Excerpt :Inspired by human pose estimation, Liang et al. [40] proposed a deep learning method for excavator pose estimation based on a stacked hourglass network; the network parameters employed reached 6.4 million with a network computation of 6.67 G floating operations per second (FLOPs, when the input size was 384 × 256 × 3). Luo et al. proposed an excavator pose estimation method based on a modified stacked hourglass network with a cascaded pyramid network (CPN) [5] and used gated recurrent units to predict time-series excavator poses from historical motions [6], where the network parameters employed reached 27.1 Million with a network computation of 6.2 Giga FLOPs (when input size was 384 × 256 × 3). To solve the problem of the high cost of training data acquisition and annotation, researchers have begun to use virtual techniques to synthesize images for training dataset construction.
Feature-based sensor configuration and working-stage recognition of wheel loader
2022, Automation in ConstructionCitation Excerpt :J. Kim et al. [16] combined the sequence mode with the action recognition of earthmoving excavators based on visual features, the action recognition accuracy can be up to 93.8%, which effectively assists in the analysis of the cycle time and the automation of the monitoring of excavator productivity. H. Luo et al. [17] further developed the gating recurrent unit and the machine action recognition method based on key points, the prediction framework of historical motion data, the attitude activity attributes of construction machinery is established to predict the future attitude. Because the attitude recognition based on machine vision changes over time, J. Yoon et al. [18] improved the loading efficiency of the excavator and dump truck by introducing spatial factors and studying the height difference, distance, and rotation angle between them.