3DSMDA-Net: An improved 3DCNN with separable structure and multi-dimensional attention for welding status recognition
Introduction
As arc welding is an important metal joining method, ensuring the quality of the weld is critical to improving the reliability of a workpiece [1]. After welding materials and welding methods have been determined, controlling a welding process becomes the key to ensure the consistency of welding quality [2]. A traditional welding process optimizes the craft through the CAE simulation analysis before welding and the destructive or non-destructive testing after welding [[3], [4], [5], [6], [7]]. However, this model will not only waste resources but also lack interactions with the welding process. In fact, skilled welding workers can dynamically adjust their welding craft by observing welding processes such as the globule transition mode and the shape of the molten pool. Therefore, it is an effective way of improving welding quality by monitoring the welding status based on visual sensing and adjusting the welding craft with feedback [[8], [9], [10]]. In a vision-based welding status monitoring process, the common monitoring objects are molten pool shape [11], penetration degree [12] and globule transition mode [13].
A typical robotic welding system is shown in Fig. 1. The controller is responsible for sending cooperative welding instructions to the robot and the positioner. After receiving the welding instruction, the wire feeding mechanism can provide welding wire to ensure the welding progress. In addition, in order to obtain a good welding environment, shielding gas is generally delivered to the base metal area. After the arc is ignited, a strong high temperature is generated and the welding wire is melted. Under the action of gravity, the globule transfers to the base metal. Under the high temperature of the arc and the globule, the base metal is melted. This melting area is the molten pool. After the molten pool is cooled, it solidifies into a welded beam to realize the connection of the base metal. Therefore, the globule transition state, molten pool shape and penetration degree can directly reflect the welding quality. In the experiment, the CCD camera was fixed on the front of the robotic arm and moved with the robotic arm to capture the welding process in real time. The vision-based welding status recognition (WSR) can be regarded as a task of pattern classification. Such a task for welding images faces the following challenges due to the characteristics of a welding process. Firstly, a process of arc welding is accompanied by strong arc light and smoke interference. This makes it difficult for industrial cameras to obtain clear images of the welding process [14]. Further, the vibrations of a welding process will cause the motion blur problem of industrial cameras [15,16]. The issues of interference and blurring directly limit the acquisition of high-quality weld images. Secondly, the differences among different classes of welding images are small, while the differences within one class are large. Therefore, the most discriminative features must be extracted to effectively identify the welding status [17]. Thirdly, in the context of the deep integration of information technology and manufacturing, a robot welding system is equipped with multi-source perception modules (visual signal, acoustic signal, spectral signal, electrical signal, etc.) on top of traditional modules (motion module, craft module, wire feeding module, etc.). This leads to producing a massive amount of real-time data in a robot welding system. As such, the traditional cloud-based centralized computing model has gradually shifted to the edge computing model [18]. Therefore, the storage and computing resources that can be allocated to the vision module in a welding edge node are very limited. Among the above three challenges, the problem of low image quality and feature recognition has severely limited the accuracy of welding image pattern recognition. The limited storage and computing resources result in extremely high requirements for the lightweight of the vision-based WSR model.
In reality, skilled welders rely on not only the current status but also the previous status of welding to make judgements. In practical welding tests, we obtained the three types of droplet transitions shown in Fig. 2 by changing the welding process. It is difficult for us to distinguish the three images at time ti, for example. However, we can easily make a judgement by using the information of images over time ti-7 ∼ ti.
Previous studies in vision-based WSR have mainly focused on a single image, as shown in Fig. 3. To remedy this, our study considers the temporal correlation of a welding process as illustrated in Fig. 2. Further, DL models that incorporate temporal information are typically large. This poses a challenge for computation and storage at an edge. To meet the requirements of a welding system, we also take into account the lightweight of the time sequence models.
Motivated by addressing the above-mentioned issues, we propose an improved three-dimensional convolutional neural network (3DCNN) with a separable structure and multi-dimensional attention (3DSMDA-Net) for WSR. For improving the accuracy of WSR, the proposed method uses Resnet18 [19] as the backbone network and 3DCNN to adaptively extract the complex spatiotemporal features contained in a welding process. Considering a fact that 3DCNN models with a large number of their parameters are difficult to be deployed at the edge, we further propose a lightweight model of a 3DCNN-oriented separation method that can alleviate the storage and computation pressure at the edge. The authors [20] visualized the decision-making basis of the network through an explainable method, and the results show that excellent network structures can accurately locate the target area in the image. Therefore, to compensate for the loss of accuracy due to the separation operation, we incorporate a multi-dimensional attention mechanism (MDA) to explicitly model this capability according to the characteristics of the separation operation. To the best of our knowledge, this is the first work on addressing both issues of time sequence information and model lightweight in the WSR.
In summary, the contributions of this paper are as follows: 1. We make use of the historic information into deep learning (DL)-based WSR to enhance recognition accuracy; 2. We propose a method of 3DCNN-oriented convolution kernel separation as a lightweight time sequence model; and 3. We propose multi-dimensional attention mechanism for reducing the loss of accuracy caused by the separation operation. This is achieved by considering the characteristics of the separation operation without adding additional parameters.
The rest of this paper is organized as follows: Section 2 reviews the related work on vision-based WSR methods, DL-based sequence image recognition methods, and DL-oriented model lightweight methods. Section 3 presents our design of the overall architecture of the 3DSMDA-Net, the structure of 3DCNN with separable (3DS) operation, and the MDA mechanism for the lightweight method. Section 4 describes the experimental setup, followed by reporting numerical experiments of the proposed method on our self-built dataset and public dataset in Section 5. Finally, Section 6 concludes this paper and makes recommendations for future work.
Section snippets
Related work
In this session, we summarize the time sequence image pattern recognition methods based on DL after reviewing the WSR methods based on vision. A robot welding system has many components, and the welding state recognition model incorporating time sequence information brings challenges to the storage and calculation in welding edge nodes, we finally reviewed the lightweight methods for DL models.
The general framework of 3DSMDA-Net
Our general framework of the 3DSMDA-Net for WSR is illustrated in Fig. 4. In particular, 3SCMDA-Net uses classical Resnet18 as its backbone network. The input is an image sequence with a frame number of 8, while the output is the label distribution of the image at the end of the sequence. The size of each image frame is 64*64*1 (width, height, and channel). This input first goes through a classical 3DCNN with a stride of 1 to produce a feature tensor of size 64*64*64*8. Then four 3DCNN with 3DS
Datasets
In our experiments, we use two datasets of our own and public ones. We construct a dataset of globule image (GID) with three transition types: streaming transfer (ST), projected transfer (PT) and short circuit transfer (SCT). The data statistics of GID is described in Table 1. Examples of GID dataset are shown in Fig. 9. For the public dataset, we use the molten pool image dataset SS304 published [31] to further verify the performance of the proposed method. The data statistics and samples of
Performance evaluation
The loss and accuracies of the proposed method during the training on GID is plotted in Fig. 11. All data are trained for 3 epochs. It can be seen from Fig. 10 that both 3DSMDA-Net and 3DCNN converge to near 1 after about 2200 iterations. The 3DS and CNN-LSTM methods converge to 0.985 after about 2600 iterations. The 2DCNN without time sequence information converges to only 0.945.
We use the validation to test the performance of the proposed method. In particular, the performance of the proposed
Conclusion and future work
Inspired by the observation of a welding process by skilled welders, we introduced and made use of time sequence information in the process of WSR in this paper. The use of time sequence information can not only indirectly augment data, but also enhance the judgment basis of the model to the current welding status. In terms of accuracy-related metrics, the MDS enhances the spatiotemporal and channel features learned from the model, and 3DSMDA-Net achieves results comparable to 3DCNN on a
Declaration of Competing Interest
The authors report no declarations of interest.
Acknowledgments
This work was supported by “the Fundamental Research Funds for the Central Universities and Graduate Student Innovation Fund of Donghua University” under Grant No. CUSF-DH-D-2020053.
References (52)
- et al.
A review on wire arc additive manufacturing: monitoring, control and a framework of automated system
J Manuf Syst
(2020) A fundamental study on qualitatively viable sustainable welding process maps
J Manuf Syst
(2018)- et al.
An adaptive Bernstein-Bézier finite element method for heat transfer analysis in welding
Adv Eng Softw
(2020) - et al.
A learning-based approach for surface defect detection using small image datasets
Neurocomputing
(2020) - et al.
Microstructure and mechanical properties of A-TIG welded AISI 316L SS-Alloy 800 dissimilar metal joint
Mat Sci Eng A-Struct
(2020) - et al.
WeldANA: Welding decision support tool for conceptual design
J Manuf Syst
(2019) - et al.
Research evolution on intelligentized technologies for arc welding process
J Manuf Process
(2014) - et al.
Monitoring of welding status by molten pool morphology during high-power disk laser welding
Optik
(2015) - et al.
Dynamic features of plasma plume and molten pool in laser lap welding based on image monitoring and processing techniques
Opt Laser Technol
(2019) - et al.
Detecting dynamic development of weld pool using machine learning from innovative composite images for adaptive welding
J Manuf Process
(2020)
In-situ monitoring of the penetration status of keyhole laser welding by using a support vector machine with interaction time conditioned keyhole behaviors
Opt Laser Eng
Effects of arc bubble behaviors and characteristics on droplet transfer in underwater wet welding using in-situ imaging method
Mater Des
A robust weld seam recognition method under heavy noise based on structured-light vision
Robot Cim-Int Manuf
A vision-based method for crack detection in gusset plate welded joints of steel bridges using deep convolutional neural networks
Automat Constr
Vibration test on welding robot
Procedia Comput Sci
Intelligent welding system technologies: state-of-the-art review and perspectives
J Manuf Syst
An adaptive-network-based fuzzy inference system for classification of welding defects
NDT&E Int
Automatic visual monitoring of welding procedure in stainless steel kegs
Opt Laser Eng
Weld image deep learning-based on-line defects detection using convolutional neural networks for Al alloy in robotic arc welding
J Manuf Process
Real-time penetration state monitoring using convolutional neural network for laser welding of tailor rolled blanks
J Manuf Syst
Welding defects detection based on deep learning with multiple optical sensors during disk laser welding of thick plates
J Manuf Syst
Online defect recognition of narrow overlap weld based on two-stage recognition model combining continuous wavelet transform and convolutional neural network
Comput Ind
Automated defect classification of SS304 TIG welding process using visible spectrum camera and machine learning
NDT&E Int
Application of sensing techniques and artificial intelligence-based methods to laser welding real-time monitoring: a critical review of recent literature
J Manuf Syst
Deep learning-empowered digital twin for visualized weld joint growth monitoring and penetration control
J Manuf Syst
3D separable convolutional neural network for dynamic hand gesture recognition
Neurocomputing
Cited by (22)
AF-FTTSnet: An end-to-end two-stream convolutional neural network for online quality monitoring of robotic welding
2024, Journal of Manufacturing SystemsOnline penetration prediction based on multimodal continuous signals fusion of CMT for full penetration
2024, Journal of Manufacturing ProcessesA novel deep learning architecture and its application in dynamic load monitoring of the vehicle system
2024, Measurement: Journal of the International Measurement ConfederationOnline monitoring system for welding states of bottom-locking joints in high-speed trains via multi-information fusion and 3DCNN
2024, Journal of Manufacturing ProcessesDeep learning-based welding image recognition: A comprehensive review
2023, Journal of Manufacturing Systems