Incorporating domain knowledge into reinforcement learning to expedite welding sequence optimization

https://doi.org/10.1016/j.engappai.2020.103612Get rights and content

Highlights

  • Welding sequence optimization using AI and ML techniques enables industry 4.0.

  • Welding Sequence Optimization reduces the structural deformation.

  • Lessen computational complexity of a combinatorial optimization problem through Domain Knowledge.

  • DKQRL is compared with other techniques with experimental validation.

Abstract

Welding Sequence Optimization (WSO) is very effective to minimize the structural deformation, however selecting proper welding sequence leads to a combinatorial optimization problem. State-of-the-art algorithms could take more than one week to compute the best sequence for an assembly of eight weld beads which is unrealistic for the early stages of Product Delivery Process (PDP). In this article, we develop and implement a novel Reinforcement Q-learning algorithm for WSO where structural deformation is used to compute reward function. We utilize a thermo-mechanical Finite Element Analysis (FEA) to predict deformation. The exploration–exploitation dilemma has been tackled by domain knowledge driven ε-greedy algorithm into Q-RL which helps to expedite the WSO and we call this novel algorithm as DKQRL. We run welding simulation experiment using well-known Simufact® software on a typical widely used mounting bracket which contains eight welding beads. DKQRL allows the reduction of structural deformation up to 71% and it substantially speeds up the computational time over Modified Lowest Cost Search (MLCS), Genetic Algorithm (GA), exhaustive search, and standard RL algorithm. Results of welding simulation demonstrate a reasonable agreement with real experiment in terms of structural deformation.

Introduction

Gas Metal Arc Welding (GMAW) is the most common technique for joining metal components and it has been preferred for its versatility, speed, and relative ease of robotic automation which is extensively used in automotive, shipbuilding, aerospace, construction, heavy and earth-moving equipment (Masubuchi, 1980, Islam et al., 2014). However, structural deformation due to welding is a serious concern for industry since it accrues various additional costs such as constraints in the design phase, extra operations, cost of quality, and overall capital expenditure (Goldak and Akhlaghi, 2005). WSO is highly cost effective which reduces welding structural deformation significantly. The ad hoc industry practice is to select the best sequence by experience and sometimes conduct a simplified design of experiments which typically leads to a sequence that generates considerably more structural deformation than the optimal one (Biswas et al., 2011). In order to get better welding sequence, it is needed to conduct innumerable real welding experiments which is very expensive and time consuming as well. To alleviate this problem, structural deformation yielded due to welding are predicted through a welding simulation software based on Finite Element Analysis (FEA) where thermo-mechanical models are commonly used and a reasonable solutions are achieved through FEA for numerous welding conditions and geometric configurations (Tikhomirov et al., 2005). There are three different FEA based models: (a) simplified: fast but less accurate, (b) thermo-mechanical: medium complexity but reasonable solutions, and (c) thermo-mechanical-metallurgical model: computationally very expensive and time consuming but highly accurate (Islam et al., 2014).

Selecting optimal welding sequence which ensues less deformation leads to a combinatorial optimization problem which is NP-hard by nature (Papadimitriou and Steiglitz, 1982). WSO can be mapped as a traveling salesman problem which is very popular in Operations Research (OR). Traveling Salesman (TS) problem can be described as given a list of cities and the distances between each pair of cities, discover the shortest possible route that visits each city exactly once and returns to the origin city. In the similar fashion, WSO can be described as given a list of welding seams to be placed along with all possible welding direction, find the best welding sequence which produces least structural deformation. The best welding sequence can be certainly found by executing the full factorial design of experiments. The total number of welding configurations for full factorial design can be counted by N=nr×r!, where n and r are the number of welding directions and beads (seams or segments) respectively. This number grows exponentially with the number of welding beads. For example, a complex weldment like an aero-engine assembly, it might have 52–64 weld segments (Jackson and Darlington, 2011). Hence, the full factorial design is not feasible for industrial applications and is often practically in-feasible even using FEA at the early stages of Product Delivery Process (PDP) (Romero-Hdz et al., 2017). In order to succeed in the rapidly evolving global manufacturing landscape, there is a pressing need to increase the competitiveness in the welding industry. Moreover quality and efficiency are main drivers. So, mega-trends such as Internet-of-Things (IoT), Industry 4.0 as well as the development and usage of advanced materials will be critical to future competitiveness (Lindgren, 2007). Process simulation enables the implementation of Artificial Intelligence (AI) and Machine Learning (ML) techniques, because usually a great amount of process output data is required and “time to market” and “Do It Right The First Time” are pushing the industry to exploit virtual tools. Fig. 1 illustrates the deformation problem and the AI framework where coupled FEA-AI virtual tool controls the amount of deformation instead of conducting real experiments to keep the Geometric Dimensioning and Tolerancing (GD&T) features within tolerance and ensure the assemblability.

In this research, we present a novel and efficient Reinforcement Learning (RL) algorithm for Welding Sequence Optimization (WSO) to improve the weld quality where structural deformation is used to compute the reward function. We utilized a thermo-mechanical FEA modeling to predict welding deformation. RL, in the context of AI, is a type of dynamic programming where the agent over time makes decisions to maximize its reward and minimize its penalty. In the welding context, the agent will be rewarded if the sequence (action) taken minimizes the overall structure deformation. The advantage of this approach to AI is that it allows an AI program to learn without a programmer spelling out how an agent should perform the task. An agent is allowed to learn in an interactive environment by trial and error using feedback from its own actions and experiences (Sutton and Barto, 1998). Unlike supervised learning where feedback provided to the agent is correct set of actions for performing a task or explicitly mention how to perform a task, RL learns without human intervention by using rewards and punishment as signals for positive and negative behavior, i.e., the agent receives rewards by performing correctly and penalties for performing incorrectly. On the other hand, while the goal in unsupervised learning is to find similarities and differences between data points, in reinforcement learning the goal is to find a good behavior, a suitable action model or a label for each particular situation that would maximize the long-term benefits (cumulative reward) that the agent receives. RL algorithm has been extensively used in different fields such as gaming, neuroscience, psychology, economics, engineering communications, engineering power systems, and robotics (Sutton and Barto, 1998).

Here, we make the following technical contributions:

(A) Lessen computational complexity of a combinatorial optimization problem: We incorporate domain knowledge into Q-learning algorithm to expedite the convergence and we call it “DKQRL”. Proposed DKQRL algorithm commendably curtails the computational complexity over exhaustive search. We conducted the experiment on a mounting bracket which includes eight weld seams that can be applied in two welding directions. In this scenario, the total number of welding configurations for exhaustive search is 10,321,920. However, in this experiment the DKQRL converges after 40 welding configurations. The average execution time for each welding configuration using FEA simulation software is 30 min. Thus we reduce considerable amount of computational time. (B) Solve the Exploration–Exploitation Dilemma of RL through Domain Knowledge: RL algorithm can be accelerated as well as converged through suitable determination of exploration and exploitation at each stage of RL algorithm. According to the domain expert of welding, it is advisable to weld the bead near the Center of Mass (CM) first to lessen the structural deformation due to welding. In the first step of DKQRL algorithm, if the weld seam near the CM causes minimum deformation we allow more exploration than exploitation throughout the process. In addition, when one bead of each part of the system is welded, it enhances the rigidity of the whole system that allows more exploration since high rigidity resists structural deformation (Park and An, 2016). Thus domain knowledge controls the ratio of exploration and exploitation throughout RL algorithm and hence expedite WSO. (C) Traveling Salesman Problem and Welding Sequence Optimization: We cast the problem of WSO with TS problem. TS problem consists of visiting each city only once with minimum cost. Similarly, WSO consists of welding each seam only once. As soon as one bead is welded, we remove the bead from the set of allowable states and the corresponding welding directions from the set of allowable actions. Mapping WSO with TS facilitates implementing RL in WSO and provides a realistic solution for the combinatorial optimization algorithm. (D) State-of-the-art performance: We conducted the simulation experiment of Gas Metal Arc Welding (GMAW) through the well-known welding simulation software Simufact®. The average execution time for each welding configuration took 30 min using a workstation with two Intel® Xeon® @2.40 GHz, 48G GB of RAM and 4 GB of dedicated video memory. The study case is defined as a typical mounting bracket which is widely used in telescopic jib (Derlukiewicz and Przybyłek, 2008) and automotive industries (Subbiah et al., 2011, Romeo et al., 2016). We validated the simulation results through real floor-shop welding experiment. Results demonstrated a high agreement between the result of simulation and real experiment in terms structural deformation. Experimental results demonstrated that best welding sequence can reduce significantly the amount of structural deformation (71%) over worst sequence. DKQRL based approach substantially speeds up the computational time over standard RL, Genetic Algorithm (GA) and exhaustive search.

The organization of the paper is as follows. Section 2 presents literature review. Proposed domain knowledge driven reinforcement learning algorithm is presented in Section 3. Results are demonstrated in Section 4. Section 5 concludes this work. Relevant references are listed at the end of the paper.

Section snippets

Literature review

The literature review is organized into three parts. First, we summarize state-of-the-art optimization techniques implemented in fields related to welding that can be used for further research in WSO such as manufacturing process parameters optimization, mechanical and structural design optimization, Second, we describe Q-learning and RL approaches. Subsequently, we illustrate the domain knowledge for WSO.

Methodology

Here we present a novel RL algorithm where domain knowledge in the field of welding discussed in the previous section has been incorporated for accelerating WSO by solving the exploration–exploitation dilemma through adapting ε-greedy algorithm. In this section, we first outline the optimization framework, then we describe the implementation of the welding domain knowledge exploited in this study and lastly, the proposed DKQRL tailored for WSO is detailed.

Experimental results and discussions

In this section, first we illustrate the study case. Then, we discuss FEA based simulation experiment conducted for welding deformation prediction. Subsequently, we illustrate the results of the FEA for the best and worst sequence found by the proposed DKQRL algorithm. After that, we demonstrate the effects of welding sequence on WSO. Next, we demonstrate a comparative study among Modified Lowest Cost Search (MLCS) (Romero-Hdz et al., 2016a, Romero-Hdz et al., 2016b), single objective Genetic

Conclusions and future work

Welding sequence optimization has considerable effect in structural deformation. In this study, the maximum structural deformation is exploited as the Q-function of the RL algorithm for WSO. RL significantly reduces the search space over exhaustive search. We incorporated the domain knowledge and expedite the RL algorithm for WSO by resolving the exploration–exploitation dilemma. Welding simulation software was used to compute the structural deformation using FEA. Proposed DKQRL algorithm for

CRediT authorship contribution statement

Baidya Nath Saha: Formal Analysis. Seiichiro Tstutsumi: Project administration. Riccardo Fincato: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors gratefully acknowledge the support provided by Osaka University through the JWRI in Japan and The National Council of Science and Technology of Mexico through CIDESI and CIMAT.

References (41)

  • Goldak, J., 2013. Web based simulation of welding and welded structures. In: CWA Conference 2013, p....
  • GoldakJ. et al.

    Computational Welding Mechanics

    (2005)
  • JacksonK. et al.

    Advanced engineering methods for assessing welding distortion in aero-engine assemblies

    IOP Conf. Ser.: Mater. Sci. Eng.

    (2011)
  • KimH.-J. et al.

    Scheduling for an arc-welding robot considering heat-caused distortion

    J. Oper. Res. Soc.

    (2005)
  • KimK.-Y. et al.

    Robot arc welding task sequencing using genetic algorithms

    IIE Trans.

    (2002)
  • LindgrenL.

    Computational Welding Mechanics: Thermomechanical and Microstructural Simulations

    (2007)
  • MasubuchiK.

    Analysis of Welded Structures: Residual Stresses, Distortion, and Their Consequences

    International Series on Materials Science and Technology

    (1980)
  • OkumotoY. et al.

    Optimization of welding route by automatic machine using reinforcement learning method

    J. Japan Soc. Naval Archit. Ocean Eng.

    (2007)
  • PapadimitriouC.H. et al.

    Combinatorial Optimization: Algorithms and Complexity

    (1982)
  • ParkJ.-U. et al.

    Effect of welding sequence to minimize fillet welding distortion in a ship’s small component fabrication using joint rigidity method

    Proc. Inst. Mech. Eng. B

    (2016)
  • Cited by (7)

    • Learning to traverse over graphs with a Monte Carlo tree search-based self-play framework

      2021, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      The fundamental motivation of applying deep learning and RL to CO lies in the discovery and reasoning of new policies. Compared with traditional algorithms, machine learning can discover the inherent characteristics of the instances to guide future instances by learning and applying the solving experience of existing instances (Romero-Hdz et al., 2020). It also makes it possible for NP-hard problems that were not easy to solve in the past.

    • Automation of load balancing for Gantt planning using reinforcement learning

      2021, Engineering Applications of Artificial Intelligence
    • Effect of welding conditions on the deformation of lithium battery pack of aluminum alloys

      2024, Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering
    View all citing articles on Scopus
    View full text