Multistage MR-CART: Multiresponse optimization in a multistage process using a classification and regression tree method
Introduction
A multistage process that consists of a series of sequential stages is a common structure in manufacturing lines. Most manufacturing industries require several stages to complete their final products, such as semiconductor, printed circuit board, chemical and telecommunication manufacturing processes (Pan, Li, & Wu, 2016). Fig. 1 describes a representative multistage manufacturing process consisting of stages. The rectangles represent stages, and each stage is followed by its inspection stage shown as a circle. The raw materials enter Stage 1 and become a final product through the stages. and in Fig. 1 are vectors of input variables and response variables, respectively, at the th stage.
In the multistage process, each stage has multiple responses and is affected by its preceding stage, while at the same time, affecting the following stage. Several methods have been proposed to solve the multiresponse problem in the multistage process. Mukherjee and Ray (2008) employed desirability functions for optimizing multiple responses in a two-stage process. In this approach, empirical models for the multiple responses are fitted, and desirability functions are constructed by using the empirical models. Then, the optimal condition for the input variables is obtained by maximizing the desirability functions. In order to search the optimal condition, several metaheuristics such as genetic algorithm, simulated annealing, Tabu search were employed. Later, Bera and Mukherjee (2016) extended the scope of Mukherjee and Ray (2008)’s method from the two-stage process to the multistage process. Hejazi, Seyyed-Esfahani, and Mahootchi (2015) suggested a mathematical programming method and a metaheuristic algorithm using iterative seemingly unrelated regression for optimizing the multiple responses in the multistage process. Recently, Yin, He, Niu, and Li (2018) suggested a method for optimizing coal preparation production system, which is a particular multistage process. In this method, a forward iterative modeling method based on support vector regression was presented to consider the interdependency between neighboring stages. In addition, a goal-oriented and backward iterative optimization approach based on genetic algorithm was proposed to determine the globally optimal operating conditions of coal preparation system.
The above methods commonly build empirical models for the multiple responses and obtains the optimal setting for the input variables based on the empirical models. Although these methods are attractive approaches, they have a difficulty in that they require a large number of experiments for building the empirical models. In the multistage process, there are various relationships between stages and relationships between the input and responses that should be investigated for the optimization. A large number of experiments must be conducted to build empirical models that explain these relationships, which requires large amounts of resources (time, material, machine, etc.).
Alternatively, process operational data gathered from manufacturing lines can be used instead of conducting a large number of experiments. Recently, many manufacturing companies have been able to obtain a large volume and variety of operational data from the manufacturing lines due to network sensors and IoT (Internet of Things). This large and variety of operational data may contain meaningful information. Using data mining methods can be attractive when dealing with a large volume and variety of operational data. Classification and regression tree (CART) and patient rule induction method (PRIM) are representative data mining methods applicable to the process optimization. Recently, Lee, Kim, Kim, Kim, and Zhen (2021) suggested applying CART for optimizing multiple responses in a single stage process. However, none of the methods have employed CART to optimize the multistage process. The proposed method extends the scope of CART-based optimization from single stage to multistage by considering the relationship between stages. For this purpose, a backward sequential optimization procedure suggested. In this procedure, optimization is sequentially conducted from the last stage to the first stage. Additionally, the proposed method employs a desirability function method (Derringer & Suich, 1980) as the objective function of CART for simultaneous optimizing the multiple responses. The proposed method obtains the subregions in the input variables space where high desirability function value is obtained for each stage.
The rest of the paper is organized as follows. Section 2.1 provides a reviews CART which is employed in the proposed method and compares it with PRIM. Section 2.2 reviews desirability function method which is also employed in the proposed method. The proposed method is presented in Section 3 and is illustrated with a case study in Section 4. Finally, a discussion and concluding remarks are given in Section 5.
Section snippets
Review of CART and PRIM
In this section, we review CART and PRIM, which are applicable to process optimization. CART was first introduced by Breimen et al. (1984). It is a binary recursive partitioning procedure that finds the subregion in the input variable space where the performance of the response is considerably better. When the response variable is nominal (continuous), it becomes a classification (regression) tree. CART has the advantage of being able to process various types of data (Lee, Jeong, & Lee, 2016).
Proposed multistage MR-CART
In this section, the proposed multistage MR-CART is presented. As mentioned in Section 1, it is important to obtain reliable response surface models in MRSO because the optimal solution is obtained by analyzing the response surface models. Nevertheless, it is not easy to obtain reliable response surface models, especially when dealing with the large amount of data. This is because not only the form of functional relationships between input variables and responses might not be clear, but also
Step 1. Prepare the data
We have a total of 5609 observations denoted by for . The numbers of input variables of Stage 1 and 2, denoted by and , are 13 and 11, respectively. The numbers of response variables of Stage 1 and 2, denoted by and , are two for each, as shown in Fig. 5. Thus, every observation includes 28 values (i.e., 28 = 13 + 11 + 2 + 2).
Step 2. Split the data
The entire 5609 observations are randomly divided into training and test datasets at a ratio of 4:1 to
Concluding remarks
In this paper, we proposed a systematic procedure for optimizing the multiple responses in the multistage manufacturing process using CART. In the multistage process, the performance of each stage needs to be considered in the context of the relationship between the stages since each stage is influenced by its preceding stage, and it also affects the stage that follows. We consider this property by modifying the CART algorithm and employing the desirability function method, which optimizes
CRediT authorship contribution statement
Dong-Hee Lee: Conceptualization, Methodology, Writing – original draft. So-Hee Kim: Software. Kwang-Jae Kim: Supervision.
Acknowledgements
This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07049412). Also, this work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (NRF-2019R1A2C1007834).
References (29)
- et al.
PRIM versus CART in subgroup discovery: When patience is harmful
Journal of biomedical informatics
(2010) - et al.
A multistage and multiple response optimization approach for serial manufacturing system
European Journal of Operational Research
(2016) - et al.
Partial least-squares regression: A tutorial
Analytica Chimica Acta
(1986) - et al.
Direct marketing modeling with CART and CHAID
Journal of Direct Marketing
(1997) - et al.
Optimal process design of two-stage multiple responses grinding processes using desirability functions and metaheuristic technique
Applied Soft Computing
(2008) - et al.
A new approach to detecting the process changes for multistage systems
Expert Systems with Applications
(2016) - et al.
A new Bayesian approach to multi-response surface optimization integrating loss function with posterior probability
European Journal of Operational Research
(2016) - et al.
A hybrid intelligent optimization approach to improving quality for serial multistage and multi-response coal preparation production systems
Journal of manufacturing systems
(2018) - et al.
Classification and regression trees
(1984) - et al.
Process Optimization for Multiple Responses Utilizing the Pareto Front Approach
Quality Engineering
(2014)
A data mining approach to process optimization without an explicit quality function
IIE Transactions
A balancing act: optimizing a product’s properties
Quality Progress
Simultaneous optimization of several response variables
Journal of Quality Technology
Data envelopment analysis with classification and regression tree – a case of banking efficiency
Expert Systems
Cited by (9)
Influence of sample attributes on generalization performance of machine learning models for windage alteration fault diagnosis of the mine ventilation system
2023, Expert Systems with ApplicationsCitation Excerpt :Similarly, taking e6 as an example, the distribution of fault volume value is drawn, as shown in Fig. 13. CART is a widely used machine learning algorithm (Lee, Kim, & Kim, 2021). This section constructs a WAFs diagnosis model based on CART.
A convex two-dimensional variable selection method for the root-cause diagnostics of product defects
2023, Reliability Engineering and System SafetyCitation Excerpt :Fault/defect diagnostic is an important component of system Prognostic and Health Management, which provides the prerequisites for fault tolerance, reliability, and security of complex engineering systems [1–7]. One of the industrial areas that have widely employed fault diagnostic techniques is Multistage Manufacturing Processes (MMPs) [8–13]. Many MMPs consist of identical stages, units, stations, or operations.
Exploring smart quality predictive modelling approach: a case study of the injection-molding industry
2024, Production Planning and ControlROLLING BEARING FAULT DIAGNOSIS METHOD BASED ON MORLET WAVELET AND CART DECISION TREE
2024, Jixie Qiangdu/Journal of Mechanical StrengthLithology identification technology based on the stacking fusion model
2023, International Journal of Oil, Gas and Coal Technology