Introduction

Forming processes often represent the most economical part of the value-added chain and represent a key factor in times of energy and raw material shortages due to the optimal utilization of materials as well as the lower specific energy requirement compared to subtractive or additive manufacturing processes (Lange, 1985). Even though forming processes offer significant advantages, they are under high cost pressure. This is mainly caused by the low margins per part, which are directly influenced by short-term machine downtimes. Blanking processes are especially affected by this problem, as their high stroke speeds and required accuracy can lead to unplanned downtimes of tool and machine components (Lange, 1985). In order to achieve the objective of high productivity under these circumstances, a reduction of machine downtimes and thus a maximization of the degree of utilization is to be aimed at. For this purpose, it is necessary to be able to make a real-time prediction about the current and future process conditions even at high production rates. Especially in the case of blanking, condition monitoring based on sensorial captured time series is difficult due to the complexity of the process. Blanking is influenced by more than forty parameters related to process-specific uncertainties (wear, varying specifications of semi-finished product, etc.) as well as machine- and tool-specific variables (clearance, cutting edge radii, etc.) (Hirsch et al. 2011). Furthermore, considering data availability in the context of progressive digitalization of manufacturing processes, resilient process control and a comprehensive condition monitoring by qualified personnel or conventional monitoring systems is difficult to achieve. In order to be able to make real-time predictions about the process and thus avoid unplanned machine downtimes, plan maintenance and identify deviations of product quality at an early stage, it is necessary to set up data-driven process monitoring. Especially in times of digitization of the value-added chain, a lack of qualified personnel, reduced physical contact with the machine caused by an ongoing automation and a required remote access, trustworthy data sets play a crucial role to guarantee a resilient model to describe or predict the state of process. Industry 4.0 (also known as Smart Manufacturing or Smart Factory) provides an approach to this, combining elements of artificial intelligence (AI), new types of sensors as well as state-of-the-art information technologies (Oztemel & Gursev, 2020). Central to this approach are data which are more and more available to companies due to high-performance processors and sensors and cross-linking of processes (Moyne & Iskandar, 2017). Industry 4.0 aims to use this data available over the entire product life cycle to improve the value chain and build manufacturing intelligence (O’Donovan et al. 2015). This steadily increasing amount of data offers the possibility of making predictions about process correlations that were not possible before. However, the value of this data is not only determined by its quantity, but also by the knowledge and information hidden in it. AI and the use of machine learning (ML) algorithms, as a key technology in Industry 4.0, offer exactly this knowledge for manufacturing process (Penumuru et al. 2020). ML algorithms provide pattern recognition and the identification of correlations to large datasets and the derivation of predictive models (Wuest et al. 2016). Combining high-performance computing with valid data sets, ML algorithms build a basis for data-driven process monitoring and thus the real-time prediction of process conditions (Tsai et al. 2015). While current literature focuses on the transfer of new ML algorithms to classify or predict the process condition, little work has been done so far to draw conclusions about the influence of the quality of the data set and related to this the influence of data acquisition, preprocessing and transformation on the performance of ML models. Especially in the blanking process the data set should be considered when using AI-supported condition monitoring. This is caused by the nonlinearity of the process as a result of transient elastic and plastic deformation of the sheet metal as well as the static and dynamic behavior of the press. Furthermore, over forty process variables can be identified that interact with one another and are related to uncertainties of the process (Jin & Shi, 2000). Even the smallest changes in these process variables can significantly influence the process. In contrast, small variations of process variables influence the time signals recorded by sensors less strongly. This leads to high demands on the quality of the acquired data even at high stroke rates in order to be able to physically differentiate effects in the process from noise even at minor variations (Groche et al. 2019).

Therefore, this works aims at describe the influence of data acquisition, preprocessing and transformation step on the performance of a multiclass support vector machine (mSVM) model. As a use case, the abrasive wear during blanking process is to be classified. The influence of these three steps on the performance of the mSVM is quantified by the accuracy and the separability between classes, related to the Mahalanobis distance. In order to transfer these findings to industrial blanking processes the resilience of the model to predict the current wear state is validated by industry-related data sets at different stroke speeds.

Data-driven monitoring of blanking processes

As shown in Fig. 1, a blanking process can be divided into three phases according to the force–displacement curve. In the punch-phase (I), the punch impacts on the sheet metal and starts to elastically deform the material. If the stresses that occur exceed the maximum shear strength of the material, it tends to deform plastically. When the shearing stress finally exceeds the shear fracture limit, the material tears and the stored elastic energy is released abruptly. In the following push-phase (II), the component or the grid-shaped discard is completely pushed out of the die and the punch passes through the bottom dead center. Finally, the punch is pulled out of the die in the withdraw-phase (III) and withdrawal forces occur as a result of jamming between the sheet and the punch.

Fig. 1
figure 1

Force–displacement curve (Hoppe et al. 2019) (a) and partition of the cutting edge during blanking of sheet metals into characteristic areas according to the VDI 2906-2 (b)

The quality of blanked parts can be described by the characteristic areas of the cutting edge surface (see Fig. 1b). According to VDI 2906-2, these features are defined as a percentage of the blanked surface orthogonal to the cutting line and are divided into rollover zone \({h}_{\mathrm{e}}\), shear zone \({h}_{\mathrm{l}}\), rupture zone \({h}_{\mathrm{f}}\) as well as the burr height \({h}_{\mathrm{b}}\). They are directly influenced by the wear state and reflect the current tool condition. Adhesive and abrasive wear occurs in the contact zone between punch, die and sheet metal and is classified into three types of wear, flank wear, face wear and tip wear (Hambli, 2001; Hernández et al. 2006). According to Hernández et al. flank wear especially occurs during the push- and withdraw-phase of a blanking cycle (Fig. 1 phase II and III). Due to the longer wear length, normal stresses between the lateral surface of punch, die and sheet metal lead to an increased flank wear, especially for large sheet thicknesses. Cutting soft steel grades (\({R}_{\mathrm{m}}<\) 350 MPa) reinforces this effect (Hohmann et al. 2017). For example, Hohmann et al. showed in their work that a cold-rolled steel (DC03) tends to adhesive flank wear. This leads to higher frictional forces between punch and sheet metal which is directly influencing the push- and withdraw-phase force signal of a blanking process. In contrast, high-strength steel grades (\({R}_{\mathrm{m}}>\) 600 MPa) tend to abrasive tip wear which leads to a rounded cutting edge radius (Klingenberg & Singh, 2004). In contrary, face wear occurs especially when blanking thin sheets, which is due to high resulting surface pressures between the face surface of the punch and the sheet metal. In addition, temperature peaks in the contact zone lead to local melting and microstructural changes of the punch material which enhances micro-machining processes (Toussaint, 2000). The combination of these three wear states finally changes the geometry of the punch and die, as shown in Fig. 2. In this work we focus on the rounding of the cutting edge radii by abrasive wear. Hambli, Klingenberg and Singh as well as Kubik et al. showed in their work that exactly this abrasive wear is the most common cause of wear in blanking and has a significant influence on the quality of the component (Hambli, 2001; Klingenberg & Singh, 2004; Kubik et al. 2021).

Fig. 2
figure 2

Geometry of the active element worn surface in generic case of tool wear (Hernández et al. 2006)

Over the last decade, many experimental and empirical studies have been conducted to understand wear in the blanking process. On the one hand, they investigated the influence of blanking parameters (clearance (Mucha, 2010), cutting edge radii (Klingenberg & Boer, 2008), sheet-metal thickness (Hambli, 2001) and the number of blanking cycles (Cheon & Kim, 2016)) on the wear evolution. On the other hand, the influence of occurring wear states on the quality of the component was investigated (Hernández et al. 2006; Makich et al. 2008). In addition to these empirical and experimental investigations, great progress has been made in the 2D and 3D numerical simulation of wear phenomena during blanking. Using computational methods, several authors tried to predict tool wear and the resulting form errors on the blanked parts (Hambli et al. 2009) and to optimize process parameters to reduce these errors (Faura et al. 1997). According to Hernández et al., these numerical simulations are very complex, due to large deformations, ductile fractures and crack propagation occurring during the process of blanking (Hernández et al. 2006). Due to the highly non-linear behavior of blanking processes, a detailed description of the wear state by means of empirical or numerical models is only possible to a limited extent. For this reason, real-time monitoring of the wear state is difficult to implement in an industrial production environment and is based on diagnostic approaches that focus on the detection, isolation and identification of defects as they occur (Jemielniak et al. 2012). In these systems, empirical knowledge is necessary to determine thresholds and envelopes to detect erroneous process conditions. The potential of the data is not fully exploited and is limited to the identification of binary faults (Isermann, 2011). Identification of error-cause relationships or the quantification of the size of wear does not take place in the industrial production environment. In contrast, a large number of studies in the literature investigated data-driven approaches to fully exploit the potential of given data sets for detailed description of wear conditions in manufacturing processes. Especially in the field of machining authors have penalized ML models for predicting wear states in the last decades. They focus on predicting wear-related quality features of machined parts [27, 28], quantifying wear of machining tools (Shen et al. 2020; Zhou et al. 2020) or predicting tool life [31]. In contrast, ML models for blanking processes are mainly used to describe discrete fault states, while the prediction and quantification of wear conditions is hardly found in recent literature. Jin and Shi developed a wavlet transformation technique for feature-preserving data compression of tonnage information during a stamping process. Based on these features a segmental thresholding strategy similar to a decision tree is presented to identify a change in thickness of the material (Jin & Shi, 2001). Lee et al. investigated an architecture of an automatic supervisory system for monitoring wear during a blanking process. The system employs an autoregressive time-series model to predict the online measured maximal blanking force. Based on the autoregressive model, coefficients wear-related state of the tool (sharp vs. worn) are predicted using a linear discriminant function (Lee et al. 1997). In a similar way, Ge et al. extracted features from a force signal of a blanking process by means of an autoregressive model and used them in a Hidden Markov Model to classify faulty conditions such as misfeeding of the work piece, slug pulling or variation of the sheet thickness (Ge et al. 2004). Bassiuny et al. investigated an empirical mode decomposition to extract the features from the strain signals of a sheet metal stamping process. Based on the features, a Hilbert marginal spectrum is calculated which is used to classify the faulty process conditions (misfeed and material to thick) by a learning vector quantization network (Bassiuny et al. 2007). Jin and Shi extracted features from the tonnage signal of a stamping process by a principal component analysis and performed a regression analysis to model the relationship between features and process variables. Hierarchical classifiers and a cross-validation method are used for root-cause determination to predict the binary state (normal / abnormal) of the process variable (Jin & Shi, 2000). Hambli presented a backpropagation neural network algorithm for predicting burr height formation on blanked parts, considering the tool clearance and its wear state. The inputs of the artificial neural network (ANN) are generated by a finite element analysis of the circular blanking process. Validation of the results by the ANN showed good agreement with a deviation of 10% between predicted and experimental burr height (Hambli, 2001). Next to the presented regression models and ANN, Support Vector machines are getting more and more attractive for data driven analysis and their application to manufacturing processes. In particular, for monitoring machinery (health) condition (Goyal et al. 2020; Liu et al. 2017) or for quality classification of machined, additive manufactured or welded parts (Baturynska & Martinsen, 2021; Çaydaş & Ekici, 2012; He & Li, 2016). In this regard, the support vector machine (SVM) is not only one of the most powerful and robust classification and regression algorithms, but has also significantly improved the handling with multi classification problems and unbalanced data sets (Cervantes et al. 2020). Even with these advantages, the use of SVM in sheet metal forming processes is mainly limited to the classification of discrete errors. Ge et al. presented in his work SVMs with different kernel functions to classify erroneous states during a blanking process. The experimental results showed that SVM can achieve a success rate of 99% detecting these faults (Ge et al. 2002). He presented in his study a one class SVM for detecting abnormal health conditions (ok / nok) of a progressive stamping machine. To achieve a desirable performance considering a trade-off relationship between the false alarm rate and false detection rate hyper parameters of the SVM need to be optimized (Qiu et al. 2020). Another work is presented by Zhou et al., which uses an SVM to identify a missing part in one of the die stations of a progressive tool using tonnage signals (Zhou & Jin, 2005). Considering this literature, SVM approaches are only used for binary classification. The extension of this binary SVM to a multiclass scenario is not yet state of the art in metal forming and is also limited in other domains.

Common to all these ML approaches is the need for a valid data set that is used to train and validate the model based on experimental or simulatively generated data. No effort was invested into pre-processing the data as this is one of the most time-consuming steps in applying an ML model to scientific problems and how the data is acquired. Investigations into the extent to which sensor configuration and positioning, as well as the step of data pre-processing, influence the model performance for predicting the process state or the quality of manufactured parts are not to be found in the literature. Especially in the process of blanking, which takes place at production rates up to 1000 strokes per minute (spm), careful selection and integration of the sensor technology as well as the preparation of the data before the modeling step are of crucial importance. The high stroke rates lead to short tool engagement times, complex stresses and strains, and a nonlinear course of sensorial measured time series (Hambli, 2001). In addition, the process is affected by vibrations caused by dynamic effects. During a blanking cycle, impacts occur when the blank holder and the punch hit the sheet metal, the sheet metal breaks and the punch is pulled out of the sheet metal. Each impulse induces vibrations in the tool which directly influence the acquired time signal. These vibrations are superimposed on the time signal and their influence increases significantly with higher stroke speeds. Hirsch et al. as well as Slavič et al. identify an increase of the maximum process force of approximately 14% when the stroke speed difference is increased by 200 spm (Hirsch et al. 2011; Slavič et al. 2014).

A summary of the literature shows that ML models can fully exploit the potential of existing data to describe, evaluate and predict process states in manufacturing processes. However, due to the complexity and dynamics of blanking processes, as well as the nonlinear and transient characteristics of the measured time series, only a few studies applying ML models to the process can be found in the literature. In particular, for wear prediction, the literature is limited to the identification of binary states. Furthermore, in order to apply a robust ML model to describe wear states during blanking, it is essential to consider the steps of data acquisition (sensor position and sensor type) data preprocessing and data transformation. Therefore, in this study an SVM is applied to classify the wear states of abrasives in a blanking process and the influence of the data acquisition and data preparation step on the performance to the model is quantified. To achieve this objective, Sect. 3 presents the procedure in this work. For this purpose, the experimental setup is presented, force signals are descriptively analyzed and (SVM) to classify the wear states in the presented blanking process are explained. In Sect. 4, the qualitative and quantitative evaluation of the SVM is carried out depending on the sensor configuration (type and position), the preprocessing (filtering) and the transformation (extraction and selection). The resilience of the model is validated using different stroke speeds and associated uncertainties due to dynamic effects.

Experimental setup and procedure

In order to quantify the influence of data acquisition, preprocessing and transformation on the performance of mSVM for predicting the wear states, the experimental procedure is shown Fig. 3. In the first step, the acquisition of force signals and the measuring of the different stages of the abrasive worn punches are conducted. The force signals are acquired during each stroke cycle varying sensor positions and types (Table 2). Afterwards, feature engineering and a principal component analysis (PCA) is conducted to extract features from these force signals. The wear states are quantified by optically measured cutting edge radii of the punches. As demonstrated in Sect. 3.3, even two features derived from the force signals represent over 95% of the total variance in the data set. Therefore, the two major principal axes (\({f}_{1}\) and \({f}_{2}\)) serve as input while the five wear states serve as output of the mSVM. The acquired process and quality data (features \({f}_{\mathrm{i}}\) and wear states \({r}_{\mathrm{i}}\)) are divided into test and training data sets. Finally, the training and validation of the mSVM takes place based on these data. Figure 3 shows the required steps for predicting the wear state during blanking including data collection, feature extraction, feature evaluation, and application of mSVM.

Fig. 3
figure 3

Procedure for predicting abrasive wear states during a blanking process including the steps of data acquisition, feature extraction, feature selection and mSVM application

Press and experimental tool

All experiments were carried out on a high-speed press from Bruderer AG (BSTA 810). The press allows a nominal force of 810 kN and stroke rates of up to 1000 spm at a stroke height of 35 mm. The machine parameters for the experiments were set to a stroke distance of 35 mm and stroke speeds of 200 spm, 300 spm, 400 spm and 500 spm. The geometry of the punch was chosen to be a cylindrical with a diameter of 6 mm. The gap between punch and die is set to 0.15 mm and in relation to the sheet thickness of the semi-finished product of 2 mm, this results in a clearance of 7.5%. Table 1 shows the tool and press parameters as well as the properties of the sheet metal.

Table 1 Tool and press parameters and the properties of the sheet metal for conducting the experiments

The multisensory tool consists of a lower and an upper part, which are connected to each other by four guide columns. The cylindrical punch is connected to the adapter plate in the upper tool via a plunger. A piezo electrical force washer is integrated into the direct force flux. In the lower tool, the die is integrated into an adapter plate and connected to the base plate via plungers. Therefore, four force washers are integrated into the direct force flux. In addition, four linear strain gauges combined to a full bridge are taped on the plunger, a measuring pin is mounted in a hole of the adapter plate in the upper tool and a strain sensor is screwed into the indirect force flow to the columns of the press. Figure 4 shows the detailed design of the tool, its integration into the press and the positioning of the sensors.

Fig. 4
figure 4

Sectional view of a multisensory blanking tool with respect to the positioning of the force sensors

Acquisition of process and quality data

To measure the force signals, various sensor types were integrated at different positions in the tool and the press (Fig. 4). Table 2 summarizes these sensor types and positions. Four sensor types are differentiated, resistive strain gauges, piezo-electrical force washers and piezo-electrical measuring pins. In the upper tool an uniaxial piezo-electrical force washer (Kistler 9051C) with a nominal force of 120 kN was integrated into the direct force flux. In addition, the force signals are measured in the direct force flux of the lower tool by four triaxial piezo-electrical force washers (Kistler 9047C) with a nominal force of 80 kN. The symmetrical arrangement of the sensors allows the cutting force to be spatially resolved. For the further investigations in this work, these four force signals parallel to the stroke path of the ram are summed up to a total force. In addition to these piezo-electrical force washers, a piezo-electrical quartz transverse measuring pin (Kistler 9240 A) is integrated on the adapter plate in the upper tool. The size of 8 \(\times \) 14.4 mm2 allows measuring of strains from 0 to 500 \(\mu \varepsilon \) even with limited space in the tool. The charge distribution resulting from a change of the load on the force washers and the measuring pin is converted into a voltage signal by a charge amplifier (Kistler 5073A). In addition to the piezo-electrical sensors, low-cost strain gauges and strain sensors are integrated into the tool and the press. Four linear strain gauges (VPG C4A-06-125SL-350) combined to a full bridge are installed between the punch holder and the adapter plate. Strain sensors (Baumer DSRT 23DF) are already mounted to the columns of the press and are used to monitor the nominal forces to protect the press from overloading. The force washers, quartz transverse measuring pin and the charge amplifier as well as the strain gauges were calibrated according to DIN EN ISO 376. For this purpose, the tool was integrated into a universal testing machine (Zwick Roell Z100) and cylindrically loaded at different force levels. The calibration factor of each sensor is determined by calculating the slope of the calibration lines. Since the strain sensors were already integrated into the press, a calibration was already carried out by Bruderer AG. The voltage signals of the force sensors and the eddy current sensor are recorded by a CompactRIO system with integrated measuring modules NI 9220 (analog input ± 10 V), NI 9215 (analog input ± 10 V) and NI 9237 (analog input ± 25 mV/V).

Table 2 Various sensor types and their position inside the tool /press for acquiring the force signals

Since the blanking process is strongly influenced by the dynamic effects resulting from the high stroke rates, the setting of the sampling frequency has to be set high enough. The required sampling frequency depends on the selected stroke rate. Thereby, a complete stroke cycle of the Bruderer press from top dead center to bottom dead center back to top dead center is determined by 360°. The actual tool engagement time only takes place in a limited angular range. With an assumed stroke frequency of 500 spm and an angular range of 160° to 210°, this results in a time of 0.024 s while the tool is in contact with sheet metal. As the tool engagement time is further divided into the three phases (Fig. 1), a time window of 3.3 ms results for the punch-phase. In order to be able to continue processing the dynamic effects under these circumstances, a sampling frequency of \({f}_{\mathrm{s}}\) 50 kHz was selected. To keep the amount of data as small as possible, only a part of the entire stroke is recorded. Therefore, data recording starts if the tool passes an inductive proximity sensor. During the reverse stroke, the tool again passes past the inductive proximity sensor and the measurement is stopped. This finally results in a measuring range of 120°–240° or punch penetration of 13 mm and a time series with 2070 samples for the highest stroke speed of 500 spm. In addition to the acquired force signals, quality data (abrasive wear state of the punch) is optically measured by a confocal white light microscope (NanoFocus AG type μsurf explorer). It is assumed that abrasive wear causes the rounding of the cutting edge radii of the punch (Falconnet et al. 2015; Hohmann et al. 2017). In order to approximate the reproducible abrasive wear conditions without the need for time-consuming long-term experiments, the cutting edge radii are mechanically rounded by a post machining process. The cutting edge is varied in five steps starting from a sharp cutting edged radii \({r}_{0}\). Table 3 shows the results of the five optically measured the five wear states represented by the cutting edge radii \({r}_{\mathrm{i}}\) and the abrasive wear volume (Fig. 5).

Table 3 Quantitative characterization of the abrasive wear states by optically measuring the cutting edge radii \({r}_{\mathrm{i}}\)
Fig. 5
figure 5

Quantitative and qualitative characterization of the abrasive wear states by optically measuring the cutting edge radii \({r}_{\mathrm{i}}\)

Based on these five wear states, the experimental procedure is shown in Table 4. 100 stroke cycles at four different stroke speeds are acquired for each wear state, resulting in twenty single experiments and 2000 time series for all experiments. Considering the number of sensor types and positions, 10,000 time series must be evaluated.

Table 4 Experimental procedure for acquiring force signals considering sensor positions und types, stroke speeds \({v}_{\mathrm{i}}\) and wear states \({r}_{\mathrm{i}}\)

Descriptive analyses of force signals

As mentioned above the time series characteristics of the measured force signals are nonlinear, transient and affected by dynamic effects. Therefore, in the following section the force signal quality is quantified by of descriptive statistics. Differences between the sensor positions and types, as well as the influence of increasing stroke speed, and the related dynamic effects have to be quantified. To compare the influence of different sensor positions and types as well as stroke speeds on the quality of the acquired force signals this descriptive analysis is exemplarily conducted with an unworn punch (\({r}_{0}\)) and a stroke speed of 200 spm. Figure 6 shows the scaled and offset-adjusted force signals of 100 single strokes for each sensor described in Table 2. During the further investigations, the force washer in the upper tool is used as a reference. In Fig. 6 the mean of 100 strokes of the reference sensor is visualized with a red line to compare it with the 100 force signals of each sensor type.

Figure 6
figure 6

Force signals of the sensor types for 100 single strokes (wear state \(r_{0}\) and stroke speed 200 spm) compared to the mean force signal of the reference sensor

The qualitative analysis of the force signal shows that especially strain gauge sensors tend to noise, which is visualized by the wide scatter band. While the piezo-electrical sensors show less noise, especially the force washer in the lower tools is superimposed by vibrations. In addition, the strain sensor mounted to columns of the press shows superimposed by vibrations but with a lower frequency 80.4 \(\pm\) 17.3 Hz for all stroke speeds caused by the eigenfrequency of the press. Looking at the time signals, it is obvious that the sensor types and position influence the deviation of the measured time series. In order to quantify this deviation as well as to explore the cause for it the standard deviation of the force signal at each sample point \(s\) is determined for each sensor type. Let \(x_{{\text{i}}}\) be the i-th sample \(i \in \left\{ {1, \ldots , n} \right\}\) of the force signal of a specific sensor type and \(\overline{x}\) the mean of the force signal of a specific sensor type, standard deviation of the sample is calculated as follows

$$ s = \sqrt {\frac{1}{n - 1}\mathop \sum \limits_{{{\text{i}} = 1}}^{n} (x_{{\text{i}}} - \overline{x})^{2} } $$
(1)

Figure 7 shows the standard deviations calculated from 100 strokes cycles for each sensor type.

Fig. 7
figure 7

Standard deviation of the sensor types for 100 single strokes (wear state \(r_{0}\) and stroke speed 200 spm) compared to the standard deviation of the force signal of the reference sensor

The red line visualizes the standard deviations of the reference sensor (force washer upper tool). The results show a high averaged deviation over the entire stroke cycles for the strains gauges (strain gauge punch and strain sensor). In contrast, the mean deviation of piezo-electrical sensors is negligible over the entire stroke cycle (force washer lower and piezo pin). Whereby the piezo pin shows a slightly higher mean standard deviation than the force washer in the lower tool. Furthermore, the piezo-electrical sensors show a peak of the standard deviation in the sampling range from 845 to 855 (P1) and 1150 to 1160 (P2).

Taking the sensor type and sensor position as well as the stroke speed into account allows the conclusion that the standard deviation of each experiment (including 1000 strokes) is mainly caused by electrical noise and dynamic effects. The dynamic effects result from the static and dynamic behavior of the press as well as process-related vibrations (P1 and P2). At peak P2 the material breakage is initiated and elastic energy stored in the system is abruptly released. Caused by the masses located between the forming zone and the position of measurement, inertia forces are induced to the system and superimpose vibrations to the force signal. Especially the comparison between the piezo-electrical sensors in the upper and lower tool (Fig. 6) shows vibration superimposed to the force signal in the lower tool caused by an uneven mass distribution. In this case the mass of the components between the sensors and the forming are ten times higher in the lower tool than in the upper tool. In contrast, peak P1 results from the punch hitting the sheet metal and initiating an impact into the structure of the press. Since the impact of the punch on the sheet is a short impulse, the product of inertial forces and exposure time is approximated by a Dirac impulse. This Dirac impulse is superimposed on the actual force signal and leads to an infinite, continuous frequency spectrum. Especially this impulse is recognizable in the time series of the strain sensor (Fig. 6).

In addition to the dynamic effects the deviation of the force signal depends on the electrical noise. Especially strain gauges which are indirectly measuring the force related to a change in the resistance are affected by this. On the one hand, the change of resistance is caused by the elongation of the measuring grid. On the other hand, the resistance is influenced by the electrical contacting of the sensors, the length of the cables and electrostatic as well as electrodynamic phenomena (e. g. grounded loops, thermal, shot, flicker, burst and transit-time noise) which are further summarized under the term electrical noise.

Next to the dynamic effects and the electrical noise, deviations in the force signals result from a time shift (already few samples) of the force signal. Although the signal between following strokes is absolutely repeatable, even a small offset of a few samples can cause large deviations, especially with strongly fluctuating signals. This effect is especially demonstrated by the shift of the point in time at which the material breaks and the punch hits the sheet metal caused by varying local properties of the material (tensile strength, elongation at break, crystallographic defects) or the semi-finished product (sheet thickness, lubrication, etc.). However, it is not only the material properties but also the hardware of the measuring chain (A/D converter, charge amplifier, measuring modules, performance CPU, etc.) that influences a time shift between following force signals. In particular, triggering the start and end point of the measurement is effected by uncertainties of the magnetic field detected by the eddy current sensor. Because of the uncertainties of this inductive proximity sensors, a shift on the time axis also occurs even if the signal between following strokes is absolutely repeatable. To summarize, the deviations in the force signal are strongly influenced by vibrations or electrical noise which are amplified by a time shift (few samples) of the force signals from stroke to stroke due to the properties of the material as well as the design of the measuring chain. In order to quantify the influence of the sensor types and sensor position as well as the stroke speed on deviation of the signal in each stroke cycle by one characteristic value, the average standard deviation \(\overline{s }\) is determined. Equation (2) is used to calculate the confidential range with a statistical certainty of 99%. Error bars are estimated by the value \(\mu\) which consists of the arithmetic mean \(\overline{x}\) and the confidential limits. These confidential limits are calculated from the student factor \(t\), the standard deviation of the sample \(s\) number of observation \(n\).

$$ \mu = \overline{x} \pm \frac{t \cdot s}{{\sqrt n }} $$
(2)

Table 5 summarizes the averaged standard deviations \(\overline{s}\) for an unworn punch (\(r_{0}\)), the four stroke speeds and all sensor types. It shows qualitatively that the averaged standard deviations increase with higher stroke speeds, progressive distance from the forming zone, a direct measurement compared to an indirect measurement and the use of a resistive sensor compared to a piezo electrical sensor.

Table 5 Averaged mean standard deviation of the model for all sensor types over all stroke speeds for an unworn punch (\(r_{0}\))

Data transformation

The experimental procedure shown in Table 4 results in a data set that has to be prepared for mSVM before application. A major step of this data transformation is the reduction of the dimension of the data set by removing the redundant data (Liu & Motoda, 1998). A large data set results in a high model complexity which leads to poor generalizability of the model and to high computational times. A trade-off between the accuracy of the model and the computational effort can take place by extracting features from the data set. Since the performance of the ML algorithm is significantly influenced by this data transformation, the feature extraction is a key factor for successful ML projects (Domingos, 2012). In engineering applications, features are usually extracted from sensorial measured time series. According to Li, features can be extracted either from the time domain, the frequency domain or the time–frequency domain (Wang & Gao, 2006). The feature extraction from the time domain can be derived directly from the sensor signals. Mostly these are statistical parameters such as maximum values, mean values, standard deviations, skewness, kurtosis or the root mean square. Furthermore, Hoppe et al. as well as Kubik et al. were able to show in their studies that engineering feature from the time domain represent effective parameters for describing process conditions during blanking (see Fig. 8) (Hoppe et al. 2019; Kubik et al. 2021). Therefore, the force signal is initially divided into three phases and characteristic points which define the start and end points as well as extrema during each phase are identified. Finally, the features are derived from these characteristic points and can be described as the length \(l_{{{\text{j}},{\text{ i}}}}\), maximum force \(F_{{{\text{j}},{\text{ i}}}}\) and work done \(W_{{{\text{j}},{\text{ i}}}}\) in each. In this case the index \(j\) describes the respective phase of the cutting process (punch-phase (\({\text{p}}\)), push-phase (\({\text{pu}}\)), withdraw-phase (\({\text{w}}\))) and \(i\) the number of variations in the experiments.

Fig. 8
figure 8

Features extracted from force signal

For extracting features from the frequency domain spectral and frequency analyses are performed. While the use of frequency domain analyses (Hilbert-Huang Transformation, Short-Time Fourier Transformation, etc.) are already widespread in domains like health and condition monitoring of machining processes, these methods are rarely used in forming technology production (Aydin et al., 2012). Representatives of time–frequency domain techniques are the wavelet transform, which are particularly suitable for the investigation of instationary time series (Fan et al. 2001). The most important technique for feature extraction in engineering applications is the principal component analysis (PCA) (Addison et al. 2003). In production engineering applications, PCA is used for the purpose of quality control (Wu & Chyu, 2004), condition monitoring (Rato & Reis, 2020) and predictive maintenance (Lewin, 1995). PCA is a multivariate statistical technique that handles large amounts of data via orthogonal projection. It reduces the dimensionality of a data set by projecting original data into a lower dimensional orthogonal space defined by a few significant eigenvectors. Applying this method to a blanking process means that we are searching for a compact representation of the measured force signals that still contains the information about relevant variations. In this paper PCA is used to extract features from experimentally acquired force signals, that denoted as a matrix \({\varvec{X}} \in {\mathbb{R}}^{{{\text{m}} \times {\text{p}}}}\), in which each row vector \({\varvec{x}}_{{\text{i}}}\) is a complete cycle of a force signal with \(p\) measurement points, and \(m\) is the total number of observations. Therefore, the first step of the PCA is to compute the covariance matrix

$${\varvec{\varSigma}}= {\text{cov}}\left( {\overline{\user2{X}}} \right) = \left[ {\begin{array}{*{20}c} {\sigma_{11} } & \cdots & {\sigma_{1p} } \\ \vdots & \ddots & \vdots \\ {\sigma_{p1} } & \cdots & {\sigma_{pp} } \\ \end{array} } \right] $$
(3)

of a zero-mean set of NN measurement series

$$ \overline{\user2{X}} = {\varvec{X}} - {\varvec{\mu}}\quad {\text{assuming}}\quad \mu_{{\text{i}}} = \frac{{\mathop \sum \nolimits_{j = 1}^{p} x_{i,j} }}{p} $$
(4)

to receive the variance at each point in time and also the joint variability with all other points in time. The principal components are computed by solving the eigenvalue problem of the covariance matrix \({\varvec{\varSigma}}\in {\mathbb{R}}^{{{\text{p}} \times {\text{p}}}}\)

$${\varvec{\varSigma}}\cdot {\varvec{v}}_{{\text{j}}} = \lambda_{{\text{j}}} \cdot {\varvec{v}}_{{\text{j}}} $$
(5)

The vector \({\varvec{v}}_{{\text{j}}} \in {\mathbb{R}}^{{{\text{p}} \times 1}}\) (\(j = 1, \ldots , p\)) is the normalized eigenvector of the sample covariance matrix \({\varvec{\varSigma}}\) of \(\overline{\user2{X}}\). Using these principal axes \({\varvec{v}}_{{\text{j}}}\), each force signal \({\varvec{x}}_{{\text{i}}}\) (\(i = 1, \ldots , m\)) can be transformed into a couple of features

$$ f_{{{\text{j}},{\text{i}}}} = {\varvec{v}}_{{\text{j}}}^{{\text{T}}} \cdot {\varvec{x}}_{{\text{i}}} $$
(6)

When a data set is projected to eigenvectors, it is often found that only the first few eigenvectors, corresponding to larger eigenvalues, are associated with the systematic process variations. All remaining eigenvectors reflect the variations of the process noise (Jin & Shi, 2000). This noise is caused by random disturbances, electrical noise, uncontrollable process variations or dynamic effects. Therefore, dimension \(p\) of the original PCA can be reduced into a smaller dimension \(p_{{{\text{opt}}}}\) (\(p_{{{\text{opt}}}} < p\)). To do this, the principal axes are first sorted by their size of the eigenvalue and after this the dimension is reduced until the selection index \(\eta_{{{\text{opt}}}}\) becomes smaller than a defined value.

$$ \eta_{{{\text{opt}}}} = \frac{{\mathop \sum \nolimits_{j = 1}^{{p_{{{\text{opt}}}} }} \lambda_{{\text{j}}} }}{{\mathop \sum \nolimits_{j = 1}^{p} \lambda_{{\text{j}}} }} $$
(7)

There is no methodical approach to define this selection index, \(\eta_{{{\text{opt}}}}\) set to 95%. This ensures that more than 95% of the variance in the force signal can be explained by the \(p_{{{\text{opt}}}}\) largest eigenvalues. Depending on the stroke speed every data set consists of \(N\) samples, being \(N = f\left( {v_{{\text{i}}} } \right)\), and 500 dimensions which are related to the experiments conducted for each stroke speed. Since force signals from five different sensors are recorded, the data set is dimensioned as \({\varvec{X}}_{{\text{i}}} \in {\mathbb{R}}^{{500 \times {\text{N}}}}\) with \(i \in \left\{ {1,2,3,4,5} \right\}\). The five corresponding wear states are captured in the output vector \({\varvec{Y}} \in {\mathbb{R}}^{{1 \times {\text{N}}}}\) with the same dimensions. Looking at the selection index \(\eta_{{{\text{opt}}}}\) the first two eigenvectors explain 96.2% of the total variance in the signal. For further investigations only these two eigenvectors will be used as input variables for the mSVM. To investigate the influence on the performance of the mode, three additional features (\(f_{3} , f_{4}\) and \( f_{5}\)) from PCA as well as nine engineered features (Fig. 8) are provided. To ensure the comparison of the models, even by combining PCA and Engineered Features, the features are normalized by the Z-score.

The machine learning method of SVM

As described in Sect. 2 there are many different types of machine learning methods that can be applied for classification tasks. Therefore, die most suitable method for classifying the wear states during blanking hast to be determined during a grid search. In this study a twostep grid searches were executed. On the one hand to find the best machine learning method and on the other hand the optimize the parameter configuration of the chosen ML method. From the force signal of the strain gauge (VPG C4A) on the punch, the ML models were trained and validated. Based on the classification accuracies for predicting the wear conditions the performance of the model was evaluated. For training the ML algorithm, the first two principal axes (process data) are available as input data and the wear states (quality data) as output data for each time series. To keep the ML models as simple as possible.

To keep the model as simple as possible, mainly linear classification models were compared. It should be noted, however, that the use of non-linear classification functions can improve the model performance. Since the focus of this study is to show how data acquisition, data preprocessing and data transformation affects the model performance, a simple ML model is chosen. Therefore, a linear discriminant analysis (LDA), random forest (RF), mSVM with a linear kernel function, Naive Bayes classifier (NB), as well as k-nearest neighbor classifier (k-NN) for classifying the abrasive wear states were tested. In Fig. 9a the accuracy achieved with a fivefold cross validation for the best configuration of the machine learning methods are shown. The accuracy values of the classification results from the NB and k-NN are not good and only depicted for completeness. The best prediction results were obtained from the mSVM with a linear kernel function. In this simplest case of linearly separable classes, an optimal hyperplane is sought, that maximizes the distance between the best separating plane and the nearest data. Considering a dataset given as [(\({\varvec{x}}_{1}\), \(y_{1}\)). (\({\varvec{x}}_{2}\), \(y_{2}\)), … (\({\varvec{x}}_{{\text{i}}}\), \(y_{{\text{i}}}\))] consisting of \(m\) training samples, where \({\varvec{x}}_{{\text{i}}} \in {\mathbb{R}}^{{\text{m}}}\) is an \(m\)-dimensional feature vector representing the \(i\)-th training tuple and \({\varvec{y}}_{{\text{i}}} \in \left\{ { - 1,1} \right\}\) is the corresponding class label the optimal hyperplane can be found by the following optimization problem:

$$ \begin{array}{*{20}l} {\text{minimize:}} \hfill & {L\left( {\varvec{w}} \right) = \frac{1}{2}{\varvec{w}}^{2} } \hfill \\ {{\text{subject}}\;{\text{to:}}} \hfill & {y_{{\text{i}}} \left( {{\varvec{w}}^{{\text{T}}} {\varvec{x}} + b} \right) \ge 1} \hfill \\ {} \hfill & {i = 1, \ldots , m} \hfill \\ \end{array} $$
(8)
Fig. 9
figure 9

Classifications results for each tested ML method (a) and optimized hyperparameter for the mSVM with linear kernel function (b)

The hyperplane in the feature space can be described by the equation \({\varvec{w}}^{{\text{T}}} {\varvec{x}} + b\), where \({\varvec{w}} \in {\mathbb{R}}^{{\text{m}}}\) and \(b \in {\mathbb{R}}\). In the case of linearly nonseparable classes, no hyperplane is found that is capable of correctly classifying every training sample. In this case the optimization problem can be described by setting a soft margin by including a slack variable \(\xi_{{\text{i}}}\) and a tuning parameter \(C\), the box constraint. This parameter allows the training algorithm a certain misclassification in the training set and applies costs to this misclassification. The higher the box constraint, the higher the cost for the misclassified points, leading to a stricter separation of the data (Cherkassky & Ma, 2004). In this simplest type, SVM divides the data points linearly into classes. In real-world problems, however, we find more than two classes. In this paper we have to deal with five abrasive wear states (cutting edge radius \(r_{{\text{i}}}\)). Therefore, this mSVM is broken down to multiple binary SVMs. Two commonly used techniques one-vs-rest (OVR) and one-vs-one (OVO) can be found in the literature. While the OVO approach splits the dataset into one dataset for each class versus every other class, the OVR approach splits the dataset into one binary dataset for each class (Kijsirikul & Ussivakul, 2002). Considering an M-class problem, where \({\varvec{y}}_{{\text{i}}} \in \left\{ {1, \ldots , M} \right\}\) are the corresponding class labels with \(i\)-the SVM optimization problem can be described as follows:

$$ \begin{array}{*{20}l} {\text{minimize:}} \hfill & {L\left( {{\varvec{w}}_{{\text{i}}} ,\xi_{{\text{j}}}^{{\text{i}}} } \right) = \frac{1}{2}{\varvec{w}}_{{\mathbf{i}}}^{2} + C\mathop \sum \limits_{j = 1}^{m} \xi_{{\text{j}}}^{{\text{i}}} } \hfill \\ {{\text{subject}}\;{\text{to:}}} \hfill & {y_{{\text{j}}} \left( {{\varvec{w}}_{{\text{i}}}^{{\text{T}}} {\varvec{x}}_{{\text{i}}} + b_{{\text{i}}} } \right) \ge 1 - \xi_{{\text{j}}}^{{\text{i}}} } \hfill \\ {} \hfill & {\xi_{{\text{i}}} \ge 0} \hfill \\ \end{array} $$
(9)

The mSVM used in this work deal with a classification of five different wear states by considering the problem as collection of binary classification problems using the OVO approach. As mentioned above, in the second step of the grid search the hyperparameter of the mSVM with a linear kernel are optimized and set to \(\xi_{{\text{i}}} = \) 0.65 for the slack variable and \( C = \) 413 for the tuning parameter. Figure 9b shows the optimized hyperparameter for the mSVM.

With principal axis extracted from the PCA and the engineered features fifteen features as input data and five wear states as output data are available for the mSVM. The model is trained for every combination of stroke speed and sensor type, resulting in 200 classification models. For each model the input data set is reduced to two significant parameters which always consist of the first two features from the PCA. Only for the investigation of the influence of data transformation on the model performance (Sect. 4.3) different feature combinations are used. To train the model 80% (\({\varvec{X}}^{2 \times 400}\)) of the data is used while 20% (\({\varvec{X}}^{2 \times 100} )\) is applied to ten-fold cross-validation of the model. For the final evaluation of the model and its performance accuracy as well as the overlap of the probability density function between classes separated by the SVM are considered. The accuracy describes the percentage of correct predictions and is defined as the quotient of number of observations correctly assigned to a class in relation to all observations. In addition to this accuracy, the performance of the model is determined by calculating the Mahalanobis distance \(d_{{\text{M}}}\). Calculating this distance allows to quantify the separability of classes by considering the distance between the centroid of these classes as wells as the variance inside each class. As the results of this study will show, the mSVM is suitable for the classification of abrasive wear states during blanking and achieves close to 100% accuracy, depending on the sensor position, the sensor type as well as the stroke speed, and the calculation of Mahalanobis distance is of interest for the separation of closely adjacent wear states. In the further investigations of this paper, the Mahalanobis distance is calculated only for the separation between the wear conditions ‘little (low) wear’ (\(r_{2}\)) and ‘medium wear’ (\(r_{3}\)).

Evaluation of model performance

In the following section the influence of data acquisition, preprocessing and transformation on the performance of the mSVM will be shown. The model performance is evaluated by the accuracy and the Mahalanobis distance. In order to quantify the resilience of the classification model, the performance of the mSVM is analyzed at different stroke speeds (see Table 4). To quantify the influence of data acquisition, preprocessing and transformation, the trained models are compared with a reference model for each experiment. This reference model is obtained by training the mSVM with the force signals of the piezo electrical sensor in the upper tool (force washer upper tool) at a maximum stroke speed of 500 spm. The reference sensor in the upper tool was used to train the mSVM, since the sensor is placed in the direct force flux close to the forming zone, is slightly affected by superimposed vibrations in the system and generates high quality data reflecting the actual physical state of the process.

Model performance depending on sensor position and type

The performance of the classification model depends on the quality of the data set and related to this to their acquisition procedure (Calmano et al. 2013; Groche et al. 2019). It is decisive which sensor types are selected, which measuring method is used and which position is chosen inside the tool or the press. Figure 10 shows the result of the mSVM for a piezo electrical force washer (force washer lower tool: direct measurement), a piezo electrical measuring pin (piezo pin: indirect measurement) and a resistive strain gauge sensor (strain gauge punch: indirect measurement).

Fig. 10
figure 10

Qualitative visualization of the mSVM classifying five wear states comparing piezo electrical force washer (a), piezo electrical measuring pin (b) and a resistive strain gauge on the punch (c)

Considering these results qualitatively, it is shown that especially piezo electrical sensor types in direct and indirect force flux significantly improve the performance of the classification model. They allow to measure dynamic effects without losing the information stored in the high frequency domain and are able to classify the wear state even with stroke speeds of up to 500 spm which results in high model accuracy as shown in Table 6.

Table 6 Influence of different sensor types and positions on accuracy to predicting wear states with a multiclass SVM and Mahalanobis distance to quantify the separability of class 2 and 3

The accuracy of the model is close to 100% over the entire stroke speed range for both piezo electrical sensors. Although both sensors have comparable accuracy, they show differences in the separability of their classes. However, to compare a piezo electrical sensor with a resistive sensor, the resistive sensor, which is susceptible to electrical noise, must be filtered. Therefore, the resistive strain gauge sensor is low pass filtered (\(f_{{{\text{c}},{\text{opt}}}}\) = 7.5 kHz). The results show that this type of sensor can correctly detect all wear states, with a 100% accuracy, even at high stroke speeds. In contrast, the resistive sensor provides a poorer separability of the classes which decreases with higher stroke speed, caused by the limited dynamic properties and the tendency to noise.

As described by Calmano et al., in addition to the sensor type, sensor position also affects the quality of the acquired data and thus the classification performance of the mSVM (Calmano et al. 2013). Comparing the performance of the classification models for two sensors of the same type at different positions in the process, this hypothesis can be confirmed. Therefore, Fig. 11 shows the classification results for two low pass filtered resistive sensors (strain sensor frame and strain gauge punch). The strain sensor in the columns has a comparable accuracy of 100% to the strain gauge in the punch for classifying the wear states at lower stroke speeds. However, as the stroke speed increases, a resilient description of the wear states is no longer possible and the accuracy drops to 71% at a stroke speed of 500 spm. A similar tendency can be seen in the separability, which decreases from 114.08 to 71.75 for the strain gauge punch and even from 100.53 to 3.24 for the strain sensor frame.

Fig. 11
figure 11

Qualitative visualization of multiclass SVM classifying five wear states comparing different sensor position of strain gauge mounted at the punch (a) and strains sensor (b) integrated to the columns of the press

Comparing the performance of the classification models for two piezo electrical force washers (force washer upper tool and force washer lower tool) which have the same physical distance to the forming zone but are integrated in the upper and lower tool differences in the model performance become apparent. The classification model based on the force washer in the upper tool shows lower deviations within each class, which is also reflected by the Mahalanobis distance. The Euclidean distance between centroid from class 2 and class 3 is about \(d_{{\text{E}}}\) = 0.65 \(\pm\) 0.05 for both models. In contrast to this, the Mahalanobis variant takes the variances into account and shows a difference of 13.6% averaged over the stroke speeds between the force sensor in the upper tool and lower tool. The accuracy of the model is 100% for both piezoelectric-type sensors over all stroke speeds (see Fig. 12).

Fig. 12
figure 12

Qualitative visualization of the multiclass SVM classifying five wear states comparing the integration of piezo electrical force washers in lower (a) and upper tool (b)

Model performance depending on data preprocessing

In addition to the data acquisition step and the related generation of valid data sets, the preprocessing of the data plays an important role in the application of ML models, since time signals acquired by sensors are usually consisting of a signal superimposed by a noise level. The noise is characterized by a stochastic, unpredictable behavior of the acquired variable caused by electrical or external sources. Noise is a combination of electrical (thermal, shot, flicker, burst and transit-time noise) and background noise (electromagnetic and acoustic noise as well as mechanical vibrations) (Computing & Corporation,Data Acquisition Handbook, 2004. 2004). From the perspective of signal processing, force signals in this study are described as follows:

$$ F\left( t \right) = F_{{{\text{sig}}}} \left( t \right) + F_{{{\text{n}},{\text{ el}}}} \left( t \right) + F_{{{\text{n}},{\text{tool}}}} + F_{{{\text{n}},{\text{process}}}} $$
(10)

\(F_{{{\text{sig}}}} \left( t \right)\) describes a relevant part of the acquired force signal, which represents the physical state of the process. The electrical noise \(F_{{{\text{n}},{\text{el}}}} \left( t \right)\) is composed of internal, physical effects such as thermal or transient time noise and external sources like electrostatic noise (voltage is induced in a conductor that is exposed to a time-varying electric field) and electromagnetic noise (current is induced in a conductor that is exposed to a time-varying magnetic field). Resistance measurements in particular, (e. g. strain gauges) are influenced by varying voltages and currents caused by electrostatic and electromagnetic noise. In addition, the resistance is influenced by the hardware of the measuring chain which is influenced by the length of the connection cable, its shielding and insulation, the electrical contacting, the temperature and the integration process. In contrast to strain gauges, piezo electrical sensors are less affected by this electrical noise. They are mainly effected by a drift of the signal related to a time-dependent shift of charge in the piezo electrical crystal. In addition to the electrical noise, mechanically caused noise is superimposed on the actual force signal. It is divided into process noise \(F_{{{\text{n}},{\text{process}}}}\) and press and tool noise \(F_{{{\text{n}},{\text{tool}}}}\). Process noise is related to vibrations caused by the impact of the punch on the sheet metal, by the material breakage, and the pushing and withdrawing of the punch through the sheet metal. This process-related noise may contain valuable information about the state of the process and is related to physical effects in the process. In contrast, press- and tool-related noise is caused by the static and dynamic effects (inertial forces resulting from the high accelerations of the ram) of the press and the connected peripherals (inertial forces and vibrations of the feed unit which accelerate and decelerate the sheet metal strip during each stroke cycle). While the mechanically caused noise strongly depends on the dynamic of the blanking process and therefore is related to the stroke speed, the electrical noise mainly depends on the sensor type (piezo electrical or resistive), the measuring method (direct or indirect measurement) and the measuring position (distance to the forming zone). To reduce both noise effects, filter operations are a common method (Jackson, 1996). In order to quantify the influence of filtering and the selection of the filter design on the performance of the classification model, the time signals of a resistive sensor and a piezo electrical sensor are analyzed. In particular, the influence of a selectable design filter parameter on model performance is demonstrated. A distinction is made between an optimally designed filter parameter, a poorly designed filter parameter and no filter. As a filter design, a third order Butterworth filter with a normalized cutoff frequency \(w_{{\text{i}}}\) is used. The normalized cutoff frequency is calculated by Eq. 12 with the sampling frequency \(f_{{\text{s}}}\) of 50 kHz and the cutoff frequency \(f_{{{\text{c}},{\text{i}}}}\) as design parameters.

$$ w_{{\text{i}}} = \frac{{2 \cdot f_{{{\text{c}},{\text{i}}}} }}{{f_{{\text{s}}} }} $$
(11)

However, special attention has to be paid to the selection of the designing filter parameter \(f_{{{\text{c}},{\text{i}}}}\). For filtering the force signal of the resistive strain gauge sensor on the punch an optimal cutoff frequency \(f_{{{\text{c}},{\text{opt}}}}\) = 7.5 kHz was used, while the cutoff frequency for the deficient case \(f_{{{\text{c}},{\text{def}}}}\) = 8.5 kHz was determined based on previous investigations (see Fig. 13). Comparing these filter designs, the optimal cutoff frequency shows a significantly better performance of the model. Looking at the frequency domain of the strain gauge sensor distinct frequencies in the range from 7.8 to 8.1 kHz and low-frequencies in the range up to 1.2 kHz are detected. While the high frequencies are related to electrical noise, the lower frequencies depend on the static and dynamic behavior of the press and the physics of the process.

Fig. 13
figure 13

Visualization of the force signal by the strain gauge on the punch filtered in the time domain (a) and frequency range of the raw signal (b)

An optimally designed filters achieve an accuracy of the model up to 100% and significantly increased the separability of the classes even at strokes speeds of 500 spm. In contrast, the cutoff frequency of the deficient filter design is above the frequency range \(f_{{{\text{c}},{\text{def}}}}\) = 8.5 kHz where the electrical noise is highly dissipated. Therefore, an improvement of the signal quality and related to this the model performance is not expected. Instead, the deficient filtered force signal remove physically relevant parts of the data and worsens the accuracy of the classification model to 82% even at stroke low stroke speeds as shown in Table 7 and qualitatively visualized in Fig. 14. A similar effect can be seen in the class separability, which improved insignificantly with a deficient designing filter parameter compared to non-filter design.

Table 7 Influence of different filter operations on accuracy in predicting wear states with a multiclass SVM and Mahalanobis distance to quantify the separability of class 2 and 3
Fig. 14
figure 14

Qualitative visualization of the multiclass SVM classifying five wear states comparing the non-filtered force signals from strain gauge on the punch (a) with the optimal \(f_{{{\text{c}},{\text{opt}}}}\) (b) and deficit \(f_{{{\text{c}},{\text{def}}}}\) (c) filtered signal

Looking at the frequency domain of the force washer, signal frequencies of major impact are localized in a lower range up to 1.2 kHz (Fig. 15b). Electrical noise in a high frequency range, as it occurs with resistive sensors, cannot be identified. In order to reduce the influence of these mechanical vibrations, the cutoff frequency is set to 0.7 kHz. However, selecting the cut-off frequency in this case requires caution, since the physically relevant characteristics of the force signal are removed if the cutoff frequency is set to a value smaller or equal to 0.5 kHz. This can be seen in the qualitative visualization of the classification model (Fig. 16).

Fig. 15
figure 15

Visualization of the force signal by the force washer in the lower tool filtered in the time domain (a) and frequency range of the raw signal (b)

Fig. 16
figure 16

Qualitative visualization of the multiclass SVM classifying five wear states comparing the non-filtered force signals from force washer in the upper tool (a) with the optimal \(f_{{{\text{c}},{\text{opt}}}}\) (b) and deficit \(f_{{{\text{c}},{\text{def}}}}\) (c) filtered signal

While the accuracy for both the unfiltered and optimally filtered design reaches 100%, the accuracy for the deficient filter design drops to 91%. In contrast to the accuracy, a better separability of the classes is achieved with a lower cutoff frequency of \(f_{{{\text{c}},{\text{def}}}}\) = 0.2 kHz (see Table 7). Due to mechanical vibrations in connection with a stroke-related time shift, the deviation of the force signals increases, which worsens the separability. Filtering the force signal with a cutoff frequency lower than 0.2 kHz reduces these deviations and increases the Mahalanobis distance, especially for higher strokes speeds, as shown in Table 7.

Consequently, the results show that signal filtering can significantly improve the model quality, but the selection of the filter operation and design is of crucial importance. On the one hand, systematic investigations of the frequency spectrum help with the design. On the other hand, it is essential to include process knowledge in the design procedure. Especially, for identifying the frequency range of the force signal which contains relevant information, process knowledge is needed. In addition, for the filter design procedure it could be shown that low frequencies contain important information for a resilient implementation of the classification model, while high frequency ranges caused by electrical noise are negligible (Table 8).

Model performance depending on feature selection

In addition to data acquisition and preprocessing, the transformation of data has a significant influence on the performance of ML algorithms (Blum & Langley, 1997). Therefore, we aim to achieve the accuracy of an ML model as represented by the entire data set by selecting relevant features from given times series with a minimal loss of information while improving the computational overhead. In the following part, the performance of the classification model is evaluated by comparing the features extracted by a PCA and a feature engineering approach (Sect. 3.3). These features are composed of the two major principal axes (\(f_{1}\) and \(f_{2}\)) extracted by the PCA, and nine engineered features (\(F_{{\text{i}}}\), \(w_{{\text{i}}}\) and \(l_{{\text{i}}}\) with \(i \in\) {punch, push, withdraw}). In the first step a correlations matrix was created to remove redundant features. The corresponding parameters and their Pearson correlation coefficients were determined for \(N\) = 500 single strokes for each experiment and are shown in Fig. 17.

Fig. 17
figure 17

Correlation matrix of the engineered features

Evaluating the matrix shows a significant correlation between the maximal force (\(F_{{{\text{punch}}}}\)), the length (\(l_{{{\text{punch}}}}\)) and the work done (\(W_{{{\text{punch}}}}\)) in the punch-phase. As a result of progressive abrasive wear, the cutting edge radii (\(r_{{\text{i}}}\)) is rounded, stress peaks in the forming are reduced in contrast to a sharp cutting edge radii and the plastic deformation phase is extended. Caused by this extended deformation, the percentage of the shear zone and the related phenomena increase with the length of the punch-phase (Klingenberg & Boer, 2008). Assuming that the progressive wear has only an insignificant influence on the maximal force of the punch-phase, an extension of the length of the punch-phase leads to an increased amount of work in this phase. Redundancies between the work done and the length of the punch-phase are identified. In contrast, the work done in the push- and withdraw-phase mainly depend on the maximal force in each phase (\(F_{{{\text{push}}}}\) and \(F_{{{\text{with}}}}\)). Hohmann et al. prove in their work that progressive wear leads to increased frictional forces between the punch in the sheet metal and thus to higher maximal forces in the push- and withdraw-phase (Hohmann et al. 2017). Therefore, a redundancy between the maximal forces and the work done in both phases is detected, which is also confirmed by high correlation coefficients of 0.99 for the push- and 0.97 for the withdraw-phase. The work performed in all three phases is redundant to the maximal forces (\(F_{{{\text{push}}}}\) and \(F_{{{\text{with}}}}\)) and the length of the punch-phase (\(l_{{{\text{punch}}}}\)). This feature is negligible for the further training of the classification mode. Furthermore, the literature indicates an abrasive wear on the cutting edge marginally effects the maximal force in the punch-phase. In addition, previous investigations within the scope of this work have shown that a resilient identification of the start and end points as well as the maximum force in the punch-phase is difficult to automate due to the limited formation of this phase. Since the determination of the minimum force in the withdraw-phase is more resilient than the determination of the length of this phase, feature \(l_{{{\text{punch}}}}\) and \(F_{{{\text{with}}}}\) are obtained for further investigations. This selection is confirmed in the coefficients of the correlation matrix, which provides a moderate correlation of -0.83 between both features. At this point it should be mentioned that the correlation matrix coefficient can be used for an unsupervised feature selection leading to an exclusion of highly correlating features. However, this does not exclude the possibility that important information may be contained in two correlating features, which are crucial for the further identification of the wear states (Hoppe et al. 2019). Figure 18 shows the qualitative influence of different feature combinations on the performance of the multiclass SVM. The training of the model is based on (a) two major principal axes of the PCA (\(f_{1}\) and \(f_{2}\)), (b) two engineered features (\(l_{{{\text{punch}}}}\) and \(F_{{{\text{with}}}}\)) and a hybrid approach (\(f_{1}\) and \(l_{{{\text{punch}}}}\)). Force signals are acquired with the force washer in the upper tool.

Fig. 18
figure 18

Qualitative visualization of the multiclass SVM classifying five wear states comparing the feature selection approaches based on PCA features (a), engineered features (b) and a hybrid approach (c)

This is shown qualitatively in the visualization of the classification models and quantitatively in the calculated Mahalanobis distance which is 35.2% higher in the hybrid approach compared to the isolated PCA approach and 9.3% higher compared to the isolated engineered features approach. This leads to the conclusion, that feature selection can improve the performance of the model. The availability of domain knowledge and process expertise for the extraction and selection of the features, especially in the feature engineering approach, plays a crucial role (Table 8).

Table 8 Influence of different feature selections approaches on the accuracy in predicting wear states with a multiclass SVM and Mahalanobis distance to quantify the separability of class 2 and 3

Discussion

The results demonstrate that abrasive wear states can be classified by a mSVM but data acquisition, data preprocessing and data transformation affect the performance of the model. Depending on these three steps as well as the number of strokes, the accuracy can range from 100 to 71%. It was found that the variance within the acquired data sets, described by the mean standard deviation \(\overline{s}\), correlates strongly with the performance of the classification model. Figure 19a shows the model performance as a double logarithmic function of the mean standard deviation for different sensor positions and sensor types and a stroke rate of 300 spm. With a decreasing mean standard deviation, as observed in the piezo electrical force sensors close to the forming zone, the accuracy of the model raises. A similar effect is seen for the separability represented by the Mahalanobis distance. In addition to the sensor position and the sensor type, the strokes speed influences the mean standard deviation. As a result of higher stroke speeds, dynamic effects are superimposed on the force signal. This increases the average standard deviation of the time signals, regardless of which sensor is used or where the sensor is positioned. As an example of this effect, Fig. 19b shows for the strain sensor in the press frame the influence of the stroke speed on the model accuracy and separability correlated with the mean standard deviation as a double logarithmic plot.

Fig. 19
figure 19

Dependency of the model performance on the mean standard deviation for all sensors at 200 spm (a) and the correlation between stroke speed and model accuracy depending on the mean standard deviation (b)

From a Machine Learning point of view, a correlation can be established between the variance in a time signal and the performance of mSVM. The origin of this variance in the data set is mainly caused by the sensor type and the sensor position (physical distance between sensor and forming zone). Looking at the commonly used sensor types for monitoring of forming processes (piezoelectric and resistive force sensors), the resistive sensors in particular show a significantly higher variance in the data set than the piezoelectric sensors. This is mainly due to the fact that resistive sensors tend to produce noise. In this process, resistive sensors measure forces indirectly by a change in resistance caused by an elongation of the measuring grid. This change in resistance is not only determined by the elastic deformation of the measuring grid, but also by electrical noise caused by the electrical contacts, the length of the cable and the mounting process. In addition, resistive sensors are limited in representing high frequency ranges due to their mechanical design and inertia. In particular, using this sensor type for acquiring forces in a blanking process results in a loss of physical valuable information, as the resistive sensor is not capable of representing dynamic effects.

As mentioned in the "Data-driven monitoring of blanking processes" section, blanking processes run at over 1000 spm. Impacts are triggered when the punch hits the sheet metal and the elastic energy stored in the system is released abruptly. This causes oscillations in the process which, due to the Dirac-shaped impulse, distribute the physical information in the measured force signals over a wide frequency range. This effect is amplified as the stroke speed increases. In contrast, piezo electrical sensors are able to improve the representation of dynamic effects during blanking due to their higher stiffness. In addition, they tend to be less susceptible to noise due to their compact design (protected from environmental influences such as vibrations, heat transfer, mechanical shocks, etc.) and their good electrical shielding, including the wires. Furthermore, compared to resistive sensors, piezoelectric sensors are less sensitive to temporal and spatial self-heating effects and the associated thermal noise.

In addition to the sensor type, it was shown that the performance of the mSVM is influenced by the physical distance of the sensor from the actual forming zone and its actual position in the tool. Depending on the distance between the sensor and the forming zone, the acquired force signal is influenced by the static and dynamic behavior of the press. The high accelerations generate inertial forces in the press frame, which have to be compensated by the mass balance of the press. Especially the mass balance system of the Bruderer press (BSTA 810) is designed for large tools (\(<\) 250 kg) causing unbalanced inertial forces using the experimental tool (\(\sim\) 100 kg). As a result, an inertial force opposite to the movement of the ram is generated during the downward movement of the ram, which is superimposed on the signal of the blanking process. These inertial forces are especially noticeable in the strain sensor of the press frame. Figure 20 shows a comparison of the time series of the force sensor in the press frame (strain sensor frame) and in the upper tool (force washer upper tool).

Fig. 20
figure 20

Comparing force signal of piezo-electrical force washer integrated to the upper and lower tool as well as the frame sensor in the press frame at a stroke speed of 300 spm and an unworn punch (\(r_{0}\))

From a Machine Learning perspective, the data set is enriched with information about the dynamic behavior of the press, which is not useful for predicting abrasive wear states. In contrast, even a small change in the data set by adding or removing data points can significantly influence the results of ML models (Nguyen et al. 2015). This effect is also seen in the positioning of sensors in the blanking tool. With increasing distance to the forming zone, more and more unwanted information of the overall system is superimposed on the physical information of the blanking process (e. g. inertial forces). Not only the physical distance to the forming zone but also the position inside the tool plays a crucial role for the performance of an ML model. Looking at the force washers (force washer upper tool and force washer lower tool) in Fig. 4, it is evident that the physical distance to the forming zone is unchanged, while the position in the tool is different. Nevertheless, the mean standard deviation over the entire stroke rate range for the force washer in the lower tool is approximately 62% higher than the mean standard deviation of the force washer in the upper tool, as shown in Table 5. This is caused by the uneven distribution of the masses between the force washer and the forming zone. The mass of the lower tool components \(m_{{\text{l}}}\) (adapter plate, columns and die holder) is greater than the mass of the upper tool components \(m_{{\text{u}}}\) (adapter plate and punch holder), resulting in a mass ratio of \(m_{{\text{l}}}\)/\(m_{{\text{u}}}\)~10. Due to the greater mass in the lower tool, vibrations caused by the impact of the punch on the sheet metal and the material breakage increase dynamic forces as shown in Fig. 20.

In summary, these findings show that the variance in a time signal depends on the sensor type and its positioning in the tool and correlates with the performance of ML models. In order to generate a high-performance ML model, the variance in the input data has to be minimized. In the case of blanking, dynamic effects caused by the high stroke speeds in combination with electrical noise and physically induced vibrations are the key factors influencing this variance. In order to minimize this variance in the signal, there are two ways of adjusting it. Firstly, the sensor position and type must be explicitly designed for the application in a blanking process to directly minimize the variance in the input data. It could therefore be shown that piezo electrical force washers near the forming zone in the upper tool are suitable to monitor the wear state in order to improve the quality of the input variables and thus the performance of the mSVM. Secondly, the variance in the input data can be indirectly affected by the step of data preprocessing and data transformation. Filtering offers the possibility to remove undesired characteristics from the force signal, but on the other hand it bears the risk of losing physically relevant information. Therefore, it is crucial to design filter operations based on expert knowledge, process knowledge and empirical experience in a way that unwanted properties are removed from the signal without losing physically relevant signal components (see “Model performance depending on data preprocessing” section).

Also the feature selection procedure can improve the model performance. Thereby it is necessary to select features that are robust against deviations in the time signal. As the results show, time domain features such as extreme values, first to fourth order statistical moments, and engineered features are affected by the variance in the signal. These findings are confirmed by Fig. 18, which shows a small value for the Mahalanobis distance for separating classes using only the engineered feature length of punch-phase \(l_{{{\text{p}},{\text{i}}}}\) and the work done in the withdraw-phase \(W_{{{\text{w}},{\text{i}}}}\). In particular, such features are very sensitive to shifts on the time axis. Even an offset of a few samples, caused by vibrations or uncertainties in the measurement chain (shift of the starting point of the measurement) lead to significant differences in the value of the feature. Figure 21 shows the deviation of the engineered feature \(l_{{{\text{p}},{\text{i}}}}\) depending on the start end point of the punch phase. Even if the end point varies around ± 2 samples, the length of the punch varies by ± 0.05 mm. A similar effect can be seen when using PCA features that do not distinguish between variance due to measured values (e. g. electrical noise) and variance due to real underlying signal variations. Even if an estimate of measurement variance is available, this information is not used in the construction of eigenvectors, e.g. by weighting noisy data. Therefore, selecting robust features and combining them, as shown in “Model performance depending on feature selection” section, can improve the performance of ML models.

Fig. 21
figure 21

Influence of small shifts on the time axis on the value of the engineered feature length of the punch-phase \(l_{{{\text{p}},{\text{i}}}}\)

Summary and conclusion

This study examines the influence of data acquisition, preprocessing and transformation on the performance of a mSVM to classify abrasive wear states during blanking based on force signals. The performance of the model was quantitatively evaluated based on the model accuracy and the separability of the classes. Especially the extent of variance in the time signals significantly influences the performance of the classification model. This variance depends on the sensor types and positions as well as the measuring method (direct measurement vs. indirect measurement) and is improved by data preprocessing as well as a suitable extraction and selection of features. Therefore, the acquisition of a valid and trustworthy data set combined with a data preprocessing and transformation assisted by domain knowledge improves the successful implementation of AI projects.

Furthermore, this study shows for the first time a successful procedure predicting wear states during blanking even at high stroke speeds considering industrial conditions. Especially the use of piezo electrical sensors in the direct force flux close to the forming zone provides the suitable results from a technical and economical point of view. In combination, a following preprocessing step by filtering the data considering domain knowledge by designing the filter operation further improves the model performance. Also taking into account the data transformation can improve the model performance, especially the combination of different parameters (PCA features and engineered features) and selection of an optimal feature space. The methods investigated so far are limited to domains like machining or process engineering considering linear transformations in the time domain. Since blanking is a nonlinear and highly dynamic process, the extension of extraction methods to nonlinear features from the frequency or time–frequency domain is necessary. In addition to force signals, further process variables (e.g. acceleration, AE, acoustic emissions, etc.) and sensor types have to be investigated with respect to their suitability for the identification of process states during blanking. Furthermore, the transferability of the quantified influences of data acquisition, preprocessing and transformation on the performance of ML algorithm to other manufacturing processes should be demonstrated.