1 Introduction

Susceptibly of steels and alloys to hydrogen embrittlement (HE) is a problem of many aspects. Depending on the material microstructure, stress state, hydrogen diffusivity, and solubility, the mechanism of the hydrogen-induced damage and HE varies. A number of HE mechanisms of damage of the structural steels were proposed such as hydrogen-enhanced decohesion (HEDE) [1, 2], stress-induced hydride formation and cleavage [3, 4], hydrogen-enhanced localized plasticity (HELP) [5,6,7], adsorption-induced dislocation emission (AIDE) [8, 9] and hydrogen-enhanced stress-induced vacancy (HESIV) [10,11,12] mechanisms. However, in most cases a combination of the mechanisms was involved in the hydrogen-assisted damage of steels that makes complicate the modeling and prediction of the hydrogen-induced crack nucleation and growth. Hydrogen can diffuse into the steels and accumulate during all the stages of the lifecycle of the steels, namely during production at the mills, manufacturing of the structural components, and exploitation. For example, quenching of steels in water, oil, and even air quenching can cause the total hydrogen concentration growth reaching levels enough for hydrogen-induced cracking [13, 14] of high-strength steels. For continuous galvanizing process, the steels are prepared for coating in an H2–N2 furnace. Hydrogen uptake after galvanization process increases with the increase in the hydrogen content in the annealing atmosphere and depends significantly on the steel microstructure [15]. Hydrogen uptake is controlled mainly by a diffusion process; however, a possible chemistry effect during coating of the steel cannot be excluded. And there are many other processes such as surface-chemical cleaning, electroplating, electrochemical machining, pickling, cathodic protection, welding, carbonizing, etc., which cause the hydrogen uptake into the steel as raw material or components.

The effect of hydrogen-assisted damage can be significantly decreased by heat treatment at some stages of the steel processing causing the reduction of the total hydrogen concentration in the solid solution [14, 16, 17]. Worth to note is the diffusivity of the hydrogen in steels typically increases with tempering due to recovery and recrystallization resulting in a reduction of the defects density affecting the hydrogen detrapping and diffusion [18, 19]. However, tempering at about 500 °C may cause some decrease of hydrogen diffusivity associated apparently with precipitation of nonmetallic inclusions (NMI) as it was observed by Sakamoto et al. [18] for martensitic type 403 stainless steel. Greater resistance to HE was observed in X37CrMoV5-1 steel after austempering at the bainitic transformation zone compared to the tempered martensite structure of the same material [18]. Improved performance of the steel in presence of hydrogen was attributed to the large density of interfaces of the retained austenite, which trap the hydrogen and reduce the diffusivity and permeability of hydrogen through the steel [20,21,22]. During exploitation, hydrogen can originate from the corrosion reactions [23,24,25], transmutation reactions [26] or during exposure to hydrogen-enriched environment [27, 28].

From above, one can conclude that the diffusivity of hydrogen in steels is a key parameter controlling the HE. At the same time, the diffusivity is significantly depended on the microstructure of steels. Interfaces of dissimilar phases or NMIs in steels, as well as other defects of high density, can suppress the hydrogen diffusion toward the stress-affected zone increasing the time until the hydrogen-assisted damage occurs. However, some defects may act as stress concentrators facilitating the hydrogen-assisted damage in the presence of stress. One can assume that the susceptibility of steels and alloys to hydrogen embrittlement can be evaluated from the knowledge of hydrogen diffusivity and microstructure of studied steel. A practical tool evaluating the sensitivity of steels to hydrogen can improve significantly the reliability and durability of steel components in many engineering applications such as heat exchangers, boiler tubes, steam pressure vessels, hydrogen storage and transport, etc., by selecting the most appropriate steel from different grades or different batches of the same grade at early beginning of the construction work.

Use of the ANN is a growing trend in the material science and engineering research community [29,30,31,32,33]. A huge database of the experimental results collected during the last decades finds its implementation in the ANN models targeting to solve a variety of engineering problems [29]. Often, the chemical composition of steels and alloys together with their mechanical properties and other experimental parameters are considered as the input data for the ANN model in many applications [29,30,31]. The present work is aimed to predict the hydrogen-induced ductility reduction using only hydrogen thermal desorption spectroscopy (TDS) data assuming the TDS contains the required information about the hydrogen diffusion, trapping and detrapping, which depend, at the same time, on the microstructure of the studied materials. One can assume that the study of the relationship between the thermal desorption spectroscopy data and the ductility reduction of steel is a classic problem on fuzzy logic (FL) [34]. In the field of artificial intelligence, the neuro-fuzzy was proposed as the combination of ANN and FL [35, 36]. The main strength of the neuro-fuzzy systems is that they can interpret the if–then rules, where thermal desorption spectroscopy data and ductility reduction are labels of the fuzzy sets which can be characterized by an appropriate membership function. Neuro-fuzzy systems were designed to incorporate the human-like reasoning style; however, we assume the relationship between the hydrogen thermal desorption spectroscopy and hydrogen susceptibility of steels has a complex mathematical justification that properly developed ANN system is able to help to reveal. The present work is focused on the use of simple nonlinear artificial neurons as units of the ANN model. Advanced fuzzy logic algorithms will be considered in future work [37, 38]. Interest in ANN implementation for spectroscopy data analysis is growing among chemists, while in material science this approach is novel [32, 33]. Nevertheless, use of the ANN is a promising way to check the hypothesis about the correlation between the hydrogen thermal desorption spectroscopy and hydrogen-induced degradation of the mechanical properties of the steels and alloys.

The objective of this paper is to introduce a conceptual approach for evaluation of steels susceptibility to HE using the ANN model. In the future, such approach will enable the steels lifetime assessment predicting the hydrogen-induced degradation of the mechanical properties and microstructure without extensive time, cost and labor-intensive experimental research (see Fig. 1).

Fig. 1
figure 1

Overview for quantitative ANN-coupled TDS-based analysis of hydrogen-induced steel properties degradation

2 Experimental

2.1 Materials

Austenitic stainless steels (ASS), ferritic stainless steels (FSS) and ferritic-martensitic high-strength steels (FMHSS) of different microstructures and mechanical properties were chosen for development and validation of the ANN models for evaluation of hydrogen sensitivity parameter (HSP). Names and chemical composition of the studied steels are listed in Table 1. The HSS grades VA1000_M05, VA1000_TM05, VA1200_MTM, VA1400_TM, VA1400_MTM, with the same chemical composition, were subjected to the different heat treatment procedures during manufacturing affecting the microstructure and mechanical properties. The mechanical properties, microstructure, and heat treatment procedures related to the studied high-strength steels are described in details by Hickel et al. [39].

Table 1 Chemical composition of the steels selected for training and validation of the ANN models, wt%

The steels with different susceptibility to HE were selected to develop the ANN-based method for evaluation of hydrogen sensitivity parameter of steels using hydrogen thermal desorption spectroscopy (TDS) as input data. The experimental procedure can be divided into three main steps: mechanical testing, thermal desorption spectroscopy and artificial neural network modeling.

2.2 Mechanical testing

The studied steels were tested in tensile mode with the constant extension rate of 10−4 s−1 in as-supplied condition and during continuous electrochemical hydrogen charging. Tensile specimens were cut from the steel plates using electrical discharge machining (EDM). Shape and dimensions of the tensile specimens are shown in Fig. 2. The electrochemical hydrogen charging was performed using 1 N H2SO4 solution with 20 mg/l of thiourea. Specimens tested during continuous hydrogen charging were pre-charged to approach the homogeneous hydrogen distribution through the 1-mm specimens’ thickness. The parameters of the electrochemical hydrogen charging defined for each steel grade are summarized in Table 2.

Fig. 2
figure 2

Schematic view of the CERT specimen. Thickness of the specimen is 1 mm. All sizes are in mm

Table 2 Hydrogen charging parameters for the studied steels

Hydrogen charging causes the hydrogen concentration growth into the studied steels leading to the mechanical properties degradation. Hydrogen-induced reduction of the elongation to fracture was chosen as the parameter for evaluation of the steel sensitivity to hydrogen. The hydrogen sensitivity parameter (HSP) was calculated for each steel grade as:

$${\text{HSP}} = \frac{{\left( {\varepsilon - \varepsilon_{\text{H}} } \right)}}{\varepsilon } \cdot 100\%$$
(1)

where ɛ is elongation to fracture of as-supplied specimen, \(\varepsilon_{\text{H}}\) is elongation to fracture of H-charged specimen.

2.3 Thermal desorption spectroscopy

TDS specimens were cut from the studied steel plates using EDM with size of about 1 × 5 × 15 mm. The specimens were grinded with the emery paper 350 using Struers grinding machine LaboPol-25 and cleaned with acetone before measurement. All the specimens were tested in as-supplied condition considering only the metallurgical hydrogen accumulating into the steels during its production process.

The hydrogen desorption rate was measured using the TDS technique in the temperature range from room temperature (RT) to 1070 K with a linear heating rate of 10 K/min. The air-lock chamber 1 hosts the specimen until an intermediate low pressure is achieved as shown schematically in Fig. 3. Then, the specimen is transferred to the furnace located in the ultra-high vacuum (UHV) chamber 2. The measurement is performed in the UHV chamber by mass spectrometer 3 starting from an ultimate pressure of about 2 × 10−8 mbar.

Fig. 3
figure 3

Schematic representation of the TDS technique. 1—air-lock chamber, 2—ultra-high vacuum chamber (UHV) with furnace, 3—mass spectrometer

2.4 Artificial neural network (ANN) modeling

The thermal desorption spectra were considered as the inputs for training the artificial regression neural network models. The target output is the HSP. Database of the thermal desorption spectra was created, and it contains the datasets for each material associated with a certain HSP measured experimentally from CERT. The basic element of ANN is the neuron or node depicted in Fig. 4.

Fig. 4
figure 4

The neuron elements of the artificial neuron network

The input data (x1xn) is multiplied with the weights (ω1ωn) and their summation gets an addition of bias (b). The weight and bias transform the input data linearly. The complex nonlinear transformation between the input data and output is possible by the activation function that makes able the ANN to learn faster and efficient [40].

Development, training and validation of the ANN were performed in Python programming language using Keras open-source neural network library running on top of TensorFlow software for machine learning applications. Figure 5 shows the general view of the deep learning process of the developed regression ANN model. The ANN model is feed-forward and comprises a number of densely connected layers, where each neuron receives the input from all the neurons of the previous layer [41].

Fig. 5
figure 5

General view of the deep learning process of the ANN model

The input vector contains the hydrogen desorption rate data of the thermal desorption spectra (see Fig. 6a). Normalization procedure is typically applied for the input data to make them in the range from 0 to 1 [29, 41]. However, the TDS input data do not go beyond the range and normalization was neglected. The number of neurons in the first layer of the ANN model corresponds to the input vector size [29, 41]. The thermal desorption spectra were interpolated to optimize the number of data points and equalize the size of TDS data to form the input database comprising 66 data pairs for the ANN learning process.

Fig. 6
figure 6

Description of the proposed ANN models architecture. Thermal desorption spectra are interpolated producing the input of the neural network by vector S of hydrogen desorption rate (a). Size of the input vector is 381 and 762 for DANN and CANN models, respectively. DANN model consists of five densely connected layers (b). CANN model consists of the input layer following with a sequence of Convolutional1D layers and MaxPooling1D layers (c). The densely connected layer comprising a single neuron is added at the end as the output layer in both proposed models

Two ANN models were developed (see Fig. 6b, c). First is the regression ANN model consisting of four densely connected layers (later DANN). The input vector size of DANN model is 381. Input layer comprises 381 neurons and rectified linear units (ReLU) activation [42]. Hidden layer (see Fig. 5) comprises two densely connected layers of 512 and 1536 neurons, respectively, with the same activation functions. The network ends with output layer (see Figs. 5, 6b) comprising a single neuron and the sigmoid activation function [41, 42]. In general, the DANN topology can be summarized as 381-512-1536-1. The second model is a convolutional neural network model consisting of three convolution (Conv1D) layers (later CANN) used often for time-series comparison, classification or forecasting [41]. The input vector size of CANN model is 762. Three Conv1D layers comprise 100, 160 and 160 filters, respectively, with the same kernel size of 30 and activation function ReLU. The first convolutional layer is fed with the TDS data frame of 762 data points. After each Conv1D layer, the max-pooling operation is added using MaxPooling1D layers with the pool size of 2 and the same stride [41]. The densely connected layer (Dense) comprising a single neuron is added at the end like those in DANN model. The detailed information about utilized algorithms is presented in Ref. [41].

The supervised learning process is intended to map input examples of TDS data to known HSP targets by changing the connection weights, which allows the ANN to generate the outputs as close as possible to the true targets. In order to evaluate the ANN model performance, the existing database of input TDS data and its HSP targets was split to training, validation and test sets. Because of a limited amount of training and validation data, different validation approaches were considered. K-fold cross-validation with shuffling is promoted as the most promising method for the situation in which a relatively little training data available [41]. However, this approach makes the developed ANN models overfitted showing the relatively high misfit on the validation set of about 10%. The simple hold-out validation approach implemented in this research was found to be more efficient [41]. The mean squared error (MSE) loss function was considered for application in the developed models as the most commonly used for regression problems [29, 41]. The MSE value used as a loss score between the experimental and predicted HSP data can be calculated by Eq. 2:

$${\text{MSE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left( {Y_{i} - Y_{i}^{{\prime }} } \right)^{2}$$
(2)

where n is a number of predictions, Yi is the experimental output data, and \(Y_{i}^{{\prime }}\) is the predicted output data by the developed ANN. Considering the loss score, the ANN learning process was performed by RMSprop adaptive learning rate method, a form of stochastic gradient descent proposed by Geoff Hinton [43, 44].

Mean absolute error (MAE) is the metric that is monitored during the training of the ANN models to measure the absolute value of the difference between the predictions and the targets. MAE is a common regression metric that allows evaluating the performance of the regression ANN model [29, 41]. MAE is calculated using Eq. 3:

$${\text{MAE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {Y_{i}^{{\prime }} - Y_{i} } \right|$$
(3)

where n is a number of predictions, Yi is the experimental output data, and \(Y_{i}^{{\prime }}\) is the predicted output data by the developed ANN model [45].

3 Results and discussion

The elongation to fracture and HSP of the studied steels tested in as-supplied and H-charged conditions were obtained from CERT results and summarized in Table 3. The calculated HSP values were linked with corresponding thermal desorption spectra for all the studied steels.

Table 3 The elongation to fracture and HSP obtained from CERT (Note: three separate batches of VA1400_MTM steel grade were studied)

MAE-based validation of the developed models has been carried out to define the appropriate parameters of the ANN model (called the hyperparameters of the ANN model), like a number of neurons in the hidden layer, number of the hidden layers, and number of the training iterations (epochs) [41]. The validation dataset was separated from the training dataset used for training of the developed ANN models. Training dataset contains 46 data pairs, while 10 data pairs were reserved for the validation and test dataset each. Validation MAE was calculated after each epoch of the ANN training to evaluate the accuracy of the ANN model (see Fig. 7). The number of hidden layers with the corresponding neurons was selected by trial-and-error method since the general-purpose method for definition of the ANN topology is still not existing [41]. First the ANN model with one hidden layer containing many neurons was designed to get the model overfit in just a few epochs. Then, number of neurons was decreased gradually to define the optimal model topology leading to low overfitting rate. Number of hidden layers and corresponding number of neurons were adjusted to degrease the MAE at the validation dataset. There are, however, some recommendation to use a single hidden layer for a little training dataset that improves the capability of the ANN model to train successfully and predict the results [46]. The validation MAE decreases during the first 500 epochs as shown in Fig. 7. However, after 500 epochs the validation MAE starts to grow, evidencing the ANN model overfitting [41]. The best validation MAE was obtained by training the ANN model with about 500 epochs using the early stopping of the training process programmed to save the best model parameters corresponding to the minimum of MAE at the validation dataset.

Fig. 7
figure 7

Validation MAE by epoch calculated for the developed DANN (a) and CANN (b) models

The HSP values were predicted from the TDS validation dataset using the chosen DANN topology and training parameters. HSP predicted from validation dataset shows a good correlation with the experimental data (see Fig. 8). The observed correlation evidences the ability of the DANN model to be trained to predict the HSP using the hydrogen thermal desorption spectroscopy data as the input. Pearson coefficient of correlation (R) was calculated to understand the accuracy of the developed DANN model. R-value is a statistical term which indicates a linear correlation between the target variable and the predicted variable. R-value calculated for the experimental and predicted HSP values evidences the linear correlation of 0.99 at validation dataset (see Fig. 8). This result denotes that the developed DANN model provides the statistical relationship between the rate of the hydrogen thermal desorption and the reduction of the elongation to fracture caused by hydrogen.

Fig. 8
figure 8

Correlation between the experimental and predicted HSP calculated at validation dataset using DANN model

Considering the validation dataset, the developed DANN model can provide the HSP prediction with high accuracy for the inputs which lies under the probabilistic distribution of the trained data. However, every time when the hyperparameters of the DANN model adjusted to improve the performance on the validation dataset, some information about the validation data leaks into the model [41]. In general, such the tuning is also the learning process comprising the search of a good ANN model configuration in some parameter space. This information leak may result in overfitting of the developed DANN model to the validation dataset. Use of a completely new never-before-seen dataset (test dataset) allows to improve the performance evaluation for the DANN model [41]. HSP values of the studied steels were predicted at the test dataset using the developed DANN model and compared with the experimental data as shown in Fig. 9. MAE of the developed DANN model to the test dataset was calculated of about 2.8%. R-value was calculated of about 0.98.

Fig. 9
figure 9

Correlation between the experimental and predicted HSP calculated at test dataset using DANN model

The developed DANN model performs better to the validation dataset compared to the test dataset evidencing some overfitting due to the information leak caused by the hyperparameters search. The difference, however, is not significant, evidencing that the developed DANN model with topology 381-512-1536-1 trained with the backpropagation algorithm can predict the effect of hydrogen on ductility in form of HSP for specified experimental conditions which lie under the probabilistic distribution of the trained value. Worth to note is the DANN model was trained, validated, and tested using a limited training, validation, and test datasets. Increase of a number of the TDS data in training and validation datasets for each steel grade can mitigate the overfitting and improve the ANN model efficiency. One can observe also from Fig. 9 that the deviation of the predicted HSP values to the test dataset for the austenitic and ferritic stainless steel grades (specimens 1–3) is much smaller compared to that is for high-strength steels (specimens 4–10). Such the behavior can be caused by considerably high microstructural inhomogeneity coupled with a little hydrogen content of the studied high-strength steel leading to some changes of the hydrogen diffusivity, trapping, and detrapping.

The artificial CANN model was applied to study the correlation between the input TDS data and output HSP values. The hydrogen TDS spectroscopy can be represented as a sequence of measurement data obtained within a certain period of time. Since the applied heating rate during the measurement is linear (10 K/min), the temperature corresponds to the time axis with a coefficient of 10. Figure 10 shows examples of the hydrogen TDS spectra obtained from an austenitic, ferritic stainless steel, and high-strength ferritic-martensitic steel specimens. Each material TDS spectra reveal a specific pattern. The artificial CANN model was used to identify the specific patterns within the hydrogen TDS spectra corresponding to the sensitivity of the studied steel to the HE.

Fig. 10
figure 10

Examples of the color plot of the TDS spectra. Color bar shows the range of the hydrogen thermal desorption rate

Developed CANN model performs slightly better on the validation dataset compared to the DANN model as one can see from Fig. 11. MAE and R-value were found to be about 1.1% and 0.99, respectively. However, the accuracy of the CANN model tested on the test dataset decreases significantly as shown in Fig. 12. MAE and R-value were calculated on the test dataset to be about 4.5% and 0.93, respectively. One can assume, the CANN model is overfitted on the validation dataset, however, any attempt for generalization of the CANN model did not cause a significant improvement of the HSP prediction. Worth to note is the CANN model performs well on the TDS data of the austenitic and ferritic steel grades. The major misfit with the experimental data was obtained for the high-strength steel grades. The reason probably is a relatively high dispersion of the TDS data obtained for the HSS grades compared to that for the austenitic and ferritic steel grades. Increase of the training and validation dataset of HSS grades may improve the performance of the CANN model. One can conclude that the developed DANN model performs better compared to the CANN model.

Fig. 11
figure 11

Correlation between the experimental and predicted HSP calculated at validation dataset using CANN model

Fig. 12
figure 12

Correlation between the experimental and predicted HSP calculated at test dataset using CANN model

The HSP values of the steels, which TDS data were not exposed to developed DANN model during the training process, were predicted to study the versatility of the DANN model. The list of chosen steels with the corresponding HSP values predicted using the DANN model is shown in Table 4. The HSP values measured experimentally for AISI 409 and AISI 441 steels increases from 66.3% to 73.3% with the increase of the chromium content from 12% to 18% (see Tables 1, 3). The HSP predicted for 1319L2 ferritic stainless steel containing 21% of chromium follows the trend increasing up to 78.7%. Usually, the hydrogen-induced fracture initiates at the nonmetallic inclusions (NMI) of Al/Ti oxides or Nb carbides and propagates transgranularly forming a quasi-cleavage fracture surface [47, 48]. However, the concentration of chromium in the solid solution and/or formation of chromium carbides/oxides may result in a change of the hydrogen diffusion, trapping and detrapping increasing the susceptibility of steel to HE.

Table 4 The HSP values predicted by DANN for the steels not exposed for the model training

The HSP was predicted by TDS data obtained from AISI 316 austenitic stainless steel using the developed DANN model of about 64% that is close to the HSP measured for AISI 304 austenitic steel (61%) experimentally. AISI 316 steel is less susceptible to stress corrosion cracking (SCC) compared to AISI 304 steel due to the addition of molybdenum [49, 50]. However, the susceptibility of AISI 304 and 316 steel grades to HE was found to be similar that supports the obtained results [51].

Hydrogen TDS data of F82H martensitic steel were processed using developed DANN model predicting the HSP of about 82%. Effect of hydrogen on ductility reduction of the F82H steel was studied by Beghini et al. [52] considering the reduction of area at failure as a measure of the hydrogen sensitivity. The HSP can be calculated from the experimental results to be 79.8–99.6% depending on the tempering procedure and concentration of hydrogen into the steel [52]. The predicted HSP is comparable with that was measured for the F82H martensitic steel heat-treated for 2 h at 750 °C followed by cooling in the air [52].

From above one can conclude that the developed DANN model provides a proper prediction of the HSP correlating well with the experimental data. However, the developed model is probably not applicable for steels subjected to strain-induced phase transformation due to the corresponding change of the hydrogen diffusion, trapping and detrapping. Also, change of steel microstructure during exploitation must be considered. Nevertheless, the systematic improvement of the ANN model may result in the development of a new powerful tool for the characterization of steels susceptibility to HE. Hydrogen effect on the mechanical properties, namely the elongation to fracture, yield point, reduction of area, tensile strength, and formability can be predicted using an appropriate learning procedure [29]. At the same time, the use of the hydrogen thermal desorption spectra as the input data for the ANN model is a promising way to characterize the steels of similar chemical composition and different microstructural properties. The developed method of hydrogen sensitivity characterization can effectively complement the steels manufacturing process reducing the companies spend on defective products.

4 Conclusions

The relationship between hydrogen thermal desorption spectroscopy data and hydrogen effect on reduction of elongation to fracture of austenitic and ferritic stainless steels and high-strength steel grades was studied. The results evidence a good correlation (R = 0.98) between the experimentally measured HSP values of the studied steels and HSP values predicted using the developed DANN model. The DANN performs better compared to the CANN model on the available datasets. The DANN model was successfully validated using never-seen-before TDS data of the steels used for the training of the DANN model as well as steels that were not exposed to the model during the training process. The developed DANN model is able to predict the HSP of steels of different microstructural properties which TDS data lies under the probabilistic distribution of the trained data.

The ability to predict the HSP by the TDS data is evidence of a strong effect of hydrogen diffusion and material microstructure on hydrogen-induced damage into the steels and alloys. Despite the DANN model shows good results of prediction of the HSP, in the present, the human supervision of the DANN work is needed to prevent the misuse of the approach caused by factors, such as nonequilibrium hydrogen distribution and microstructural differences between different samples of same material under exploitation affecting the hydrogen diffusion and trapping.