1 Introduction

In solid mechanics, the mechanical behavior of a material under external loads is represented by constitutive equations. These material models depend on parameters called material parameters. It is necessary to calibrate the model to defined experiments in order to predict the structural behavior of components or even structures. In former times, tensile tests were carried out to ensure that more or less homogeneous stress and strain states were obtained. In this case, either the external forces of the testing machine’s force gauge and the displacements of the testing machine’s clamps are used to determine stresses and strains within the specimen under consideration. Alternatively, strain gauges were used to locally determine the strains in certain regions of the specimen. Today, in cases where optical access to the specimen’s surface is given, digital image correlation systems (DIC) serve as an adequate tool to detect the surface displacements in a sub-region of the entire specimen, and, accordingly, the strain field in that region—see, for example [76] or [17], and the literature cited therein. Here, one can find out whether the assumption of a homogeneous deformation state is justified, or whether inhomogeneities dominate the deformation. Moreover, it is possible to directly choose specimens providing inhomogeneous deformations. Then, however, it is necessary to solve the partial differential equations under consideration (equilibrium conditions).

To determine the parameters, which represents an inverse problem, we follow a non-linear least-square method (NLS) minimizing the difference between the model response and the experimental data. Commonly, the solution of boundary-value problems in solid mechanics is computed by means of finite elements (FEM). A first approach of such a procedure was published by Schnur and Zabaras [70], where the goal was to detect the place and the Young’s moduli of an inclusion in a matrix material. Following this concept of combining NLS and FEM, a first extension is the use of discrete force-displacement curves, see, for example [30, 57, 75]. An enhancement of such an approach is the application of optical information either by a grating on the specimen’s surface [51], or by Moirè-patterns [47]. In regard to discrete displacement information, very few points on the specimen’s surface or the contour data can serve for identification purposes as well [29]. The entire material parameter identification process using finite elements and full-field measurements data was mainly driven by the works of [2, 53, 54]. For a comprehensive overview see [52] as well. There, a non-linear least-square method minimizes the residual between the displacement field of the DIC-data (or manufactured simulation data) and the displacements on a sub-region of the FE-surface. Based on this approach, which employs gradient-based optimization schemes or BFGS-type approaches, further publications on different applications followed, see, for example [6, 44,45,46, 63, 66].

An alternative approach to identify material parameters is provided by the virtual field method, see, for example [59], where the NLS is circumvented. In this case, the virtual displacements are chosen in such a manner that a resulting system to determine the material parameters is obtained. For an overview of different methods to identify material parameters see [3]. In [3] the procedure to apply FEM and displacement data is called FEM-U. A further alternative approach is discussed in [65], where the parameters are obtained by a direct approach to the DIC-data and which is compared to the NLS-approach. Apart from gradient-based methods, further gradient-free schemes are applied, for example, in [37, 38] using neural networks—or [29] with a direct search method.

We call the schemes using a NLS-method together with full-field measurement by a DIC-system and simulations by means of finite elements the NLS-FEM-DIC approach. The fundamental difference in the applications of the NLS-FEM-DIC approach lies in the use of the material models, the calculation of the sensitivities as well as which measurement data are included in the identification. The starting point for constitutive models are either linear elastic [34], or non-linear elastic material properties. In the case of inelastic material properties, the mathematical problem changes. In this case, there has been no debate on how the NLS-FEM-DIC approach is connected to the schemes in Numerical Mathematics. Here, a connection can be drawn on to the methods compiled in [69]. This was addressed on a theoretical level in [24] by comparing the three approaches simultaneous simulation equations (SSE), internal (IND), and external numerical differentiation (END). In this article, we extend the discussion in [24] to high-order time integration and a real application. For this purpose, we draw on a particular interpretation of the finite element method using material models of evolutionary-type. The computation of finite elements on the basis of constitutive equations of evolutionary-type such as models of viscoelasticity, rate-independent plasticity, and viscoplasticity turned out to be interpretable as the solution of differential-algebraic equations (DAEs) after applying the spatial discretization using finite elements, see [15, 16, 77]. Drawing on this interpretation, consistent algorithms to solve a system of DAEs can be applied. In [15] this is discussed for von Mises-type plasticity, and in [22] for the case of viscoelasticity, where singly diagonal-implicit Runge–Kutta (SDIRK) methods are applied. These schemes have the advantage that time-adaptivity (step-size control) is provided without any considerable additional numerical costs, and the fact that Backward–Euler like implementations to integrate the constitutive equations are embedded in the scheme ensuring that no additional work has to be done. In addition, it was found that the procedures are very efficient for long-term processes such as relaxation or creep processes. For further reading on similar approaches, see [8, 39, 64, 67].

If the NLS-FEM-DIC approach is followed, where a gradient-based method is applied, END (numerical differentiation) is the first choice of most publications [9, 55, 60, 65]. Although there is a large variability and flexibility of this application, the computational costs are higher than for providing analytical derivatives. Using the analytical approach to calculate the sensitivities (IND), only a very publications use this approach [40, 53, 66]. Here, we draw on Acegen to provide the codes containing the derivatives with respect to the material parameters [41,42,43]. It will be shown that this essentially reduces the computational times.

Regarding to the evaluation of the experimental data, it is very common only to evaluate the displacement data, which has the disadvantage that rigid-body motions in the experiments cannot be represented by the finite element simulation. Thus, the surface strains have to be compared, which requires knowledge about the strain computations in the FEM-simulation and the DIC-evaluation. In the case of plane problems, the strains can directly be obtained by the finite element program [9]. However, if the surface is curvilinear, most of the finite element programs cannot provide the in-plane surface strains so that a comparison to the 3D-DIC data is not possible. Here, we draw on the strain determination scheme proposed in [28]. This is applied not only on the motion of the DIC-coordinates but to the nodal coordinates of the FE surface as well. The nodes are treated in the same manner as for the DIC-data. With regard to the sensitivity computations, the derivatives of the equations with respect to the material parameters are consequently also provided.

Fig. 1
figure 1

Experimental setup and typical strain distributions in horizontal and vertical directions

Furthermore, it is very common not to consider reaction forces (or torques), although they can be determined by finite elements as well. An approach using force data is discussed in [60], where END is drawn on. However, the use of IND requires a clear description of the analytical sensitivities. This gap was filled in [24] using the Lagrange-multiplier method within the DAE-interpretation of finite elements. Although the computation of the reaction forces is frequently performed by a node-wise formulation of the equilibrium conditions, we follow the Lagrange-multiplier procedure [31]. This is necessary if other time integration schemes are applied, for example, Rosenbrock-type methods [26]. Furthermore, it is a clear variational concept for displacement control and reaction force computation, and is helpful in step-size control for considering local error estimation of the forces as well.

Moreover, most contributions do not discuss quality measures describing the resulting material parameters, such as the confidence interval, or the correlation between the parameters. Additionally, we follow the concept of local identifiability proposed in [4, 7] which was studied in [25, 71] in the field of solid mechanics.

If all these concepts are applied, the entire NLS-FEM-DIC concept becomes obvious. This will be demonstrated for biaxial tensile tests in this article, where, additionally, the comparison of numerical differentiation (equivalent to END) to analytical derivatives (IND) is provided. Specifically, the connection to the Multilevel-Newton algorithm—commonly applied in finite elements—is discussed [23]. Rubber is chosen as the test material.

The notation in use is defined in the following manner: geometrical vectors are symbolized by \(\vec {a} \), second-order tensors \({\mathbf {A}} \) by bold-faced Roman letters, and calligraphic letters \({\mathcal {A}}\) define fourth order tensors. Furthermore, we introduce matrices at global level symbolized by bold-faced italic letters and matrices on local level (Gauss-point level) using bold-faced Roman letters \({{\mathbf {\mathsf{{A}}}}}\).

2 Experimental data

First, we carried out experiments using a biaxial testing device, see [33] for more details and references. We chose a natural rubber according to [5]. In view of the identifiability of all material parameters, we have to circumvent equibiaxial testing paths. Fig. 1a shows the testing device, and Fig. 1b,c represent typical strain distributions that are determined by a digital image correlation system. Here, we draw on the 3D-DIC system Aramis of the company GOM, Brunswick, Germany. It turned out that rigid body motions resulting from the stiffness of the entire testing machine have significant influence on the parameter identification concept. Thus, principal strains (or stretches) are determined using the displacement data of the system. In our tests, the clamp displacements are prescribed and the resulting forces are measured and compared. The specimens under consideration are shown in Fig. 2. The bulge serves for a better fixation in the clamps.

Fig. 2
figure 2

Geometry of the cross-like specimen (thickness 6 mm)

Since we are applying a model of overstress-type viscoelasticity, particular loading paths are required to obtain the basic properties of the equilibrium stress state by the termination points of relaxation, see [35] for a discussion. We prescribe rate-dependent loading paths, see Fig. 3a, and a multi-step relaxation path, see Fig. 3b.

Fig. 3
figure 3

Displacement loading paths in the biaxial tensile testing machine

Horizontally, a displacement is applied which is four times larger than in the vertical direction (in order to ensure identifiability of the parameters, see the discussion in [33]). The displacement rates of the clamps are horizontally \({\dot{u}}_h(t) = 0.004, 0.01, 0.1, 1\, \hbox {mm s}^{-1}\). Rates slower than 0.004 mm s\(^{-1}\) were not possible by the testing machine’s capabilities. As a result, the vertical displacement rates are \({\dot{u}}_v(t) = {\dot{u}}_h(t)/4\). The force-displacement (displacements of the holder) results are shown in Fig. 7 together with the calibrated model.

3 Method of vertical lines

As discussed in the introduction, we follow the method of vertical lines, which represents a semi-discretization of solving partial differential equations (PDEs), first in space and subsequently in time, see [18, 68]. In our case these are the coupled partial differential equations stemming from the local balance of linear momentum (quasi-static case) and the stress-strain relation coupled with the evolution equations for internal variables,

$$\begin{aligned} \begin{aligned}&{{\,\mathrm{Div}\,}}({\mathbf {F}} (\vec {X} ,t) {\tilde{\mathbf {T}}} (\vec {X} ,t)) + \rho _{\text {R}}(\vec {x} ) \vec {k} = \vec {0} , \\&{\tilde{\mathbf {T}}} (\vec {X} ,t) = {\tilde{\mathbf {h}}} ({\mathbf {C}} (\vec {X} ,t),{\mathbf {\mathsf{{q}}}}(\vec {X} ,t)),\\&\dot{{\mathbf {\mathsf{{q}}}}}(\vec {X} ,t) = \tilde{{{\mathbf {\mathsf{{r}}}}}}({\mathbf {C}} (\vec {X} ,t),{\mathbf {\mathsf{{q}}}}(\vec {X} ,t)). \end{aligned} \end{aligned}$$
(1)

This is accompanied by initial and boundary conditions for the displacements, the tractions, and the internal variables. \({\mathbf {F}} (\vec {X} ,t) = {{\,\mathrm{Grad}\,}}\vec {\chi } {}_{\text {R}}(\vec {X} ,t)\) represents the deformation gradient of the motion \(\vec {x} =\vec {\chi } {}_{\text {R}}(\vec {X} ,t)\), where the material point with the position vector \(\vec {X} \) in the initial configuration occupies the place \(\vec {x} \) at time t. \({\mathbf {C}} = {{\mathbf {F}}}^{T} {\mathbf {F}} \) is the right Cauchy-Green tensor, whereas \({\tilde{\mathbf {T}}} = (\det {\mathbf {F}} ) {{\mathbf {F}}}^{-1} {\mathbf {T}} {{\mathbf {F}}}^{-T} \) represents the second Piola-Kirchhoff stress tensor, and \({\mathbf {T}} \) the Cauchy stress tensor. \(\rho _{\text {R}}\) defines the density in the reference configuration, and \(\vec {k} \) is the acceleration of gravity. \({{\,\mathrm{Div}\,}}{\mathbf {A}} = \partial A_{ij}/\partial X_j \vec {e} _i\) is the divergence operator applied to a second-order tensor \({\mathbf {A}} \) using the partial derivatives with respect to the coordinates in the reference configuration. Here, we draw on a model of finite strain viscoelasticity, see [35, 50, 62], with particular modifications. Eq. (1)\(_{2,3}\) are dependent on internal variables \({\mathbf {\mathsf{{q}}}} {\, \in {\mathbb {R}}}^{{n_{\text {q}}}}\), which can be scalar- or tensor-valued. In the entire procedure here, we assume that the time derivative Eq. (1)\(_3\) are done on quantities operating relative to the reference configuration. Thus, variables with relative derivatives such as Oldroyd- or Jaumann derivatives have to be transformed back to referential quantities. In the presentation here, the vector \({\mathbf {\mathsf{{q}}}}\) is composed of the six independent components of the (symmetric) viscous right Cauchy–Green tensor \({\mathbf {\mathsf{{q}}}} \leftarrow {\mathbf {C}} {}_{\text {v}}\), i.e. we have \({n_{\text {q}}}= 6\). \({\mathbf {C}} {}_{\text {v}}= {{\mathbf {F}}}^{T} {}_{\text {v}}{\mathbf {F}} {}_{\text {v}}\) stems from the multiplicative decomposition of the deformation gradient into an elastic and a viscous part, \({\mathbf {F}} = {\mathbf {F}} {}_{\text {e}}{\mathbf {F}} {}_{\text {v}}\). The second Piola–Kirchhoff tensor decomposes into three parts,

$$\begin{aligned}&{\tilde{\mathbf {T}}} = J \rho _{\text {R}}U'(J) {\mathbf {C}} ^{-1} + {\tilde{\mathbf {T}}} {}_{\text {eq}}^{\text {iso}} + {\tilde{\mathbf {T}}} {}_{\text {ov}}\nonumber \\&\quad \text {with} \quad U'(J) = \frac{K}{10} (J^4 - J^{-6}). \end{aligned}$$
(2)

The first part is connected to the volumetric deformation \(J {:}{=} \det {\mathbf {F}} \), the second part represents the isochoric, equilibrium stress state, and the last term defines the overstress part. The specific strain energy function \(U(J) = K/50 (J^5 + J^{-5} - 2)\) is chosen to avoid nonphysical behavior in tension and compression as it is common in most models, see the discussion in [13, 27]. The isochoric equilibrium stress part reads

$$\begin{aligned} {\tilde{\mathbf {T}}} {}_{\text {eq}}^{\text {iso}} = \varphi _1 {\mathbf {I}} + \varphi _2 {\overline{\mathbf {C}}} + \varphi _3 {{\overline{\mathbf {C}}}}^{-1} \end{aligned}$$
(3)

with

$$\begin{aligned} \varphi _1= & {} \frac{2 \rho _{\text {R}}}{(\det {\overline{\mathbf {C}}} )^{1/3}} (w_1 + w_2 \text {I}_{{\overline{\mathbf {C}}} }), \quad \varphi _2 =-\frac{2 \rho _{\text {R}}}{(\det {\overline{\mathbf {C}}} )^{1/3}} w_2, \nonumber \\ \varphi _3= & {} -\frac{2 \rho _{\text {R}}}{3 (\det {\overline{\mathbf {C}}} )^{1/3}} (w_1 \text {I}_{{\overline{\mathbf {C}}} }+ 2 w_2 \text {II}_{{\overline{\mathbf {C}}} }) \end{aligned}$$
(4)

and

$$\begin{aligned} w_1 = \frac{\displaystyle \partial w{}_{\text {eq}}}{\displaystyle \partial \text {I}_{{\overline{\mathbf {C}}} }} = c_{10}, \quad w_2 = \frac{\displaystyle \partial w{}_{\text {eq}}}{\displaystyle \partial \text {II}_{{\overline{\mathbf {C}}} }} = c_{01} \frac{3}{2} \text {II}_{{\overline{\mathbf {C}}} }^{1/2} \end{aligned}$$
(5)

for

$$\begin{aligned} w{}_{\text {eq}}(\text {I}_{{\overline{\mathbf {C}}} },\text {II}_{{\overline{\mathbf {C}}} }) = c_{10} (\text {I}_{{\overline{\mathbf {C}}} }- 3) + c_{01} (\text {II}_{{\overline{\mathbf {C}}} }^{3/2} - 3 \sqrt{3}). \end{aligned}$$
(6)

In a certain sense, this is the extension of the Mooney-Rivlin model to polyconvexity, see [27]. \({\overline{\mathbf {C}}} = {\overline{\mathbf {F}}} ^T {\overline{\mathbf {F}}} \) defines the unimodular right Cauchy–Green tensor depending on \({\overline{\mathbf {F}}} = (\det {\mathbf {F}} )^{-1/3} {\mathbf {F}} \) with \(\det {\overline{\mathbf {F}}} = 1\). Here, we have the invariants \(\text {I}_{{\overline{\mathbf {C}}} }= \mathrm {tr}\,{\overline{\mathbf {C}}} \) and \(\text {II}_{{\overline{\mathbf {C}}} }= ((\mathrm {tr}\,{\overline{\mathbf {C}}} )^2 - \mathrm {tr}\,{\overline{\mathbf {C}}} ^2)/2\). The overstress part is given by

$$\begin{aligned} {\tilde{\mathbf {T}}} {}_{\text {ov}}= 2 \rho _{\text {R}}\mu _0 \frac{(\det {\mathbf {C}} {}_{\text {v}})^{1/3}}{(\det {\mathbf {C}} )^{1/3}} \left( {{\mathbf {C}}}^{-1} {}_{\text {v}}- \frac{1}{3}({\mathbf {C}} \cdot {{\mathbf {C}} _{\text {v}}^{-1}}) {{\mathbf {C}}}^{-1} \right) \end{aligned}$$
(7)

depending on the evolving viscous right Cauchy–Green tensor

$$\begin{aligned} {\dot{\mathbf {C}}} {}_{\text {v}}= \frac{4 \rho _{\text {R}}\mu _0}{\eta } \frac{(\det {\mathbf {C}} {}_{\text {v}})^{1/3}}{(\det {\mathbf {C}} )^{1/3}} \left( {\mathbf {C}} - \frac{1}{3}({\mathbf {C}} \cdot {{\mathbf {C}} _{\text {v}}^{-1}}) {\mathbf {C}} {}_{\text {v}}\right) . \end{aligned}$$
(8)

Particularly, a process-dependent viscosity

$$\begin{aligned} \eta = \eta _{0} \exp \left( -s_0 \sqrt{{\mathbf {C}} {\tilde{\mathbf {T}}} {}_{\text {ov}}\cdot {\tilde{\mathbf {T}}} {}_{\text {ov}}{\mathbf {C}} }\right) \end{aligned}$$
(9)

is necessary to reflect rate dependence. In some situations, we express the function depending on the Green-strain tensor \({\mathbf {E}} = ({\mathbf {C}} - {\mathbf {I}} )/2\), which does not essentially affect the principal equations.

In this model, we have six material parameters, which are determined step-wise. The parameter vector determines the equilibrium stress state, and controls the overstress behavior. Here, we follow the common procedure of connecting the density \(\rho _{\text {R}}\) with the material parameters \(K \leftarrow K \rho _{\text {R}}\), \(c_{10} \leftarrow c_{10} \rho _{\text {R}}\), \(c_{01} \leftarrow c_{01} \rho _{\text {R}}\), and \(\mu _0 \leftarrow \mu _0 \rho _{\text {R}}\). Obviously, Eq. (2) represents the concrete formulation of Eq. (1)\(_2\), and Eq. (8) the evolution Eq. (1)\(_3\).

Equation (1)\(_1\) is transferred into a weak form represented by either the classical principle of virtual displacements or by a mixed formulation, for example a three-field formulation as proposed in [74]. Both weak formulations have the disadvantage that the calculation of the reaction forces at those degrees of freedom, where displacements are prescribed, is not possible (they do not produce a virtual work). Of course, local equilibrium formulations intuitively lead to results, but it is not embedded in the theory itself. Furthermore, no consistent mathematical description is provided. This can be circumvented by analytical considerations regarding the Lagrange multiplier method, see [31], leading to the DAE-system

(10)

Here, we draw on the abbreviation of the unknowns and their initial conditions

(11)

consists of the unknown nodal displacements , where forces (or surface tractions) are applied, and of the degrees of freedom , where the known displacements are given. In the Lagrange multiplier approach to obtain the reaction forces, it is assumed that all nodal displacements are unknown. This implies the constraint equation

(12)

The displacements are assigned by the incidence matrix to the unknown nodal displacements . Further, represents the Lagrange multiplier vector and is interpreted as the vector containing the reaction forces, which are necessary to enforce the prescribed displacements .

(13)

contains all internal variables, , evaluated at all \({n_{\text {G}}}\) spatial integration points, which are usually the Gauss-points. \({n_{\text {el}}}\) is the number of elements, and \({n_{\text {GP}}^e}\) represents the number of Gauss-points within element e. defines all internal variables of the entire mesh. In this sense, the “global” ODEs

(14)

hold as well. The vector contains all equations resulting from the weak formulation

(15)

with the incidence matrices and assembling all element contributions into a large system of equations. They are not explicitly programmed, and they represent the assembling procedure. They contain only the numbers 1 and 0. \({\mathbf {\mathsf{{C}}}}^{e(j)} {\, \in {\mathbb {R}}}^{6}\) represents the column vector representation of the symmetric part of the right Cauchy–Green tensor (Voigt-notation) which depends non-linearly on the element nodal displacements \({\mathbf {\mathsf{{u}}}}^{e} {\, \in {\mathbb {R}}}^{{n_{\text {u}}^e}}\), \({\mathbf {\mathsf{{C}}}}^{e(j)} = \hat{{{\mathbf {\mathsf{{C}}}}}}^{e(j)}({\mathbf {\mathsf{{u}}}}^{e})\), with . \({n_{\text {u}}^e}\) is the number of element nodal displacement degrees of freedom, \(w^{e(j)}\) the weighting factors of the Gauss-integration in an element, are the number of Gauss-points within one element, and defines the strain-displacement matrix of element e evaluated at the j-th Gauss-point, , which depends on the element nodal displacements \({\mathbf {\mathsf{{u}}}}^e\) as well. The Gauss points have the local coordinates \(\varvec{\xi }^{(j)}{\, \in {\mathbb {R}}}^{3}\) (we only consider three-dimensional continua). Furthermore, \({{\mathbf {\mathsf{{J}}}}}^{e(j)} {\, \in {\mathbb {R}}}^{3 \times 3}\) symbolizes the Jacobi-matrix of the coordinate transformation between reference element coordinates and global coordinates, and defines the given equivalent nodal force vector. The symmetric part of the stress tensor (1)\(_2\) is transferred into a vector \(\tilde{{{\mathbf {\mathsf{{T}}}}}} {\, \in {\mathbb {R}}}^{6}\), \(\tilde{{{\mathbf {\mathsf{{T}}}}}} = \tilde{{{\mathbf {\mathsf{{h}}}}}}({\mathbf {\mathsf{{C}}}}^{e(j)},{\mathbf {\mathsf{{q}}}}^{e(j)})\), which is evaluated at Gauss-point \(\varvec{\xi }^{(j)}\), and depends due to the right Cauchy–Green tensor on the displacements and the internal variables at that Gauss point.

In the case of elasticity, where the equations have to be evaluated in the absence of internal variables, the DAE-system (10) degenerates to the system of non-linear equations

(16)

where t stands for a parameter similar to a generalized continuation method. Here, the non-linear system (16) has to be fulfilled at each loading step t.

In implicit finite element approaches, the DAE-system (10) is often solved using a Backward–Euler method. This yields the non-linear system

(17)

with

(18)

at each point in time \(t_{n+1}\), \(0 \le t_{n+1}\le T\), \(t_{n+1}= t_{n}+ \Delta t_n\), where \(\Delta t_n\) is the step-size. In the final iterated solution, is reached—which is not necessarily the case in other integration schemes, see [26]. In this case, the system (17) can be written as ( is assumed to be known)

(19)

Equations (19)\(_{1,3}\) are solved using the Multilevel-Newton algorithm, and Eq. (19)\(_2\) represents a downstream step since it is explicit in . Regarding the Multilevel-Newton algorithm, see [61] for the original scheme, and [15, 23] for the discussion in finite elements. Since we assume a DAE-system with consistent initial conditions, the non-linear system (19) is fulfilled for the initial conditions at time \(t_0\).

The same algorithmic structure is given by diagonal-implicit Runge–Kutta methods (DIRK-methods), where at each stage \(T_{ni}= t_{n}+ c_i \Delta t_n\), \(\Delta t_n= t_{n+1}-t_{n}\), the non-linear system

(20)

with

(21)

has to be solved. \(c_i\), \(i=1,\ldots ,s\), and \(a_{ij}\) (\(a_{ij}= 0\) for \(j<i\)), are the coefficients of the Butcher array containing the given weighting factors of the method under consideration, see [19, 20]. In the final solution, condition (20)\(_3\) is fulfilled so that we can simplify system (20) to

(22)

The stage quantities are used to determine the stage derivatives

(23)

required for the starting vectors

(24)

The unknown stage quantities are the nodal displacements , the reaction forces (negative Lagrange multipliers), and all internal variables from all Gauss-points in the structure. Alternatively, it is possible to apply a rate formulation in which the stage-derivatives are the unknowns. We choose the first version to obtain the same implementation as in “classical” implicit finite elements.

The final values at time \(t_{n+1}\) are

(25)

where the \(b_i\), \(i=1,\ldots ,s\), are additional weighting factors of the Butcher array. For stiffly accurate methods, the condition \(a_{sj} = b_j\) holds—so that the stage quantities at the s-stage are identical to the final result, , , and .

For \(s=1\) (one stage), \(c_1=b_1=a_{11}=1\), the Backward–Euler approach is embedded in the more general DIRK-methods. The advantage of the DIRK-methods is that step-size selection is obtained for very few extra computations, see [15], which is necessary for physical problems with different local time-scales. This has clear advantages with regard to computational, especially in relaxation or creep dominated problems. The higher order of the methods yields larger time steps for the same accuracy as in the Backward–Euler approach. This has been demonstrated for problems in crystal plasticity, viscoplasticity, diffusion-driven mechanics, thermo-mechanical or electro-thermo-mechanical coupling, curing processes, and thermal fluid-structure interaction. According to our experience, a two-stage, second-order method is sufficient for more accurate solutions and time-adaptivity.

In the case of non-linear elastic problems (absence of ordinary-differential part), the entire approach can be interpreted as a continuation method, i.e. the Newton-Raphson scheme is applied to

(26)

which has to be fulfilled at each time \(t_{n+1}\), see Eq. (16). The starting (guess) vector of the Newton-Raphson method is commonly chosen to be the previous solution , or some estimation, see [32]. The latter estimation is strongly recommended (also for the problem (20) for some parts of the non-linear system) since it essentially stabilizes the computations.

4 Least-square approach

In the following, the non-linear least-square problem is discussed in detail. First, the basic NLS is recapped. Then the fully analytical computation of the sensitivity matrices is explained, and the numerical differentiation technique is summarized. Since we are interested in investigating quality measures, which is also connected to the concept of local identifiability for the identified material parameters, this is explained afterwards.

4.1 Basic problem

Least-square methods minimize the square of the distance between the model and the experimental data. Before this is explained in more detail, the experimental data has to be discussed. In the case of full-field measurement, we obtain from each experiment E, \(E=1,\ldots ,{n_{\text {exp}}}\), a data vector . For \({n_{\text {N}}^{(E)}}\) (temporal) load-steps, each load-step consists of \({n_{\text {d}}^{(E)}}\) entries (discrete spatially distributed displacements or stretches, forces concerned, ...), i.e. \({n_{\text {exp}}^{(E)}}= {n_{\text {N}}^{(E)}}{n_{\text {d}}^{(E)}}\). The experimental evaluation times are \(t_m\), \(m=1,\ldots ,{n_{\text {N}}^{(E)}}\). These data vectors are assembled into the vector , where contains the information from one experiment. For all tests under consideration, this leads to the entire data vector , .

For each experiment E, the finite element model provides displacements ( and ), and reaction forces for the temporal points \(t_{n}^{\,(E)}\), \(n=1,\ldots ,N^{\,(E)}\). Both the temporal and the spatial points of the experiment and the finite element model do not coincide. Thus, a temporal and a spatial interpolation of the model data to the experimental data have to be carried out. We choose the (temporal) linear interpolation of the experimental data to the model time data since the sensitivities of the simulations cannot be interpolated. In this sense, the size of the data vectors adapts to the information from the finite element computations. In the spatially distributed data, there are several aspects to be considered. Firstly, a DIC-system describes only surface data on a sub-region of the finite element mesh. Thus, only a subset of model data has to be compared. Secondly, a (commercial) DIC-system can provide both the motion of material points, accordingly the displacements of the material points, or, by a black-box interpolation of the DIC-evaluation program, strain data at these points. This statement holds for commercial finite element programs as well. It has turned out in several applications that rigid-body motions in the experimental tests—resulting from the compliance of the testing machine—essentially influence the comparability of both data sets. Although there are algorithmic possibilities to minimize the rigid-body motions in the DIC-data, this does not help in all cases. Thus, the strain data evaluation is—to our experience—of greater interest. In this case, however (and, in view of a fully analytical sensitivity analysis, see Sect. 4.2), the strain evaluation procedure has to be known. Thus, we follow the interpolation concept proposed in [28] based on triangulation, see [36, 58] for a similar approach. This concept can be applied to both the DIC-data as well as to the finite element nodal displacement data. Thus, the same interpolation scheme and strain evaluation is applied for both systems (DIC and FEM). Moreover, we can compute at any point of the DIC-region and the finite element model displacements and strains lead to a considerable flexibility in the evaluation. Finally, the forces, which are commonly recorded as well, should be considered in the identification process since we are interested in adapting stress-strain models requiring some force information.

As we discussed before, only a subset , of the finite element nodal displacements can be compared to the DIC-data. If in-plane strains or stretches are considered, an interpolation scheme is required. Instead of the common letter \(\lambda \) for a stretch, we take \(\nu \) to circumvent a misinterpretation with the Lagrange-multiplier . In this case the simulation component

(27)

is indirectly dependent on the parameter set . In the case of a principal strain measure, it reads .

In order to extract the required nodal displacements, the incidence matrix is introduced. On the side of the forces—in our application of a biaxial tensile test we have two, and . The vectors , \(k=1,2\), are chosen to extract the nodal reaction forces required to determine the scalar values. The indices 1, 2 indicate the horizontal and the vertical directions.

In conclusion, we have the entire vector of all experimental data with different physical properties (e.g. forces, displacements, strains, ...), and the finite element result depending on the parameter set . Since different physical quantities and different amounts of data are available (few forces and a huge amount of displacement-based data), a weighting technique is applied to the residual . Here, we draw on the concepts proposed in [21]. The entire weighting matrix reads

(28)

Within the weighting matrix

(29)

we have the same weighting factors

$$\begin{aligned} w_F^{(E)} = \frac{\displaystyle 1}{\displaystyle \max _{n=1,\ldots ,{n_{\text {N}}^{(E)}}} \big |F_n^{(E)}\big |} \end{aligned}$$
(30)

for the force data in vertical and horizontal direction of the biaxial tensile tests. The weighting factors of the strain data are defined by

(31)

Now, the NLS-problem reads

(32)

i.e. the square of the weighted residual should be a minimum, and the necessary condition for the minimum in the solution is given by

(33)

representing a system of non-linear equations to determine the material parameters . No inequalities were necessary in the applications of this article.

Here, the functional matrix

(34)

, is required, which is frequently called the sensitivity matrix (or, in short, sensitivity). This holds for Gauss-Newton-like algorithms as well [48, 56, 69]. In this respect, we need the derivatives of both the resulting forces as well as the displacements with respect to the parameter vector ,

(35)

The derivatives of the principal strains (or stretches) with respect to the parameters are obtained using the chain-rule, e.g.

(36)

see Eq. (35)\(_2\), see “Appendix A”. For the case of overstress-type models, the equilibrium stress part is identified first, i.e. we need the sensitivities of the parameterized non-linear system (26). Since the derivatives are defined implicitly, as the solution of the non-linear system (26) depends on the parameters , the system reads

(37)

The sensitivities are obtained by differentiating Eq. (37)\(_1\) with respect to (the indices \(n+1\) and (E) are omitted for brevity in the following)

(38)

In other words, a system of linear equations with \({n_{\kappa }}\) right-hand sides has to be computed after each load step n. The matrix represents the tangential stiffness matrix. For moderate systems and direct solution schemes, one LU-decomposition is required, and \({n_{\kappa }}\) back-substitutions (commonly, the LU-decomposition has already been done which was required for solving the system (37)\(_1\)). Then, the sensitivities of the Lagrange multipliers read

(39)

requiring the solution of Eq. (38). This implies only a matrix-matrix product. The matrix is also a part of the entire tangential stiffness matrix.

In the case of DAEs, we follow the concepts summarized in [69]. The sensitivities can be determined either on the level of the DAE-system 10, yielding the so-called simultaneous simulation of sensitivities, see [12, 49], or on the time-discretized level—leading to two approaches namely internal numerical differentiation (IND) and external numerical differentiation (END). The resulting simultaneous sensitivity equations (SSE) can be shown to be equivalent to IND under special conditions (same integrator to obtain the sensitivities). Since the implementation of the SSE yields—in the general case—systems that are too large for our applications, it is not recommended. We follow the sensitivity computation using IND and END as discussed in the following.

4.2 Internal numerical differentiation

In the case of internal numerical differentiation, the DAE-system (10) is discretized in time first. All sensitivities are provided by analytical computations. The solution (25) depends on the parameters ,

(40)

Since we draw on stiffly accurate diagonally-implicit Runge–Kutta methods, the final solution at each time \(t_{n+1}\) is directly provided by the non-linear system, , , . Thus, we need the derivatives of the stage derivatives and the starting values with respect to ,

(41)

with

(42)

Thus, the derivatives (41) must be stored additionally, and we obtain the dependencies

(43)

We seek for the derivatives and , which can be—similarly to the Multilevel-Newton algorithm—be determined by assuming the implicit function theorem (here, we briefly recap the theory in [24]). We assume that the conditions of the implicit function theorem are fulfilled, i.e. the stage quantities of the internal variables can be represented by a function . This function is inserted into Eq. (43)\(_1\),

(44)

and calculate the derivative with respect to (we omit again the temporal index ni)

(45)

On the left, the coefficient matrix is again the tangential stiffness matrix of the non-linear solver in the FE-program. The matrix is obtained from the second equation (integration step for the internal variables)

(46)

by applying again the chain rule, see Eq. (21) for fulfilled constraint (20)\(_3\),

(47)

where the first matrix on the left (term within the brackets) vanishes in the Multilevel-Newton algorithm, see [24, Eq.(23)]. With , the linear system

(48)

has to be computed in each stage. Since we assembled formally these equations, see Eqs.(13)-(14), they can be decoupled in this step. In other words, after each stage small linear systems have to solved on Gauss-point level (in our case with 6 internal variables, \(6\times {n_{\kappa }}\)). This result is inserted into Eq. (45).

Since the Lagrange-multipliers are explicitly determinable,

(49)

the sensitivity is based only on quantities derived before,

(50)

The “classical” derivatives within the Multilevel-Newton algorithm are provided in an analytical form in [22], whereas the sensitivities required for the numerical optimizer are computed by means of the program Acegen [41,42,43] .

4.3 External numerical differentiation

External numerical differentiation, i.e. the computation of the sensitivities (34) by means of numerical differentiation is the usual procedure in the NLS-FEM-DIC approach. In this case, the derivatives are approximated by

(51)

with the vectors (all entries are zero except for one having a 1 in row j). This holds similarly for the Lagrange multipliers, , as well. To determine these derivatives, the finite element solver, i.e. the DAE-solution (22), is computed with and . The derivatives and from all time steps \(t_{n+1}\) must be stored. END has an essential advantage if the finite element program is treated as a black-box solver, for example as it is the case for commercial programs. A disadvantage is to be seen in the computational time required and the unawareness of the exact implementation.

4.4 Quality measures

Today, there are many tools to identify material parameters using a least-square approach. They are programmed very stable. The \(R^2\)-value, \(R^2 = 1- \big (\sum _{i=1}^{{n_{\text {d}}}}(d_i-s_i)^2\big )/\big (\sum _{i=1}^{{n_{\text {d}}}}(d_i-{\overline{d}})^2\big )\) with \({\overline{d}} = (1/{n_{\text {d}}}) \sum _{i=1}^{{n_{\text {d}}}} d_i\), \(R^2 \le 1\), gives a hint on how the model reflects the course of the experimental data. However, it does not say anything about the quality of the parameters found. Either one stops with the iterative scheme in a local minimum of the non-linear least-square problem, or a subset of the material parameters is not uniquely determinable. In this case, it is possible to obtain infinite combinations of the parameters—with the same quality regarding the experimental curves. In this context, the physical meaning of the parameters becomes questionable. This was mentioned in [4, 7] for the general case in parameter identification, and studied in solid mechanics in [25, 33, 71]. Since this is only a local concept in the solution found by some optimizer, it is merely an indicator (so far, however, it appears to be quite a useful indicator). In its theoretical development, an expansion is done in the solution, and the curvature is investigated to find out whether a local “valley” of the goal function exists. In this case, the Hessian

(52)

is exploited, or its approximation

(53)

We check the condition to verify whether there might be a unique solution, see for details [4, 7].

To our experience, the investigation of identifiability should be done using a re-identification procedure. This means that we perform a direct computation with the parameter found by the optimizer, with the same path and boundary conditions of the experiment. Then, we try to re-identify the parameters and evaluate the Hessian. The reason for this procedure is that scattering of real experiments can essentially influence the value of the determinant so that misinterpretations regarding the results can occur. A certain drawback of this indicator is that in real applications is never really zero so that the question occurs how small should be this measure so that the results become questionable. Unfortunately, this essentially depends on the problem under consideration (size of , number of parameters, influence of weighting factors, ...). The smaller it become the more sensitive the parameter identification process becomes. Thus, it is a experienced-based quantity.

A measure of the quality of the parameters is also provided by the confidence interval—here, we draw on the real experimental data

(54)

which is based on the diagonal components of the covariance matrix,

$$\begin{aligned} \Delta \kappa _i = \sqrt{P_{ii}}, \quad i = 1,\ldots , {n_{\text {d}}}, \end{aligned}$$
(55)

with

(56)

Here, we have the standard deviation

(57)

These investigations can be done after determining a parameter set that is totally independent of the parameter identification method. In view of analytical and numerical parameter identification studies, we would like to refer to [34]. A result of the covariance matrix is given by the correlation matrix

(58)

This matrix provides a hint whether parameters are correlated, see, for example [11].

5 Identification at biaxial tensile test

To identify the material parameters and we proceed as follows. In a first step, only the termination points of relaxation resulting from the loading process in Fig. 3b are chosen to determine . In the second identification step, we draw on the loading with four different displacement rates, see Fig. 3a, and the multi-step relaxation loading path shown in Fig. 3b. These tests are required to determine the material parameter set .

The geometry of the biaxial tensile specimens are shown in Fig. 2. We exploit triple symmetry, i.e. only one-eighth of the specimen is discretized, and draw on the 20-noded mixed hexahedral elements (Q2P1-elements of [74], see also [22] for details. Figure 4 shows the mesh, and the boundary conditions concerned. The triangulated surface regions of both the DIC-data points as well as the finite element nodal points are compared in Fig. 5. Here, we apply the triangulation tool proposed in [72, 73].

Fig. 4
figure 4

Mesh using 20-noded Q2P1-elements and the boundary conditions concerned (\({n_{\text {u}}}= 8472\))

Fig. 5
figure 5

Triangulation of surface data for strain determination

5.1 Identification of equilibrium stress part

With regard to the findings of a unique identification of the material parameters in biaxial experiments from [33], one approach is to apply a much larger deformation in one direction than in the other direction. We define a four times larger displacement in vertical direction, see Fig. 3b. Since we have an overstress-type model, the material parameters of the equilibrium stress part are determined by the termination points of relaxation in a first step, \({\tilde{\mathbf {T}}} {}_{\text {ov}}= {\mathbf {0}} \) in Eq. (2). We store the maximum and minimum strain state of the DIC-data as well as the vertical and horizontal forces of the testing machine’s force gauge at the termination points of relaxation. These data-points are compared with the finite element solution in a least-square sense. In this case, there are no internal variables so that we have only a NLS-problem applied to systems of non-linear equations, see Eq. (37), considering either IND, see Eqs.(38)-(39) for the functional matrices required by the Matlab tool lsqnonlin.m, or END, as discussed in Sect. 4.3. Here, we compare both the surface strains of the strain measure \({\mathbf {E}} ^{(1)} = {\mathbf {U}} - {\mathbf {1}} \), where \({\mathbf {U}} \) is the right stretch tensor of the polar decomposition of the deformation gradient \({\mathbf {F}} = {\mathbf {R}} {\mathbf {U}} \), and the force-displacement curves provided by the termination points of relaxation.

The resulting material parameters are compiled in Table 1.

Table 1 Identified material parameters of equilibrium stress part and overstress part
Fig. 6
figure 6

Force-displacement identification result and relative error in principal strains

We obtain the results in Fig. 6a with an \(R^2\)-value of 0.994. The computational strain distribution at the final point is compared with the experimental strain distribution. Here, we consider the maximum and minimum principal strains. Figure 6b shows the relative error of the principal maximum strains

$$\begin{aligned} e = \frac{|\varepsilon _{\text {max,exp}} - \varepsilon _{\text {max,sim}}|}{|\varepsilon _{\text {max,exp}}|} \times 100. \end{aligned}$$
(59)

Here, the relative error is around 20% in the center region of the specimen. The confidence interval of the material parameter indicates that only \(c_{10}\) is sensitive, which is due to the fact that the middle region of the sample is primarily considered in the identification, not so much the heavily loaded sample arms. It is well-known that the first and second invariant \(\text {I}_{{\overline{\mathbf {C}}} }\) and \(\text {II}_{{\overline{\mathbf {C}}} }\) are strongly coupled in pure tensile tests. In the biaxial tensile test this is less pronounced, but visible in the correlation matrix:

(60)

There is a strong correlation between the parameters \(c_{10}\) and \(c_{01}\), see Eq. (58), which is known since the invariants \(\text {I}_{{\overline{\mathbf {C}}} }\) and \(\text {II}_{{\overline{\mathbf {C}}} }\) are not independent of each other in the small strain range. Here, we have a square of the standard deviation of \(s^2 = 0.053\).

The computations require 25 iterations with a starting vector . In the case of END, we have 104 function calls, and it requires a factor of the computational time of approximately 3.4 more than IND. It should be noted that the number of iterations with the starting vectors can be more or less. However, only a local minimum is met, i.e. there is no statement about a global minimum with the selected entire numerical scheme. In our experience to do so far, the concept of local identifiability provides relatively stable, unique solutions.

Fig. 7
figure 7

Force-displacement results of identification of rate dependence and relaxation

To find out whether the material parameters are uniquely identifiable, a re-identification concept is applied. In this case, a forward computation is performed with the material parameters found, and the simulation data is chosen as virtual experimental data. Then, the determinant of the Hessian (53) is evaluated. If the parameters are not uniquely identifiable, the determinant is zero. In our case, holds. Thus, identifiability is guaranteed according to the chosen indicator, see discussion in [33], since it is moderately small.

5.2 Identification of overstress part

In the viscoelastic case, we have \({n_{\text {u}}}=8472\) unknown nodal displacements, \({n_{\text {Q}}}= 194400\) unknown internal variables in the whole structure. Here, we fix the material parameters , see Table 1, and determine the parameters on the basis of the four displacement rates in the loading paths of Fig. 3a and the multi-step relaxation path of Fig. 3b. Once again, we make use of the measured horizontal and vertical force data as well as the maximum and minimum strain data, adding up to significantly more data than for the parameter identification of the equilibrium stress part. The time-adaptive time integrator is the second-order scheme of Ellsiepen, see [10, 14], which is based on the second-order method of [1]. The result of the identification of the force-displacement curves is shown in Fig. 7 (the strain data is not shown here). We have \(R^2 = 0.9903\), and the parameters are assembled in Table 1, resulting from the starting values of the iterative scheme . Since the confidence interval is larger than the value itself, the parameter \(s_0\) required to incorporate rate-dependence is not of high quality. This is indicated also by the correlation matrix

(61)

where the viscosity \(\eta _0\) and \(s_0\) are strongly correlated. The square of the standard deviation \(s^2= 0.0573\) is, however, small. The re-identification procedure yields the determinant of the Hessian indicating identifiability.

The comparison of the computations with analytical derivatives (IND) with numerical differentiation (END) again yields a factor of 3.5, i.e. END requires much more time for the identification process.

A brief comparison of a Backward–Euler computation with the step-size controlled, second-order SDIRK-method of Ellsiepen (both computed adaptively—however, the BE-method requires the number of global Newton-iterations within the Multilevel-Newton algorithm, where we reduce the step-size by a factor of 0.5 if the number of iterations exceeds 15, and increase the load-step by a factor of 1.2 if the iteration number is less than 4) yields for one computation of the multi-step relaxation path of Fig. 3b the step-size behavior shown in Fig. 8.

Fig. 8
figure 8

Step-size behavior of Backward–Euler method and second-order SDIRK-scheme for multi-step-relaxation path

Although the time-steps become larger for the BE-scheme during the relaxation path, it requires a factor of 1.12 only for this process. However, this depends essentially on the error tolerances assumed for the SDIRK time-step estimation. After each change of the process path, the implementation of the step-size control starts with a step-size of \(\Delta t_n= 10^{-4}\hbox {s}\) again.

6 Conclusions

In this paper a non-linear least-square method of Matlab, which requires gradients, is applied to the residual between full-field measurement data and finite-element computations. The usual approach of only including displacements has been extended here with respect to the consideration of reaction forces and a three-dimensional surface strain tool applied to the finite element nodal data. This was done in particular both in the derivation of the equations by the method of Lagrange multipliers and the three-dimensional strain calculation tool, which are necessary in the analytical calculation of the sensitivities. Both are required since considering forces improves the identification process, while the strain information circumvents rigid body motions which are commonly inherent in displacement data. In particular, the in-plane strain analysis tool is a simpler approach for curved three-dimensional surfaces, which is difficult to implement in the coding of finite element programs. Furthermore, high-order diagonally-implicit Runge–Kutta method are chosen as a time-integrator for the resulting DAE-system. As a consequence of using a thorough matrix notation and of interpreting the finite element method as the solution of DAE-systems, it is possible to use methods of numerical mathematics and to discover a general approach. In this case, a comparison of internal numerical differentiation (the sensitivities are computed with analytical expressions provided by the code-generation tool Acegen) and external numerical differentiation is provided. Although the latter is more flexible, it requires more than a factor of 3 of computational time—in our applications (dependent on the number of material parameters). This is demonstrated using data for rubber based on biaxial tensile experiments. Additionally, it is shown that measures such as the confidence interval, correlation matrix, and the concept of identifiability are helpful indicators of the quality resulting from the optimization tool. It turns out that not all parameters of the applied finite strain viscoelasticity model are necessary to reproduce the biaxial tensile test data using digital image correlation.