Introduction

The modular robot manipulators (MRMs) [1, 2] equipped with standard modules adapt to severe working conditions through changing their configurations and increasing/reducing modules. Since the modularization and light weight of MRMs, they are potential in numerous unmanned and complex environments, such as aerospace explorations, search-rescue operations, and medical assistance. Thus, the effective control strategies are expected to ensure the security and low consumption.

The tracking control strategies of MRMs can be classified into centralized control [3, 4], distributed control [5, 6], and decentralized control [7,8,9] according to the recent literature. The centralized control and distributed control are designed by employing the information of all subsystems or its neighbors, which violates the original intention of mechanical design of the MRMs and leads to increasing communication exchange. Indeed, the primary characteristic of MRMs is that the joint module can be added, deleted, and replaced without redesigning the controller. Thus, it is better to develop a controller contains the information of the corresponding subsystem only in the “modularization” point of view. Fortunately, the decentralized control method, which is reasonable to MRMs, has aroused many scholars’ interests. In addition, most robot control strategies rely on the accurate model dynamics. However, it is impossible to obtain the complete dynamics of MRMs, since their intrinsic mechanical characteristic that the configurations are changing with increasing/reducing the modules for various task environments. To mitigate the influence induced by the model uncertainties, Zhu et al. [10] introduced a first-order Takagi–Sugeno fuzzy logic system to approximate the unknown dynamics, and proposed a decentralized adaptive fuzzy sliding mode control scheme for MRMs. Zhou et al. [11] developed a torque sensorless force/position decentralized control by utilizing the radial basis function neural networks (RBFNNs). The other feasible selection is to adopt the joint torque feedback (JTF) technique [12, 13] to reduce the complexity of the dynamics and to improve the generality in practice. Zhang et al. [14] presented a modular distributed control technique for MRMs that the model uncertainties associated with link and payload masses were compensated using joint torque sensor measurement. Nevertheless, the drawback of mentioned methods lies in ignoring the comprehensive optimization of the control performance and power consumption, as well as the high-energy cost caused by the long-time computation and communication, simultaneously. To the best of our knowledge, there are very few attempts on developing the decentralized optimal tracking control methods for robots, especially, the decentralized tracking control integrating adaptive dynamic programming (ADP) and event-triggered algorithm for MRMs.

Optimal control scheme has been received widespread attentions from both researchers and engineers since the mid-1950s. As an effective way to solve optimal control problems of nonlinear systems, ADP algorithm, which was first proposed by Werbos [15], can avoid the difficulties of “curse of dimensionality”. Recently, ADP-based methods are utilized to design optimal controllers for continuous-time [16, 17] and discrete-time [18, 19] nonlinear systems with input/output constraints [20, 21], external disturbances [22, 23], and mismatched interconnections [24, 25]. Since the optimal control problems of nonlinear systems are solved gradually, the ADP-based optimal control approaches [26] are applied to various fields [27, 28]. Nevertheless, all the aforementioned control methods were developed based on the time-triggered mechanism, which neglected the huge amount of unnecessary computation, communication, and energy cost in a long working time. In the last few years, the event-triggered mechanism [29, 30] is employed to address above problems. Kyriakos et al. [31] proposed a novel optimal adaptive event-triggered control algorithm for nonlinear continuous-time systems. Yang et al. [32] tackled the optimal event-triggered control problem of nonlinear continuous-time systems subject to asymmetric control constraints. Considering the interconnected systems, Vignesh et al. [33] presented an approximate optimal distributed control scheme for nonzero-sum games. He et al. [34] designed a decentralized event-triggered control method for nonlinear systems with matched interconnections. For the MRM systems, Dong et al. [35] proposed the time-triggered decentralized robust optimal control for MRMs via critic-identifier structure-based ADP approach. Zhao et al. [36] developed an event-triggered decentralized tracking optimal control approach by employing a local NN observer to estimate unknown model dynamics. In general, since the composed components for each module of MRMs are basically identical in practice, the dynamics of MRMs is usually partially known, such as the specification of actuators, the reduction ratio, etc. Besides, the training of NN needs a large amount of online or offline data, which wastes computation, communication, and energy resource. Thus, they should be taken into account to extend their service time. Unfortunately, a few ADP-based event-triggered decentralized tracking control approaches for MRMs were investigated, especially, considering the model-based real-time compensation of model uncertainties.

Inspired by the above literature, this paper presents an event-triggered decentralized tracking control approach with compensator-critic structure for MRMs. First, the dynamic model of MRMs, which is described as the integration of all subsystems associated with coupling dynamics, is formulated based on JTF technique. Then, a model-based real-time robust compensator is implemented to deal with the model uncertainties. Second, the performance index function which contains the tracking error and control torque is defined, and the system state is sampled according to the event-triggering condition. Based on the ADP algorithm, the event-triggered HJBE can be solved by the critic NN, and then, the event-triggered approximate decentralized optimal tracking control policy can be obtained. By utilizing the Lyapunov stability theorem, the tracking error of the closed-loop manipulators system is proved to be UUB under the proposed control method. Finally, the effectiveness of the proposed compensator-critic structure-based event-triggered decentralized optimal tracking control method is verified via the experimental results.

The main contributions of this paper are summarized as follows.

  1. 1.

    We address the ADP-based event-triggered decentralized tracking control problem of MRMs with compensator-critic structure. On the basis of JTF technique, the model-based robust compensator and the critic NN are designed to mitigate model uncertainties in real time and to approximate the optimal compensation tracking control policy, respectively.

  2. 2.

    Unlike existing time-triggered control methods [37, 38] which ignored the conservation of limited energy resource, in this paper, a novel compensator-critic structure-based event-triggered decentralized tracking control method for MRMs is proposed. It does not only make the actual trajectory of each joint module follow its desired one, but also reduce the computational burden, save the communication, and energy consumption simultaneously.

The remainder of this paper is arranged as follows. “Dynamic model and preliminaries” sketches the dynamic model and preliminaries of MRM subsystems. In “Compensator-critic structure-based event-triggered decentralized tracking control”, the compensator-critic structure-based event-triggered decentralized tracking control of MRMs is proposed, and the stability analysis is given. In “Experimental results”, experiments verify the effectiveness of the developed method. “Conclusion” summarizes this paper.

Dynamic model and preliminaries

We consider a n-degree of freedom (DOF) serial MRM, whose each module consists of a rotary joint with a direct current (DC) motor, a speed reducer, and a joint torque sensor, as shown in Fig. 1. Based on the JTF technique [39], the dynamics of the ith joint subsystem can be modeled as:

$$\begin{aligned} {I_{ri}}{\gamma _i}{\ddot{q}_i} + \frac{{{\tau _{ti}}}}{{{\gamma _i}}} + {f_{ri}}\left( {{q_i},{{\dot{q}}_i}} \right) + {Z_i}\left( {q,\dot{q},\ddot{q}} \right) = {\tau _i}, \end{aligned}$$
(1)

where \(I_{ri}\) denotes rotor moment of inertia related to the axis of rotation, \({\gamma _i}\) refers to the reduction ratio of the speed reducer, \({q_i}\) is the vector of the joint movements, \({{\dot{q}}_i}\) and \({\ddot{q}_i}\) are the joint velocity and acceleration, respectively, \({\tau _{ti}}\) represents the measurement of the joint torque sensor, \({f_{ri}}\left( {{q_i},{{\dot{q}}_i}} \right) \) means the joint friction torque, \({Z_i}\left( {q,\dot{q},\ddot{q}} \right) \) indicates the dynamic coupling torque among the subsystems, and \({\tau _i}\) is the control input torque, also the motor output torque.

Fig. 1
figure 1

Installation structure diagram of MRM

The joint friction torque \({f_{ri}}\left( {{q_i},{{\dot{q}}_i}} \right) \) mainly reflects the friction of the motor and speed reducer. Motivated by [40, 41], it is assumed to be a function of the joint position and joint velocity as:

$$\begin{aligned} {f_{ri}}\left( {{q_i},{{\dot{q}}_i}} \right)= & {} { f_{bi}}{\dot{q}_i} + \left( {{ f}_{si}}{e^{\left( { - {{ f}_{\tau i}}{{\left( {{{\dot{q}}_i}} \right) }^2}} \right) }} \right. \nonumber \\&\left. + {{ f}_{ci}} \right) {\mathrm{sgn}} \left( {{{\dot{q}}_i}} \right) + {f_{pi}}\left( {{q_i},{{\dot{q}}_i}} \right) , \end{aligned}$$
(2)

where \({f_{bi}}\) represents the viscous friction coefficient, \({f_{si}}\) is the static friction, \(f_{\tau i}\) denotes a positive parameter corresponding to the Stribeck effect, \(f_{ci}\) reflects the Coulomb friction, \({f_{pi}}\left( {{q_i},{{\dot{q}}_i}} \right) \) denotes the position dependency of friction and other friction modeling errors, and \(sgn(\cdot )\) is a classical sign function.

Supposing the nominal values of \({f_{bi}}, {f_{si}}, {f_{\tau i}}\) and \({f_{ci}}\) are closed to their actual values, then according to the linearization scheme [41], the friction model (2) can be approximated by:

$$\begin{aligned}&{f_{ri}}\left( {{q_i},{{\dot{q}}_i}} \right) \approx {{\hat{f}}_{bi}}{\dot{q}_i} + \left( {{{{\hat{f}}}_{si}}{e^{\left( { - {{{\hat{f}}}_{\tau i}}{{\left( {{{\dot{q}}_i}} \right) }^2}} \right) }} + {{{\hat{f}}}_{ci}}} \right) {\mathrm{sgn}} \left( {{{\dot{q}}_i}} \right) \nonumber \\&\qquad \qquad \qquad + {f_{pi}}\left( {{q_i},{{\dot{q}}_i}} \right) + {A_i}\left( {{{\dot{q}}_i}} \right) {{\tilde{F}}_{ri}}, \end{aligned}$$
(3)

where \({{\hat{f}}_{bi}}, {{\hat{f}}_{si}}, {{\hat{f}}_{\tau i}}\), and \({{\hat{f}}_{ci}}\) are the approximate values of \({f_{bi}}, {f_{si}}, {f_{\tau i}}\), and \({f_{ci}}\), respectively, and

$$\begin{aligned} {A_i}\left( {{{\dot{q}}_i}} \right)= & {} \left[ {{\dot{q}}_i},{\mathrm{sgn}} \left( {{{\dot{q}}_i}} \right) ,{e^{\left( { - {{{\hat{f}}}_{\tau i}}{{\left( {{{\dot{q}}_i}} \right) }^2}} \right) }}{\mathrm{sgn}} \left( {{{\dot{q}}_i}} \right) ,\right. \\&\left. - {{{\hat{f}}}_{si}}{e^{\left( { - {{{\hat{f}}}_{\tau i}}{{\left( {{{\dot{q}}_i}} \right) }^2}} \right) }}{{\left( {{q_i}} \right) }^2}{\mathrm{sgn}} \left( {{{\dot{q}}_i}} \right) \right] ,\\ {{\tilde{F}}_{ri}}= & {} {\left[ {{f_{bi}} - {{{\hat{f}}}_{bi}},{f_{ci}} - {{{\hat{f}}}_{ci}},{f_{si}} - {{{\hat{f}}}_{si}},{f_{\tau i}} - {{{\hat{f}}}_{\tau i}}} \right] ^T}. \end{aligned}$$

Remark 1

In practice, the joint friction torque \({f_{ri}}\) is always constant and bounded, which is affected slightly by temperature and lubrication. Thus, it is reasonable to assume that the estimated error term \({{\tilde{F}}_{ri}}\) is also bounded as \(|{{\tilde{F}}_{ri}}| \le {\beta _{Fbi}}\), where \({\beta _{Fbi}}\) is a positive constant vector with \(b = 1,2,3,4\). The non-parametric friction term \({f_{pi}}\left( {{q_i},{{\dot{q}}_i}} \right) \) has an upper bound as \(|| {{f_{pi}}\left( {{q_i},{{\dot{q}}_i}} \right) } || \le {\beta _{pi}}\) with \({\beta _{pi}}\) a positive constant.

On the basis of the dynamic model in [42], the dynamic coupling torque \({Z_i}\left( {q,\dot{q},\ddot{q}} \right) \) can be obtained by:

$$\begin{aligned} {Z_i}\left( {q,\dot{q},\ddot{q}} \right)&= {I_{ri}}\sum \limits _{j = 1}^{i - 1} {c_{ri}^T{c_j}{{\ddot{q}}_j}} + {I_{ri}}\sum \limits _{j = 2}^{i - 1} {\sum \limits _{k = 1}^{j - 1} {c_{ri}^T\left( {{c_k} \times {c_j}} \right) {{\dot{q}}_k}{{\dot{q}}_j}} } \nonumber \\&\buildrel \varDelta \over = {I_{ri}}\sum \limits _{j = 1}^{i - 1} {\varPhi _j^i{{\ddot{q}}_j}} + {I_{ri}}\sum \limits _{j = 2}^{i - 1} {\sum \limits _{k = 1}^{j - 1} {\varPsi _{kj}^i{{\dot{q}}_k}{{\dot{q}}_j}} } \nonumber \\&= {\sum \limits _{j = 1}^{i - 1} {\left[ {\begin{array}{*{20}{c}} {{I_{ri}}{\hat{\varPhi }} _j^i}&{{I_{ri}}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{{\ddot{q}}_j}}&{{\tilde{\varPhi }} _j^i{{\ddot{q}}_j}} \end{array}} \right] } ^T} \nonumber \\&\quad + {\sum \limits _{j = 2}^{i - 1} {\sum \limits _{k = 1}^{j - 1} {\left[ {\begin{array}{*{20}{c}} {{I_{ri}}{\hat{\varPsi }} _{kj}^i}&{{I_{ri}}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{{\dot{q}}_k}{{\dot{q}}_j}}&{{\tilde{\varPsi }} _{kj}^i{{\dot{q}}_k}{{\dot{q}}_j}} \end{array}} \right] }^T}}, \end{aligned}$$
(4)

where \(c_{ri}\), \(c_{j}\), and \(c_{k}\) represent unit vectors along the rotation axis of the ith, the jth and the kth joint, respectively. Accordingly, we define \(\varPhi _j^i=c_{ri}^T{c_j}\) and \(\varPsi _{kj}^i = c_{ri}^T\left( {{c_k} \times {c_j}} \right) \). Moreover, we have \(\varPhi _j^i = {\hat{\varPhi }} _j^i + {\tilde{\varPhi }} _j^i\) and \(\varPsi _{kj}^i = {\hat{\varPsi }} _{kj}^i + {\tilde{\varPsi }} _{kj}^i\), where \({\hat{\varPhi }} _j^i\) and \({\hat{\varPsi }} _{kj}^i\) are the estimates of the vectors \(\varPhi _j^i\) and \(\varPsi _{kj}^i\), \({\tilde{\varPhi }} _j^i\) and \({\tilde{\varPsi }} _{kj}^i\) indicate the alignment errors, respectively.

Remark 2

From the dynamic coupling torque (4), we know the terms \(c_{ri}\), \({c_j}\) and \({c_k}\) are bounded as \(||{\varPhi _j^i} ||\mathrm{{ = }}|| {c_{ri}^T{c_j}} || \le 1\), \(|| {\varPsi _{kj}^i} ||= || {c_{ri}^T\left( {{c_k} \times {c_j}} \right) } || \le 1\), respectively. Moreover, we also conclude that if the jth and the kth \((1<j,k<i-1)\) joints are assembled lower, then the dynamic coupling term \({Z_i}\left( {q,\dot{q},\ddot{q}} \right) \) is bounded as \(|| {{Z_i}\left( {q,\dot{q},\ddot{q}} \right) } || \le {\beta _{Zi}}\) with \({\beta _{Zi}}\) a positive constant. Accordingly, the MRM can be controlled “joint by joint”, such that the lower joints are all controlled when the current joint is controlled.

According to (1) and (3), the dynamic model of the MRM subsystem is described by:

$$\begin{aligned} {\dot{x}_i} = \left\{ \begin{array}{l} {{\dot{x}}_{1i}} = {x_{2i}}\\ {{\dot{x}}_{2i}} = {\varGamma _{fi}} + {\varTheta _i}+ {D_i}{u_i}, \end{array} \right. \end{aligned}$$
(5)

where \({x_i} = {\left[ {\begin{array}{*{20}{c}} {{x_{1i}}}&{{x_{2i}}} \end{array}} \right] ^T} = {\left[ {\begin{array}{*{20}{c}} {{q_i}}&{{{\dot{q}}_i}} \end{array}} \right] ^T} \in {{{\mathbb {R}}}^{2}}\), \({D_i} = {\left( {{I_{ri}}{\gamma _i}} \right) ^{ - 1} \in {{{\mathbb {R}}}^+}}\), \({\varGamma _{fi}}= -{D_i}\) \(\left( {{{{\hat{f}}}_{bi}}{x_{2i}} + \left( {{{{\hat{f}}}_{si}}{e^{\left( { - {{{\hat{f}}}_{\tau i}}{{\left( {{x_{2i}}} \right) }^2}} \right) }} + {{{\hat{f}}}_{ci}}} \right) {\mathrm{sgn}} \left( {{x_{2i}}} \right) }+{\frac{{{\tau _{ti}}}}{{{\gamma _i}}}}\right) \), \(u_{i}=\tau _{i}\) represents the ith joint control torque, and the model uncertainty which includes the friction model error and the interconnection joint coupling can be given as:

$$\begin{aligned} {\varTheta _i}= - {D_i}\left( {{ {f_{pi}}\left( {{x_{1i}},{x_{2i}}} \right) + {A_i}\left( {{x_{2i}}} \right) {{{\tilde{F}}}_{ri}}} +{Z_i}\left( {x,\dot{x},\ddot{x}} \right) } \right) . \end{aligned}$$
(6)

Assumption 1

The nonlinear system dynamics (5) is Lipschitz continuous for the state \({x_i} \in \varOmega \), and each subsystem is controllable, and \({x_i}\left( 0 \right) = 0\) with a equilibrium of system.

In this paper, we propose a compensator-critic structure-based event-triggered decentralized tracking control of MRMs based on ADP algorithm. The aim is to find a decentralized near-optimal control policy \(u_i\) to guarantee the stability of the closed-loop MRM subsystem. For the subsystem (5), the improved infinite horizon performance index function is defined as:

$$\begin{aligned} {\varXi _i}\left( {{\vartheta _i}\left( {{x_i}} \right) } \right)= & {} \int \limits _0^\infty \left( \vartheta _i^T\left( {{x_i}\left( t \right) } \right) {Q_i}{\vartheta _i}\left( {{x_i}\left( t \right) } \right) \right. \nonumber \\&\left. + u_i^T\left( {\vartheta _i}\left( {{x_i}\left( t \right) } \right) \right) {R_i}{u_i}\left( {\vartheta _i}\left( {{x_i}\left( t \right) } \right) \right) \right) \mathrm{{d}}t, \end{aligned}$$
(7)

where \({\vartheta _i}\left( {{x_i}} \right) = {x_{2i}} - {x_{2di}} + {a_{ei}}\left( {{x_{1i}} - {x_{1di}}} \right) \) is the hybrid error function including the position error and velocity error with \({\vartheta _{i0}}\left( {{x_i}\left( 0 \right) } \right) = {\vartheta _i}\left( 0 \right) \), \({a_{ei}}\) is a positive constant, \({x_{1di}}\) and \({x_{2di}}\) denote the desired position and velocity trajectories, respectively, \({Q_i}\) and \({R_i}\) are the positive definite matrices, \({N_i}\left( {{\vartheta _i},{u_i}\left( {\vartheta _i} \right) } \right) = \vartheta _i^T{Q_i}{\vartheta _i} + u_i^T\left( {{\vartheta _i}} \right) R{u_i}\left( {{\vartheta _i}} \right) \ge 0\) is the utility function with \({N_i}\left( {0,0} \right) = 0\), where \({u_i}\left( {{\vartheta _i}} \right) ={u_i}\left( {\vartheta _i}\left( {{x_i}\left( t \right) } \right) \right) \) is designed to realize the decentralized tracking control by transforming \({u_i}\left( {{x_i}} \right) \) into \({u_i}\left( {{\vartheta _i}} \right) \).

Then, we develop the time-triggered HJBE for subsystem (5) as:

$$\begin{aligned} {H_i}\left( {{\vartheta _i},{u_i}\left( {{\vartheta _i}} \right) ,\nabla {\varXi _i}\left( {{\vartheta _i}} \right) } \right)&= {{{N_i}\left( {{\vartheta _i},{u_i}\left( {{\vartheta _i}} \right) } \right) }} + \nabla \varXi _i^T\left( {{\vartheta _i}} \right) {{\dot{\vartheta }}_i}\nonumber \\&= \vartheta _i^T{Q_i}{\vartheta _i} + u_i^T\left( {{\vartheta _i}} \right) {R_i}{u_i}\left( {{\vartheta _i}} \right) \nonumber \\&\quad + \nabla \varXi _i^T\left( {{\vartheta _i}} \right) \nonumber \\&\quad \left( {{\varGamma _{fi}} + {D_i}{u_i}\left( {{\vartheta _i}} \right) + {\varTheta _i}+ {\upsilon _i}} \right) , \end{aligned}$$
(8)

where \(\nabla {\varXi _i}\left( {{\vartheta _i}} \right) \) shows the partial derivative of \({\varXi _i}\left( {{\vartheta _i}} \right) \) with respect to \({\vartheta _i}\), i.e., \(\nabla {\varXi _i}\left( {{\vartheta _i}} \right) = \partial {\varXi _i}\left( {{\vartheta _i}} \right) /\partial {\vartheta _i}\) and \({\upsilon _i}= - {x_{2di}} + {a_{ei}}\left( {{{x}_{1i}} - {{ x}_{1di}}} \right) \). The optimal performance index function is described by:

$$\begin{aligned} \varXi _i^*\left( {{\vartheta _i}} \right) = \mathop {\min }\limits _{{u_i}\left( {{\vartheta _i}} \right) } \int \limits _0^\infty {\left( {{N_i}\left( {{\vartheta _i},{u_i}\left( {{\vartheta _i}} \right) } \right) } \right) \mathrm{{d}}t}. \end{aligned}$$
(9)

Substituting (9) into the HJBE (8), we obtain:

$$\begin{aligned} 0 = \mathop {\min }\limits _{{u_i}\left( {{\vartheta _i}} \right) } {H_i}\left( {{\vartheta _i},{u_i}\left( {{\vartheta _i}} \right) ,\nabla \varXi _i^*\left( {{\vartheta _i}} \right) } \right) . \end{aligned}$$
(10)

Kalman [43] strictly demonstrated that if the optimal performance index function \(\varXi _i^*\left( {{\vartheta _i}} \right) \) is continuously differential and satisfies (10), the solution of HJBE \(u_i^*\left( {{\vartheta _i}} \right) \) exists as the optimal control policy of the corresponding nonlinear continuous system, which can be formulated as:

$$\begin{aligned} u_i^*\left( {{\vartheta _i}} \right) = - \frac{1}{2}R_i^{ - 1}D_i^T\nabla \varXi _i^*\left( {{\vartheta _i}} \right) . \end{aligned}$$
(11)

Then, the HJBE can be presented as:

$$\begin{aligned} 0&= {{N_i}\left( {{\vartheta _i},{u_i^*}\left( {{\vartheta _i}} \right) }\right) } \nonumber \\&\quad + {\left( {\nabla \varXi _i^*\left( {{\vartheta _i}} \right) } \right) ^T}\left( {{\varGamma _{fi}} + {\varTheta _i} + {D_i}{u_{i}^*\left( {\vartheta _i}\right) }+ {\upsilon _i}} \right) . \end{aligned}$$
(12)

To mitigate the subsystem dynamics, by utilizing the partly known model information, we can rewrite the optimal control \(u_i^*\) as:

$$\begin{aligned} u_i^*\left( {\vartheta _i}\right) =u_{1i}\left( {\vartheta _i}\right) +u_{2i}^*\left( {\vartheta _i}\right) , \end{aligned}$$
(13)

which is used to deal with dynamic model term \({\varGamma _{fi}}\), \({\varTheta _i}\) and to realize the optimal tracking control, respectively. Thus, combining (12) with (13), and through simple transformation, we have

$$\begin{aligned} 0&= {{N_i}\left( {{\vartheta _i},{u_i^*}\left( {{\vartheta _i}} \right) }\right) } + {\left( {\nabla \varXi _i^*\left( {{\vartheta _i}} \right) } \right) ^T}\nonumber \\&\quad \left( {{\varGamma _{fi}} + {\varTheta _i} + {D_i}\left( {u_{1i}\left( {\vartheta _i}\right) +u_{2i}^*\left( {\vartheta _i}\right) }\right) + {\upsilon _i}} \right) . \end{aligned}$$
(14)

Remark 3

On the basis of optimization theory and the dynamic model analysis, in this paper, the decentralized tracking control problem of MRMs is transformed into an optimal compensation control problem (13), which consists of model-based robust control \(u_{1i}\left( {\vartheta _i}\right) \) and ADP-based optimal control \(u_{2i}^*\left( {\vartheta _i}\right) \). Inspired by the previous works [41, 44], the decentralized optimal tracking control is developed with a compensator-critic structure, which can not only mitigate model uncertainties in real time but also realize the satisfactory tracking performance for MRMs.

According to [38, 45], the HJBE (8) can be solved by time-triggered ADP algorithm. However, as mentioned in [46], the time-triggered optimal control strategies do not only suffer from heavy computational burden and communication, but also waste limited energy resource. To address above shortcomings, a compensator-critic structure-based event-triggered decentralized tracking control is designed for MRMs as follows.

Compensator-critic structure-based event-triggered decentralized tracking control

In this section, the detailed design procedure of compensator-critic structure-based event-triggered decentralized tracking control for MRMs is described.

The model-based robust compensator

In practice, the dynamics of each joint module is partially known. Inspired by [13, 47, 48], we present a robust compensator \({u_{1i}}\), which consists of the model-based measurable term \(u_{1mi}\) and compensation term for dynamic uncertainties \(u_{1ui}\), can be expressed by:

$$\begin{aligned} {u_{1i}}= & {} {u_{1mi}} + {u_{1ui}}, \end{aligned}$$
(15)
$$\begin{aligned} {u_{1mi}}= & {} {{\hat{f}}_{bi}}{x_{2i}} + \left( {{{{\hat{f}}}_{si}}{e^{\left( { - {{{\hat{f}}}_{\tau i}}{{\left( {{x_{2i}}} \right) }^2}} \right) }} + {{{\hat{f}}}_{ci}}} \right) {\mathrm{sgn}}\nonumber \\&\left( {{x_{2i}}} \right) - D_i^{ - 1}{\upsilon _i} + {\beta _{pi}} + \frac{{{\tau _{ti}}}}{{{\gamma _i}}}, \end{aligned}$$
(16)
$$\begin{aligned} {u_{1ui}}= & {} {A_i}\left( {{x_{2i}}} \right) {u_{fi}} + {u_{zi}}, \end{aligned}$$
(17)

where the compensation term \(u_{1ui}\) is designed to deal with the approximated friction model error term \({{\tilde{F}}_{ri}}\) and the dynamic coupling term \({Z_i}\left( {x,\dot{x},\ddot{x}} \right) \). Based on our previous works [49, 50], \(u_{fi}\) and \(u_{zi}\) are presented to compensate \({{\tilde{F}}_{ri}}\) and \({Z_i}\left( {x,\dot{x},\ddot{x}} \right) \), respectively. The robust compensator \(u_{fi}\) can be designed as:

$$\begin{aligned} {u_{fi}}= & {} - {\kappa _{fi}}\left( {\int \limits _0^t {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}\mathrm{{d}}\tau } } \right) \nonumber \\&- \left\{ {\begin{array}{*{20}{c}} {{\xi _{Fib}}\frac{{A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}}}{{\left| {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}} \right| }},if\left| {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}} \right| > {\kappa _{fib}}}\\ {{\xi _{Fib}}\frac{{A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}}}{{{\kappa _{ufib}}}},if\left| {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}} \right| \le {\kappa _{fib}}} \end{array}} \right. , \end{aligned}$$
(18)

where \({\kappa _{fi}}\) and \({\kappa _{fib}}\) are positive parameters with \(b = 1,2,3,4\). To facilitate the analysis of the dynamic coupling term \({Z_i}\left( {x,\dot{x},\ddot{x}} \right) \), one can rewrite the term as:

$$\begin{aligned} {Z_i}\left( {x,\dot{x},\ddot{x}} \right)&= {\sum \limits _{j = 1}^{i - 1} {\left[ {\begin{array}{*{20}{c}} {{I_{ri}}{\hat{\varPhi }} _j^i}&{{I_{ri}}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{{\ddot{q}}_j}}&{{\tilde{\varPhi }} _j^i{{\ddot{q}}_j}} \end{array}} \right] ^T} } \nonumber \\&\quad + {\sum \limits _{j = 2}^{i - 1} {\sum \limits _{k = 1}^{j - 1} {\left[ {\begin{array}{*{20}{c}} {{I_{ri}}{\hat{\varPsi }} _{kj}^i}&{{I_{ri}}} \end{array}} \right] \left[ {\begin{array}{*{20}{c}} {{{\dot{q}}_k}{{\dot{q}}_j}}&{{\tilde{\varPsi }} _{kj}^i{{\dot{q}}_k}{{\dot{q}}_j}} \end{array}} \right] ^T} } }\nonumber \\&\buildrel \varDelta \over = \sum \limits _{j = 1}^{i - 1} {{\bar{\varPhi }} _j^iU_j^i} + \sum \limits _{j = 2}^{i - 1} {\sum \limits _{k = 1}^{j - 1} {{\bar{\varPsi }} _{kj}^iZ_{kj}^i} }. \end{aligned}$$
(19)

The robust compensator \(u_{zi}=u_{z1i}+u_{z2i}\) is developed to compensate the terms \(\sum \nolimits _{j = 1}^{i - 1} {{\bar{\varPhi }} _j^iU_j^i}\) and \( \sum \nolimits _{j = 2}^{i - 1} {\sum \nolimits _{k = 1}^{j - 1} {{\bar{\varPsi }} _{kj}^iZ_{kj}^i} }\) in (19) as:

$$\begin{aligned} {u_{z1i}}= & {} - {\kappa _{1i}}\int \limits _0^t {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \mathrm{{d}}\tau \nonumber \\&+ \left\{ {\begin{array}{*{20}{c}} { - {\xi _{1oi}}\frac{{{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}}}{{\left| {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \right| }},if\left| {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \right| > {\kappa _{z1oi}}}\\ { - {\xi _{1oi}}\frac{{{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}}}{{{\kappa _{z1oi}}}},if\left| {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \right| \le {\kappa _{z1oi}}} \end{array}} \right. , \end{aligned}$$
(20)
$$\begin{aligned} {u_{z2i}}= & {} - {\kappa _{2i}}\int \limits _0^t {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \mathrm{{d}}\tau \nonumber \\&+ \left\{ {\begin{array}{*{20}{c}} { - {\xi _{2oi}}\frac{{{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}}}{{\left| {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \right| }},if\left| {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \right| > {\kappa _{z2oi}}}\\ { - {\xi _{2oi}}\frac{{{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}}}{{{\kappa _{z2oi}}}},if\left| {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \right| \le {\kappa _{z2oi}}} \end{array}} \right. , \end{aligned}$$
(21)

where \({\kappa _{1i}}\), \({\kappa _{2i}}\), \({\kappa _{z1oi}}\), \({\kappa _{z2oi}}\), \({\xi _{1oi}}\), and \({\xi _{2oi}}\) are positive parameters with \(o=1,2\). According to (15), (16), (17), (18), (20) and (21), the robust compensator \(u_{1i}\) can be presented as:

$$\begin{aligned} {u_{1i}} \left( {\vartheta _i} \right) = {u_{1mi}} + {A_i}\left( {{x_{2i}}} \right) {u_{1fi}} + {u_{z1i}} +{u_{z2i}}. \end{aligned}$$
(22)

Theorem 1

Consider a MRM working in free space, the subsystem dynamics (5) with model uncertainties as (3) and (4). The tracking errors are ensured to be UUB under the robust compensation control law (22).

Proof

Choose the Lyapunov function candidate for the MRM subsystem as:

$$\begin{aligned} {V_{mi}(t)}= & {} \frac{1}{2}D_i^{ - 1}\vartheta _i^2 + \frac{1}{2}{\kappa _{fi}}M_{fi}^T{M_{fi}} \nonumber \\&+ \frac{1}{2}{\kappa _{1i}}\phi _{1i}^T{\phi _{1i}} + \frac{1}{2}{\kappa _{2i}}\varphi _{2i}^T{\varphi _{2i}}, \end{aligned}$$
(23)

where

$$\begin{aligned} {M_{fi}}&= \frac{1}{{{\kappa _{fi}}}}{{\tilde{F}}_{ri}} + \int \limits _0^t {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}\mathrm{{d}}\tau },\nonumber \\ {\phi _{1i}}&= \sum \limits _{j = 1}^{i - 1} {\left( {\frac{1}{{{\kappa _{1i}}}}U_j^i + \int \limits _0^t {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \mathrm{{d}}\tau } \right) }, \nonumber \\ {\varphi _{2i}}&= \sum \limits _{j = 2}^{i - 1} {\sum \limits _{k = 1}^{j - 1} {\left( {\frac{1}{{{\kappa _{2i}}}}Z_{kj}^i + \int \limits _0^t {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \mathrm{{d}}\tau } \right) } } . \end{aligned}$$
(24)

In (24), the actual terms \({{\tilde{F}}_{ri}}\), \( {U_j^i}\), and \( {Z_{kj}^i} \) are all constants. Therefore, the time derivative of (23) is expressed as:

$$\begin{aligned} {{\dot{V}}_{mi}}&= - {\vartheta _i}\left( {{{\hat{f}}}_{bi}}{x_{2i}} + \left( {{{{\hat{f}}}_{si}}{e^{\left( { - {{{\hat{f}}}_{\tau i}}{{\left( {{x_{2i}}} \right) }^2}} \right) }} + {{{\hat{f}}}_{ci}}} \right) {\mathrm{sgn}} \left( {{x_{2i}}} \right) \right. \nonumber \\&\left. \quad + {f_{pi}}\left( {{x_{1i}},{x_{2i}}} \right) + \frac{{{\tau _{ti}}}}{{{\gamma _i}}} \right) - {\vartheta _i}{A_i}\left( {{x_{2i}}} \right) {{{\tilde{F}}}_{ri}} \nonumber \\&\quad + {\kappa _{fi}}{\vartheta _i}{A_i}\left( {{x_{2i}}} \right) \left( {\frac{1}{{{\kappa _{fi}}}}{{{\tilde{F}}}_{ri}} + \int \limits _0^t {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}\mathrm{{d}}\tau } } \right) \nonumber \\&\quad - {\vartheta _i}{Z_i}\left( {x,\dot{x},\ddot{x}} \right) \nonumber \\&\quad + {\kappa _{1i}}{\vartheta _i}{\bar{\varPhi }} _j^i\left( {\sum \limits _{j = 1}^{i - 1} {\left( {\frac{1}{{{\kappa _{1i}}}}U_j^i + \int \limits _0^t {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \mathrm{{d}}\tau } \right) } } \right) \nonumber \\&\quad + {\kappa _{2i}}{\vartheta _i}{\bar{\varPsi }} _{kj}^i\left( \sum \limits _{j = 2}^{i - 1} \sum \limits _{k = 1}^{j - 1} \left( \frac{1}{{{\kappa _{2i}}}}Z_{kj}^i \right. \right. \nonumber \\&\left. \left. \quad + \int \limits _0^t {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \mathrm{{d}}\tau \right) \right) + {\vartheta _i}\left( {{u_{1i}} + {\upsilon _i}} \right) . \end{aligned}$$
(25)

According the robust compensator \(u_{1i}\) in (15), \(u_{1mi}\) and \(u_{1ui}\) are employed to deal with the known dynamics and uncertainties correspondingly. Through (16) and (22), we obtain

$$\begin{aligned} {{\dot{V}}_{mi}}&\le {\vartheta _i}{\upsilon _i} + \left( {\kappa _{fi}}{\vartheta _i}{A_i}\left( {{x_{2i}}} \right) \right. \nonumber \\&\left. \quad \times \int \limits _0^t {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}\mathrm{{d}}\tau } + {\vartheta _i}{A_i}\left( {{x_{2i}}} \right) {u_{fi}} \right) \nonumber \\&\quad + \sum \limits _{j = 1}^{i - 1} {\left( {{\kappa _{1i}}{\vartheta _i}{\bar{\varPhi }} _j^i\int \limits _0^t {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \mathrm{{d}}\tau + {\vartheta _i}{\bar{\varPhi }} _j^i{u_{z1i}}} \right) } \nonumber \\&\quad + \sum \limits _{j = 2}^{i - 1} {\sum \limits _{k = 1}^{j - 1} {\left( {{\kappa _{2i}}{\vartheta _i}{\bar{\varPsi }} _{kj}^i\int \limits _0^t {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \mathrm{{d}}\tau + {\vartheta _i}{\bar{\varPsi }} _{kj}^i{u_{z2i}}} \right) } } . \end{aligned}$$
(26)

Combining (18), (20) and (21) with (26), we have

$$\begin{aligned} {{\dot{V}}_{mi}}&\le {\vartheta _i}{\upsilon _i} + {\vartheta _i}{A_i}\left( {{x_{2i}}} \right) \left\{ {\begin{array}{*{20}{c}} { - {\xi _{Fib}}\frac{{A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}}}{{\left| {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}} \right| }},\quad \mathrm{{if}}\left| {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}} \right|> {\kappa _{fib}}}\\ { - {\xi _{Fib}}\frac{{A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}}}{{{\kappa _{fib}}}},\quad \mathrm{{if}}\left| {A_i^T\left( {{x_{2i}}} \right) {\vartheta _i}} \right| \le {\kappa _{fib}}} \end{array}} \right. \nonumber \\&\quad + \sum \limits _{j = 1}^{i - 1} {{\vartheta _i}{\bar{\varPhi }} _j^i\left\{ {\begin{array}{*{20}{c}} { - {\xi _{1oi}}\frac{{{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}}}{{\left| {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \right| }},\quad \mathrm{{if}}\left| {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \right|> {\kappa _{z1oi}}}\\ { - {\xi _{1oi}}\frac{{{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}}}{{{\kappa _{z1oi}}}},\quad \mathrm{{if}}\left| {{{\left( {{\bar{\varPhi }} _j^i} \right) }^T}{\vartheta _i}} \right| \le {\kappa _{z1oi}}} \end{array}} \right. } \nonumber \\&\quad + \sum \limits _{j = 2}^{i - 1} {\sum \limits _{k = 1}^{j - 1} {{\vartheta _i}{\bar{\varPsi }} _{kj}^i\left\{ {\begin{array}{*{20}{c}} { - {\xi _{2oi}}\frac{{{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}}}{{\left| {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \right| }},\quad \mathrm{{if}}\left| {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \right| > {\kappa _{z2oi}}}\\ { - {\xi _{2oi}}\frac{{{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}}}{{{\kappa _{z2oi}}}},\quad \mathrm{{if}}\left| {{{\left( {{\bar{\varPsi }} _{kj}^i} \right) }^T}{\vartheta _i}} \right| \le {\kappa _{z2oi}}} \end{array}} \right. } }. \end{aligned}$$
(27)

By the simple transformation, assuming that \(\left\| {{A_i}\left( {{x_{2i}}} \right) A_i^T\left( {{x_{2i}}} \right) } \right\| \le {A_{Mi}}\) and \(\left\| {{\upsilon _i}} \right\| \le {\upsilon _{Mi}}\), we rewrite (27) as:

$$\begin{aligned} {\dot{V}_{mi}}\le & {} - \left( \left( \sum \limits _{b = 1}^4 {\frac{{{\xi _{Fib}}}}{{{\kappa _{fib}}}}{A_{Mi}}} + \sum \limits _{o = 1}^2 \left( \frac{{{\xi _{1oi}}}}{{{\kappa _{z1oi}}}}\right. \right. \right. \nonumber \\&\left. \left. \left. + \frac{{{\xi _{2oi}}}}{{{\kappa _{z2oi}}}}\right) \right) \left\| {{\vartheta _i}} \right\| - {\upsilon _{Mi}} \right) \left\| {{\vartheta _i}} \right\| . \end{aligned}$$
(28)

According to the Lyapunov’s direct method, the tracking error \({\vartheta _i}\) can be guaranteed to be UUB, if \({\vartheta _i}\) lies outside the compact set:

$$\begin{aligned} {\varOmega _{ci}}= & {} \left\{ {\vartheta _i}:\left\| {{\vartheta _i}} \right\| \le {\upsilon _{Mi}}/\left( \sum \limits _{b = 1}^4 {\frac{{{\xi _{Fib}}}}{{{\kappa _{fib}}}}{A_{Mi}}}\right. \right. \\&\left. \left. + \sum \limits _{o = 1}^2 \left( {\frac{{{\xi _{1oi}}}}{{{\kappa _{z1oi}}}} + \frac{{{\xi _{2oi}}}}{{{\kappa _{z2oi}}}}}\right) \right) \right\} . \end{aligned}$$

This completes the proof. \(\square \)

Decentralized tracking control based on event-triggered mechanism

The event-triggered mechanism is effective to reduce the computational burden and energy cost. Based on event-triggered mechanism, the decentralized tracking control input is updated when the triggering condition is violated. Suppose that \(\left\{ {{t_l}} \right\} _{l = 0}^{ + \infty }\) is a monotonically increasing sequence consisting of triggering instants, where \(t_{l}\) satisfies \(0< {t_l} < {t_{l + 1}}\) and \(\mathop {\lim }\nolimits _{l \rightarrow \infty } {t_l} = \infty \) for \(l \in \left\{ {0,1,2, \ldots } \right\} \). The sampled state is presented as:

$$\begin{aligned} {{\hat{\vartheta }} _{li}}\left( {{{{\hat{x}}}_{li}}} \right) = {\vartheta _i}\left( {{x_i}\left( {{t_l}} \right) } \right) , \end{aligned}$$
(29)

where \({{\hat{\vartheta }} _{li}}\left( {{{{\hat{x}}}_{li}}} \right) \) is the sampled data for \(t \in \left[ {{t_l},{t_{l + 1}}} \right) \). To obtain the proper event-triggering condition, the gap function between the sampled state and the actual state is defined as:

$$\begin{aligned} {E_{li}}\left( t \right) = {{\hat{\vartheta }} _{li}}\left( {{{{\hat{x}}}_{li}}} \right) - {\vartheta _i}\left( {{x_i}} \right) ,t \in \left[ {{t_l},{t_{l + 1}}} \right) . \end{aligned}$$
(30)

Based on the event-triggering mechanism, the control policy is updated \({\vartheta _i}\left( {{x_i}} \right) = \left\{ \begin{array}{l} {{{\hat{\vartheta }} }_{li}}\left( {{{{\hat{x}}}_{li}}} \right) , t \in \left[ {{t_l},{t_{l + 1}}} \right) \\ {\vartheta _i}\left( {{x_i}} \right) , t=t_{l+1} \end{array} \right. \). In this situation, the decentralized tracking control input becomes a piece-wise continuous-time signal by a zero-order hold, which is formulated as:

$$\begin{aligned} u_i\left( \left( \vartheta _i\left( x_i\left( t_l\right) \right) \right) \right) =u_i\left( {\hat{\vartheta }}_{li}\left( {\hat{x}}_{li}\right) \right) \end{aligned}$$
(31)

during the time interval \([t_l,t_{l+1})\). Based on (11), the event-triggered decentralized optimal control can be formulated by:

$$\begin{aligned} u_i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{{{\hat{x}}}_{li}}} \right) } \right)= & {} u_{i1}\left( {{{{\hat{\vartheta }} }_{li}}\left( {{{{\hat{x}}}_{li}}} \right) } \right) \nonumber \\&- \frac{1}{2}R_i^{ - 1}D_i^T\nabla \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{{{\hat{x}}}_{li}}} \right) } \right) , t \in \left[ {{t_l},{t_{l + 1}}} \right) .\nonumber \\ \end{aligned}$$
(32)

However, \(u_i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{{{\hat{x}}}_{li}}} \right) } \right) \) is the discrete value of aperiodic sampling and by introducing the zero-order hold, the control signal becomes continuous.

Substituting (32) into (14), we establish the event-triggered HJBE as:

$$\begin{aligned} 0&= \vartheta _i^T{Q_i}{\vartheta _i} + {\left( {u_i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right) ^T}{R_i}u_i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) + {\left( {\nabla \varXi _i^*\left( {{\vartheta _i}} \right) } \right) ^T}{{{\dot{\vartheta }} }_i} \nonumber \\&={\left( {\nabla \varXi _i^*\left( {{\vartheta _i}} \right) } \right) ^T}\left( {{\varGamma _{fi}} + {\varTheta _i} + {D_i}\left( {u_{i}\left( {{{{\hat{\vartheta }} }_{li}}}\right) }\right) {+ \upsilon _i}} \right) \nonumber \\&\quad +\vartheta _i^T{Q_i}\vartheta _i + {\left( {u_{i1}\left( {{{\hat{\vartheta } }_{li}}} \right) } \right) ^T}{R_i}u_{i1}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \nonumber \\&\quad + \frac{1}{4}{\left( {\nabla \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right) ^T}{D_i}R_i^{ - 1}D_i^T\nabla \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) . \end{aligned}$$
(33)

Assumption 2

The decentralized tracking control \(u_i^*\) is Lipschitz continuous for every state \({\vartheta _i}\), \({{\hat{\vartheta }} _{li}} \in \varOmega \), i.e., there exists a positive constant \({m_{li}}\), such that

$$\begin{aligned} \left\| {u_i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) -u_i^*\left( {{\vartheta _i}} \right) } \right\| \le {m_{li}}\left\| {{{{\hat{\vartheta }} }_{li}} - {\vartheta _i}} \right\| = {m_{li}}\left\| {{E_{li}}} \right\| . \end{aligned}$$
(34)

Remark 4

In the event-triggered decentralized tracking control policy (32), the subsystem error function \({\vartheta _i}\left( {{x_i}} \right) \) is substituted by \({{\hat{\vartheta }} _{li}}\left( {{{{\hat{x}}}_{li}}} \right) \) to determine the triggering time instant \(t_{l}\), and the decentralized tracking control policy is updated by \(u_i^*\left( {{\vartheta _i}\left( {{x_i}\left( {{t_l}} \right) } \right) } \right) = u_i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{{{\hat{x}}}_{li}}} \right) } \right) \) within \(t \in \left[ {{t_l},{t_{l + 1}}} \right) \).

Critic-based event-triggered decentralized tracking control

To solve the event-triggered HJBE, the neural network (NN) which has powerful learning ability, is utilized to approximate the performance index function \(\varXi _i^*\left( {{\vartheta _i}} \right) \) as:

$$\begin{aligned} \varXi _i^*\left( {{\vartheta _i}} \right) = W_{ci}^T{\delta _{ci}}\left( {{\vartheta _i}} \right) + {\varepsilon _{ci}}\left( {{\vartheta _i}} \right) , \end{aligned}$$
(35)

where \({W_{ci}} \in {\mathbb {R}}{^K}\) is the desired weight vector, K is the number of neurons in the hidden layer, \({\delta _{ci}}\left( {{\vartheta _i}} \right) \) is the activation function, and \({\varepsilon _{ci}}\left( {{\vartheta _i}} \right) \) is the critic NN approximation error. Thus, the partial derivative of \(\varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) \) is:

$$\begin{aligned} \nabla \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right)= & {} {\left. {\frac{{\partial \varXi _i^*\left( {{\vartheta _i}} \right) }}{{\partial {\vartheta _i}}}} \right| _{{\vartheta _i} = {{{\hat{\vartheta }} }_{li}}}}\nonumber \\= & {} \nabla \delta _{ci}^T\left( {{{{\hat{\vartheta }} }_{li}}} \right) {W_{ci}} + \nabla {\varepsilon ^T _{ci}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) , \end{aligned}$$
(36)

where \(\nabla \delta _{ci}\left( {{{{\hat{\vartheta }} }_{li}}}\right) = {\left. {\frac{{\delta _{ci}\left( {{\vartheta _i}} \right) }}{{\partial {\vartheta _i}}}} \right| _{{\vartheta _i}= {{{\hat{\vartheta }} }_{li}}}}\) and \(\nabla {\varepsilon _{ci}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) = {\left. {\frac{{\varepsilon _{ci}\left( {{\vartheta _i}} \right) }}{{\partial {\vartheta _i}}}} \right| _{{\vartheta _i}= {{{\hat{\vartheta }} }_{li}}}}\). According to [51], it is reasonable to assume \(||\nabla \delta _{ci}\left( {{{{\hat{\vartheta }} }_{li}}}\right) ||\le {\delta _{cid}}\) and \(||\nabla {\varepsilon _{ci}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) ||\le {\varepsilon _{cid}}\) with \({\delta _{cid}}\) and \({\varepsilon _{cid}}\) positive constants. Through Assumption 2, we have \(\left\| \nabla \delta _{ci}\left( {{{ \vartheta }_{i}}}\right) -\nabla \delta _{ci}\left( {{{{\hat{\vartheta }} }_{li}}}\right) \right\| \) \(\le P_{i}\left\| E_{li}\right\| \). Combining (32) with (36), we can obtain

$$\begin{aligned} u_{2i}^*\left( {{{{\hat{\vartheta }} }_{li}}} \right)&= - \frac{1}{2}R_i^{ - 1}D_i^T\nabla \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) \nonumber \\&= - \frac{1}{2}R_i^{ - 1}D_i^T\left( \nabla \delta _{ci}^T\left( {{{{\hat{\vartheta }} }_{li}}} \right) {W_{ci}} + \nabla {\varepsilon ^T_{ci}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \right) . \end{aligned}$$
(37)

Therefore, the event-triggered HJBE can be rewritten as:

$$\begin{aligned}&{H_i} \left( {{\vartheta _i},u_i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) ,\nabla \varXi _i^*\left( {{\vartheta _i}} \right) } \right) \nonumber \\&\quad = \vartheta _i^T{Q_i}{\vartheta _i} + {\left( {u_i^{*T}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right) }{R_i}u_i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) \nonumber \\&\qquad + \left( {{W^T_{ci}}\nabla \delta _{ci}\left( {{\vartheta _i}} \right) } \right) \left( {\varGamma _{fi}} + {D_i}u_i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) + {\varTheta _i} + {\upsilon _i}\right) \nonumber \\&\quad = { - \nabla {\varepsilon _{ci}}\left( {{\vartheta _i}} \right) \left( {{\varGamma _{fi}} + {D_i}u_i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) + {\varTheta _i}} + {\upsilon _i} \right) } \nonumber \\&\quad \buildrel \varDelta \over = {\varepsilon _{Hi}}, \end{aligned}$$
(38)

where \({u_i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) } = u_{1i} + u_{2i}^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) \) is the ideal control torque. Since the desired weight vector \(W_{ci}\) is unavailable, the critic NN can be approximated by:

$$\begin{aligned} {{\hat{\varXi }} _i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) = {\hat{W}}_{ci}^T{\delta _{ci}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) , \end{aligned}$$
(39)

and the partial derivative of \({{\hat{\varXi }} _i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \) can be expressed by \(\nabla {{\hat{\varXi }} _i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \) \( = \nabla \delta _{ci}^T\left( {{{{\hat{\vartheta }} }_{li}}} \right) \) \({{\hat{W}}_{ci}}\). The event-triggered approximate decentralized tracking control strategy \({{\hat{u}}_{i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \) is presented as:

$$\begin{aligned} {{\hat{u}}_i\left( {{{{\hat{\vartheta }} }_{li}}} \right) }&= u_{1i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) +{\hat{u}}_{2i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \nonumber \\&= u_{1i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) - \frac{1}{2}R_i^{ - 1}D_i^T\left( {{\nabla \delta ^T _{ci}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) {\hat{W}}_{ci}} \right) . \end{aligned}$$
(40)

Remark 5

Different from the traditional ADP-based optimal control approaches that rely on actor NNs, critic NNs, and even model NNs, in this paper, the compensator-critic structure-based event-triggered decentralized tracking control method, which consist of model-based robust compensator and only critic NNs-based approximated optimal controller, is proposed for MRMs.

Through (38), (39) and (40), the approximate event-triggered Hamiltonian is:

$$\begin{aligned}&{{{\hat{H}}}_i} \left( {{\vartheta _i},{{{\hat{u}}}_{i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) ,\nabla {{{\hat{\varXi }} }_i}\left( {{\vartheta _i}} \right) } \right) \nonumber \\&\quad = \vartheta _i^T{Q_i}{\vartheta _i} + {\left( {{{{\hat{u}}}_i^T}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right) }{R_i}{{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \nonumber \\&\qquad + {\left( {\nabla {{{\hat{\varXi }}}_i^T}\left( {{\vartheta _i}} \right) } \right) }{{{\dot{\vartheta }} }_i} \nonumber \\&\quad = \vartheta _i^T{Q_i}{\vartheta _i} + {\left( {{{{\hat{u}}}_i^T}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right) }{R_i}{{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \nonumber \\&\qquad + \left( {{{{\hat{W}}}^T_{ci}}\nabla \delta _{ci}\left( {{\vartheta _i}} \right) } \right) \left( {{\varGamma _{fi}} + {D_i}{{{\hat{u}}}_{i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) + {\varTheta _i}}+ {\upsilon _i}\right) \nonumber \\&\quad \buildrel \varDelta \over = {\varepsilon _{cHi}}. \end{aligned}$$
(41)

Comparing (38) with (41), the NN weight approximation error can be defined as \({{\tilde{W}}_{ci}} = {W_{ci}} - {{\hat{W}}_{ci}}\), and the residual error \({\varepsilon _{cHi}}\) is:

$$\begin{aligned} {\varepsilon _{cHi}}&= {\tilde{W}}_{ci}^T\nabla {\delta _{ci}}\left( {{\vartheta _i}} \right) \left( {\varGamma _{fi}} + {D_i}{{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \right. \nonumber \\&\left. \quad + {\varTheta _i} + {\upsilon _i} \right) + {\varepsilon _{Hi}} + {\varepsilon _{ZiM}}, \end{aligned}$$
(42)

where \({\varepsilon _{cHi}}\) is bounded as \(||{\varepsilon _{cHi}} ||\le {\varepsilon _{cHiM}}\) with \({\varepsilon _{cHiM}}\) a positive constant, and \({\varepsilon _{ZiM}}\) is the upper bound of \({\varepsilon _{Zi}}\) as:

$$\begin{aligned} {\varepsilon _{Zi}} = { \left( \varepsilon _{ui}\right) ^T}{R_i}\left( \varepsilon _{ui} \right) + {\left( {\nabla {{{\hat{\varXi }}}_i^T}\left( {{\vartheta _i}} \right) } \right) }{D_i}\left( \varepsilon _{ui} \right) , \end{aligned}$$
(43)

with \(\varepsilon _{ui}={{{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) - {u_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) }= -\frac{1}{2}R_i^{ - 1}D_i^T\left( {{\nabla {\delta ^T _{ci}}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \tilde{W}_{ci}}+{\nabla \varepsilon ^T _{ci}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \right) \).

Fig. 2
figure 2

Structural diagram of the proposed optimal control method

To adjust the critic NN weight vector \({ {{\hat{W}}}}_{ci}\), we minimize the objective function \({E_{ci}} = \frac{1}{2}\varepsilon _{cHi}^T{\varepsilon _{cHi}}\) by the gradient decent algorithm, and it should be updated by:

$$\begin{aligned} {{\dot{{\hat{W}}}}_{ci}}&= - {\alpha _{ci}}\frac{{\partial {E_{ci}}}}{{\partial {{{\hat{W}}}_{ci}}}} = - {\alpha _{ci}}{\varepsilon _{cHi}}\frac{{\partial {\varepsilon _{cHi}}}}{{\partial {{{\hat{W}}}_{ci}}}} \nonumber \\&= - {\alpha _{ci}}{\sigma _{ci}}\left( {\vartheta _i^T{Q_i}{\vartheta _i} + {{\left( {{{{\hat{u}}}_i^T}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right) }}{R_i}{{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) + \sigma _{ci}^T{{{\hat{W}}}_{ci}}} \right) , \end{aligned}$$
(44)

where \({\sigma _{ci}}{=} \nabla {\delta _{ci}}\left( {{\vartheta _i}} \right) \left( {{\varGamma _f}\left( {{x_i}} \right) {+} {D_i}{{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) {+} {\varTheta _i}\left( x \right) {+} {\upsilon _i}} \right) \). Therefore, the weight approximation error can be updated by:

$$\begin{aligned} {\dot{{\tilde{W}}}_{ci}} = - {\dot{{\hat{W}}}_{ci}} = - {\alpha _{ci}}{\sigma _{ci}}\left( {\sigma _{ci}^T{{{\tilde{W}}}_{ci}} - {\varepsilon _{cHi}}} \right) . \end{aligned}$$
(45)

Remark 6

The critic NN is constructed to approximate the decentralized optimal compensation control based on the powerful learning ability of NNs. Note that the critic NN weight learning law (44) is designed using the local joint modular state without relying on the event-triggered conditions.

Thus, the event-triggered approximate decentralized tracking control policy \({{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \) which is applied to MRM as the control torque is given as (40). The structural diagram of the proposed compensator-critic structure-based event-triggered decentralized tracking control strategy of MRM systems is illustrated in Fig. 2.

Theorem 2

Considering the n-DOF MRM whose subsystem dynamics described as (5), the weight estimation error \( {\tilde{W}}_{ci}\) of the critic NN can be guaranteed to be UUB with the weight updating law (44).

Proof

Select the Lyapunov function candidate as:

$$\begin{aligned} {{ V}_{ci}} =&{\frac{1}{2}{\tilde{W}}_{ci}^T{{{\tilde{W}}}_{ci}}}. \end{aligned}$$
(46)

Supposed that \({{{\sigma _{ci}}\sigma _{ci}^T} \le \lambda _{\max }}\left( {{\sigma _{ci}}\sigma _{ci}^T} \right) \buildrel \varDelta \over = {\sigma _{ciM}}\) with a positive constant \({\sigma _{ciM}}\), where \({\lambda _{\max }}\left( \cdot \right) \) denotes the maximal eigenvalue of matrix. Then, according the critic NN weight updating law (44) and Young’s inequality, the time derivative of (46) is calculated as:

$$\begin{aligned} {{\dot{V}}_{ci}}&= {{\tilde{W}}_{ci}^T}{\dot{{\tilde{W}}}_{ci}} \nonumber \\&= - {\alpha _{ci}}{\tilde{W}}_{ci}^T{\sigma _{ci}}\sigma _{ci}^T{{{\tilde{W}}}_{ci}} + {\alpha _{ci}}{\tilde{W}}_{ci}^T{\sigma _{ci}}{\varepsilon _{cHi}} \nonumber \\&\le - \left( {{\alpha _{ci}} - \frac{1}{2}} \right) {\sigma _{ciM}}{\left\| {{{{\tilde{W}}}_{ci}}} \right\| ^2} + \frac{{\alpha _{ci}^2}}{2}\varepsilon _{cHiM}^2. \end{aligned}$$
(47)

Thus, the weight approximation error \( {\tilde{W}}_{ci}\) can be proved to be UUB with \({\alpha _{ci}} > \frac{1}{2}\), if \( {\tilde{W}}_{ci}\) lies outside the compact set:

$$\begin{aligned} {\varOmega _{Wi}} = \left\{ {{\tilde{W}}_{ci}}:\left\| {{\tilde{W}_{ci}}} \right\| \le \sqrt{\frac{{\alpha _{ci}^2}\varepsilon _{cHiM}^2}{\left( {{2\alpha _{ci}} - 1} \right) {\sigma _{ciM}}}}\right\} . \end{aligned}$$

\(\square \)

Remark 7

Unlike existing works which presented time-triggered tracking controllers [37, 38], in this paper, the event-triggered mechanism is introduced to develop the compensator-critic structure-based decentralized tracking control strategy based on the ADP approach with considering the optimal performance, reducing computational burden, and saving communication and energy consumption.

Stability analysis of the closed-loop MRM system

In this part, the stability analysis of the closed-loop MRM system under the developed compensator-critic structure-based event-triggered decentralized tracking control is provided using the Lyapunov stability theorem.

Theorem 3

Considering the n-DOF MRM whose subsystem dynamics described as (5), and Assumptions 1 and 2, the closed-loop MRM system is UUB via the approximate compensator-critic structure based event-triggered decentralized tracking control law (40) if the following condition is satisfied:

$$\begin{aligned}&{\left\| {{E_{li}}} \right\| ^2} \le \, \frac{\big ({\left( {\lambda _{\min }}\left( {{Q_i}} \right) -1\right) {\left\| {{\vartheta _i}} \right\| }-{\upsilon _{Mi}}}\big )\left\| {\vartheta _i}\right\| +\left( {{\lambda _{\min }}\left( {{R_i}} \right) }-\frac{1}{2}{D_i}\right) {\left\| {{{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2}}{\frac{1}{2}{D_i}{m_{l1i}}+2{{\lambda _{\min }}\left( {{R_i}} \right) }{m_{l2i}^2}} \nonumber \\&+\, \frac{{{\lambda _{\min }}\left( {{R_i}} \right) }{{\left\| {{{u}_{1i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| }^2}+{{\lambda _{\min }}\left( {{R_i^{ - 1}}} \right) }{{D_i^2} }\left( { \delta _{cid}^2 W_{cid}^2 + \varepsilon _{cid}^2} \right) }{\frac{1}{2}{D_i}{m_{l1i}}+2 {{\lambda _{\min }}\left( {{R_i}} \right) }{m_{l2i}^2}}\nonumber \\&\buildrel \varDelta \over =\, {\left\| {{E_{Li}}} \right\| ^2}. \end{aligned}$$
(48)

Proof

Select the Lyapunov function candidate for the MRM subsystem as:

$$\begin{aligned} {V_i}\left( t \right) = \underbrace{\frac{1}{2}{\vartheta _i^T}{\vartheta _i}+\varXi _i^*\left( {{\vartheta _i}} \right) }_{{V_{1i}}} + \underbrace{\varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) }_{{V_{2i}}} . \end{aligned}$$
(49)

(1) The events are not triggered, i.e., \(t \in \left[ {{t_l},{t_{l + 1}}} \right) \). Calculating the time derivative of (49) \({\dot{V}_{i}}={\dot{V}_{1i}}+{\dot{V}_{2i}}\), the first term is:

$$\begin{aligned} {\dot{V}_{1i}}&= {\vartheta _i^T}\left( {{\varGamma _{fi}} +{D_i}{{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) + {\varTheta _i} + {\upsilon _i}}\right) \nonumber \\&\quad +{\left( {\nabla \varXi _i^{*T}\left( {{\vartheta _i}} \right) } \right) }\left( {{\varGamma _{fi}} + {D_i}{{{\hat{u}}}_i}\left( {{{{\hat{\vartheta }} }_{li}}} \right) + {\varTheta _i} + {\upsilon _i}} \right) . \end{aligned}$$
(50)

\(\square \)

In light of the time-triggered HJBE (14) and optimal control law (32), we obtain:

$$\begin{aligned}&{\left( {\nabla \varXi _i^{*T}\left( {{\vartheta _i}} \right) } \right) } \left( {{\varGamma _f}\left( {{x_i}} \right) + {\varTheta _i}\left( x \right) + {\upsilon _i}} \right) \nonumber \\&\quad = - \vartheta _i^T{Q_i}{\vartheta _i} - u_{1i}^T\left( {{{ \vartheta }_{i}}} \right) {R_i}{u_{1i}}\left( {{{ \vartheta }_{i}}} \right) \nonumber \\&\qquad -{\left( {\nabla \varXi _i^{*T}\left( {{\vartheta _i}} \right) } \right) }\left( {D_i}{u_{i}^*}\left( {{{ \vartheta }_{i}}} \right) \right) \nonumber \\&\qquad + \frac{1}{4}{\left( {\nabla \varXi _i^{*T}\left( {{\vartheta _i}} \right) } \right) }{D_i}R_i^{ - 1}D_i^T\nabla \varXi _i^*\left( {{\vartheta _i}} \right) , \end{aligned}$$
(51)

and

$$\begin{aligned} {\left( {\nabla \varXi _i^{*T}\left( {{\vartheta _i}} \right) } \right) }{D_i} = - 2u_{2i}^{*T}\left( {{\vartheta _i}} \right) {R_i}. \end{aligned}$$
(52)

Then, substituting (51) and (52) into \({\dot{V}_{1i}}\), we have

$$\begin{aligned} {{\dot{V}}_{1i}}&= {\vartheta _i^T}\left( {\varGamma _{fi}} + {D_i}{u_{1i}}\left( {{{ \vartheta }_{i}}} \right) + {\varTheta _i}\right. \nonumber \\&\left. \quad + {\upsilon _i}+{D_i}{{{\hat{u}}}_{i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) - {D_i}{u_{1i}}\left( {{{ \vartheta }_{i}}} \right) \right) \nonumber \\&\quad - \vartheta _i^T{Q_i}{\vartheta _i} - u_{1i}^T\left( {{{ \vartheta }_{i}}} \right) {R_i}{u_{1i}}\left( {{{ \vartheta }_{i}}} \right) \nonumber \\&\quad + 2{{u_{2i}^{*T}\left( {{\vartheta _i}} \right) } }{R_i}{u_{1i}}\left( {{{ \vartheta }_{i}}} \right) + {{u_{2i}^{*T}\left( {{\vartheta _i}} \right) } }{R_i}{u_{2i}^*}\left( {{{ \vartheta }_{i}}} \right) \nonumber \\&\quad - 2{{u_{2i}^{*T}\left( {{\vartheta _i}} \right) } }{R_i}{{ u}_{1i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) - 2{{u_{2i}^{*T}\left( {{\vartheta _i}} \right) } }{R_i}{{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) . \end{aligned}$$
(53)

According to Theorem 1 and Assumption 2, through Young’s inequality, (53) becomes:

$$\begin{aligned} {{\dot{V}}_{1i}}&\le {\vartheta _i^T}\left( {\upsilon _i} +{D_i}{m_{l1i}}\left\| {E_{li}}\right\| +{D_i}{{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \right) \nonumber \\&\quad - \vartheta _i^T{Q_i}{\vartheta _i} - u_{1i}^T\left( {{{{\hat{\vartheta }} }_{li}}} \right) {R_i}{u_{1i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) \nonumber \\&\quad + {R_i}{\left\| {u_{2i}^*\left( {{\vartheta _i}} \right) - {{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2}- {R_i}{\left\| {{{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2} \nonumber \\&\le -\vartheta _i^T\left( {Q_i}-1\right) {\vartheta _i}+{\vartheta _i^T} {\upsilon _i}\nonumber \\&\quad +\frac{1}{2}{D_i}{m_{l1i}}\left\| {E_{li}}\right\| ^2- {R_i}{\left\| {{{u}_{1i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2} \nonumber \\&\quad -\left( {R_i}-\frac{1}{2}{D_i}\right) {\left\| {{{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2}\nonumber \\&\quad + {R_i}{\left\| {u_{2i}^*\left( {{\vartheta _i}} \right) - {{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2} \nonumber \\&\le -\left( {Q_i}-1\right) \left\| {\vartheta _i}\right\| ^2+{\upsilon _i}\left\| {\vartheta _i}\right\| \nonumber \\&\quad +\frac{1}{2}{D_i}{m_{l1i}}\left\| {E_{li}}\right\| ^2- {R_i}{\left\| {{{u}_{1i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2} \nonumber \\&\quad + {R_i}\left\| {m_{l2i}}\left\| {E_{li}}\right\| \right. \nonumber \\&\left. \quad - \frac{1}{2}R_i^{ - 1}D_i^T\left( {\nabla \delta _{ci}^T\left( {{\vartheta _{li}}} \right) {{{\tilde{W}}}_{ci}} + \nabla {\varepsilon _{ci}}\left( {{\vartheta _{li}}} \right) } \right) \right\| ^2\nonumber \\&\quad -\left( {R_i}-\frac{1}{2}{D_i}\right) {\left\| {{{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2} \nonumber \\&\le -\left( {Q_i}-1\right) \left\| {\vartheta _i}\right\| ^2+{\upsilon _i}\left\| {\vartheta _i}\right\| \nonumber \\&\quad +\left( \frac{1}{2}{D_i}{m_{l1i}}+2 {R_i}{m_{l2i}^2}\right) \left\| {E_{li}}\right\| ^2 \nonumber \\&\quad - {R_i}{\left\| {{{u}_{1i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2}-\left( {R_i}-\frac{1}{2}{D_i}\right) {\left\| {{{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2} \nonumber \\&\quad +R_i^{ - 1}{{D_i^2} }\left( { \delta _{cid}^2{\tilde{W}}_{ci}^2 + \varepsilon _{cid}^2} \right) . \end{aligned}$$
(54)

For the second term \(\dot{V}_{2i}\), we have \({\dot{V}_{2i}} = \nabla \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}} \right) = 0\).

Applying (49) and (54), we obtain the time derivative of the Lyapunov function candidate \( \dot{V} \left( t \right) \) of the MRM system as:

$$\begin{aligned} \dot{V}\left( t \right)&= \sum \limits _{i = 1}^n {{{\dot{V}}_i}\left( t \right) } = \sum \limits _{i = 1}^n \left( {{{\dot{V}}_{1i}}\left( t \right) + {{\dot{V}}_{2i}}\left( t \right) }\right) \nonumber \\&\le \sum \limits _{i = 1}^n \bigg ({}\bigg .-\left( \left( {\lambda _{\min }}\left( {{Q_i}} \right) -1\right) {\left\| {{\vartheta _i}} \right\| }-{\upsilon _{Mi}}\right) \left\| {\vartheta _i}\right\| \nonumber \\&\quad - {{\lambda _{\min }}\left( {{R_i}} \right) }{\left\| {{{u}_{1i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2} \nonumber \\&\quad -\left( {{\lambda _{\min }}\left( {{R_i}} \right) }-\frac{1}{2}{D_i}\right) {\left\| {{{{\hat{u}}}_{2i}}\left( {{{{\hat{\vartheta }} }_{li}}} \right) } \right\| ^2} \nonumber \\&\quad +\left( \frac{1}{2}{D_i}{m_{l1i}}+2 {{\lambda _{\min }}\left( {{R_i}} \right) }{m_{l2i}^2}\right) \left\| {E_{li}}\right\| ^2 \nonumber \\&\quad +{{\lambda _{\min }}\left( {{R_i}} \right) }^{ - 1}{\left( {D_i^2} \right) }\left( { \delta _{cid}^2 W_{cid}^2 + \varepsilon _{cid}^2} \right) \bigg .{}\bigg ). \end{aligned}$$
(55)

Hence, (55) implies that \(\dot{V}(t)\le 0\) if the triggering condition (48) is satisfied when \(\vartheta _i\) lies outside the compact set \({\varOmega _{ui}} = \left\{ {\vartheta _i}:\left\| {{\vartheta _i}} \right\| \le \frac{{\upsilon _{Mi}}}{ {\lambda _{\min }}\left( {{Q_i}} \right) -1}\right\} \).

(2) When the events are triggered, i.e., \(\forall t = {t_{l + 1}}\), the difference of (49) is presented as:

$$\begin{aligned} {E_{Vi}}\left( t \right)&= {V_i}\left( {{{{\hat{\vartheta }} }_{li}}\left( {{t_{l + 1}}} \right) } \right) - {V_i}\left( {{{{\hat{\vartheta }} }_{li}}\left( {t_{l + 1}^ - } \right) } \right) \nonumber \\&= {{\vartheta _i^T}\left( {{t_{l + 1}}} \right) } {{\vartheta _i}\left( {{t_{l + 1}}} \right) }\nonumber \\&\quad -{{\vartheta _i^T}\left( {t_{l + 1}^ - } \right) }{{\vartheta _i}\left( {t_{l + 1}^ - } \right) }+\varXi _i^*\left( {{\vartheta _i}\left( {{t_{l + 1}}} \right) } \right) \nonumber \\&\quad - \varXi _i^*\left( {{\vartheta _i}\left( {t_{l + 1}^ - } \right) } \right) {+} \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{t_{l + 1}}} \right) } \right) {-} \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{t_l}} \right) } \right) , \end{aligned}$$
(56)

where \({{\hat{\vartheta }} _{li}}\left( {t_{l + 1}^ - } \right) = \mathop {\lim }\nolimits _{{\rho _i} \rightarrow 0} \left( {{{{\hat{\vartheta }} }_{li}}\left( {{t_{l + 1}} - {\rho _i}} \right) } \right) \) and \({\rho _i}\) is a small positive constant. From (49), (55), and (56), we obtain that \({\dot{V}_i}\left( t \right) \le 0\) when the events are not triggered \(t \in \left[ {{t_l},{t_{l + 1}}} \right) \). Then, we have

$$\begin{aligned} {E_{1{\vartheta _i}}}\left( t \right)&= {{\vartheta _i^T}\left( {{t_{l + 1}}} \right) } {{\vartheta _i}\left( {{t_{l + 1}}} \right) }-{{\vartheta _i^T}\left( {t_{l + 1}^ - } \right) }{{\vartheta _i}\left( {t_{l + 1}^ - } \right) } \le 0, \end{aligned}$$
(57)
$$\begin{aligned} {E_{2{\vartheta _i}}}\left( t \right)&= \varXi _i^*\left( {{\vartheta _i}\left( {{t_{l + 1}}} \right) } \right) - \varXi _i^*\left( {{\vartheta _i}\left( {t_{l + 1}^ - } \right) } \right) \le 0, \end{aligned}$$
(58)
$$\begin{aligned} {E_{{{{\hat{\vartheta }} }_{li}}}}\left( t \right)&= \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{t_{l + 1}}} \right) } \right) - \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{t_l}} \right) } \right) \nonumber \\&\le - {O_i}\left( {\left\| {{E_{\left( {l + 1} \right) i}}\left( {{t_l}} \right) } \right\| } \right) , \end{aligned}$$
(59)

where \({O_i}\left( \cdot \right) \) is class-k function [52], and \({E_{\left( {l + 1} \right) i}}\left( {{t_l}} \right) = {{\hat{\vartheta }} _{\left( {l + 1} \right) i}} - {{\hat{\vartheta }} _{li}}\). Then, (56) becomes

$$\begin{aligned} {E_{Vi}}\left( t \right)\le & {} \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{t_{l + 1}}} \right) } \right) - \varXi _i^*\left( {{{{\hat{\vartheta }} }_{li}}\left( {{t_l}} \right) } \right) \nonumber \\\le & {} - {O_i}\left( {\left\| {{E_{\left( {l + 1} \right) i}}\left( {{t_l}} \right) } \right\| } \right) . \end{aligned}$$
(60)

Therefore, if event-triggered condition (48) holds, the closed-loop MRM system is UUB. This completes the proof.

Exclusion of Zeno behaviors

In general, the MRM system is a continuous-time system that the minimum trigger interval \({t_{\min }} = \min \left\{ { {{t_{l + 1}} - {t_l}} } \right\} \) is possible to be zero, i.e., the so-called Zeno behavior. Thus, it is necessary to prove that \({t_{\min }}\) has a positive lower bound.

Theorem 4

Considering the dynamics of the MRM subsystem (5), the triggering condition (48) and the compensator-critic structure-based event-trigger-ed decentralized tracking control strategy (40), the minimum trigger interval \({t_{\min }}\) has a positive lower bound by:

$$\begin{aligned} {t_{\min }} \ge \frac{1}{{{S_{iz}}}}\ln \left( {1 + {\varPi _{l,\min }}} \right) > 0, \end{aligned}$$
(61)

where \({\varPi _{l,\min }} = \min \left( {\left\| {{E_{li}}\left( {{{{\hat{\vartheta }} }_{li}}\left( {t_{l + 1}^ - } \right) } \right) } \right\| /\left( {{{{\hat{\vartheta }} }_{li}}} \right) + {\varpi _i}} \right) > 0\) and \({\varpi _i}\) is a positive constant.

Proof

The time derivative of the event-triggered error (30) is:

$$\begin{aligned} \frac{{\mathrm{{d}}\left( {{E_{li}}} \right) }}{{\mathrm{{d}}t}} = {{\dot{E}}_{li}} = {\dot{{\hat{\vartheta }}} _{il}}\left( {{{{\hat{x}}}_l}} \right) - {{\dot{\vartheta }}_i}\left( {{x_i}} \right) = - {{\dot{\vartheta }} _i}\left( {{x_i}} \right) . \end{aligned}$$
(62)

According to Assumptions 1 and 2, the upper bound of \({{\dot{\vartheta }} _i}\left( {{x_i}} \right) \) is derived as:

$$\begin{aligned} \left\| {{{{\dot{\vartheta }} }_i}\left( {{x_i}} \right) } \right\| = {\dot{x}_{2i}} + {a_{ei}}{\dot{x}_{1i}} \le {S_{iz}}\left\| {{x_i}} \right\| + {S_{iz}}{\varpi _i}. \end{aligned}$$
(63)

Combining (30), (62) with (63), we can obtain:

$$\begin{aligned} \left\| {{{\dot{E}}_{li}}} \right\|&\le \int _{{t_l}}^t {{e^{\left( {{S_{iz}}\left( {t - \mu } \right) } \right) }}} {S_{iz}}\left( {\left\| {{{{\hat{\vartheta }} }_{li}}} \right\| + {\varpi _i}} \right) \mathrm{{d}}\mu \nonumber \\&\le \left( {\left\| {{{{\hat{\vartheta }} }_{li}}} \right\| + {\varpi _i}} \right) \left( {{e^{\left( {{S_{iz}}\left( {t - {t_l}} \right) } \right) }} - 1} \right) . \end{aligned}$$
(64)

When \(t = {t_{l + 1}}\), the event-triggered condition satisfies:

$$\begin{aligned} \left\| {{E_{li}}\left( {{{{\hat{\vartheta }} }_{li}}\left( {t_{l + 1}^ - } \right) } \right) } \right\| = \left\| {{E_{Li}}\left( {t_{l + 1}^ - } \right) } \right\| . \end{aligned}$$
(65)

According to (64) and (65), the lth triggering interval \(\varDelta {t_l}\) has the lower bound by:

$$\begin{aligned} \varDelta {t_l} = {t_{l + 1}} - {t_l} \ge \frac{1}{{{S_{iz}}}}\ln \left( {1 + \frac{{\left\| {{E_{li}}\left( {{{{\hat{\vartheta }} }_{li}}\left( {t_{l + 1}^ - } \right) } \right) } \right\| }}{{\left\| {{{{\hat{\vartheta }} }_{li}}} \right\| + {\varpi _i}}}} \right) . \end{aligned}$$
(66)

It can be seen from (66) that the minimum triggering interval \({t_{\min }} = \left\| {{E_{Li}}} \right\| \) \(/ \left( \left\| {{{{\hat{\vartheta }} }_{li}}} \right\| + {\varpi _i} \right) \) which increases from zero to the positive value \({\varPi _{l,\min }} {=} \min \) \(\left( {\left\| \! {{E_{li}}\left( {{{{\hat{\vartheta }} }_{li}}\left( {t_{l + 1}^ - } \right) } \right) } \right\| \!/\left( {{{{\hat{\vartheta }}}_{li}}} \right) \! {+}\! {\varpi _i}} \right) ,\) \(\forall t \in \left[ {{t_l},{t_{l{+}1}}} \right) \). Therefore, the minimum triggering interval \({t_{\min }}\) satisfies the condition (61), such that \({t_{\min }}\) has a positive lower bound for arbitrary state \({\vartheta _i}\left( {{x_i}} \right) \). \(\square \)

Fig. 3
figure 3

Experimental platform of 2-DOF MRM

Experimental results

Establishment of experimental platform

A 2-DOF MRM experimental platform has been established, which is composed of two sets of joint modules and connecting rods, as shown in Fig. 3. Each joint module contains a motor, an incremental encoder, a speed reducer, an absolute encoder, and a torque sensor. The DC Brush motor selected from Maxon Inc. is the power to drive the MRM and each joint motor is driven by a linear power amplifier (LPA). The incremental encoder and the absolute encoder are utilized to measure the displacement of the motor and the position of the link module, correspondingly. The speed reducer is connected to increase the motor output torque through reducing motor speed with the gear ratio 100:1. The joint torque sensor is equipped between the link module and the joint module to measure the joint torque. The experimental data acquisition and processing depend on the QPIDe data acquisition device and Matlab/Simulink software installed in the host–computer, respectively. The designed control system is built by Simulink, and the packaged QUARC module is utilized to establish the communication between the host–computer and the QPIDe device to realize the real-time control of the 2-DOF MRM.

Table 1 Parameter setting

In this paper, the experiments of a 2-DOF MRM with tracking task are established to verify the effectiveness of the proposed compensator-critic structure-based event-triggered decentralized tracking control strategy. We select the desired trajectories for each joint as \( {q_{1d}} = \frac{{\pi }}{{4}}\sin \left( {\frac{\pi }{{45}}} \right) t+0.05\), \({q_{2d}} = \frac{{\pi }}{{2}}\sin \left( {\frac{\pi }{{45}}} \right) t+0.1\). For the critic NN, we choose the radial basis function neural network (RBFNN) to approximate the optimal performance index function. The 1-5-1 NN structure is selected with 1 input neuron, 5 hidden neurons, and 1 output neuron for each joint. The NN weights are defined as \({{\hat{W}}_{ci}} = {\left[ {{{{\hat{W}}}_{1ci}},{{{\hat{W}}}_{2ci}},{{{\hat{W}}}_{3ci}},{{{\hat{W}}}_{4ci}},{{{\hat{W}}}_{5ci}}} \right] ^T}\) with \({{\hat{W}}_{0ci}} = {\left[ {0.3,0.1,0.3,0.1,0.3} \right] ^T}\). The activation function is chosen as the radial basis function \({\delta _{jci}}\left( {{\vartheta _i}} \right) = \exp \left( { - \frac{{{{\left\| {{\vartheta _i} - {c_i}} \right\| }^2}}}{{2b_j^2}}} \right) \), \(b_{j}=1.5\), \(j=1,2,3,4,5\). And, \(c_{1}= [-1, -0.5,0,0.5,1]^T\) and \(c_{2}= [-2, -1,0,1,2]^T\). Other model parameters, control parameters, and upper bound parameters are listed in Table 1. Note that the real-time state of MRMs can only be obtained by sensor sampling; therefore, we choose the sampled state from the time-triggered mechanism method as the system state in the event-triggered control.

Experimental results and analysis

Experimental results under the proposed control method are shown in Figs. 4, 5, 6, 7 and 8, which compared with the ADP-based time-triggered decentralized tracking control method [37].

Fig. 4
figure 4

Position tracking curves under the proposed event-triggered tracking control method

Fig. 5
figure 5

Tracking error curves under the proposed event-triggered tracking control method

Fig. 6
figure 6

Optimal control torque curves under the time-triggered control method and the proposed event-triggered control method

Fig. 7
figure 7

Triggering error and threshold curves under the proposed control method

Figure 4 shows the position tracking curves of each joint under the proposed control method. The red and blue dashed lines present desired tracking trajectory and actual tracking trajectory, respectively. From this figure, one observes that the asymptotic tracking between the actual and the desired trajectories can be realized in a very short time. Through Fig. 5, the position tracking errors of each joint keep within an acceptable range (less than \( \pm \mathrm{{5}}\times 10^ {- 3}\)rad) under the proposed event-triggered tracking control approach, and it illustrates the effectiveness of the presented control scheme intuitively.

Figure 6 presents the joint control torque curves of MRMs under the conventional and the proposed control methods. The red and blue lines show that of the time-triggered control method [37] and the proposed event-triggered control method, respectively. We can see the proposed joint control torques only updated when the event-triggered condition is satisfied, and thus, it has a lower updating frequency. Figure 7 illustrates the triggering error \(\left\| E_{li}\right\| \) and the triggering threshold \(\left\| E_{Li} \right\| \). In Figs. 6 and 7, we can see that the control torque curve within time interval [20,30] is a piece-wise one depending on the zero-order hold.

The cumulative numbers of sample states used in the time-triggered control method [37] and the proposed event-triggered control method are shown in Fig. 8. It shows that the updating time of the time-triggered control method is near five times as that of the event-triggered one.

From the experimental results, the developed compensator-critic structure-based event-triggered decentralized tracking control is effective to MRMs. It cannot only maintain the satisfactory control accuracy, but also effectively reduce the computational burden, save the communication, and energy consumption.

Fig. 8
figure 8

Control torque updating times of each joint under the time-triggered control method and the proposed event-triggered control method

Conclusion

This paper addresses the event-triggered decentralized tracking control problem for MRMs with a compensator-critic structure-based ADP algorithm. With the help of the JTF technique, the subsystem dynamic model of MRM is established. The model-based robust compensator is utilized to avoid the influence of dynamic uncertainties. The performance index function is constructed to reflect the position error, the velocity error, and the control torque. Thus, the event-triggered decentralized tracking control is obtained including the model-based robust controller and the ADP-based optimal compensation controller. Then, a critic NN is constructed to solve the improved event-triggered HJBE, and the event-triggered approximate decentralized optimal compensation tracking control torque can be derived directly. The Lyapunov stability theorem is utilized to prove UUB of the tracking error of the closed-loop MRM system. In contrast to the time-triggered optimal controller, the proposed compensator-critic structure-based event-triggered decentralized tracking control method cannot only maintain the satisfactory control accuracy, but also effectively reduce the computational burden, save the communication, and energy cost, simultaneously.