Skip to main content
Log in

Adaptive Reinforcement Learning Strategy with Sliding Mode Control for Unknown and Disturbed Wheeled Inverted Pendulum

  • Regular Papers
  • Intelligent Control and Applications
  • Published:
International Journal of Control, Automation and Systems Aims and scope Submit manuscript

Abstract

This paper develops a novel adaptive integral sliding-mode control (SMC) technique to improve the tracking performance of a wheeled inverted pendulum (WIP) system, which belongs to a class of continuous time systems with input disturbance and/or unknown parameters. The proposed algorithm is established based on an integrating between the advantage of online adaptive reinforcement learning control and the high robustness of integral sliding-mode control (SMC) law. The main objective is to find a general structure of integral sliding mode control law that can guarantee the system state reaching a sliding surface in finite time. An adaptive/approximate optimal control based on the approximate/adaptive dynamic programming (ADP) is responsible for the asymptotic stability of the closed loop system. Furthermore, the convergence possibility of proposed output feedback optimal control was determined without the convergence of additional state observer. Finally, the theoretical analysis and simulation results validate the performance of the proposed control structure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. P. N. Dao, V. H. Nguyen, and T. T. Do, “Adaptive dynamic programming based integral sliding mode control law for continuous-time systems: A design for inverted pendulum systems,” International Journal of Mechanical Engineering and Robotics Research, vol. 8, no. 2, pp. 279–283, March 2019.

    Google Scholar 

  2. T. T. Pham, P. N. Dao, V. T. Vu, Q. H. Tran, and V. H. Nguyen, “Robust control law using H-infinity for wheeled inverted pendulum systems,” International Journal of Mechanical Engineering and Robotics Research, vol. 8, no. 3, pp. 483–487, May 2019.

    Google Scholar 

  3. R. Cui, G. Ji, and M. Zhaoyong, “Adaptive backstepping control of wheeled inverted pendulums models,” Nonlinear Dynamics, vol. 79, no. 1, pp. 501–511, 2015.

    MathSciNet  MATH  Google Scholar 

  4. Z. Li and J. Luo, “Adaptive robust dynamic balance and motion controls of mobile wheeled inverted pendulums,” IEEE Transactions on Control Systems Technology, vol. 17, no. 1, pp. 233–241, 2008.

    Google Scholar 

  5. Z. Li and Y. Zhang, “Robust adaptive motion/force control for wheeled inverted pendulums,” Automatica, vol. 46, no. 8, pp. 1346–1353, 2010.

    MathSciNet  MATH  Google Scholar 

  6. Z. Li, “Adaptive fuzzy output feedback motion/force control for wheeled inverted pendulums,” IET Control Theory & Applications, vol. 5, no. 10, pp. 1176–1188, 2011.

    MathSciNet  Google Scholar 

  7. J. Kumar, V. Kumar, and K. P. S. Rana, “Design of robust fractional order fuzzy sliding mode PID controller for two link robotic manipulator system,” Journal of Intelligent & Fuzzy Systems, vol. 35, no. 5, pp. 5301–5315, 2018.

    Google Scholar 

  8. J. de J. Rubio, J. Pieper, J. A. Meda-Campaña, A. Aguilar, V. I. Rangel, and G. J. Gutierrez, “Modelling and regulation of two mechanical systems,” IET Science, Measurement & Technology, vol. 12, no. 5, pp. 657–665, 2018.

    Google Scholar 

  9. Z. Li and C. Yang, “Neural-adaptive output feedback control of a class of transportation vehicles based on wheeled inverted pendulum models,” IEEE Transactions on Control Systems Technology, vol. 20, no. 6, pp. 1583–1591, 2012.

    MathSciNet  Google Scholar 

  10. J. de J. Rubio, “Robust feedback linearization for nonlinear processes control,” ISA Transactions, vol. 74, pp. 155–164, 2018.

    Google Scholar 

  11. J. de J. Rubio, G. Ochoa, D. Mujica-Vargas, E. Garcia, E. Balcazar, Ricardo, I. Elias, D. R. Cruz, and C. F. Juarez, A. Aguilar, and J. F. Novoa, “Structure regulator for the perturbations attenuation in a quadrotor,” IEEE Access, vol. 7, pp. 138244–138252, 2019.

    Google Scholar 

  12. C. Yang, Z. Li, and J. Li, “Trajectory planning and optimized adaptive control for a class of wheeled inverted pendulum vehicle models,” IEEE Transactions on Cybernetics, vol. 17, no. 1, pp. 233–241, 2008.

    Google Scholar 

  13. H. K. Khalil, Nonlinear Systems, Prentice Hall, Upper Saddle River, NJ, 2002.

    MATH  Google Scholar 

  14. C. Yang, Z. Li, R. Cui, and B. Xu, “Neural network-based motion control of an underactuated wheeled inverted pendulum model,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, issue. 11, pp. 2004–2016, 2014.

    Google Scholar 

  15. M. Yue, X. Wei, and Z. Li, “Adaptive sliding-mode control for two-wheeled inverted pendulum vehicle based on zero-dynamics theory,” Nonlinear Dynamics, vol. 76, issue. 1, pp. 459–471, 2014.

    MathSciNet  MATH  Google Scholar 

  16. K. Sun, S. Mou, J. Qiu, T. Wang, and H. Gao, “Adaptive fuzzy control for nontriangular structural stochastic switched nonlinear systems with full state constraints,” IEEE Transactions on Fuzzy Systems, vol. 27, no. 8, pp. 1587–1601, 2018.

    Google Scholar 

  17. J. Qiu, K. Sun, I. J. Rudas, and H. Gao, “Command filter-based adaptive NN control for MIMO nonlinear systems with full-state constraints and actuator hysteresis,” IEEE Transactions on Cybernetics, vol. 50, no. 7, pp. 2905–2915, 2019.

    Google Scholar 

  18. Z. Q. Guo, J. X. Xu, T. H. Lee, “Design and implementation of a new sliding mode controller on an underactuated wheeled inverted pendulum,” Journal of the Franklin Institute, vol. 351, issue. 4, pp. 2261–2282, 2014.

    MathSciNet  MATH  Google Scholar 

  19. K. Y. Chen, “Robust optimal adaptive sliding mode control with the disturbance observer for a manipulator robot system,” International Journal of Control, Automation and Systems, vol. 16, no. 4, pp. 1701–1715, 2018.

    Google Scholar 

  20. Y. Lv, X. Ren, S. Hu, H. Xu, “Approximate optimal stabilization control of servo mechanisms based on reinforcement learning scheme,” International Journal of Control, Automation and Systems, vol. 17, no. 10, pp. 2655–2665, 2019.

    Google Scholar 

  21. X. Yang, D. Liu, B. Luo, and C. Li, “Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning,” Information Sciences, vol. 369, pp. 736–747, 2016.

    MATH  Google Scholar 

  22. D. Vrabie, O. Pastravanu, M. Abu. Khalaf, and F. L. Lewis, “Adaptive optimal control for continuous-time linear systems based on policy iteration,” Automatica, vol. 45, no. 2, pp. 477–484, 2009.

    MathSciNet  MATH  Google Scholar 

  23. Y. Jiang and Z. P. Jiang, “Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics,” Automatica, vol. 48, no. 10, pp. 2699–2704, 2012.

    MathSciNet  MATH  Google Scholar 

  24. D. Vrabie and F. Lewis, “Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems,” Neural Networks, vol. 22, no. 3, pp. 237–246, 2009.

    MATH  Google Scholar 

  25. M. A. Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, pp. 779–791, 2005.

    MathSciNet  MATH  Google Scholar 

  26. K. G. Vamvoudakis and F. L. Lewis, “Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem,” Automatica, vol. 46, no. 5, pp. 878–888, 2010.

    MathSciNet  MATH  Google Scholar 

  27. K. G. Vamvoudakis, D. Vrabie, and F. L. Lewis, “Online adaptive algorithm for optimal control with integral reinforcement learning,” International Journal of Robust and Nonlinear Control, vol. 24, issue. 17, pp. 2686–2710, 2014.

    MathSciNet  MATH  Google Scholar 

  28. S. Bhasin, R. Kamalapurkar, M. Johnson, K. G. Vamvoudakis, F. L. Lewis, and W. E. Dixon, “A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems,” Automatica, vol. 49, issue. 1, pp. 82–92, 2013.

    MathSciNet  MATH  Google Scholar 

  29. B. Kiumarsi, F. L. Lewis, and Z. P. Jiang, “H control of linear discrete-time systems: Off-policy reinforcement learning,” Automatica, vol. 78, pp. 148–152, 2017.

    MathSciNet  MATH  Google Scholar 

  30. X. Zhang, H. Zhang, Q. Sun, and Y. Luo, “Adaptive dynamic programming-based optimal control of unknown nonaffine nonlinear discrete-time systems with proof of convergence,” Neurocomputing, vol. 91, pp. 48–55, 2012.

    Google Scholar 

  31. W. Gao and Z. P. Jiang, “Adaptive optimal output regulation via output-feedback: An adaptive dynamic programing approach,” Proc. of 2016 IEEE 55th Conference on Decision and Control (CDC), pp. 5845–5850, 2016.

  32. J. Li and Q. Zhang, “Fuzzy reduced-order compensator-based stabilization for interconnected descriptor systems via integral sliding modes,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 49, no. 4, pp. 752–765, 2017.

    Google Scholar 

  33. A. A. Bature, S. Buyamin, M. N. Ahmad, and M. Muhammad, “A comparison of controllers for balancing two wheeled inverted pendulum robot,” International Journal of Mechanical & Mechatronics Engineering, vol. 14, no. 3, pp. 62–68, 2014.

    Google Scholar 

  34. J. Liu, Y. Gao, X. Su, M. Wack, and L. Wu, “Disturbance-observer-based control for air management of PEM fuel cell systems via sliding mode technique,” IEEE Transactions on Control Systems Technology, vol. 27, no. 3, pp. 1129–1138, 2018.

    Google Scholar 

  35. Y. Gao, J. Liu, Z. Wang, and L. Wu, “Interval type-2 FNN-based quantized tracking control for hypersonic flight vehicles with prescribed performance,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2019. DOI: https://doi.org/10.1109/TSMC.2019.2911726

  36. Y. Gao, J. Liu, G. Sun, M. Liu, and L. Wu, “Fault deviation estimation and integral sliding mode control design for Lipschitz nonlinear systems,” Systems & Control Letters, vol. 123, pp. 8–15, 2019.

    MathSciNet  MATH  Google Scholar 

  37. S. Mobayen, “Adaptive global terminal sliding mode control scheme with improved dynamic surface for uncertain nonlinear systems,” International Journal of Control, Automation and Systems, vol. 16, no. 4, pp. 1692–1700, 2018.

    Google Scholar 

  38. J. Li and Q. Zhang, “A linear switching function approach to sliding mode control and observation of descriptor systems,” Automatica, vol. 95, pp. 112–121, 2018.

    MathSciNet  MATH  Google Scholar 

  39. J. Li and G. Yang, “Fuzzy descriptor sliding mode observer design: A canonical form-based method,” IEEE Transactions on Fuzzy Systems, vol. 28, no. 9, pp. 2048–2062, 2020.

    Google Scholar 

  40. D. Kleinman, “On an iterative technique for Riccati equation computations,” IEEE Transactions on Automatic Control, vol. 13, no. 1, pp. 114–115, 1968.

    Google Scholar 

  41. C. Mu and D. Wang, “Neural-network-based adaptive guaranteed cost control of nonlinear dynamical systems with matched uncertainties,” Neurocomputing, vol. 245, pp. 46–54, 2017.

    Google Scholar 

  42. F. Castaños and L. Fridman, “Analysis and design of integral sliding manifolds for systems with unmatched perturbations,” IEEE Transactions on Automatic Control, vol. 51, no. 5, pp. 853–858, 2006.

    MathSciNet  MATH  Google Scholar 

  43. X. Yang, D. Liu, B. Luo, and C. Li, “Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning,” Information Sciences, vol. 369, pp. 731–747, 2016.

    MATH  Google Scholar 

  44. Y. Jiang and Z. Jiang, Robust Adaptive Dynamic Programming, John Wiley & Sons, 2017.

  45. H. Modares, F. L. Lewis, and M.-B. Naghibi-Sistani, “Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems,” Automatica, vol. 50, no.1, pp. 193–202, 2014.

    MathSciNet  MATH  Google Scholar 

  46. K. G. Vamvoudakis, M. F. Miranda, and J. P. Hespanha, “Asymptotically stable adaptive-optimal control algorithm with saturating actuators and relaxed persistence of excitation,” IEEE Transactions on Neural Networks and Learning Systems, vol. 27, no.11, pp. 2386–2398, 2015.

    MathSciNet  Google Scholar 

  47. D. Xu, Q. Wang, and Y. Li, “Optimal guaranteed cost tracking of uncertain nonlinear systems using adaptive dynamic programming with concurrent learning,” International Journal of Control, Automation and Systems, vol. 18, no.5, pp. 1116–1127, 2020.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phuong Nam Dao.

Additional information

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Recommended by Editor Jessie (Ju H.) Park. This work was supported in part by the Ministry of Science and Technology (MOST), Taiwan, under grant MOST 108-2636-E-006-007 and MOST 109-2636-E-006-019 (Young Scholar Fellowship Program).

Phuong Nam Dao received his Ph.D. degree in electrical engineering from Hanoi University of Science and Technology, Hanoi, Vietnam in 2013. Currently, he holds the position as a lecturer at Hanoi University of Science and Technology, Vietnam. His research interests include control of robotic systems and robust/adaptive, optimal control.

Yen-Chen Liu received his B.S. and M.S. degrees in mechanical engineering from National Chiao Tung University, Hsinchu, Taiwan, in 2003 and 2005, respectively, and a Ph.D. degree in mechanical engineering from the University of Maryland, College Park, MD, USA, in 2012. He is currently an Associate Professor with the Department of Mechanical Engineering, National Cheng Kung University, Tainan, Taiwan. His research interests include control of networked robotic systems, bilateral teleoperation, multi-robot systems, semiautonomous systems, and human-robot interaction. He received the MOST Ta-You Wu Memorial Award in 2016, Kwoh-Ting Li Researcher Award by National Cheng Kung University, Taiwan in 2018, and MOST Young Scholar Fellowship in 2019.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dao, P.N., Liu, YC. Adaptive Reinforcement Learning Strategy with Sliding Mode Control for Unknown and Disturbed Wheeled Inverted Pendulum. Int. J. Control Autom. Syst. 19, 1139–1150 (2021). https://doi.org/10.1007/s12555-019-0912-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12555-019-0912-9

Keywords

Navigation