Robustifying a reinforcement learning agent-based bionic reflex controller through an adaptive sliding mode control

Hirakjyoti Basumatary; Daksh Adhar; Shyamanta M. Hazarika

doi:10.1017/S0263574724001838

Robustifying a reinforcement learning agent-based bionic reflex controller through an adaptive sliding mode control

Published online by Cambridge University Press: 08 November 2024

Hirakjyoti Basumatary

Daksh Adhar and

Shyamanta M. Hazarika

Show author details

Hirakjyoti Basumatary*: Affiliation:
Biomimetic Robotics and Artificial Intelligence Laboratory (BRAIL), Mechanical Engineering Department, Indian Institute of Technology, Guwahati, India
Daksh Adhar: Affiliation:
Biomimetic Robotics and Artificial Intelligence Laboratory (BRAIL), Mechanical Engineering Department, Indian Institute of Technology, Guwahati, India
Shyamanta M. Hazarika: Affiliation:
Biomimetic Robotics and Artificial Intelligence Laboratory (BRAIL), Mechanical Engineering Department, Indian Institute of Technology, Guwahati, India
*: Corresponding author: Hirakjyoti Basumatary; Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Maintaining object grasp stability represents a pivotal challenge within the domain of robotic manipulation and upper-limb prosthetics. Perturbations originating from external sources frequently disrupt the stability of grasps, resulting in slippage occurrences. Also, if the grasping forces are not optimal while controlling the slip, it may result in the deformation of the objects. This study investigates the robustification of a reinforcement learning (RL) policy for implementing intelligent bionic reflex control, i.e., slip and deformation prevention of the grasped objects. RL-derived policies are vulnerable to failures in environments characterized by dynamic variability. To mitigate this vulnerability, we propose a methodology involving the incorporation of an adaptive sliding mode controller into a pre-trained RL policy. By exploiting the inherent invariance property of the sliding mode algorithm in the presence of uncertainties, our approach strengthens the robustness of the RL policies against diverse and dynamic variations. Numerical simulations substantiate the efficacy of our approach in robustifying RL policies trained within simulated environments.

Keywords

grasping control of robotic systems force control novel applications of robotics robotic hands

Type: Research Article
Information: Robotica , First View , pp. 1 - 24

DOI: https://doi.org/10.1017/S0263574724001838 [Opens in a new window]
Copyright: © The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Sanchez, J., Corrales, J.-A., Bouzgarrou, B.-C. and Mezouar, Y., “Robotic manipulation and sensing of deformable objects in domestic and industrial applications: A survey,” Int. J. Robot. Res. 37(7), 688–716 (2018).CrossRef Google Scholar

Basumatary, H. and Hazarika, S. M., “State of the art in bionic hands,” IEEE T. Hum-MACH. Syst. 50(2), 116–130 (2020).CrossRef Google Scholar

Zhu, J., Cherubini, A., Dune, C., Navarro-Alarcon, D., Alambeigi, F., Berenson, D., Ficuciello, F., Harada, K., Kober, J., Xiang, L., “Challenges and outlook in robotic manipulation of deformable objects,” IEEE Robot. Autom. Mag. 29(3), 67–77 (2022).CrossRef Google Scholar

Romeo, R. A. and Zollo, L., “Methods and sensors for slip detection in robotics: A survey,” IEEE Access 8, 73027–73050 (2020).CrossRef Google Scholar

Romeo, R. A., Lauretti, C., Gentile, C., Guglielmelli, E. and Zollo, L., “Method for automatic slippage detection with tactile sensors embedded in prosthetic hands,” IEEE T. Med. Robot. Bion. 3(2), 485–497 (2021).CrossRef Google Scholar

Cheng, Y., Zhao, P., Wang, F., Block, D. J. and Hovakimyan, N., “Improving the robustness of rreinforcement learning olicies with l1 adaptive control,” IEEE Robot. Auto. Lett. 7(3), 6574–6581 (2022).CrossRef Google Scholar

James, J. W. and Lepora, N. F., “Slip detection for grasp stabilization with a multifingered tactile robot hand,” IEEE T. Robot. 37(2), 506–519 (2020).CrossRef Google Scholar

Yang, D. and Wu, G., “A multi-threshold-based force regulation policy for prosthetic hand preventing slippage,” IEEE Access 9, 9600–9609 (2021).CrossRef Google Scholar

Nazari, K. and Mandil, W., “roactive slip control by learned slip model and trajectory adaptation,” (2022). arXiv preprint arXiv: 2209.06019.Google Scholar

Siciliano, B., Sciavicco, L., Villani, L. and Oriolo, G.. Force Control (Springer, 2009).Google Scholar

Carbone, G., Iannone, S. and Ceccarelli, M., “Regulation and control of LARM Hand III,” Robot. Comp-INT. Manuf. 26(2), 202–211 (2010).CrossRef Google Scholar

Engeberg, E. D. and Meek, S. G., “Adaptive sliding mode control for prosthetic hands to simultaneously prevent slip and minimize deformation of grasped objects,” IEEE/ASME T. Mechtron. 18(1), 376–385 (2011).CrossRef Google Scholar

Zhang, Y., Xu, X., Xia, R. and Deng, H., “Stiffness-estimation-based grasping force fuzzy control for underactuated prosthetic hands,” IEEE/ASME T. Mechatron. 28(1), 140–151 (2022).Google Scholar

Cretu, A.-M., Payeur, P. and Petriu, E. M., “Soft object deformation monitoring and learning for model-based robotic hand manipulation,” IEEE T. Syst. Man Cybern. Part B (Cybernetics) 42(3), 740–753 (2011).CrossRef Google Scholar PubMed

Makihara, K., Domae, Y., Ramirez-Alpizar, I. G., Ueshiba, T. and Harada, K., “Grasp pose detection for deformable daily items by pix2stiffness estimation,” Adv. Robot. 36(12), 600–610 (2022).CrossRef Google Scholar

Shen, B., Jiang, Z., Choy, C., Guibas, L. J., Savarese, S., Anandkumar, A. and Zhu, Y., “Acid: Action-conditional implicit visual dynamics for deformable object manipulation,” (2022). arXiv preprint arXiv: 2203.06856.Google Scholar

Ji, W., Zhang, J., Xu, B., Tang, C. and Zhao, D., “Grasping mode analysis and adaptive impedance control for apple harvesting robotic grippers,” Comput. Electron. Agr. 186, 106210 (2021).CrossRef Google Scholar

Duan, X.-G., Zhang, Y. and Deng, H., “A simple control method to avoid overshoot for prosthetic hand control,” In 2014 IEEE International Conference on Information and Automation (ICIA), IEEE (2014) pp. 736–739.Google Scholar

Jiang, L., Tian, X., Zhan, Q., Xu, Q. and Zhang, Y., “Impedance control of an anthropomorphic hands without finger force sensors,” IEEE T. Autom. Sci. Eng. 21(4), 5779–5789 (2023).Google Scholar

Deng, H., Zhong, G., Li, X. and Nie, W., “Slippage and deformation preventive control of bionic prosthetic hands,” IEEE/ASME T. Mechatron. 22(2), 888–897 (2016).CrossRef Google Scholar

Kaboli, M., Yao, K. and Cheng, G., “Tactile-based manipulation of deformable objects with dynamic center of mass,” In 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), IEEE (2016) pp. 752–757.Google Scholar

Mouaze, N. and Birglen, L., “Bistable compliant underactuated gripper for the gentle grasp of soft objects,” Mech. Mach. Theory 170, 104676 (2022).CrossRef Google Scholar

Wang, W. and Ahn, S.-H., “Shape memory alloy-based soft gripper with variable stiffness for compliant and effective grasping,” Soft Robot. 4(4), 379–389 (2017).CrossRef Google Scholar PubMed

Milojević, A., Linß, S., Ćojbašić, Žarko and Handroos, H., “A novel simple, adaptive, and versatile soft-robotic compliant two-finger gripper with an inherently gentle touch,” J. Mech. Robot. 13(1), 011015 (2021).CrossRef Google Scholar

Salvato, E., Fenu, G., Medvet, E. and Pellegrino, F. A., “Crossing the reality gap: A survey on sim-to-real transferability of robot controllers in reinforcement learning,” IEEE Access 9, 153171–153187 (2021).CrossRef Google Scholar

Güitta-López, L.ía, Boal, J. and lvaro J López-López, Á., “Learning more with the same effort: How randomization improves the robustness of a robotic deep rreinforcement learning gent,” Appl. Intell. 53(12), 14903–14917 (2023).CrossRef Google Scholar

Chen, X., Hu, J., Jin, C., Li, L. and Wang, L., “Understanding domain randomization for sim-to-real transfer,” (2021). arXiv preprint arXiv: 2110.03239.Google Scholar

Pinto, L., Davidson, J., Sukthankar, R. and Gupta, A., “Robust adversarial reinforcement learning,” In International Conference on Machine Learning, PMLR (2017) pp. 2817–2826 Google Scholar

Morimoto, J. and Doya, K., “Robust reinforcement learning,” Neural Comput. 17(2), 335–359 (2005).CrossRef Google Scholar PubMed

Rice, L., Wong, E. and Kolter, Z., “Overfitting in adversarially robust deep learning,” In International Conference on Machine Learning, PMLR (2020) pp. 8093–8104.Google Scholar

Nagabandi, A., Clavera, I., Liu, S., Fearing, R. S., Abbeel, P., Levine, S. and Finn, C., “Learning to adapt in dynamic, real-world environments through meta-reinforcement learning,” (2018). arXiv preprint arXiv: 1803.11347.Google Scholar

Rusu, A. A., Colmenarejo, S. G., Gulcehre, C., Desjardins, G., Kirkpatrick, J., Pascanu, R., Mnih, V., Kavukcuoglu, K. and Hadsell, R., “Policy distillation,” (2015). arXiv preprint arXiv: 1511.06295.Google Scholar

Kadokawa, Y., Zhu, L., Tsurumine, Y. and Matsubara, T., “Cyclic policy distillation: Sample-efficient sim-to-real rreinforcement learning ith domain randomization,” Robot. Auton. Syst. 165, 104425 (2023).CrossRef Google Scholar

Niu, Z., Yuan, J., Ma, X., Xu, Y., Liu, J., Chen, Y.-W., Tong, R. and Lin, L., “Knowledge distillation-based domain-invariant representation learning for domain generalization,” IEEE T. Multimedia, (2023).Google Scholar

Kim, J. W., Shim, H. and Yang, I., “On improving the robustness of reinforcement learning-based controllers using disturbance observer,” In 2019 IEEE 58th Conference on Decision and Control (CDC), IEEE (2019) pp. 847–852.Google Scholar

Guha, A. and Annaswamy, A., “Mrac-rl: A framework for on-line policy adaptation under parametric model uncertainty,” (2020) arXiv preprint arXiv: 2011.10562.Google Scholar

Hao, S., Hu, L. and Liu, P. X., “Second-order adaptive integral terminal sliding mode approach to tracking control of robotic manipulators,” IET Control Theory A. 15(17), 2145–2157 (2021).CrossRef Google Scholar

Coumans, E. and Bai, Y. “Pybullet, a python module for physics simulation for games, robotics and machine learning,” (2016). (https://pybullet.org/wordpress/).Google Scholar

Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A. and Abeel, P., “Soft actor-critic algorithms and applications,” (2018), arXiv preprint arXiv: 1812.05905.Google Scholar

Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M. and Dormann, N., “Stable-baselines3: Reliable rreinforcement learning mplementations,” J. Mach. Learn. Res. 22(268), 1–8 (2021).Google Scholar

Deng, H., Zhang, Y. and Duan, X.-G., “Wavelet transformation-based fuzzy reflex control for prosthetic hands to prevent slip,” IEEE T. Ind. Electron. 64(5), 3718–3726 (2016).CrossRef Google Scholar

Yang, H., Hu, X., Cao, L. and Sun, F., “A new slip-detection method based on pairwise high frequency components of capacitive sensor signals,” In 2015 5th International Conference on Information Science and Technology (ICIST), IEEE (2015) pp. 56–61.Google Scholar

Romeo, R. A., Rongala, U. B., Mazzoni, A., Camboni, D., Carrozza, M. C., Guglielmelli, E., Zollo, L. and Oddo, C. M., “Identification of slippage on naturalistic surfaces via wavelet transform of tactile signals,” IEEE Sens. J. 19(4), 1260–1268 (2018).CrossRef Google Scholar

Hu, Y., Schneider, T., Wang, B., Zorin, D. and Panozzo, D., “Fast tetrahedral meshing in the wild,” ACM T. Graphics (TOG) 39(4), 117–111 (2020).Google Scholar

Arriola-Rios, V. E., Guler, P., Ficuciello, F., Kragic, D., Siciliano, B. and Wyatt, J. L., “Modeling of deformable objects for robotic manipulation: A tutorial and review,” Front. Robot. AI 7, 82 (2020).CrossRef Google Scholar PubMed

Zhang, C. and Chen, T., “Efficient Feature Extraction for 2d/3d Objects in Mesh Representation,” In: Proceedings 2001 International Conference On Image Processing (Cat No. 01CH37205), Vol. 3, (IEEE, 2001) pp. 935–938.CrossRef Google Scholar

Ma, X., Chen, L., Gao, Y., Liu, D. and Wang, B., “Modeling contact stiffness of soft fingertips for grasping applications,” Biomimetics 8(5), 398 (2023).CrossRef Google Scholar PubMed

Utkin, V. and Shi, J., “Integral sliding mode in systems operating under uncertainty conditions,” In Proceedings of 35th IEEE conference on decision and control, Vol. 4, IEEE, (1996) pp. 4591–4596.Google Scholar

Li, P., Ma, J., Zheng, Z. and Geng, L., “Fast nonsingular integral terminal sliding mode control for nonlinear dynamical systems,” In 53rd IEEE conference on decision and control, IEEE (2014) pp. 4739–4746.Google Scholar

Alattas, K. A., Mobayen, S., Sami, U. D., Jihad, H. A., Afef, Fekih, Wudhichai, A. and Mai, T. V., “Design of a non-singular adaptive integral-type finite time tracking control for nonlinear systems with external disturbances,” IEEE Access 9, 102091–102103 (2021).CrossRef Google Scholar

Mondal, S. and Mahanta, C., “Adaptive second order terminal sliding mode controller for robotic manipulators,” J. Frankl. Inst. 351(4), 2356–2377 (2014).CrossRef Google Scholar

Boukattaya, M., Mezghani, N. and Damak, T., “Adaptive nonsingular fast terminal sliding-mode control for the tracking problem of uncertain dynamical systems,” ISA T. 77, 1–19 (2018).CrossRef Google Scholar PubMed

Al-Mohammed, M., Adem, R. and Behal, A., “A switched adaptive controller for robotic gripping of novel objects with minimal force,” IEEE T. Contr. Syst. T. 31(1), 17–26 (2022).CrossRef Google Scholar

Fakhari, A., Kao, I. and Keshmiri, M., “Modeling and control of planar slippage in object manipulation using robotic soft fingers,” ROBOMECH. J. 6(1), 15 (2019).CrossRef Google Scholar

Fakhari, A., Keshmiri, M., Kao, I. and Jazi, S. H., “Slippage control in soft finger grasping and manipulation,” Adv. Robotics 30(2), 97–108 (2016).CrossRef Google Scholar

Logothetis, M., Karras, G. C., Alevizos, K. and Kyriakopoulos, K. J., “A variable impedance control strategy for object manipulation considering non–rigid grasp,” In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE (2020) pp. 7411–7416.Google Scholar

Collins, J., Chand, S., Vanderkop, A. and Howard, D., “A review of physics simulators for robotic applications,” IEEE Access 9, 51416–51431 (2021).CrossRef Google Scholar

Muratore, F., Ramos, F., Turk, G., Yu, W., Gienger, M. and Peters, J., “Robot learning from randomized simulations: A review,” Front. Robot. AI 31, (2022).Google Scholar PubMed

Chen, C.-H. and Naidu, D. S., “Fusion of Hard and Soft Control Strategies for the Robotic Hand,” (John Wiley & Sons, 2017).CrossRef Google Scholar

Article contents

Robustifying a reinforcement learning agent-based bionic reflex controller through an adaptive sliding mode control

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests