
J Shanghai Jiaotong Univ Sci››2025,Vol. 30››Issue (6): 1125-1133.doi:10.1007/s12204-023-2666-z
• Automation & Computer Technologies •Previous ArticlesNext Articles
李春阳,朱晓庆,阮晓钢,刘鑫源,张思远
Received:2023-02-20Accepted:2023-03-14Online:2025-11-21Published:2023-11-06CLC Number:
LI Chunyang, ZHU Xiaoqing, RUAN Xiaogang, LIU Xinyuan, ZHANG Siyuan. Gait Learning Reproduction for Quadruped Robots Based on Experience Evolution Proximal Policy Optimization[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(6): 1125-1133.
[1] YANG J J, SUN H, WANG C H, et al. An overview of quadruped robots [J].Navigation Positioning and Timing, 2019,6(5): 61-73 (in Chinese). [2] ZHANG W, TAN W H, LI Y B. Locmotion control of quadruped robot based on deep reinforcement learning: Review and prospect [J].Journal of Shandong University(Health Sciences), 2020,58(8): 61-66 (in Chinese). [3] KOHL N, STONE P. Policy gradient reinforcement learning for fast quadrupedal locomotion [C]//IEEE International Conference on Robotics and Automation,2004. New Orleans: IEEE, 2004: 2619-2624. [4] YANG C Y, YUAN K, ZHU Q G, et al. Multi-expert learning of adaptive legged locomotion [J].Science Robotics, 2020,5(49): eabb2174. [5] LEE J, HWANGBO J, WELLHAUSEN L, et al. Learning quadrupedal locomotion over challenging terrain [J].Science Robotics, 2020,5(47): eabc5986. [6] THOR M, KULVICIUS T, MANOONPONG P. Generic neural locomotion control framework for legged robots [J].IEEE Transactions on Neural Networks and Learning Systems, 2021,32(9): 4013-4025. [7] PENG X B, ABBEEL P, LEVINE S, et al. DeepMimic: Example-guided deep reinforcement learning of physics-based character skills [J].ACM Transactions on Graphics, 2018,37(4): 1-14. [8] PENG X B, COUMANS E, ZHANG T N, et al. Learning agile robotic locomotion skills by imitating animals [DB/OL]. (2020-04-02).https://arxiv.org/abs/2004.00784 [9] RAHME M, ABRAHAM I, ELWIN M L, et al. Linear policies are sufficient to enable low-cost quadrupedal robots to traverse rough terrain [C]//2021 IEEE/RSJ International Conference on Intelligent Robots and Systems. Prague: IEEE, 2021: 8469-8476. [10] TAN J, ZHANG T, COUMANS E, et al. Sim-to-real: Learning agile locomotion for quadruped robots[J]. (2018-04-27).https://arxiv.org/abs/1804.10332 [11] WANG Z, CHEN C L, DONG D Y. Instance weighted incremental evolution strategies for reinforcement learning in dynamic environments [J].IEEE Transactions on Neural Networks and Learning Systems, 2022.https://doi.org/10.1109/TNNLS.2022.3160173 [12] BELLEGARDA G, CHEN Y Y, LIU Z C, et al. Robust high-speed running for quadruped robots via deep reinforcement learning [C]//2022 IEEE/RSJ International Conference on Intelligent Robots and Systems. Kyoto: IEEE, 2022: 10364-10370. [13] SHENG J P, CHEN Y Y, FANG X, et al. Bio-inspired rhythmic locomotion for quadruped robots [J].IEEE Robotics and Automation Letters, 2022,7(3): 6782-6789. [14] SHI H J, ZHOU B, ZENG H S, et al. Reinforcement learning with evolutionary trajectory generator: A general approach for quadrupedal locomotion [J].IEEERobotics and Automation Letters, 2022,7(2): 3085-3092. [15] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms [DB/OL]. (2017-07-20).https://arxiv.org/abs/1707.06347 [16] PITCHAI M, XIONG X F, THOR M, et al. CPG driven RBF network control with reinforcement learning for gait optimization of a dung beetle-like robot[M]//Artificial neural networks and machine learning – ICANN 2019: Theoretical neural computation. Cham: Springer, 2019: 698-710. [17] SALIMANS T, HO J, CHEN X, et al. Evolution strategies as a scalable alternative to reinforcement learning [DB/OL]. (2017-05-10).https://arxiv.org/abs/1703.03864 [18] SUTTON R S, MCALLESTER D, SINGH S, et al. Policy gradient methods for reinforcement learning with function approximation [C]//12th International Conference onNeuralInformationProcessingSystems. Denver: ACM, 1999: 1057-1063. [19] BIE T, ZHU X Q, FU Y, et al. Safety priority path planning method based on Safe-PPO algorithm [J].Journal of Beijing University of Aeronautics and Astronautics, 2023,49(8): 2108-2118 (in Chinese). [20] SCHULMAN J, MORITZ P, LEVINE S, et al. High-dimensional continuous control using generalized advantage estimation [DB/OL]. (2015-06-08).https://arxiv.org/abs/1506.02438 [21] COUMANS E, BAI Y F. PyBullet quickstart guide [EB/OL]. [2023-02-01].https://usermanual.wiki/Document/PyBullet20Quickstart20Guide.543993445.pdf |
| [1] | YU Xinyi, XU Siyu, FAN Yuehai, OU Linlin.Self-Adaptive LSAC-PID Approach Based on Lyapunov Reward Shaping for Mobile Robots[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(6): 1085-1102. |
| [2] | Cheng Hongyu, Zhang Han, Wang Shuang , Xie Le.Design of a 6-DOF Master Robot for Robot-Assisted Minimally Invasive Surgery[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(4): 658-667. |
| [3] | Wang Wei, Zhou Cheng, Jiang Jinlei, Cui Xinyuan, Yan Guozheng, Cui Daxiang.Optimization of Wireless Power Receiving Coil for Near-Infrared Capsule Robot[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(3): 425-432. |
| [4] | Li Tao, Zhao Zhigang, Zhu Mingtong, Zhao Xiangtang.Cable Vector Collision Detection Algorithm for Multi-Robot Collaborative Towing System[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 319-329. |
| [5] | Fu Yujia, Zhang Jian, Zhou Liping, Liu Yuanzhi, Qin Minghui, Zhao Hui, Tao Wei.Passive Binocular Optical Motion Capture Technology Under Complex Illumination[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 352-362. |
| [6] | Nie Wei, Liang Xinwu.Efficient Fully Convolutional Network and Optimization Approach for Robotic Grasping Detection Based on RGB-D Images[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(2): 399-416. |
| [7] | ZHAO Yanfei1,2,3(赵艳飞), XIAO Peng4(肖鹏), WANG Jingchuan1,2,3*(王景川), GUO Rui4*(郭锐).Semi-Autonomous Navigation Based on Local Semantic Map for Mobile Robot[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 27-33. |
| [8] | FU Hang1(傅航),XU Jiangchang1(许江长), LI Yinwei2,4*(李寅炜),ZHOU Huifang2,4(周慧芳),CHEN Xiaojun1,3*(陈晓军).Augmented Reality Based Navigation System for Endoscopic Transnasal Optic Canal Decompression[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 34-42. |
| [9] | ZHOU Hanwei1(周涵巍),ZHU Xinping1(朱心平),MA Youwei2(马有为),WANG Kundong1*(王坤东).Low Latency Soft Fiberoptic Choledochoscope Robot Control System[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 43-52. |
| [10] | HE Guisong (贺贵松), HUANG Xuegong*(黄学功),LI Feng(李峰).Coordination Design of a Power-Assisted Ankle Exoskeleton Robot Based on Active-Passive Combined Drive[J]. J Shanghai Jiaotong Univ Sci, 2025, 30(1): 197-208. |
| [11] | LIU Yuesheng (刘月笙), HE Ning∗(贺宁), HE Lile (贺利乐), ZHANG Yiwen (张译文), XI Kun (习坤), ZHANG Mengrui (张梦芮).Self-Tuning of MPC Controller for Mobile Robot Path Tracking Based on Machine Learning[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(6): 1028-1036. |
| [12] | DONG Yubo1(董玉博), CUI Tao1(崔涛), ZHOU Yufan1(周禹帆), SONG Xun2(宋勋), ZHU Yue2(祝月), DONG Peng1∗(董鹏).Reward Function Design Method for Long Episode Pursuit Tasks Under Polar Coordinate in Multi-Agent Reinforcement Learning[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 646-655. |
| [13] | DU Haikuo1,2(杜海阔), GUO Zhengyu3,4(郭正玉), ZHANG Lulu1,2(章露露), CAI Yunze1,2∗(蔡云泽).Multi-Objective Loosely Synchronized Search for Multi-Objective Multi-Agent Path Finding with Asynchronous Actions[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 667-677. |
| [14] | DONG Dejin1,2(董德金), DONG Shiyin3(董诗音), ZHANG Lulu1,2(章露露), CAI Yunze1,2∗(蔡云泽).Multi-AGVs Scheduling with Vehicle Conflict Consideration in Ship Outfitting Items Warehouse[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 725-736. |
| [15] | LI Shuyi (李舒逸), LI Minzhe (李旻哲), JING Zhongliang∗(敬忠良).Multi-Agent Path Planning Method Based on Improved Deep Q-Network in Dynamic Environments[J]. J Shanghai Jiaotong Univ Sci, 2024, 29(4): 601-612. |
| Viewed | ||||||
| Full text |
|
|||||
| Abstract |
|
|||||