Register now After registration you will be able to apply for this opportunity online.
Pushing the Limit of Quadruped Running Speed with Autonomous Curriculum Learning
The project aims to explore curriculum learning techniques to push the limits of quadruped running speed using reinforcement learning. By systematically designing and implementing curricula that guide the learning process, the project seeks to develop a quadruped controller capable of achieving the fastest possible forward locomotion. This involves not only optimizing the learning process but also ensuring the robustness and adaptability of the learned policies across various running conditions.
Keywords: curriculum learning, fast locomotion
Quadruped robots have shown remarkable versatility in navigating diverse terrains, demonstrating capabilities ranging from basic locomotion to complex maneuvers. However, achieving high-speed forward locomotion remains a challenging task due to the intricate dynamics and control requirements involved. Traditional reinforcement learning (RL) approaches have made significant strides in this area, but they often face issues related to sample efficiency, convergence speed, and stability when applied to tasks with high degrees of freedom like quadruped locomotion.
Curriculum learning (CL), a concept inspired by the way humans and animals learn progressively from simpler to more complex tasks, offers a promising solution to these challenges. In the context of reinforcement learning, curriculum learning involves structuring the learning process by starting with simpler tasks and gradually increasing the complexity as the agent's proficiency improves. This approach can lead to faster convergence and better generalization by enabling the agent to build foundational skills before tackling more difficult scenarios.
**Work packages**
Literature research
Development of autonomous curriculum
Comparison with baselines (no curriculum, hand-crafted curriculum)
**Requirements**
Strong programming skills in Python
Experience in reinforcement learning
**Publication**
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
**Related literature**
This project and the following literature will make you a master in curriculum/active/open-ended learning.
Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.
Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.
Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.
Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.
Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.
Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.
Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.
Quadruped robots have shown remarkable versatility in navigating diverse terrains, demonstrating capabilities ranging from basic locomotion to complex maneuvers. However, achieving high-speed forward locomotion remains a challenging task due to the intricate dynamics and control requirements involved. Traditional reinforcement learning (RL) approaches have made significant strides in this area, but they often face issues related to sample efficiency, convergence speed, and stability when applied to tasks with high degrees of freedom like quadruped locomotion.
Curriculum learning (CL), a concept inspired by the way humans and animals learn progressively from simpler to more complex tasks, offers a promising solution to these challenges. In the context of reinforcement learning, curriculum learning involves structuring the learning process by starting with simpler tasks and gradually increasing the complexity as the agent's proficiency improves. This approach can lead to faster convergence and better generalization by enabling the agent to build foundational skills before tackling more difficult scenarios.
**Work packages**
Literature research
Development of autonomous curriculum
Comparison with baselines (no curriculum, hand-crafted curriculum)
**Requirements**
Strong programming skills in Python
Experience in reinforcement learning
**Publication**
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
**Related literature** This project and the following literature will make you a master in curriculum/active/open-ended learning.
Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.
Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.
Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.
Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.
Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.
Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.
Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.
Not specified
Not specified
Please include your CV and transcript in the submission.
**Chenhao Li**
https://breadli428.github.io/
chenhli@ethz.ch
**Marco Bagatella**
https://marbaga.github.io/
mbagatella@ethz.ch
Please include your CV and transcript in the submission.