Register now After registration you will be able to apply for this opportunity online.
Autonomous Curriculum Learning for Increasingly Challenging Tasks
While the history of machine learning so far largely encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. Such a process would in effect build its own diverse and expanding curricula, and the solutions to problems at various stages would become stepping stones towards solving even more challenging problems later in the process.
Consider the realm of legged locomotion: Training a robot via reinforcement learning to track a velocity command illustrates this concept. Initially, tracking a low velocity is simpler due to algorithm initialization and environmental setup. By manually crafting a curriculum, we can start with low-velocity targets and incrementally increase them as the robot demonstrates competence. This method works well when the difficulty correlates clearly with the target, as with higher velocities or more challenging terrains.
However, challenges arise when the relationship between task difficulty and control parameters is unclear. For instance, if a parameter dictates various human dance styles for the robot to mimic, it's not obvious whether jazz is easier than hip-hop. In such scenarios, the difficulty distribution does not align with the control parameter. How, then, can we devise an effective curriculum?
In the conventional RSL training setting for locomotion over challenging terrains, there is also a handcrafted learning schedule dictating increasingly hard terrain levels but unified with multiple different types. With a smart autonomous curriculum learning algorithm, are we able to overcome separate terrain types asynchronously and thus achieve overall better performance or higher data efficiency?
**Work packages**
Literature research
Development of autonomous curriculum
Comparison with baselines (no curriculum, hand-crafted curriculum)
**Requirements**
Strong programming skills in Python
Experience in reinforcement learning
**Publication**
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
**Related literature**
This project and the following literature will make you a master in curriculum/active/open-ended learning.
Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.
Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.
Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.
Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.
Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.
Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.
Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.
**Work packages**
Literature research
Development of autonomous curriculum
Comparison with baselines (no curriculum, hand-crafted curriculum)
**Requirements**
Strong programming skills in Python
Experience in reinforcement learning
**Publication**
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to robotics or machine learning conferences where outstanding robotic performances are highlighted.
**Related literature** This project and the following literature will make you a master in curriculum/active/open-ended learning.
Oudeyer, P.Y., Kaplan, F. and Hafner, V.V., 2007. Intrinsic motivation systems for autonomous mental development. IEEE transactions on evolutionary computation, 11(2), pp.265-286.
Baranes, A. and Oudeyer, P.Y., 2009. R-iac: Robust intrinsically motivated exploration and active learning. IEEE Transactions on Autonomous Mental Development, 1(3), pp.155-169.
Wang, R., Lehman, J., Clune, J. and Stanley, K.O., 2019. Paired open-ended trailblazer (poet): Endlessly generating increasingly complex and diverse learning environments and their solutions. arXiv preprint arXiv:1901.01753.
Pitis, S., Chan, H., Zhao, S., Stadie, B. and Ba, J., 2020, November. Maximum entropy gain exploration for long horizon multi-goal reinforcement learning. In International Conference on Machine Learning (pp. 7750-7761). PMLR.
Portelas, R., Colas, C., Hofmann, K. and Oudeyer, P.Y., 2020, May. Teacher algorithms for curriculum learning of deep rl in continuously parameterized environments. In Conference on Robot Learning (pp. 835-853). PMLR.
Margolis, G.B., Yang, G., Paigwar, K., Chen, T. and Agrawal, P., 2024. Rapid locomotion via reinforcement learning. The International Journal of Robotics Research, 43(4), pp.572-587.
Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.
Not specified
Not specified
Please include your CV and transcript in the submission.
**Chenhao Li**
https://breadli428.github.io/
chenhli@ethz.ch
**Marco Bagatella**
https://marbaga.github.io/
mbagatella@ethz.ch
Please include your CV and transcript in the submission.