Learning World Models for Legged Locomotion

Model-based reinforcement learning learns a world model from which an optimal control policy can be extracted. Understanding and predicting the forward dynamics of legged systems is crucial for effective control and planning. Forward dynamics involve predicting the next state of the robot given its current state and the applied actions. While traditional physics-based models can provide a baseline understanding, they often struggle with the complexities and non-linearities inherent in real-world scenarios, particularly due to the varying contact patterns of the robot's feet with the ground. The project aims to develop and evaluate neural network-based models for predicting the dynamics of legged environments, focusing on accounting for varying contact patterns and non-linearities. This involves collecting and preprocessing data from various simulation environment experiments, designing neural network architectures that incorporate necessary structures, and exploring hybrid models that combine physics-based predictions with neural network corrections. The models will be trained and evaluated on prediction autoregressive accuracy, with an emphasis on robustness and generalization capabilities across different noise perturbations. By the end of the project, the goal is to achieve an accurate, robust, and generalizable predictive model for the forward dynamics of legged systems.

Keywords: forward dynamics, non-smooth dynamics, neural networks, model-based reinforcement learning

Description
**Work packages** Literature research Understand the training pipeline of the paper Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics. Explore the possibility of using a first-order gradient in optimizing the policy. **Requirements** Strong programming skills in Python Experience in machine learning frameworks, especially model-based reinforcement learning. **Publication** This project will mostly focus on simulated environments. Promising results will be submitted to machine learning conferences, where the method will be thoroughly evaluated and tested on different systems (e.g., simple Mujoco environments to complex systems such as quadrupeds and bipeds). **Related literature** Hafner, D., Lillicrap, T., Ba, J. and Norouzi, M., 2019. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603. Hafner, D., Lillicrap, T., Norouzi, M. and Ba, J., 2020. Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193. Hafner, D., Pasukonis, J., Ba, J. and Lillicrap, T., 2023. Mastering diverse domains through world models. arXiv preprint arXiv:2301.04104. Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820. Song, Y., Kim, S. and Scaramuzza, D., 2024. Learning Quadruped Locomotion Using Differentiable Simulation. arXiv preprint arXiv:2403.14864. Li, C., Krause, A. and Hutter, M., 2025. Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics. arXiv preprint arXiv:2501.10100.
**Work packages**

Literature research

Understand the training pipeline of the paper Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics.

Explore the possibility of using a first-order gradient in optimizing the policy.

**Requirements**

Strong programming skills in Python

Experience in machine learning frameworks, especially model-based reinforcement learning.

**Publication**

This project will mostly focus on simulated environments. Promising results will be submitted to machine learning conferences, where the method will be thoroughly evaluated and tested on different systems (e.g., simple Mujoco environments to complex systems such as quadrupeds and bipeds).

**Related literature**

Hafner, D., Lillicrap, T., Ba, J. and Norouzi, M., 2019. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603.

Hafner, D., Lillicrap, T., Norouzi, M. and Ba, J., 2020. Mastering atari with discrete world models. arXiv preprint arXiv:2010.02193.

Hafner, D., Pasukonis, J., Ba, J. and Lillicrap, T., 2023. Mastering diverse domains through world models. arXiv preprint arXiv:2301.04104.

Li, C., Stanger-Jones, E., Heim, S. and Kim, S., 2024. FLD: Fourier Latent Dynamics for Structured Motion Representation and Learning. arXiv preprint arXiv:2402.13820.

Song, Y., Kim, S. and Scaramuzza, D., 2024. Learning Quadruped Locomotion Using Differentiable Simulation. arXiv preprint arXiv:2403.14864.

Li, C., Krause, A. and Hutter, M., 2025. Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics. arXiv preprint arXiv:2501.10100.
Work Packages
Not specified
Requirements
Not specified
Contact Details
Please include your CV and transcript in the submission. **Chenhao Li** https://breadli428.github.io/ chenhli@ethz.ch
Please include your CV and transcript in the submission.

**Chenhao Li**

https://breadli428.github.io/

chenhli@ethz.ch
Student(s) Name(s)
Not specified
Project Report Abstract
Not specified

Calendar

Earliest start	No date
Latest end	No date

Location

Robotic Systems Lab (ETHZ)

Other involved organizations
Course 6: Electrical Engineering and Computer Science (MIT), Learning and Adaptive Systems (ETHZ), ETH Competence Center - ETH AI Center (ETHZ)

Labels

Master Thesis

Topics

Engineering and Technology