This opportunity is not published. No applications will be accepted.

Learning for Trajectory Optimization

Development of a learning algorithm that predicts a near-optimal solution to a trajectory optimization problem for robot locomotion.

Keywords: robotics, legged robot, (deep) learning, optimization

Description
The concept of solving difficult optimization problems through machine learning has been present in literature for decades already [1-3]. Partially inspired by impressive results in simulation where learning system dynamics and subsequent prediction can be much faster than traditional methods [4], we want to explore this idea in the robotics domain applied to the trajectory optimization (TO) problem. TO is a flavor of optimal control that generates motion plans that are optimal in some sense (e.g., min. time or energy) and satisfy constraints (e.g., joint limits, torques, …). Such optimization problems typically result in large nonlinear mathematical problems. It is well known that gradient-based optimization algorithms are prone to local minima and may fail to find a feasible solution at all. Hence they work best when initialized close to the optimal. We would like to learn a near-optimal solution of the TO problem and use it to initialize a standard numerical solver. We hope this will significantly speed up computation times, possibly allow MPC-style control, and produce guaranteed feasible solutions. There exist approaches that try to learn an optimal policy through reinforcement learning while being guided by TO solutions [5]. While these imitation learning approaches are related and can provide useful insights, the goal in this project is to obtain an entire trajectory. We therefore want to approach this problem differently by formulating it as an supervised learning problem (with data generated from our optimizer). [1] Smith, Neural Networks for Combinatorial Optimization: A Review of More Than a Decade of Research, INFORMS Journal on Computing, 1999 [2] Malek et al, A Neural Network Model for Solving Nonlinear Optimization Problems with Real-Time Applications, ISNN, 2009 [3] Marcin Andrychowicz et al, Learning to learn by gradient descent by gradient descent, CoRR 2016 [4] https://youtu.be/55rsJI11FOA [5] Levine et al, Guided Policy Search, ICML 2013
The concept of solving difficult optimization problems through machine learning has been present in literature for decades already [1-3]. Partially inspired by impressive results in simulation where learning system dynamics and subsequent prediction can be much faster than traditional methods [4], we want to explore this idea in the robotics domain applied to the trajectory optimization (TO) problem.

TO is a flavor of optimal control that generates motion plans that are optimal in some sense (e.g., min. time or energy) and satisfy constraints (e.g., joint limits, torques, …).
Such optimization problems typically result in large nonlinear mathematical problems. It is well known that gradient-based optimization algorithms are prone to local minima and may fail to find a feasible solution at all. Hence they work best when initialized close to the optimal. We would like to learn a near-optimal solution of the TO problem and use it to initialize a standard numerical solver. We hope this will significantly speed up computation times, possibly allow MPC-style control, and produce guaranteed feasible solutions.

There exist approaches that try to learn an optimal policy through reinforcement learning while being guided by TO solutions [5]. While these imitation learning approaches are related and can provide useful insights, the goal in this project is to obtain an entire trajectory. We therefore want to approach this problem differently by formulating it as an supervised learning problem (with data generated from our optimizer).

[1] Smith, Neural Networks for Combinatorial Optimization: A Review of More Than a Decade of Research, INFORMS Journal on Computing, 1999

[2] Malek et al, A Neural Network Model for Solving Nonlinear Optimization Problems with Real-Time Applications, ISNN, 2009

[3] Marcin Andrychowicz et al, Learning to learn by gradient descent by gradient descent, CoRR 2016

[4] https://youtu.be/55rsJI11FOA

[5] Levine et al, Guided Policy Search, ICML 2013
Work Packages
- Literature review to evaluate which methods and learning architectures would be suitable for this problem - Suggest a learning algorithm/setup, demonstrate its feasibility on a toy problem and analyze its performance. This may require defining suitable problem formulation and parametrization and the development of a measure to assess the quality of a trajectory (feasibility, stability, speed/efficiency, …) - If time permits: Implement the method for a full robotic system in simulation and/or reality
- Literature review to evaluate which methods and learning architectures would be suitable for this problem
- Suggest a learning algorithm/setup, demonstrate its feasibility on a toy problem and analyze its performance. This may require defining suitable problem formulation and parametrization and the development of a measure to assess the quality of a trajectory (feasibility, stability, speed/efficiency, …)
- If time permits: Implement the method for a full robotic system in simulation and/or reality
Requirements
We are looking for an independent and highly motivated student who takes ownership of this project and demonstrates persistence in making his algorithms work on a real system. We expect candidates to have some knowledge in the following areas: - Prior experience with implementing learning algorithms - Strong programming skills in C++; knowledge of ROS (robotic operating system) is helpful - Background in trajectory optimization or numerical optimization is beneficial
We are looking for an independent and highly motivated student who takes ownership of this project and demonstrates persistence in making his algorithms work on a real system.

We expect candidates to have some knowledge in the following areas:

- Prior experience with implementing learning algorithms

- Strong programming skills in C++; knowledge of ROS (robotic operating system) is helpful

- Background in trajectory optimization or numerical optimization is beneficial
Contact Details
Please contact Jan Carius (jan.carius@mavt.ethz.ch). Your application should include a brief statement of motivation, grade transcript, and your CV.
Please contact Jan Carius (jan.carius@mavt.ethz.ch). Your application should include a brief statement of motivation, grade transcript, and your CV.
Student(s) Name(s)
Not specified
Project Report Abstract
Not specified

Calendar

Earliest start	2017-09-04
Latest end	2018-03-31

Location

Robotic Systems Lab (ETHZ)

Labels

Semester Project
Master Thesis

Learning for Trajectory Optimization

Calendar

Location

Labels

Topics