Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Approximate Dynamic Programming on Manifolds for Robotics
This project extends the so-called "LP approach" to approximate dynamic programming to the case where the state is constrained to a manifold. Such constraints appear frequently in robotics applications, particularly for jointed robots or systems described in quaternion form.
Approximate Dynamic Programming (ADP) is a family of methods that derive approximations for the value functions (VFs) of control and reinforcement learning problems. A popular method in recent years has been to cast the VF estimation problem as an infinite-dimensional linear program (LP), subject to constraints derived from the well-known Bellman optimality condition. This VF can then be used to make control decisions online (by simply minimising over the stage cost for the current time step plus the VF for the next state) in cases where a full sequence of control decisions would otherwise be difficult to compute, for example where there are constraints and nonlinear dynamics.
Real-time control of robotic systems (e.g. jointed robots, or vehicles with non-holonomic dynamics) is a challenging task that still lacks a reliable solution. These systems are often characterised by additional algebraic constraints on the states. For example, the tip of an arm moving around a fixed pivot is constrained to a circle or sphere. Similarly, a quadcopter whose dynamics usually contain numerous trigonometric functions can be represented in quaternions as a polynomial system subject to an additional manifold constraint. The LP approach to ADP is well developed for polynomial systems, however there are no theories or algorithms developed to incorporate the manifold constraints that would make it attractive for robotics.
Approximate Dynamic Programming (ADP) is a family of methods that derive approximations for the value functions (VFs) of control and reinforcement learning problems. A popular method in recent years has been to cast the VF estimation problem as an infinite-dimensional linear program (LP), subject to constraints derived from the well-known Bellman optimality condition. This VF can then be used to make control decisions online (by simply minimising over the stage cost for the current time step plus the VF for the next state) in cases where a full sequence of control decisions would otherwise be difficult to compute, for example where there are constraints and nonlinear dynamics.
Real-time control of robotic systems (e.g. jointed robots, or vehicles with non-holonomic dynamics) is a challenging task that still lacks a reliable solution. These systems are often characterised by additional algebraic constraints on the states. For example, the tip of an arm moving around a fixed pivot is constrained to a circle or sphere. Similarly, a quadcopter whose dynamics usually contain numerous trigonometric functions can be represented in quaternions as a polynomial system subject to an additional manifold constraint. The LP approach to ADP is well developed for polynomial systems, however there are no theories or algorithms developed to incorporate the manifold constraints that would make it attractive for robotics.
The objective of this project is to adapt the well-studied LP approach to ADP to incorporate these additional algebraic constraints, and estimate the VF directly on the curved manifolds over which the dynamics evolve. This will yield deeper insights into the role of value functions in such control problems, with the end goal of performing complex tasks (see e.g. http://www.mujoco.org ) with minimal online computation. The project will be based around the following steps, adapted to the background of the student:
1. Review literature on ADP, in particular the LP approach.
2. Identify a simple system with a manifold constraint (e.g. the pivoted arm described above) and derive a variant of the LP approach that can represent state evolution on this manifold.
3. Simulate the state and control evolution of the controlled system based on this approach, and compare with other simple methods, such as discretized DP.
4. Review textbook material on quaternions if necessary, and extend the approach to the quadcopter system.
5. If time allows, implement the resulting controller on IfA's CrazyFlie quadcopter platform
The objective of this project is to adapt the well-studied LP approach to ADP to incorporate these additional algebraic constraints, and estimate the VF directly on the curved manifolds over which the dynamics evolve. This will yield deeper insights into the role of value functions in such control problems, with the end goal of performing complex tasks (see e.g. http://www.mujoco.org ) with minimal online computation. The project will be based around the following steps, adapted to the background of the student:
1. Review literature on ADP, in particular the LP approach. 2. Identify a simple system with a manifold constraint (e.g. the pivoted arm described above) and derive a variant of the LP approach that can represent state evolution on this manifold. 3. Simulate the state and control evolution of the controlled system based on this approach, and compare with other simple methods, such as discretized DP. 4. Review textbook material on quaternions if necessary, and extend the approach to the quadcopter system. 5. If time allows, implement the resulting controller on IfA's CrazyFlie quadcopter platform
Paul Beuchat (beuchatp@control.ee.ethz.ch)
Joe Warrington (warrington@control.ee.ethz.ch)
Paul Beuchat (beuchatp@control.ee.ethz.ch) Joe Warrington (warrington@control.ee.ethz.ch)