Register now After registration you will be able to apply for this opportunity online.
Reinforcement Learning Control with Probabilistic Safety
When controlling a system we typically aim to make the system carry out specific tasks, like remaining in a set of states, or reaching a set of states, or both. Recent advances allow to formulate controllers using dynamic programming that trade off such specifications optimally against costs, such as energy consumption. However, these methods rely on full model knowledge; it is the aim of this project to explore learning-based algorithms towards achieving these objectives. The approach will be validated on the Ball-on-a-Plate system, which is a mechanically actuated plate with a ball on it.
Keywords: Machine Learning, Reinforcement Learning, Control Theory, Safety, Stochastic Systems
The problem of finding controllers that yield a maximum probability of remaining in a set (invariance), reaching specific states (reachability) or both (reach-avoidance) has a rich history in Dynamic Programming (https://arxiv.org/pdf/2211.07544.pdf). Further, most systems also feature a physical cost objective and trading off the cost against the probability of achieving one of the above specifications yields a problem formulation that has often been considered intractable. While recent developments propose a computationally effective approach, they require full information of the stochastic model dynamics (https://arxiv.org/pdf/2312.10495v1.pdf, https://arxiv.org/pdf/2402.19360v1.pdf). This project aims to alleviate this restriction using learning-based approaches, e.g., Reinforcement Learning.
The ball-on-a-plate system, a mechanically actuated plate that balances a ball in its middle, will act as a benchmark. The overall goal is to manouvre the ball to a sequence of sets as fast as possible while retaining a high enough reliability. Recently equipped with a simple-to-use python interface the system allows for an easy deployment of control algorithms.
The problem of finding controllers that yield a maximum probability of remaining in a set (invariance), reaching specific states (reachability) or both (reach-avoidance) has a rich history in Dynamic Programming (https://arxiv.org/pdf/2211.07544.pdf). Further, most systems also feature a physical cost objective and trading off the cost against the probability of achieving one of the above specifications yields a problem formulation that has often been considered intractable. While recent developments propose a computationally effective approach, they require full information of the stochastic model dynamics (https://arxiv.org/pdf/2312.10495v1.pdf, https://arxiv.org/pdf/2402.19360v1.pdf). This project aims to alleviate this restriction using learning-based approaches, e.g., Reinforcement Learning.
The ball-on-a-plate system, a mechanically actuated plate that balances a ball in its middle, will act as a benchmark. The overall goal is to manouvre the ball to a sequence of sets as fast as possible while retaining a high enough reliability. Recently equipped with a simple-to-use python interface the system allows for an easy deployment of control algorithms.
The goal is to
1. Design learning-based algorithms to trade off invariance, reachability and reach-avoid specifications against cost objectives.
2. Implement and validate the algorithm on the Ball-on-a-Plate system, including creating a good visual presentation of the approach.
The goal is to 1. Design learning-based algorithms to trade off invariance, reachability and reach-avoid specifications against cost objectives. 2. Implement and validate the algorithm on the Ball-on-a-Plate system, including creating a good visual presentation of the approach.
Please send your resume/CV (including lists of relevant publications/projects) and transcript of records via email to nikschmid@ethz.ch, mfochesato@ethz.ch.
A basic familiarity with reinforcement learning or dynamic programming is expected.
Please send your resume/CV (including lists of relevant publications/projects) and transcript of records via email to nikschmid@ethz.ch, mfochesato@ethz.ch.
A basic familiarity with reinforcement learning or dynamic programming is expected.