Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Approximate dynamic programming for an Artificial Intelligence billiard player
This project is part of a new IfA project to develop a fully automated billiard playing robot, and is in particular concerned with its strategic AI.
This project is part of a new IfA project to develop a fully automated billiard playing robot. Strategy in billiard games is complex, because a) the outcome of a particular shot is uncertain, b) the range of legal shots is large and continuous-valued, and c) it is sometimes difficult for a player to decide where he/she would want the balls to be after the shot has been taken. This project will break the decision problem down using methods from approximate dynamic programming (DP). DP represents all future rewards (in this case points scored relative to the opponent) as a function of the system state, so that the decision maker does not have to plan ahead for a full match with every shot.
This project is part of a new IfA project to develop a fully automated billiard playing robot. Strategy in billiard games is complex, because a) the outcome of a particular shot is uncertain, b) the range of legal shots is large and continuous-valued, and c) it is sometimes difficult for a player to decide where he/she would want the balls to be after the shot has been taken. This project will break the decision problem down using methods from approximate dynamic programming (DP). DP represents all future rewards (in this case points scored relative to the opponent) as a function of the system state, so that the decision maker does not have to plan ahead for a full match with every shot.
The steps of this project will be as follows: 1) Write down a dynamic programming recursion for the billiard problem. 2) Identify the difficulties in solving this problem directly, and develop an simplification approach for the case where all shots are executed perfectly. 3) Visualise in simulation the expected reward from the control policy (parameters of the chosen shot), and the cost-to-go (value of the ball arrangement after the shot has been taken). 4) Implement on the experimental setup developed in the lab. 5) If time allows, derive a first-principles stochastic extension, which accounts for the fact that difficult shots will not always succeed.
The steps of this project will be as follows: 1) Write down a dynamic programming recursion for the billiard problem. 2) Identify the difficulties in solving this problem directly, and develop an simplification approach for the case where all shots are executed perfectly. 3) Visualise in simulation the expected reward from the control policy (parameters of the chosen shot), and the cost-to-go (value of the ball arrangement after the shot has been taken). 4) Implement on the experimental setup developed in the lab. 5) If time allows, derive a first-principles stochastic extension, which accounts for the fact that difficult shots will not always succeed.
Joe Warrington (warrington@control.ee.ethz.ch), Nikos Kariotoglou (karioto@control.ee.ethz.ch)
Joe Warrington (warrington@control.ee.ethz.ch), Nikos Kariotoglou (karioto@control.ee.ethz.ch)