Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Improving Feedback in Human-in-the-loop Reinforcement Learning
In this project, we would like to explore whether an attention or saliency visualization technique can improve teacher feedback in a human-in-the-loop reinforcement learning scenario.
Approaches that use human-feedback to guide RL-agents suffer from poor scalability. Furthermore, it is hard to make an informed feedback decision as a user about whether an agent took some actions for the right reason or not. Therefore, we want to study if a visual interpretation of an agent’s policy, in the form of an attention or saliency mechanism, can help users make better feedback decisions. Can a user distinguish a good policy from a bad policy when given a visualization mechanism? The main task is to find a suitable visualization technique and run a comparative study on RL-agent policies. In an extended step, the goal is to see whether such an augmentation of the observations can improve the performance of human-in-the-loop RL algorithms in game-based environments such as Atari.
Approaches that use human-feedback to guide RL-agents suffer from poor scalability. Furthermore, it is hard to make an informed feedback decision as a user about whether an agent took some actions for the right reason or not. Therefore, we want to study if a visual interpretation of an agent’s policy, in the form of an attention or saliency mechanism, can help users make better feedback decisions. Can a user distinguish a good policy from a bad policy when given a visualization mechanism? The main task is to find a suitable visualization technique and run a comparative study on RL-agent policies. In an extended step, the goal is to see whether such an augmentation of the observations can improve the performance of human-in-the-loop RL algorithms in game-based environments such as Atari.
Not specified
Sammy Christen (sammy.christen@inf.ethz.ch)
Dr. David Lindlbauer (david.lindlbauer@inf.ethz.ch)
Sammy Christen (sammy.christen@inf.ethz.ch)
Dr. David Lindlbauer (david.lindlbauer@inf.ethz.ch)