This opportunity is not published. No applications will be accepted.

Improving Feedback in Human-in-the-loop Reinforcement Learning

In this project, we would like to explore whether an attention or saliency visualization technique can improve teacher feedback in a human-in-the-loop reinforcement learning scenario.

Keywords: Reinforcement Learning, Computer Vision, Saliency, Attention, Human-in-the-loop

Description
Approaches that use human-feedback to guide RL-agents suffer from poor scalability. Furthermore, it is hard to make an informed feedback decision as a user about whether an agent took some actions for the right reason or not. Therefore, we want to study if a visual interpretation of an agent’s policy, in the form of an attention or saliency mechanism, can help users make better feedback decisions. Can a user distinguish a good policy from a bad policy when given a visualization mechanism? The main task is to find a suitable visualization technique and run a comparative study on RL-agent policies. In an extended step, the goal is to see whether such an augmentation of the observations can improve the performance of human-in-the-loop RL algorithms in game-based environments such as Atari.
Approaches that use human-feedback to guide RL-agents suffer from poor scalability. Furthermore, it is hard to make an informed feedback decision as a user about whether an agent took some actions for the right reason or not. Therefore, we want to study if a visual interpretation of an agent’s policy, in the form of an attention or saliency mechanism, can help users make better feedback decisions. Can a user distinguish a good policy from a bad policy when given a visualization mechanism? The main task is to find a suitable visualization technique and run a comparative study on RL-agent policies. In an extended step, the goal is to see whether such an augmentation of the observations can improve the performance of human-in-the-loop RL algorithms in game-based environments such as Atari.
Goal
Not specified
Contact Details
Sammy Christen (sammy.christen@inf.ethz.ch) Dr. David Lindlbauer (david.lindlbauer@inf.ethz.ch)
Sammy Christen (sammy.christen@inf.ethz.ch)

Dr. David Lindlbauer (david.lindlbauer@inf.ethz.ch)

Calendar

Earliest start	2020-03-02
Latest end	No date

Location

Advanced Interactive Technologies (ETHZ)

Labels

Semester Project
Bachelor Thesis
Master Thesis
CLS Student Project [managed by Max Planck ETH Center for Learning Systems]
ETH Zurich (ETHZ)

Topics

Engineering and Technology
Behavioural and Cognitive Sciences