Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Learning Object Handling for Loading a Dishwasher
Leveraging recent success in data-driven approaches, this project aims to learn policies that combine visual and haptic information for inserting arbitrary objects into a dishwasher rack robustly. The project with be conducted under joint supervision from Intel Research and Robotic Systems Lab.
Keywords: reinforcement learning, representation learning, robot manipulation, sim-to-real transfer
Even everyday tasks such as loading and unloading dishwashers (object insertion) demand continuous scene understanding and decision-making. Prior works in this direction combine learning with vision or expert demonstrations [1, 2]. However, they cannot deal with visual occlusions, distribution shifts, or noise in robot state estimation. This makes them unsuitable for mobile manipulators that mainly rely on onboard sensing. A natural question is: _Can vision be augmented with other sensory inputs for contact-rich manipulation tasks?_
As the first step in this direction, in this project, we would like to investigate the utility of haptic information on contact-rich manipulation tasks. By simulating diverse scenarios in current state-of-the-art simulators, we would like to train an end-to-end manipulation policy that reasons through and about environment contacts and learns to perform the everyday task of loading objects (such as plates and mugs) into a dishwasher. Through a joint collaboration with Intel Research, the goal is to make the policy robust to scene variations (such as clutter and objects of different shapes) and perform sim-to-real in a realistic kitchen in the lab with a 6-DoF robotic arm on ANYmal. Additionally, we would like to explore combining visual and haptic information [2] to improve the system's performance further.
References:
1. Schoettler, Gerrit, et al. "Deep reinforcement learning for industrial insertion tasks with visual inputs and natural rewards." IEEE/RSJ IROS (2020).
2. Lee, Michelle A., et al. "Making sense of vision and touch: Learning multimodal representations for contact-rich tasks." IEEE Transactions on Robotics (2020).
Even everyday tasks such as loading and unloading dishwashers (object insertion) demand continuous scene understanding and decision-making. Prior works in this direction combine learning with vision or expert demonstrations [1, 2]. However, they cannot deal with visual occlusions, distribution shifts, or noise in robot state estimation. This makes them unsuitable for mobile manipulators that mainly rely on onboard sensing. A natural question is: _Can vision be augmented with other sensory inputs for contact-rich manipulation tasks?_
As the first step in this direction, in this project, we would like to investigate the utility of haptic information on contact-rich manipulation tasks. By simulating diverse scenarios in current state-of-the-art simulators, we would like to train an end-to-end manipulation policy that reasons through and about environment contacts and learns to perform the everyday task of loading objects (such as plates and mugs) into a dishwasher. Through a joint collaboration with Intel Research, the goal is to make the policy robust to scene variations (such as clutter and objects of different shapes) and perform sim-to-real in a realistic kitchen in the lab with a 6-DoF robotic arm on ANYmal. Additionally, we would like to explore combining visual and haptic information [2] to improve the system's performance further.
References:
1. Schoettler, Gerrit, et al. "Deep reinforcement learning for industrial insertion tasks with visual inputs and natural rewards." IEEE/RSJ IROS (2020). 2. Lee, Michelle A., et al. "Making sense of vision and touch: Learning multimodal representations for contact-rich tasks." IEEE Transactions on Robotics (2020).
- Literature research on combining multi-modal inputs and learning-based manipulation
- Learning framework for object insertion tasks using haptic and proprioception information
- Evaluation of policy in simulation and comparison to baselines with absent modalities
- Deployment in a realistic kitchen environment on hardware (6-DoF robotic arm on ANYmal)
- Literature research on combining multi-modal inputs and learning-based manipulation - Learning framework for object insertion tasks using haptic and proprioception information - Evaluation of policy in simulation and comparison to baselines with absent modalities - Deployment in a realistic kitchen environment on hardware (6-DoF robotic arm on ANYmal)
- Highly motivated for the topic
- Knowledge in 3D Vision and reinforcement learning
- Programming experience with Python and PyTorch
- Experience in working with robot hardware is a plus
- Highly motivated for the topic - Knowledge in 3D Vision and reinforcement learning - Programming experience with Python and PyTorch - Experience in working with robot hardware is a plus
Please e-mail your application with your most recent transcript and a short resume to the following addresses:
- Mayank Mittal (mittalma@ethz.ch)
Please e-mail your application with your most recent transcript and a short resume to the following addresses: