Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Multi-task robotic manipulation with a single language-conditioned policy
Can a robotic manipulator perform a diverse set of tasks? Can it achieve so based on purely natural language description (e.g., “pick up the apple”, “unfold the cloth”)? Can the learned policy generalize well to unseen tasks? In this project, we aim to tackle these problems by developing a language-conditioned multi-task policy for robotic manipulators.
Recently, there is great development on multi-task robotic manipulation with the method of imitation learning [1, 2]. BC-Z [1] achieves promising results, but its framework requires thousands of demonstrations (2,759 for bin-emptying task). On the other hand, [2] proposed to combine transporter network [3] and CLIP [4] into one framework Cliport. Cliport can achieve 80% success rate on some tasks with only 10 demonstrations. Furthermore, they found that cliport (multi-task) can even perform better than a single-task policy on most of the tasks, and generalize well to unseen attribute of objects (e.g., pink color). This project aims to implement this Cliport on our own robots (e.g., Panda, DynaArm, ALMA) and further improve upon that.
[1] Shridhar, Mohit, Lucas Manuelli, and Dieter Fox. "Cliport: What and where pathways for robotic manipulation." Conference on Robot Learning. PMLR, 2022.
[2] Jang, Eric, et al. "Bc-z: Zero-shot task generalization with robotic imitation learning." Conference on Robot Learning. PMLR, 2022.
[3] Zeng, Andy, et al. "Transporter networks: Rearranging the visual world for robotic manipulation." Conference on Robot Learning. PMLR, 2021.
[4] Radford, Alec, et al. "Learning transferable visual models from natural language supervision." International Conference on Machine Learning. PMLR, 2021.
Recently, there is great development on multi-task robotic manipulation with the method of imitation learning [1, 2]. BC-Z [1] achieves promising results, but its framework requires thousands of demonstrations (2,759 for bin-emptying task). On the other hand, [2] proposed to combine transporter network [3] and CLIP [4] into one framework Cliport. Cliport can achieve 80% success rate on some tasks with only 10 demonstrations. Furthermore, they found that cliport (multi-task) can even perform better than a single-task policy on most of the tasks, and generalize well to unseen attribute of objects (e.g., pink color). This project aims to implement this Cliport on our own robots (e.g., Panda, DynaArm, ALMA) and further improve upon that.
[1] Shridhar, Mohit, Lucas Manuelli, and Dieter Fox. "Cliport: What and where pathways for robotic manipulation." Conference on Robot Learning. PMLR, 2022.
[2] Jang, Eric, et al. "Bc-z: Zero-shot task generalization with robotic imitation learning." Conference on Robot Learning. PMLR, 2022.
[3] Zeng, Andy, et al. "Transporter networks: Rearranging the visual world for robotic manipulation." Conference on Robot Learning. PMLR, 2021.
[4] Radford, Alec, et al. "Learning transferable visual models from natural language supervision." International Conference on Machine Learning. PMLR, 2021.
- Literature review on relevant topics (manipulation, language embedding, imitation learning, etc.)
- Collect teleoperated demonstrations for multiple tasks
- Develop a language-conditioned multi-task policy for a fixed-base manipulator with imitation learning
- Extend the work to mobile manipulator (wheeled or legged)
- Literature review on relevant topics (manipulation, language embedding, imitation learning, etc.) - Collect teleoperated demonstrations for multiple tasks - Develop a language-conditioned multi-task policy for a fixed-base manipulator with imitation learning - Extend the work to mobile manipulator (wheeled or legged)
- Theoretical background in robot kinematics and dynamics
- Knowledge in Machine Learning
- Experience in Python and Deep learning frameworks (e.g. PyTorch)
- Highly motivated and research oriented
- Theoretical background in robot kinematics and dynamics - Knowledge in Machine Learning - Experience in Python and Deep learning frameworks (e.g. PyTorch) - Highly motivated and research oriented
Kaixian Qu (kaixqu@ethz.ch) Please include your CV and up-to-date transcript.
Kaixian Qu (kaixqu@ethz.ch) Please include your CV and up-to-date transcript.