This opportunity is not published. No applications will be accepted.

Investigation of the Sim-to-Real gap for the grasping task using a robotic arm

This thesis focuses on enhancing robots' ability to grasp objects precisely in various contexts using deep neural networks, addressing challenges like the sim-to-real gap. The project aims to compare sensors (RGBD, Time-of-Flight), and introduce a novel method for grasping using reinforcement learning in simulation.

Keywords: Computer vision, Grasping, Sim2real, Reinforcement Learning, Robotics

Description
Robotic grasping is a crucial research area today due to its potential impact on various industries. As robots become more integrated into our lives, their ability to handle objects with precision becomes vital. This skill can revolutionize manufacturing, logistics, healthcare, and everyday tasks. However, creating robots that can grasp objects effectively in diverse situations requires advancements in fields like computer vision and mechanical engineering. This thesis focuses on the perception part. There are three key tasks during vision-based robotic grasping, which are object localization, object pose estimation and grasp estimation [1]. Machine learning techniques, particularly deep neural networks, are employed to train models that can predict optimal grasp points based on object attributes and sensor data. Reinforcement learning and simulation-based training are utilized to refine grasp policies through trial and error [2,3]. However, a notable challenge lies in the sim-to-real gap, where models trained in simulated environments struggle to perform as effectively in the real world. Simulated training lacks the complexity and variability of real-world scenarios, leading to a lack of generalization. Bridging this gap remains a central concern in advancing robotic grasping capabilities, requiring innovative techniques to enhance model transferability and adaptability across different contexts. In this project, you will first investigate the state of the art in grasping task and implementing it which serves as a baseline. The focus will be on approaches that work in battery operated embedded systems, not connected to the cloud. Next, different sensors (RGBD, Time-of-Flight) as well as their position will be compared and evaluated. With these insights, also towards their sim2real gap, a new method in simulation will be developed and tested in real world. **Prerequisites** - Solid programming experience (Python). - Background knowledge in computer vision - Optional: Experience with ROS **Character** - 20% Literature study - 40% Algorithmic Design and Implementation - 30% Platform implementation and Testing - 10% Report and Presentation
Robotic grasping is a crucial research area today due to its potential impact on various industries. As robots become more integrated into our lives, their ability to handle objects with precision becomes vital. This skill can revolutionize manufacturing, logistics, healthcare, and everyday tasks. However, creating robots that can grasp objects effectively in diverse situations requires advancements in fields like computer vision and mechanical engineering.

This thesis focuses on the perception part. There are three key tasks during vision-based robotic grasping, which are object localization, object pose estimation and grasp estimation [1]. Machine learning techniques, particularly deep neural networks, are employed to train models that can predict optimal grasp points based on object attributes and sensor data.

Reinforcement learning and simulation-based training are utilized to refine grasp policies through trial and error [2,3]. However, a notable challenge lies in the sim-to-real gap, where models trained in simulated environments struggle to perform as effectively in the real world. Simulated training lacks the complexity and variability of real-world scenarios, leading to a lack of generalization. Bridging this gap remains a central concern in advancing robotic grasping capabilities, requiring innovative techniques to enhance model transferability and adaptability across different contexts.

In this project, you will first investigate the state of the art in grasping task and implementing it which serves as a baseline. The focus will be on approaches that work in battery operated embedded systems, not connected to the cloud. Next, different sensors (RGBD, Time-of-Flight) as well as their position will be compared and evaluated. With these insights, also towards their sim2real gap, a new method in simulation will be developed and tested in real world.

**Prerequisites**

- Solid programming experience (Python).
- Background knowledge in computer vision
- Optional: Experience with ROS

**Character**

- 20% Literature study
- 40% Algorithmic Design and Implementation
- 30% Platform implementation and Testing
- 10% Report and Presentation
Goal
- Review grasping literature using cameras and other sensors - Get familiarised with the robotic arm and sensors - Implement an existing algorithm - Comparison of RGBD and TOF sensors in terms of their sim2real gap - Develop and evaluate a new grasping method based on reinforcement learning in simulation - Deploy and test the final approach on an embedded platform such as the Nvidia Jetson **References** [1] G. Du, K. Wang, S. Lian, and K. Zhao, “Vision-based Robotic Grasping From Object Localization, Object Pose Estimation to Grasp Estimation for Parallel Grippers: A Review,” Artif Intell Rev, vol. 54, no. 3, pp. 1677–1734, Mar. 2021 [2] C. Eppner, A. Mousavian, and D. Fox, “A Billion Ways to Grasp: An Evaluation of Grasp Sampling Schemes on a Dense, Physics-based Grasp Data Set.” arXiv, Dec. 11, 2019. Accessed: Aug. 09, 2023 [3] C. Sun et al., “Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation.” arXiv, Dec. 06, 2021. Accessed: Jan. 17, 2023
- Review grasping literature using cameras and other sensors
- Get familiarised with the robotic arm and sensors
- Implement an existing algorithm
- Comparison of RGBD and TOF sensors in terms of their sim2real gap
- Develop and evaluate a new grasping method based on reinforcement learning in simulation
- Deploy and test the final approach on an embedded platform such as the Nvidia Jetson

**References**

[1] G. Du, K. Wang, S. Lian, and K. Zhao, “Vision-based Robotic Grasping From Object Localization, Object Pose Estimation to Grasp Estimation for Parallel Grippers: A Review,” Artif Intell Rev, vol. 54, no. 3, pp. 1677–1734, Mar. 2021

[2] C. Eppner, A. Mousavian, and D. Fox, “A Billion Ways to Grasp: An Evaluation of Grasp Sampling Schemes on a Dense, Physics-based Grasp Data Set.” arXiv, Dec. 11, 2019. Accessed: Aug. 09, 2023

[3] C. Sun et al., “Fully Autonomous Real-World Reinforcement Learning with Applications to Mobile Manipulation.” arXiv, Dec. 06, 2021. Accessed: Jan. 17, 2023
Contact Details
Steven Marty (martyste@pbl.ee.ethz.ch) Davide Plozza (davide.plozza@pbl.ee.ethz.ch)
Steven Marty (martyste@pbl.ee.ethz.ch)
Davide Plozza (davide.plozza@pbl.ee.ethz.ch)

Calendar

Earliest start	2023-09-18
Latest end	2024-09-18

Location

Center for Project-Based Learning D-ITET (ETHZ)

Labels

Master Thesis
Software (PBL)
Machine Learning (PBL)
Computer Vision (PBL)
Robotics (PBL)

Topics

Information, Computing and Communication Sciences

Documents

Name	Comment	Size	Actions
Thesis Proposal.pdf		655KB	Download