Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Towards Learning-based Visual Human Robot Interaction
The goal of this project is to create a full pipeline for visual collaborative human robot interaction. For this, methods from classic computer vision and robotics will be combined with modern deep learning techniques.
Keywords: deep learning, human robot interaction, computer vision, segmentation, robotics
While the ultimate goal in robotics is full autonomy, robots will still be limited to predefined tasks and applications in the foreseeable future. To facilitate the deployment of robots for more challenging applications and environments, we anticipate robots to act besides and to collaborate with humans. In this project we will work towards a vision-based pipeline for understanding human gestures and expressions, in order to allow for a safe operation besides humans and an accurate command following.
In order to detect humans in the scene, prior work of our lab on panoptic segmentation [1] and mapping will be used in order to detect humans in the scene in 3D. Given the person’s 3D location, the inspection head with its zoom camera is then used to obtain a high-res full-body observation of the subject. We will then fit a simplified [2] or a more comprehensive [3] body model on the human body pose in order to obtain a lower dimensional representation.
Depending on the time frame of the project, other potential components of the project could be the classification of the intended commands and the usage of these in an application, for example for guiding the excavator to grasp stones or dig trenches.
[1] Carion et al. "End-toEnd Object Detection with Transformers"
[2] Cao et al. "OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields"
[3] Feng et al. “”Collaborative Regression of Expressive Bodies using Moderation
While the ultimate goal in robotics is full autonomy, robots will still be limited to predefined tasks and applications in the foreseeable future. To facilitate the deployment of robots for more challenging applications and environments, we anticipate robots to act besides and to collaborate with humans. In this project we will work towards a vision-based pipeline for understanding human gestures and expressions, in order to allow for a safe operation besides humans and an accurate command following.
In order to detect humans in the scene, prior work of our lab on panoptic segmentation [1] and mapping will be used in order to detect humans in the scene in 3D. Given the person’s 3D location, the inspection head with its zoom camera is then used to obtain a high-res full-body observation of the subject. We will then fit a simplified [2] or a more comprehensive [3] body model on the human body pose in order to obtain a lower dimensional representation.
Depending on the time frame of the project, other potential components of the project could be the classification of the intended commands and the usage of these in an application, for example for guiding the excavator to grasp stones or dig trenches.
[1] Carion et al. "End-toEnd Object Detection with Transformers"
[2] Cao et al. "OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields"
[3] Feng et al. “”Collaborative Regression of Expressive Bodies using Moderation
- Literature research
- Creation of the code pipeline from provided components
- Dataset collection
- Testing during hardware experiments and on datasets
- Potentially: development of an application making use of the developed pipeline
- Literature research - Creation of the code pipeline from provided components - Dataset collection - Testing during hardware experiments and on datasets - Potentially: development of an application making use of the developed pipeline
- Interest in and fascination for software integration and machine learning
- Experience in C++, Python, ROS
- Computer Vision and Geometry experience
- Interest in and fascination for software integration and machine learning - Experience in C++, Python, ROS - Computer Vision and Geometry experience
- Julian Nubert: nubertj@ethz.ch
- Gabriel Waibel: waibelg@ethz.ch
- Julian Nubert: nubertj@ethz.ch - Gabriel Waibel: waibelg@ethz.ch