Register now After registration you will be able to apply for this opportunity online.
Leveraging Human Motion Data from Videos for Humanoid Robot Motion Learning
The advancement in humanoid robotics has reached a stage where mimicking complex human motions with high accuracy is crucial for tasks ranging from entertainment to human-robot interaction in dynamic environments. Traditional approaches in motion learning, particularly for humanoid robots, rely heavily on motion capture (MoCap) data. However, acquiring large amounts of high-quality MoCap data is both expensive and logistically challenging. In contrast, video footage of human activities, such as sports events or dance performances, is widely available and offers an abundant source of motion data.
Building on recent advancements in extracting and utilizing human motion from videos, such as the method proposed in WHAM (refer to the paper "Learning Physically Simulated Tennis Skills from Broadcast Videos"), this project aims to develop a system that extracts human motion from videos and applies it to teach a humanoid robot how to perform similar actions. The primary focus will be on extracting dynamic and expressive motions from videos, such as soccer player celebrations, and using these extracted motions as reference data for reinforcement learning (RL) and imitation learning on a humanoid robot.
**Work packages**
Literature research
Global motion reconstruction from videos.
Learning from reconstructed motion demonstrations with reinforcement learning on a humanoid robot.
**Requirements**
Strong programming skills in Python
Experience in computer vision and reinforcement learning
**Publication**
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to machine learning / computer vision / robotics conferences.
**Related literature**
Yuan, Y., Iqbal, U., Molchanov, P., Kitani, K. and Kautz, J., 2022. Glamr: Global occlusion-aware human mesh recovery with dynamic cameras. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11038-11049).
YUAN, Y. and Makoviychuk, V., 2023. Learning physically simulated tennis skills from broadcast videos.
Shin, S., Kim, J., Halilaj, E. and Black, M.J., 2024. Wham: Reconstructing world-grounded humans with accurate 3d motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2070-2080).
Peng, X.B., Abbeel, P., Levine, S. and Van de Panne, M., 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4), pp.1-14.
**Work packages**
Literature research
Global motion reconstruction from videos.
Learning from reconstructed motion demonstrations with reinforcement learning on a humanoid robot.
**Requirements**
Strong programming skills in Python
Experience in computer vision and reinforcement learning
**Publication**
This project will mostly focus on algorithm design and system integration. Promising results will be submitted to machine learning / computer vision / robotics conferences.
**Related literature**
Yuan, Y., Iqbal, U., Molchanov, P., Kitani, K. and Kautz, J., 2022. Glamr: Global occlusion-aware human mesh recovery with dynamic cameras. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 11038-11049).
YUAN, Y. and Makoviychuk, V., 2023. Learning physically simulated tennis skills from broadcast videos.
Shin, S., Kim, J., Halilaj, E. and Black, M.J., 2024. Wham: Reconstructing world-grounded humans with accurate 3d motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2070-2080).
Peng, X.B., Abbeel, P., Levine, S. and Van de Panne, M., 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG), 37(4), pp.1-14.
The objective of this project is to develop a robust system for extracting human motions from video footage and transferring these motions to a humanoid robot using learning from demonstration techniques. The system will be designed to handle the noisy data typically associated with video-based motion extraction and ensure that the humanoid robot can replicate the extracted motions with high fidelity while respecting physical rules.
Proposed Methodology
**Video Data Collection and Motion Extraction**:
- Collect video footage of soccer player celebrations and other dynamic human activities.
- Starting from existing monocular human pose/motion estimation algorithms to extract 3D motion data from the videos.
- Incorporate physics-based corrections similar to those employed in WHAM to address issues like jitter, foot sliding, and ground penetration in the extracted motion data.
**Motion Learning**:
- Applying existing learning from demonstration algorithms in a simulated environment to replicate kinematic motions reconstructed from the videos while respecting physical rules using reinforcement learning.
**Implementation on Humanoid Robot**:
- This is encouraged since we have our robot lying there waiting for you.
The objective of this project is to develop a robust system for extracting human motions from video footage and transferring these motions to a humanoid robot using learning from demonstration techniques. The system will be designed to handle the noisy data typically associated with video-based motion extraction and ensure that the humanoid robot can replicate the extracted motions with high fidelity while respecting physical rules.
Proposed Methodology
**Video Data Collection and Motion Extraction**:
- Collect video footage of soccer player celebrations and other dynamic human activities.
- Starting from existing monocular human pose/motion estimation algorithms to extract 3D motion data from the videos.
- Incorporate physics-based corrections similar to those employed in WHAM to address issues like jitter, foot sliding, and ground penetration in the extracted motion data.
**Motion Learning**:
- Applying existing learning from demonstration algorithms in a simulated environment to replicate kinematic motions reconstructed from the videos while respecting physical rules using reinforcement learning.
**Implementation on Humanoid Robot**:
- This is encouraged since we have our robot lying there waiting for you.
Please include your CV and transcript in the submission.
**Manuel Kaufmann**
https://ait.ethz.ch/people/kamanuel
kamanuel@inf.ethz.ch
**Chenhao Li**
https://breadli428.github.io/
chenhli@ethz.ch
Please include your CV and transcript in the submission.