 ETH Competence Center for Materials and Processes (MaP)Acronym | MaP | Homepage | http://www.map.ethz.ch/ | Country | Switzerland | ZIP, City | 8093 Zürich | Address | Leopold-Ruzicka-Weg 4 | Phone | +41 44 633 37 53 | Type | Academy | Parent organization | ETH Zurich | Current organization | ETH Competence Center for Materials and Processes (MaP) | Members | |
Open OpportunitiesWe are developing an acoustofluidic platform that can increase the efficiency of microtissue histology. But most steps in this long process workflow are currently performed manually. To achieve high throughputs, we are interested in developing a 3-axis linear manipulator compatible with the established acoustofluidic-enhanced-histology workflow that automates most of the steps. - Mechanical Engineering, Robotics and Mechatronics
- ETH Zurich (ETHZ), Master Thesis
| Robotics is dominated by on-policy reinforcement learning: the paradigm of training a robot controller by iteratively interacting with the environment and maximizing some objective. A crucial idea to make this work is the Advantage Function. On each policy update, algorithms typically sum up the gradient log probabilities of all actions taken in the robot simulation. The advantage function increases or decreases the probabilities of these taken actions by comparing their “goodness” versus a baseline. Current advantage estimation methods use a value function to aggregate robot experience and hence decrease variance. This improves sample efficiency at the cost of introducing some bias.
Stably training large language models via reinforcement learning is well-known to be a challenging task. A line of recent work [1, 2] has used Group-Relative Policy Optimization (GRPO) to achieve this feat. In GRPO, a series of answers are generated for each query-answer pair. The advantage is calculated based on a given answer being better than the average answer to the query. In this formulation, no value function is required.
Can we adapt GRPO towards robot learning? Value Functions are known to cause issues in training stability [3] and a result in biased advantage estimates [4]. We are in the age of GPU-accelerated RL [5], training policies by simulating thousands of robot instances simultaneously. This makes a new monte-carlo (MC) approach towards RL timely, feasible and appealing. In this project, the student will be tasked to investigate the limitations of value-function based advantage estimation. Using GRPO as a starting point, the student will then develop MC-based algorithms that use the GPU’s parallel simulation capabilities for stable RL training for unbiased variance reduction while maintaining a competitive wall-clock time.
- Intelligent Robotics, Knowledge Representation and Machine Learning, Robotics and Mechatronics
- Bachelor Thesis, Master Thesis, Semester Project
| Plastic waste and the resulting environmental pollution are major challenges of our time. One of the problems is the mismatch of degradability and durability in plastics. Single use plastics like packaging material should be easy to degrade to facilitate recycling after use. However, these single use plastics are often very stable and hard to recycle. Performance plastics need to last during their lifetime without significant decrease in material properties, but aging in these materials eventually leads to material failure and replacement. Both situations generate plastic waste. Therefore, we want to synthesize a material that can degrade on-demand and experiences enhanced durability on longer timescales to satisfy the needs of single use plastics and performance plastics, respectively. - Organic Chemical Synthesis
- Master Thesis, Semester Project
| This project aims to develop a deep learning-powered microscope capable of imaging and classifying various white blood cell (WBC) subtypes from blood smears - Engineering and Technology
- Internship, Master Thesis, Semester Project
| The advancement in humanoid robotics has reached a stage where mimicking complex human motions with high accuracy is crucial for tasks ranging from entertainment to human-robot interaction in dynamic environments. Traditional approaches in motion learning, particularly for humanoid robots, rely heavily on motion capture (MoCap) data. However, acquiring large amounts of high-quality MoCap data is both expensive and logistically challenging. In contrast, video footage of human activities, such as sports events or dance performances, is widely available and offers an abundant source of motion data.
Building on recent advancements in extracting and utilizing human motion from videos, such as the method proposed in WHAM (refer to the paper "Learning Physically Simulated Tennis Skills from Broadcast Videos"), this project aims to develop a system that extracts human motion from videos and applies it to teach a humanoid robot how to perform similar actions. The primary focus will be on extracting dynamic and expressive motions from videos, such as soccer player celebrations, and using these extracted motions as reference data for reinforcement learning (RL) and imitation learning on a humanoid robot. - Engineering and Technology
- Master Thesis
| Agility and rapid decision-making are vital for humanoid robots to safely and effectively operate in dynamic, unstructured environments. In human contexts—whether in crowded spaces, industrial settings, or collaborative environments—robots must be capable of reacting to fast, unpredictable changes in their surroundings. This includes not only planned navigation around static obstacles but also rapid responses to dynamic threats such as falling objects, sudden human movements, or unexpected collisions. Developing such reactive capabilities in legged robots remains a significant challenge due to the complexity of real-time perception, decision-making under uncertainty, and balance control.
Humanoid robots, with their human-like morphology, are uniquely positioned to navigate and interact with human-centered environments. However, achieving fast, dynamic responses—especially while maintaining postural stability—requires advanced control strategies that integrate perception, motion planning, and balance control within tight time constraints.
The task of dodging fast-moving objects, such as balls, provides an ideal testbed for studying these capabilities. It encapsulates several core challenges: rapid object detection and trajectory prediction, real-time motion planning, dynamic stability maintenance, and reactive behavior under uncertainty. Moreover, it presents a simplified yet rich framework to investigate more general collision avoidance strategies that could later be extended to complex real-world interactions.
In robotics, reactive motion planning for dynamic environments has been widely studied, but primarily in the context of wheeled robots or static obstacle fields. Classical approaches focus on precomputed motion plans or simple reactive strategies, often unsuitable for highly dynamic scenarios where split-second decisions are critical.
In the domain of legged robotics, maintaining balance while executing rapid, evasive maneuvers remains a challenging problem. Previous work on dynamic locomotion has addressed agile behaviors like running, jumping, or turning (e.g., Hutter et al., 2016; Kim et al., 2019), but these movements are often planned in advance rather than triggered reactively. More recent efforts have leveraged reinforcement learning (RL) to enable robots to adapt to dynamic environments, demonstrating success in tasks such as obstacle avoidance, perturbation recovery, and agile locomotion (Peng et al., 2017; Hwangbo et al., 2019). However, many of these approaches still struggle with real-time constraints and robustness in high-speed, unpredictable scenarios.
Perception-driven control in humanoids, particularly for tasks requiring fast reactions, has seen advances through sensor fusion, visual servoing, and predictive modeling. For example, integrating vision-based object tracking with dynamic motion planning has enabled robots to perform tasks like ball catching or blocking (Ishiguro et al., 2002; Behnke, 2004). Yet, dodging requires a fundamentally different approach: instead of converging toward an object (as in catching), the robot must predict and strategically avoid the object’s trajectory while maintaining balance—often in the presence of limited maneuvering time.
Dodgeball-inspired robotics research has been explored in limited contexts, primarily using wheeled robots or simplified agents in simulations. Few studies have addressed the challenges of high-speed evasion combined with the complexities of humanoid balance and multi-joint coordination. This project aims to bridge that gap by developing learning-based methods that enable humanoid robots to reactively avoid fast-approaching objects in real time, while preserving stability and agility.
- Engineering and Technology
- Master Thesis
| Humanoid robots, designed to mimic the structure and behavior of humans, have seen significant advancements in kinematics, dynamics, and control systems. Teleoperation of humanoid robots involves complex control strategies to manage bipedal locomotion, balance, and interaction with environments. Research in this area has focused on developing robots that can perform tasks in environments designed for humans, from simple object manipulation to navigating complex terrains. Reinforcement learning has emerged as a powerful method for enabling robots to learn from interactions with their environment, improving their performance over time without explicit programming for every possible scenario. In the context of humanoid robotics and teleoperation, RL can be used to optimize control policies, adapt to new tasks, and improve the efficiency and safety of human-robot interactions. Key challenges include the high dimensionality of the action space, the need for safe exploration, and the transfer of learned skills across different tasks and environments. Integrating human motion tracking with reinforcement learning on humanoid robots represents a cutting-edge area of research. This approach involves using human motion data as input to train RL models, enabling the robot to learn more natural and human-like movements. The goal is to develop systems that can not only replicate human actions in real-time but also adapt and improve their responses over time through learning. Challenges in this area include ensuring real-time performance, dealing with the variability of human motion, and maintaining stability and safety of the humanoid robot.
- Information, Computing and Communication Sciences
- Master Thesis
| Acoustic standing waves can be used to manipulate physical objects in both gas and liquid environments. This project investigates their effects on particle flow and selectivity in air, considering various particle sizes and weights. Through modeling, simulations, and experimental validation, we aim to characterize the selectivity of these waves and develop a compact driver circuit for practical implementation. The student will work closely with Honeywell engineers on test setups, electronic designs, and prototyping. A successful outcome may lead to a subsequent R&D phase or PhD project in a collaboration with Honeywellto further develop these findings. - Electrical and Electronic Engineering, Mechanical and Industrial Engineering, Physics
- Master Thesis
| Humanoid robots hold the promise of navigating complex, human-centric environments with agility and adaptability. However, training these robots to perform dynamic behaviors such as parkour—jumping, climbing, and traversing obstacles—remains a significant challenge due to the high-dimensional state and action spaces involved. Traditional Reinforcement Learning (RL) struggles in such settings, primarily due to sparse rewards and the extensive exploration needed for complex tasks.
This project proposes a novel approach to address these challenges by incorporating loosely guided references into the RL process. Instead of relying solely on task-specific rewards or complex reward shaping, we introduce a simplified reference trajectory that serves as a guide during training. This trajectory, often limited to the robot's base movement, reduces the exploration burden without constraining the policy to strict tracking, allowing the emergence of diverse and adaptable behaviors.
Reinforcement Learning has demonstrated remarkable success in training agents for tasks ranging from game playing to robotic manipulation. However, its application to high-dimensional, dynamic tasks like humanoid parkour is hindered by two primary challenges:
Exploration Complexity: The vast state-action space of humanoids leads to slow convergence, often requiring millions of training steps.
Reward Design: Sparse rewards make it difficult for the agent to discover meaningful behaviors, while dense rewards demand intricate and often brittle design efforts.
By introducing a loosely guided reference—a simple trajectory representing the desired flow of the task—we aim to reduce the exploration space while maintaining the flexibility of RL. This approach bridges the gap between pure RL and demonstration-based methods, enabling the learning of complex maneuvers like climbing, jumping, and dynamic obstacle traversal without heavy reliance on reward engineering or exact demonstrations.
- Information, Computing and Communication Sciences
- Master Thesis
| Piezoelectric elements are widely used for particle manipulation and atomization, with applications in humidification, cooling, and medical aerosol generation. However, temperature and environmental factors can impact the efficiency of vaporization and the properties of the generated droplets. Additionally, the heat generated by piezo elements affects particle size and flux, requiring careful control.
This project will investigate the effect of piezoelectric elements on liquid and gel atomization, optimizing power consumption, repeatability, and calibration. A proof-of-concept demonstrator will be developed to study these parameters under controlled conditions. A successful outcome may lead to a subsequent R&D phase or PhD project in collaboration with Honeywell to further develop these findings - Electrical and Electronic Engineering, Mechanical and Industrial Engineering, Physics
- Master Thesis
|
|