Learning and Adaptive SystemsOpen OpportunitiesThe project aims to explore curriculum learning techniques to push the limits of quadruped running speed using reinforcement learning. By systematically designing and implementing curricula that guide the learning process, the project seeks to develop a quadruped controller capable of achieving the fastest possible forward locomotion. This involves not only optimizing the learning process but also ensuring the robustness and adaptability of the learned policies across various running conditions. - Engineering and Technology
- Master Thesis
| The advancement in humanoid robotics has reached a stage where mimicking complex human motions with high accuracy is crucial for tasks ranging from entertainment to human-robot interaction in dynamic environments. Traditional approaches in motion learning, particularly for humanoid robots, rely heavily on motion capture (MoCap) data. However, acquiring large amounts of high-quality MoCap data is both expensive and logistically challenging. In contrast, video footage of human activities, such as sports events or dance performances, is widely available and offers an abundant source of motion data.
Building on recent advancements in extracting and utilizing human motion from videos, such as the method proposed in WHAM (refer to the paper "Learning Physically Simulated Tennis Skills from Broadcast Videos"), this project aims to develop a system that extracts human motion from videos and applies it to teach a humanoid robot how to perform similar actions. The primary focus will be on extracting dynamic and expressive motions from videos, such as soccer player celebrations, and using these extracted motions as reference data for reinforcement learning (RL) and imitation learning on a humanoid robot. - Engineering and Technology
- Master Thesis
| In the burgeoning field of deep reinforcement learning (RL), agents autonomously develop complex behaviors through a process of trial and error. Yet, the application of RL across various domains faces notable hurdles, particularly in devising appropriate reward functions. Traditional approaches often resort to sparse rewards for simplicity, though these prove inadequate for training efficient agents. Consequently, real-world applications may necessitate elaborate setups, such as employing accelerometers for door interaction detection, thermal imaging for action recognition, or motion capture systems for precise object tracking. Despite these advanced solutions, crafting an ideal reward function remains challenging due to the propensity of RL algorithms to exploit the reward system in unforeseen ways. Agents might fulfill objectives in unexpected manners, highlighting the complexity of encoding desired behaviors, like adherence to social norms, into a reward function.
An alternative strategy, imitation learning, circumvents the intricacies of reward engineering by having the agent learn through the emulation of expert behavior. However, acquiring a sufficient number of high-quality demonstrations for this purpose is often impractically costly. Humans, in contrast, learn with remarkable autonomy, benefiting from intermittent guidance from educators who provide tailored feedback based on the learner's progress. This interactive learning model holds promise for artificial agents, offering a customized learning trajectory that mitigates reward exploitation without extensive reward function engineering. The challenge lies in ensuring the feedback process is both manageable for humans and rich enough to be effective. Despite its potential, the implementation of human-in-the-loop (HiL) RL remains limited in practice. Our research endeavors to significantly lessen the human labor involved in HiL learning, leveraging both unsupervised pre-training and preference-based learning to enhance agent development with minimal human intervention. - Engineering and Technology, Information, Computing and Communication Sciences
- Master Thesis
| Reinforcement learning (RL) can potentially solve complex problems in a purely data-driven manner. Still, the state-of-the-art in applying RL in robotics, relies heavily on high-fidelity simulators. While learning in simulation allows to circumvent sample complexity challenges that are common in model-free RL, even slight distribution shift ("sim-to-real gap") between simulation and the real system can cause these algorithms to easily fail. Recent advances in model-based reinforcement learning have led to superior sample efficiency, enabling online learning without a simulator. Nonetheless, learning online cannot cause any damage and should adhere to safety requirements (for obvious reasons). The proposed project aims to demonstrate how existing safe model-based RL methods can be used to solve the foregoing challenges. - Engineering and Technology
- Master Thesis
| While the history of machine learning so far largely encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. Such a process would in effect build its own diverse and expanding curricula, and the solutions to problems at various stages would become stepping stones towards solving even more challenging problems later in the process.
Consider the realm of legged locomotion: Training a robot via reinforcement learning to track a velocity command illustrates this concept. Initially, tracking a low velocity is simpler due to algorithm initialization and environmental setup. By manually crafting a curriculum, we can start with low-velocity targets and incrementally increase them as the robot demonstrates competence. This method works well when the difficulty correlates clearly with the target, as with higher velocities or more challenging terrains.
However, challenges arise when the relationship between task difficulty and control parameters is unclear. For instance, if a parameter dictates various human dance styles for the robot to mimic, it's not obvious whether jazz is easier than hip-hop. In such scenarios, the difficulty distribution does not align with the control parameter. How, then, can we devise an effective curriculum?
In the conventional RSL training setting for locomotion over challenging terrains, there is also a handcrafted learning schedule dictating increasingly hard terrain levels but unified with multiple different types. With a smart autonomous curriculum learning algorithm, are we able to overcome separate terrain types asynchronously and thus achieve overall better performance or higher data efficiency?
- Engineering and Technology
- Master Thesis
| Humanoid robots, designed to replicate human structure and behavior, have made significant strides in kinematics, dynamics, and control systems. Research aims to develop robots capable of performing tasks in human-centric settings, from simple object manipulation to navigating complex terrains. Reinforcement learning (RL) has proven to be a powerful method for enabling robots to learn from their environment, enhancing their performance over time without explicit programming for every possible scenario. In the realm of humanoid robotics, RL is used to optimize control policies, adapt to new tasks, and improve the efficiency and safety of human-robot interactions. However, one of the primary challenges is the high dimensionality of the action space, where handcrafted reward functions fall short of generating natural, lifelike motions.
Incorporating motion priors into the learning process of humanoid robots addresses these challenges effectively. Motion priors can significantly reduce the exploration space in RL, leading to faster convergence and reduced training time. They ensure that learned policies prioritize stability and safety, reducing the risk of unpredictable or hazardous actions. Additionally, motion priors guide the learning process towards more natural, human-like movements, improving the robot's ability to perform tasks intuitively and seamlessly in human environments. Therefore, motion priors are crucial for efficient, stable, and realistic humanoid locomotion learning, enabling robots to better navigate and interact with the world around them. - Information, Computing and Communication Sciences
- Master Thesis
| Model-based reinforcement learning learns a world model from which an optimal control policy can be extracted. Understanding and predicting the forward dynamics of legged systems is crucial for effective control and planning. Forward dynamics involve predicting the next state of the robot given its current state and the applied actions. While traditional physics-based models can provide a baseline understanding, they often struggle with the complexities and non-linearities inherent in real-world scenarios, particularly due to the varying contact patterns of the robot's feet with the ground.
The project aims to develop and evaluate neural network-based models for predicting the dynamics of legged environments, focusing on accounting for varying contact patterns and non-linearities. This involves collecting and preprocessing data from various simulation environment experiments, designing neural network architectures that incorporate necessary structures, and exploring hybrid models that combine physics-based predictions with neural network corrections. The models will be trained and evaluated on prediction autoregressive accuracy, with an emphasis on robustness and generalization capabilities across different noise perturbations. By the end of the project, the goal is to achieve an accurate, robust, and generalizable predictive model for the forward dynamics of legged systems. - Engineering and Technology
- Master Thesis
|
|