Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Extracting simple rules from deep reinforcement learning building control polices
In this work, we would like to derive simple control rules from the existing building control polices obtained using deep reinforcement learning.
Keywords: Deep reinforcement learning, Rule-based control, Building control
Control of indoor comfort and energy flows in modern buildings is becoming an increasingly important problem in society, while at the same time more and more complex to optimize. Buildings account for 32% of global primary energy consumption and one-quarter of all greenhouse gas emissions. On the other hand, over the last two decades, buildings have become more complex, integrating photovoltaics, batteries, heat pumps, heating/cooling storage, and electric vehicle (EV) chargers. Therefore, optimal control of buildings has become a challenging task for both industry and academia.
Both conventional, rule-based (RB) and advanced, model-based controllers, such as Model Predictive Control (MPC) show limitations to control modern buildings. RB controllers need manual tuning and are suitable only for single control loops, thus cannot provide system wide optimisation. On the other hand, MPC controllers can provide system wide optimal performance, but require a model of the building, which is expensive to obtain. Recent advancements in reinforcement learning (RL), and in particular deep RL (DRL), have attracted growing research interest among building control engineers and demonstrated the potential to enhance the building performance while addressing some limitations of advanced control techniques [1].
We have successfully developed a fully black-box data-driven pipeline to obtain a control policy for building control problems. We used historical data of different units at NEST demonstration building at our campus at Empa Dübendorf [2] and developed neural network models of the rooms within these units. Further, we have succesffully adopted and tested some of the DRL algorithms, such as Deep Deterministic Policy Gradient (DDPG) and others, on the room temperature control for different units. The DDPG control agents achieved around 20% energy saving for the heating season 2019/2020 and at the same time better comfort satisfaction, i.e. less comfort bound violations. However, for this line of work to be very interesting for building automation industry, one option is to develop methods to extract simple control rules from the obtained building control policies.
Control of indoor comfort and energy flows in modern buildings is becoming an increasingly important problem in society, while at the same time more and more complex to optimize. Buildings account for 32% of global primary energy consumption and one-quarter of all greenhouse gas emissions. On the other hand, over the last two decades, buildings have become more complex, integrating photovoltaics, batteries, heat pumps, heating/cooling storage, and electric vehicle (EV) chargers. Therefore, optimal control of buildings has become a challenging task for both industry and academia.
Both conventional, rule-based (RB) and advanced, model-based controllers, such as Model Predictive Control (MPC) show limitations to control modern buildings. RB controllers need manual tuning and are suitable only for single control loops, thus cannot provide system wide optimisation. On the other hand, MPC controllers can provide system wide optimal performance, but require a model of the building, which is expensive to obtain. Recent advancements in reinforcement learning (RL), and in particular deep RL (DRL), have attracted growing research interest among building control engineers and demonstrated the potential to enhance the building performance while addressing some limitations of advanced control techniques [1].
We have successfully developed a fully black-box data-driven pipeline to obtain a control policy for building control problems. We used historical data of different units at NEST demonstration building at our campus at Empa Dübendorf [2] and developed neural network models of the rooms within these units. Further, we have succesffully adopted and tested some of the DRL algorithms, such as Deep Deterministic Policy Gradient (DDPG) and others, on the room temperature control for different units. The DDPG control agents achieved around 20% energy saving for the heating season 2019/2020 and at the same time better comfort satisfaction, i.e. less comfort bound violations. However, for this line of work to be very interesting for building automation industry, one option is to develop methods to extract simple control rules from the obtained building control policies.
In this work, we would like to derive simple control rules from the existing building control polices obtained using deep reinforcement learning. We have already obtained several building control policies for temperature control of rooms at NEST. This work will start by analysing the actions of these control policies as applied in the past or live at NEST. The main objective would be to suggest a formal methodology (that could be fully automated eventually) to extract simple rules from these policies. The final step would be to implement these simple rules at NEST and compare the results with the existing control policies.
In this work, we would like to derive simple control rules from the existing building control polices obtained using deep reinforcement learning. We have already obtained several building control policies for temperature control of rooms at NEST. This work will start by analysing the actions of these control policies as applied in the past or live at NEST. The main objective would be to suggest a formal methodology (that could be fully automated eventually) to extract simple rules from these policies. The final step would be to implement these simple rules at NEST and compare the results with the existing control policies.
Required qualifications of the eligible student for this MSc thesis are a prior experience with control algorithms, model-free or model based, ideally also with prior experience with reinforcement learning, and good programming skills, preferably in Python for compliance with other projects. The candidate should be proficient in English.
This project will take place in the ehub team [3] at Empa, Dübendorf. For further enquiries, please contact Dr. Bratislav Svetozarevic, bratislav.svetozarevic@empa.ch.
Required qualifications of the eligible student for this MSc thesis are a prior experience with control algorithms, model-free or model based, ideally also with prior experience with reinforcement learning, and good programming skills, preferably in Python for compliance with other projects. The candidate should be proficient in English.
This project will take place in the ehub team [3] at Empa, Dübendorf. For further enquiries, please contact Dr. Bratislav Svetozarevic, bratislav.svetozarevic@empa.ch.