Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Deep reinforcement learning based occupant centered control
In this work, we would like to integrate real-time feedback of occupants at one of the apartments at NEST into our DRL pipeline to obtain the occupant-centred control policy.
Keywords: Deep reinforcement learning, Room temperature control, Thermal comfort
Control of indoor comfort and energy flows in modern buildings is becoming an increasingly important problem in society, while at the same time more and more complex to optimize. Buildings account for 32% of global primary energy consumption and one-quarter of all greenhouse gas emissions. On the other hand, over the last two decades, buildings have become more complex, integrating photovoltaics, batteries, heat pumps, heating/cooling storage, and electric vehicle (EV) chargers. Therefore, optimal control of buildings has become a challenging task for both industry and academia.
Many advanced building control algorithms are mainly focusing on energy optimisation of the building, while satisfying predefined fixed comfort bounds. However, these bounds are defined by HVAC standards and do not take individual needs of occupants for comfort. The results of laboratory studies and realistic investigations were summarized in the ASHRAE 55 standard (last update 2017). The standard contains a variety of methods for determining the indoor climate using measurable parameters. Interestingly, however, laboratory tests and investigations in reality contradict each other clearly in some cases [1]. So it turns out to be difficult to reduce the feeling of comfort to just a few physical factors. Individual factors such as experience, expectation, health or mood are neglected with this approach. In summary, it can be said that the need for comfort must be taken into account in order to sustainably reduce the energy consumption of buildings.
We have already developed a fully black-box data-driven pipeline to obtain a control policy for building control problems based on deep reinforcement learning (DRL) algorithms and achieved satisfactory results on the energy savings in different units at NEST demonstration building at Empa, Dübendorf [2]. However, we have currently only addressed satisfying predefined fixed comfort bounds.
Control of indoor comfort and energy flows in modern buildings is becoming an increasingly important problem in society, while at the same time more and more complex to optimize. Buildings account for 32% of global primary energy consumption and one-quarter of all greenhouse gas emissions. On the other hand, over the last two decades, buildings have become more complex, integrating photovoltaics, batteries, heat pumps, heating/cooling storage, and electric vehicle (EV) chargers. Therefore, optimal control of buildings has become a challenging task for both industry and academia.
Many advanced building control algorithms are mainly focusing on energy optimisation of the building, while satisfying predefined fixed comfort bounds. However, these bounds are defined by HVAC standards and do not take individual needs of occupants for comfort. The results of laboratory studies and realistic investigations were summarized in the ASHRAE 55 standard (last update 2017). The standard contains a variety of methods for determining the indoor climate using measurable parameters. Interestingly, however, laboratory tests and investigations in reality contradict each other clearly in some cases [1]. So it turns out to be difficult to reduce the feeling of comfort to just a few physical factors. Individual factors such as experience, expectation, health or mood are neglected with this approach. In summary, it can be said that the need for comfort must be taken into account in order to sustainably reduce the energy consumption of buildings.
We have already developed a fully black-box data-driven pipeline to obtain a control policy for building control problems based on deep reinforcement learning (DRL) algorithms and achieved satisfactory results on the energy savings in different units at NEST demonstration building at Empa, Dübendorf [2]. However, we have currently only addressed satisfying predefined fixed comfort bounds.
In this work, we would like to integrate real-time feedback of occupants at one of the apartments at NEST into our DRL pipeline to obtain the occupant-centered control policy. To fulfil this task, the student would have to analyse available measurements in the building and select some of them as a real-time input to the DRL algorithm. Given these occupant's inputs, a suitable reward needs to be defined for the DRL training to obtain satisfactory occupant thermal comfort while minimising the energy consumption. The main challenge of the project is to achieve fast learning of the control policy, given the limited amount of occupant's inputs on comfort satisfaction. Due to the nature of the problem, some simulation work will be possible as a help, but the developed control policy shall be tested experimentaly with occupants at NEST.
In this work, we would like to integrate real-time feedback of occupants at one of the apartments at NEST into our DRL pipeline to obtain the occupant-centered control policy. To fulfil this task, the student would have to analyse available measurements in the building and select some of them as a real-time input to the DRL algorithm. Given these occupant's inputs, a suitable reward needs to be defined for the DRL training to obtain satisfactory occupant thermal comfort while minimising the energy consumption. The main challenge of the project is to achieve fast learning of the control policy, given the limited amount of occupant's inputs on comfort satisfaction. Due to the nature of the problem, some simulation work will be possible as a help, but the developed control policy shall be tested experimentaly with occupants at NEST.
Required qualifications of the eligible student for this MSc thesis are a prior experience with control algorithms, model-free or model based, ideally also with prior experience with reinforcement learning, and good programming skills, preferably in Python for compliance with other projects. The candidate should be proficient in English.
This project will take place in the ehub team [3] at Empa, Dübendorf. The project will be part of a larger project in this area where several researchers from Empa are working on together with industry partners. For further enquiries, please contact Dr. Bratislav Svetozarevic, bratislav.svetozarevic@empa.ch.
Required qualifications of the eligible student for this MSc thesis are a prior experience with control algorithms, model-free or model based, ideally also with prior experience with reinforcement learning, and good programming skills, preferably in Python for compliance with other projects. The candidate should be proficient in English.
This project will take place in the ehub team [3] at Empa, Dübendorf. The project will be part of a larger project in this area where several researchers from Empa are working on together with industry partners. For further enquiries, please contact Dr. Bratislav Svetozarevic, bratislav.svetozarevic@empa.ch.