Register now After registration you will be able to apply for this opportunity online.
This opportunity is not published. No applications will be accepted.
Simulated clinical decision-making environment
This Master's thesis will revolve around simulating a clinical decision-making environment for patient trajectory modeling and treatment recommendation systems. The project will create and validate a generative model for clinical time-series with causal constraints.
The development of machine learning methods for supporting clinical decision-making has the potential to transform 21st-century healthcare. In light of the overload of heterogeneous sources of diagnostic information1 and of the multiplication of treatment and clinical investigation options, such algorithms open the possibility of automating decision-making and facilitating patient analysis. Efforts to achieve this are emerging in the literature but remain largely limited by the challenging evaluation of novel treatment recommendation systems. Experimenting with treatment choices on live patients is impractical and often unethical -- reliable systems to estimate the effect of the latter are therefore crucial to progress in clinical machine learning.
Existing benchmarks for policy learning methods have been adapted to the offline setting, with examples such as the D4RL suite. Still, these remain poorly representative of the clinical context, where policy learning is made challenging by sparse or ambiguous reward definition (patient outcomes), partial observability of the patient state, and the heterogeneous, high-dimensional nature of clinical time-series. In addition, prior clinical simulation efforts are not concerned with obtaining realistic treatment effects on patient outcomes -- crucial to the evaluation of treatment policies.
Hence, we wish to develop a comprehensive simulation of patient evolution in the intensive care unit (ICU) -- allowing for reliable evaluation of treatment effects (causal inference) and clinical policy learning models. Another relevant application of interest includes data augmentation methods to improve learning on time series.
The development of machine learning methods for supporting clinical decision-making has the potential to transform 21st-century healthcare. In light of the overload of heterogeneous sources of diagnostic information1 and of the multiplication of treatment and clinical investigation options, such algorithms open the possibility of automating decision-making and facilitating patient analysis. Efforts to achieve this are emerging in the literature but remain largely limited by the challenging evaluation of novel treatment recommendation systems. Experimenting with treatment choices on live patients is impractical and often unethical -- reliable systems to estimate the effect of the latter are therefore crucial to progress in clinical machine learning.
Existing benchmarks for policy learning methods have been adapted to the offline setting, with examples such as the D4RL suite. Still, these remain poorly representative of the clinical context, where policy learning is made challenging by sparse or ambiguous reward definition (patient outcomes), partial observability of the patient state, and the heterogeneous, high-dimensional nature of clinical time-series. In addition, prior clinical simulation efforts are not concerned with obtaining realistic treatment effects on patient outcomes -- crucial to the evaluation of treatment policies.
Hence, we wish to develop a comprehensive simulation of patient evolution in the intensive care unit (ICU) -- allowing for reliable evaluation of treatment effects (causal inference) and clinical policy learning models. Another relevant application of interest includes data augmentation methods to improve learning on time series.
The overarching goal of this Master’s thesis will be to develop a realistic simulation of patient trajectories under a variety of possible treatment options in the intensive care unit.
Following a review of the relevant literature, an initial investigation will focus on training generative models for multivariate time series data. For instance, a discriminative approach could be adopted, following prior work from our group on medical time series models. Datasets will consist of time series of observations, lab values and treatment assignments from real patients in intensive care units.
Particular focus will be placed on designing useful evaluation metrics for simulated patient trajectories. In addition to producing realistic, physiologically plausible time series, we will assess whether such models capture expected treatment effects, ranging from vasopressor effect on blood pressure to more complex and subtle medical effects. Another important part of this work will be to estimate and calibrate the uncertainty of our generative model in producing realistic trajectories, to avoid simulation of unknown patient states.
Subsequent refinements will focus on addressing the shortcomings of existing methods on these metrics. Inductive biases will be introduced to more reliably simulate treatment effects, based on pharmacokinetics models or randomised controlled trials. To start with, this could include data augmentation strategies to overcome the bias of the treatment policy in the observational datasets. If time allows, explicit causal inference methods could be implemented.
As a final development, the project should lead to a simulation model for clinical time series. Under a random or specific treatment policy, the algorithm should produce realistic patient trajectories -- useful for both training and evaluation of clinical policy models. A comparison with other clinical simulation efforts will be paramount at all stages of the project, to benchmark our work and guarantee its added value.
The overarching goal of this Master’s thesis will be to develop a realistic simulation of patient trajectories under a variety of possible treatment options in the intensive care unit. Following a review of the relevant literature, an initial investigation will focus on training generative models for multivariate time series data. For instance, a discriminative approach could be adopted, following prior work from our group on medical time series models. Datasets will consist of time series of observations, lab values and treatment assignments from real patients in intensive care units.
Particular focus will be placed on designing useful evaluation metrics for simulated patient trajectories. In addition to producing realistic, physiologically plausible time series, we will assess whether such models capture expected treatment effects, ranging from vasopressor effect on blood pressure to more complex and subtle medical effects. Another important part of this work will be to estimate and calibrate the uncertainty of our generative model in producing realistic trajectories, to avoid simulation of unknown patient states.
Subsequent refinements will focus on addressing the shortcomings of existing methods on these metrics. Inductive biases will be introduced to more reliably simulate treatment effects, based on pharmacokinetics models or randomised controlled trials. To start with, this could include data augmentation strategies to overcome the bias of the treatment policy in the observational datasets. If time allows, explicit causal inference methods could be implemented.
As a final development, the project should lead to a simulation model for clinical time series. Under a random or specific treatment policy, the algorithm should produce realistic patient trajectories -- useful for both training and evaluation of clinical policy models. A comparison with other clinical simulation efforts will be paramount at all stages of the project, to benchmark our work and guarantee its added value.
This project should be of interest to a student interested in generative modelling for time series, representation learning and causal inference methods. Note that both the successful development of such a simulated clinical benchmark as well as resulting opportunities for algorithm evaluation could lead to publication in high-impact machine learning conferences.
Supervisor: Prof Gunnar Rätsch, D-INFK.
Advisor: Alizée Pace, D-INFK + ETH AI Center.
Other BMI group members/AI center fellows may offer relevant supervision.
Please contact Alizée at alizee.pace@ai.ethz.ch for further information on this project.
This project should be of interest to a student interested in generative modelling for time series, representation learning and causal inference methods. Note that both the successful development of such a simulated clinical benchmark as well as resulting opportunities for algorithm evaluation could lead to publication in high-impact machine learning conferences.
Supervisor: Prof Gunnar Rätsch, D-INFK. Advisor: Alizée Pace, D-INFK + ETH AI Center. Other BMI group members/AI center fellows may offer relevant supervision.
Please contact Alizée at alizee.pace@ai.ethz.ch for further information on this project.