Register now After registration you will be able to apply for this opportunity online.
Computational Methods for Protein Design and Fitness Optimization
We are currently looking for Master’s students with background in machine learning (or related computational field) for a project on Protein Fitness Optimization.
Keywords: protein design, fitness optimization, computational biology, generative models
Protein fitness optimization aims to improve a protein’s functionality by modifying its amino acid sequence to enhance a specific property, such as stability or binding affinity. There are various computational approaches to tackle this problem that enable in-silico pipelines to suggest new protein candidates. One of them involves generative models, which aim to capture the distribution of protein sequence data and propose new sequence mutants based on the learned distribution.
We are interested in investigating the performance of different methods on deep mutational scanning (DMS) datasets of specific proteins [1]. The main tasks will be to:
- set up the optimization task on a DMS dataset,
- implement and compare different existing methods, and
- investigate the applicability of more expressive and powerful protein language models, such as ESM2 [2].
[1] Notin, Pascal, et al. "Proteingym: Large-scale benchmarks for protein fitness prediction and design." *Advances in Neural Information Processing Systems* 36 (2023): 64331-64379.
[2] Lin, Zeming, et al. "Evolutionary-scale prediction of atomic-level protein structure with a language model." *Science*379.6637 (2023): 1123-1130.
Protein fitness optimization aims to improve a protein’s functionality by modifying its amino acid sequence to enhance a specific property, such as stability or binding affinity. There are various computational approaches to tackle this problem that enable in-silico pipelines to suggest new protein candidates. One of them involves generative models, which aim to capture the distribution of protein sequence data and propose new sequence mutants based on the learned distribution.
We are interested in investigating the performance of different methods on deep mutational scanning (DMS) datasets of specific proteins [1]. The main tasks will be to:
- set up the optimization task on a DMS dataset, - implement and compare different existing methods, and - investigate the applicability of more expressive and powerful protein language models, such as ESM2 [2].
[1] Notin, Pascal, et al. "Proteingym: Large-scale benchmarks for protein fitness prediction and design." *Advances in Neural Information Processing Systems* 36 (2023): 64331-64379.
[2] Lin, Zeming, et al. "Evolutionary-scale prediction of atomic-level protein structure with a language model." *Science*379.6637 (2023): 1123-1130.
Not specified
You can apply by sending a CV to Lea Bogensperger (lea.bogensperger@uzh.ch).
You can apply by sending a CV to Lea Bogensperger (lea.bogensperger@uzh.ch).