CMU-CB-23-100
Ray and Stephanie Lane Computational Biology Department
School of Computer Science, Carnegie Mellon University



CMU-CB-23-100

Sequential Strategies for Automated Science
and Protein Engineering

Trevor S. Frisby

May 2023

Ph.D. Thesis

CMU-CB-23-100.pdf


Keywords: NA

Many scientific processes depend upon sequential decision making. Choosing which experiments to run next, or how to alter an experimental design, or reconfigure experimental instrumentation affects not just the underlying accuracy or quality of the actual experiment, but also the efficiency at which optimal experimental conditions are identified. Especially as the ability to automate certain experimental components becomes more prevalent, practical algorithms that can guide these types of experimental decision making are more important now than ever. In this dissertation, we use machine learning to address such sequential decision making problems in two emerging biological domains–general laboratory experimentation via a Cloud Lab and protein engineering. Towards the first setting, we introduce protocol, a first-of-its-kind deterministic algorithm that improves experimental protocols via asynchronous, parallel Bayesian optimization. In the latter setting, we describe two methods for selecting protein engineering experiments. First, we show how to formulate Directed Evolution as a regularized Bayesian optimization problem where the regularization term reflects evolutionary or structure-based constraints. Finally, we demonstrate how to use a deep Transformer Protein Language Model to effectively select lead sequences from nanobody repertoires, as well as how to select beneficial single-site mutagenesis experiments that optimize targeted protein functions.

185 pages

Thesis Committee:
Christopher James Langmead (Chair)
Russell Schwartz
David Koes (University of Pittsburgh)
Austin Rice (Amgen)

Russell Schwartz, Head, Computational Biology Department
Martial Hebert, Dean, School of Computer Science



Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by reports@cs.cmu.edu