Stepping up the pace of protein design

Summer 2017

Computer models based on molecular energy budgets have made it easier than ever to imagine the myriad ways a protein structure could be folded. Yet it has still been difficult to determine which of those patterns are stable enough to exist in nature, so that their properties could be studied formally.

David Baker and researchers at the University of Washington in Seattle have developed a high-throughput system to clone these proteins in yeast cells, where many copies can be captured and tagged with fluorescent antibodies, which will remain visible if the fold is stable. The process has dramatically reduced the cost of making these samples and opened up the prospect of sorting through thousands of proteins, as well as conducting iterations of the process to increase the proportion of stable outcomes each time around.

That still leaves the problem of determining what these folded proteins actually look like, which is why the American investigators teamed up with members of the Structural Genomics Consortium (SGC) at the University of Toronto. This group was among the pioneers of detailed molecular imaging, which became a priority some 15 years ago as the volume of genetic sequencing began to grow exponentially.

“The challenge is usually obtaining a protein that is well-behaved for structural biology,” says Cheryl Arrowsmith, a member of the SGC and the University Health Network. Her lab specializes in sophisticated Nuclear Magnetic Resonance techniques to make rapid structural determination of candidate proteins provided by the Washington researchers.

“Because the Baker lab had already identified the stable proteins, we were able to apply our high-throughput methods for assessing the protein by NMR and determining its structure,” she explains.

Arrowsmith co-authored a paper with the Baker lab describing the results, which appeared in Science in July. Starting with models based on four distinct folding topologies, 2788 stable proteins were identified – many in protein sequence space that had never made before. The researchers suggest that in addition to providing a platform for better understanding the dynamics of folding, these novel structures could find applications in bioengineering or synthetic biology.

“We have entered a new era of iterative, data-driven de novo protein design and modeling,” they conclude.