もっと詳しく

Back in 2020, we discussed DeepMind’s impressive advances in protein structure prediction, as evinced by their showing at CASP14. They’ve finally made good on their promise to publish their work on AlphaFold2! In the interim, though, David Baker’s group at the University of Washington – perennially one of the top academic groups in the field – got impatient and re-engineered their own protein structure prediction approaches to come up with RoseTTAFold, which appears to be almost as good (and rather less computationally intensive).

Looking for slightly more comprehensible explanations of how it works? The usual science and tech sites (Ars Technica, Stat News, TechCrunch, etc.) have some basic explanations. Getting into more technical explanations that still provide some solid background on both the biology and the computational side, we’ve got a Youtube explanation that doesn’t seem half-shabby, if you’re a fan of video, and an article on an AI-focused site that’s actually a bit heavier on the bio background. For commentary from researchers closer to the field, check out more from a member of the Oxford Protein Informatics Group (expanding on 2020 thoughts), and from Mohammed AlQuaraishi, who also had some ruminations on the CASP14 results (“it feels like one’s child has left home.”) Broadly, there’s no One Weird Trick that makes AlphaFold2 work: it takes a lot of extant ideas from the bioinformatics (multiple sequence alignments and evolutionary co-variation), protein modeling, and machine learning worlds, and engineers them into something really impressive.

How will this change structural biology? As a comment in Nature Structural and Molecular Biology describes, it’s actually likely to be a huge boon for people tackling tough experimental structural techniques, which have for a long time uses protein models when working with and solving structures from experimental data. And as EMBL-EBI points out, it’s going to be helpful for hypothesis generation in a lot of areas of basic research, particularly for people working with protein families that had little or no existing structural information. But as Derek Lowe from In the Pipeline notes, and has discussed previously, it’s also possible to oversell what this will mean for, say, drug discovery: determining protein structures is important and has historically often been quite challenging, but there are many parts of biomedical research for which it wasn’t really the rate-limiting step. AlphaFold2 and RoseTTAFold also have some very real limits, as highlighted in this FEBS post – there are some limits to their ability to predict protein complexes, and they can’t handle proteins that bind cofactors or non-protein things like amino acids, or that have post-translationally modified amino acids, or that form several different conformations in the actual cell. Given that bioinorganic chemists estimate that a quarter of proteins bind metals – only one of the classes of cofactors under discussion! – this does mean that a significant subset of proteins are going to be less well-predicted than simpler cofactor-free monomeric proteins until AlphaFold2, RoseTTAFold, and similar algorithms undergo more development.

Want to predict some protein structures yourself? For researchers working on model organisms, DeepMind helpfully already predicted the structure of every protein and made them available to the public via EMBL-EBI. For everyone working on something odder, both projects are up on GitHub (AlphaFold2, RoseTTAFold), but the Baker group’s also implemented RoseTTAFold on their Robetta server, while DeepMind’s put a (slightly limited) version of AlphaFold2 on a public Colab notebook. Researchers have, of course, been digging into these tools and figuring out how to adapt them to work on more complicated problems, like this Colab notebook (dubbed ColabFold on GitHub), which tackles protein-protein complexes. (One of the lead people on that – Sergey Ovchinnikov – has a talk available online discussing both how AlphaFold2 works and their Colab setup.) Despite the caveats, it’s all quite exciting.