What is the protein motion problem in biology?

The motion problem refers to the challenge of predicting not just a protein's static shape but the full range of shapes it moves through during its function. A protein's job is done by what physicists call its conformational ensemble -- the set of shapes it visits -- rather than any single frozen structure. Solving this is considered structural biology's next major frontier after AlphaFold.

What are the limitations of AlphaFold for protein research?

AlphaFold predicts a single static 3D structure for a protein, essentially a frozen pose, rather than capturing how proteins flex and shift between shapes as they function. For a large fraction of biology and medicine, the critical action happens in transitional shapes AlphaFold never shows. The tool answered 'what shape' but cannot answer 'which shapes, and how often.'

How does diffusion AI work for protein design?

Diffusion-based protein design starts with a cloud of random atomic noise and lets a trained model gradually denoise it into a coherent structure guided by a specified goal, such as building a backbone that binds a cancer protein. A second network then writes the amino acid sequence most likely to fold into that backbone. This approach, pioneered in David Baker's lab, has raised protein binder success rates from a coin-flip to a more reliable engineering process.

Why is protein binder design important for medicine?

Protein binders are custom-built molecules engineered to grip specific spots on disease-related proteins, forming the basis of new drugs, diagnostics, and biological sensors. New AI-driven binder design platforms in 2026 are achieving high enough success rates to target proteins once considered undruggable, including disordered regions with no fixed shape. Nearly every drug ever developed works by interacting with a protein in motion, making this research directly relevant to treating diseases like Alzheimer's, Parkinson's, and cancer.

What is slowing down AI progress in structural biology?

A 2026 survey of more than 360 biological foundation models found that progress had decelerated since around 2021 due to a shortage of diverse, well-curated experimental data rather than a lack of clever AI architectures. New high-throughput experimental methods that generate over 10 million data points from a single experiment are starting to address this bottleneck. The key insight is that the tightest loop between a model that proposes and an experiment that delivers fresh data is more valuable than the smartest model alone.

Dispatch · Biology & Machine Intelligence

The Motion Problem

AlphaFold gave us the photograph. But proteins are not statues — they breathe, flex, and shapeshift to do their work. Teaching AI to see them move is biology's next great leap.

June 27, 2026 By Lisa Pedrosa 9 min read AI Science · Biology

Six years ago, a piece of software did something biologists had dreamed of for half a century: it looked at the chain of amino acids that makes up a protein and predicted, with uncanny accuracy, the intricate three-dimensional shape it would fold into. AlphaFold was, by any measure, one of the great achievements of computational science. It also told a small but consequential lie — not on purpose, but by omission. It showed us proteins as if they were sculptures, frozen in a single pose. They are nothing of the kind. A working protein is a restless, twitching machine, constantly flexing between shapes, and in 2026 the most exciting question in structural biology is no longer what does this protein look like but how does it move.

This is what researchers have started calling the motion problem, and solving it is the field's new summit. A protein at body temperature is in perpetual conversation with the molecules around it. An enzyme clamps shut around its target and springs open. A receptor on a cell's surface twists when a signal arrives, relaying a message inward. A channel breathes, opening and closing a pore thousands of times a second. The single “folded structure” AlphaFold gives you is really just the most common frame of a movie — and for a huge fraction of biology and medicine, the action is in the frames it never shows.

~50yr

The folding problem stood open before AI

13%

Accuracy gain from physics-AI hybrids

10M

Data points per single new experiment

360+

Biological foundation models surveyed

Why Motion Is So Much Harder Than Shape

To appreciate the leap, it helps to know why predicting a static shape was hard enough. A modest protein of a few hundred amino acids has an astronomical number of ways it could fold, yet in nature it reliably finds one. AlphaFold learned the patterns linking sequence to that final shape by studying the hundreds of thousands of structures crystallographers had painstakingly solved over decades. It was, in essence, a brilliant student of a very large textbook of frozen poses.

Motion breaks that arrangement. There is no comparable textbook of moving proteins, because capturing a protein mid-flex is brutally difficult — the shifts happen in millionths of a second and span energies that are maddening to measure. The traditional way to simulate them, called molecular dynamics, models every atom's jostling step by tiny step. It is physically rigorous and gorgeous to watch, and it is so computationally expensive that simulating even a few microseconds of a single protein's life can consume a supercomputer for weeks. You cannot brute-force your way across all of biology like that.

A protein's job is rarely done by one shape. It's done by the set of shapes it visits — what physicists call its conformational ensemble. Predict the ensemble, and you predict the function.

So the new generation of AI tools is attempting something cleverer than brute force. Rather than grinding out every atomic wobble, generative models are learning to sample the landscape of shapes a protein is likely to occupy — to produce, in seconds, a population of plausible conformations weighted roughly the way nature would weight them. The technical goal researchers describe is to approximate what's known as the Boltzmann distribution, the physicist's recipe for how often a molecule visits each of its possible states, at a tiny fraction of the cost of simulating it. If AlphaFold answered “what shape,” these models aim to answer “which shapes, and how often.”

“We spent fifty years learning to photograph the protein. Now we have to learn to film it.”

— The structural-biology view of the post-AlphaFold era

Designing Proteins That Don't Exist

The second frontier braided into this one is even bolder: not just predicting how natural proteins move, but designing entirely new ones from scratch to do a chosen job. The most coveted target is the protein binder — a small, custom-built molecule engineered to grip a specific spot on a disease-related protein, the way an antibody does, but designed atom by atom by a model. Done well, binders become the basis of new drugs, diagnostics, and biological sensors.

For years this was an art with low yields; most computer-designed binders simply failed when made in the lab. That is changing fast. Building on tools that emerged from David Baker's lab in Seattle and elsewhere — diffusion models that “hallucinate” new backbones, then redesign their sequences — 2026's binder-design platforms are reporting success rates high enough that the work is starting to feel less like fishing and more like engineering. New approaches scale the design process with the same generative-pretraining-plus-test-time-compute recipe that powers large language models, and a wave of all-atom design systems is pushing toward binders that hit targets once considered undruggable, including the floppy, shapeless regions of proteins that have no fixed structure at all.

From reading shape, to reading motion, to writing new proteins on purpose. Source: Nature Communications Biology review; arXiv binder-design papers, 2026.

The Wall: Biology Is Data-Poor

Here is the honest complication. For all the momentum, the field has run into the same obstacle slowing AI across the life sciences: not enough of the right data. A 2026 survey of more than 360 biological foundation models found that progress had actually decelerated since around 2021 — and the culprit was not a shortage of clever architectures but a shortage of diverse, well-curated experimental measurements to train them on. You can build a model that imagines how a protein moves, but you can only trust it as far as you can check it against reality, and the reality checks are scarce, slow, and expensive.

This is why one of the quietest breakthroughs of the past year may matter most. New high-throughput experimental methods can now generate ten million data points from a single experiment — measuring how millions of amino-acid changes alter a protein's behavior all at once, in days rather than years — and feed them straight into the models. The pattern echoes the drug-discovery story: the prize is not the smartest model alone, but the tightest loop between a model that proposes and an experiment that delivers fresh, abundant truth back to it. Pair a generative designer with a lab that can test millions of its ideas at once, and the data wall starts to crack.

“The bottleneck is no longer the algorithm. It's the measurement the algorithm has never seen.”

— On the data wall facing biological AI, 2026

How the New Tools Actually Work

It is worth lifting the hood, because the methods are more intuitive than their names suggest. The dominant approach to designing a new protein borrows a trick from the AI image generators that fill the internet: diffusion. Start with a cloud of random noise — in this case, a jumble of atoms in space — and let a trained model gradually denoise it into a coherent structure, guided toward a goal you specify, such as “build a backbone that wraps around this patch on a cancer protein.” A second network then reads that backbone and writes the amino-acid sequence most likely to fold into it. The pipeline runs design and verification in tandem, and the rise of these diffusion methods, pioneered in David Baker's lab and now spreading across the field, is the main reason binder design has gone from a coin-flip to a craft.

The newest systems add a further idea imported straight from large language models: spend more computing power at the moment of design, not just during training. By letting a model generate and quietly evaluate many candidate structures before committing — a strategy researchers call test-time compute — the best 2026 platforms push success rates higher and reach targets that earlier tools missed entirely. Among the hardest of those targets are the so-called intrinsically disordered regions: stretches of protein that have no fixed shape at all, flickering through a haze of conformations, long considered impossible to drug. Designing a binder that can grab one of these moving shadows is exactly the kind of problem that only becomes tractable once a model understands motion rather than snapshots — which is why the two frontiers, predicting movement and designing proteins, keep converging into one.

Accuracy is climbing on the prediction side as well. A tool called D-I-TASSER, which marries machine learning with old-fashioned physics-based simulation, predicts complex structures with roughly 13 percent greater accuracy than earlier methods — a reminder that the winning recipe is rarely pure AI but a marriage of learned pattern and physical law. The pattern recognizes what is plausible; the physics enforces what is possible. Neither alone is enough, and the labs making the fastest progress are the ones refusing to choose between them.

Why It Matters Beyond the Lab

Some of the credit belongs to the experimentalists, not just the algorithms. A revolution in cryo-electron microscopy — flash-freezing proteins and imaging them by the millions — has begun to capture not one structure but several, glimpses of the same molecule caught in different poses. Those multiple snapshots are precisely the training data a motion-predicting model needs: not a single answer key but a sampling of the states a protein really visits. The interplay is the whole story of modern structural biology in miniature. Better microscopes feed better models; better models tell the microscopists where to look. Neither discipline is replacing the other, and the breakthroughs are landing in the seam between them.

It is fair to ask why a non-specialist should care whether a protein wiggles. The answer is that nearly every drug you have ever taken works by interacting with a protein in motion. Diseases like Alzheimer's, Parkinson's, and many cancers are, at root, stories of proteins folding wrong, moving wrong, or sticking where they shouldn't. A model that can predict not just a protein's resting shape but the full repertoire of poses it visits — and the fleeting, druggable pockets that open only mid-motion — would hand medicine a map it has never had. Whole categories of targets long dismissed as “undruggable” are undruggable precisely because their key vulnerabilities appear only when they move.

The reach extends well past the pharmacy. Designed proteins that fold and flex on cue are already being explored as biosensors that light up in the presence of a toxin, as enzymes engineered to chew through plastic or capture carbon, and as the scaffolding for vaccines that present a virus to the immune system at exactly the right angle. Each of these depends on getting the motion right, because a protein that cannot move correctly cannot do its job. The same understanding that lets a model design a better drug lets it design a better tool — and increasingly, the line between studying life's machinery and building new pieces of it is dissolving. We are not just learning to read the language of proteins. We are beginning to compose in it.

Stand back, and a pattern comes into focus. AlphaFold was a destination; the motion problem is a road. The first told us what proteins are. The second is teaching us what they do — and, increasingly, letting us design new ones to do what we need. If the 2020s opened with a machine that could read the frozen language of life, they may close with machines that can read it in motion, and write new sentences of their own. The statue is learning to dance, and for the first time, we are getting to watch.

Sources

Nature Communications Biology — “The latest AI breakthroughs in structural biology: protein binder design and conformational state prediction,” 2026.
Phys.org — “When AI meets physics: Unlocking complex protein structures to accelerate biomedical breakthroughs,” Feb. 2026 (D-I-TASSER).
Phys.org — “This protein-engineering breakthrough generates over 10M data points and turbocharges AI in just three days,” Apr. 2026.
Frontiers in Molecular Biosciences — “Protein structure prediction powered by artificial intelligence: from biochemical foundations to practical applications,” 2026.
arXiv — “Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute,” 2026.
arXiv — “SeedProteo: Accurate De Novo All-Atom Design of Protein Binders,” 2026.
Science — “Design of intrinsically disordered region binding proteins,” 2026.
Nature Communications — Bennett et al. (Baker lab), “Improving de novo protein binder design with deep learning.”
Cell / Structure — “Code to complex: AI-driven de novo binder design,” 2025–26.
arXiv — “Foundation Models for AI-Enabled Biological Design” (survey of 360+ biological foundation models).
PMC — “De novo protein design: a transformative frontier in clinical protein applications,” 2026.
NCBI / PMC — “When artificial intelligence meets protein research,” 2026.

The Motion Problem

Why Motion Is So Much Harder Than Shape

Designing Proteins That Don't Exist

The Wall: Biology Is Data-Poor

How the New Tools Actually Work

Why It Matters Beyond the Lab

Sources

Keep Reading

AI: The Engine of Discovery

The Bioelectric Code

The Living Computer

The Immunological Revolution

The CRISPR Generation

The Longevity Revolution