We built AI to find the physics we've never seen. A new study warns that the physics it already knows may be exactly what's standing in the way.
For three years the promise has been seductive in its simplicity: point a powerful enough neural network at the data, and it will find the physics we couldn't. In June 2026, a quieter and more unsettling result complicated that promise. The trouble, a team of cosmologists discovered, is not that the machine knows too little. It is that it knows too much — and cannot bring itself to forget.
The setting was about as far from the laboratory bench as physics gets: the faint, cold glow left over from the birth of the cosmos, and the gravitational scaffolding of galaxies stretched across billions of light-years. To search this data for signs of new physics — effects that would mean our reigning model of the universe is incomplete — researchers increasingly lean on a technique called transfer learning. You train a model on a vast bank of simulated universes that obey the standard theory, then fine-tune it on real observations, hoping it can spot the deviations. It is the same trick that lets a model trained on millions of photographs learn to read a single medical scan. It is fast, it is cheap, and, the new work shows, it carries a hidden cost.
That cost has a name in machine learning, dry and clinical: negative transfer. It is what happens when the knowledge a model brings to a new problem actively makes it worse at solving it. And in the hunt for physics beyond the standard model, negative transfer turns out to be not a rare edge case but a structural temptation — the machine reaching for the familiar explanation precisely when the unfamiliar one is what matters.
To understand why this matters, you have to appreciate how strange our best theory of the universe actually is. The standard model of cosmology — physicists call it ΛCDM, for the cosmological constant lambda plus cold dark matter — describes the large-scale cosmos with uncanny precision. It predicts the temperature ripples in the cosmic microwave background, the way galaxies cluster, the rate at which the universe is flying apart. Feed it a handful of numbers and it reproduces the heavens.
And yet the theory is built almost entirely out of things we cannot see or explain. Roughly a quarter of the universe is dark matter, a substance that interacts with light not at all and has never been caught in a detector. Around seventy percent is dark energy, the placeholder name for whatever is accelerating cosmic expansion. Add it up and something like ninety-five percent of everything is described by terms that are, at bottom, confessions of ignorance dressed in equations. ΛCDM works beautifully and explains almost nothing about what the universe is made of. That is the paradox modern physics lives inside.
So cosmologists go hunting for cracks — small, telling discrepancies that might reveal what lies beneath. Does the neutrino have mass, and how much? Is dark energy truly constant, or does it drift over cosmic time? These are the questions where a genuine discovery is hiding, and they are exactly the questions researchers hoped a well-trained AI could help answer faster than any human poring over the data.
The June study, published in the Journal of Cosmology and Astroparticle Physics, set out to test transfer learning as a shortcut for these searches. The logic was sound. Simulating universes is computationally brutal; pretraining a model on a library of standard-model simulations, then adapting it to the specific signal you care about, should slash the cost. The researchers expected to report a tidy efficiency win. Instead they ran into a wall they hadn't predicted.
The clearest example involved the mass of the neutrino, one of the most coveted unknowns in physics. A neutrino with mass leaves a faint fingerprint on the way cosmic structure forms — it gently smooths out the clumping of matter on certain scales. The problem is that this smoothing looks almost identical to the effect of an ordinary, already-known parameter of the standard model: σ8, a number that measures how lumpy the universe is. To the eye of a model steeped in ΛCDM, a real signal of new physics and a small tweak to an old parameter are near-twins.
And here is where the pretraining betrays it. Having spent its formative training absorbing the categories of the standard model, the network does what any over-confident expert does when shown an ambiguous case: it reaches for the explanation it already trusts. It files the neutrino's fingerprint under σ8, shrugs, and moves on. The novelty is not missed because the data is too noisy. It is missed because the machine has a prior conviction about what it is looking at.
The researchers reached for a human analogy that lands uncomfortably well. Imagine a medical student who has memorized the introductory textbook and then meets a patient with a rare disease whose early symptoms mimic a common one. The diligent student, pattern-matching against everything they've studied, confidently diagnoses the ordinary illness. Their knowledge, far from helping, is precisely what leads them astray. A less-trained mind, with fewer ready-made categories to fall back on, might at least pause at the strangeness.
It would be easy to file this under the narrow concerns of dark-energy specialists. That would be a mistake. The whole strategy of modern AI-for-science rests on transfer: you pretrain a large model on a mountain of existing knowledge, then point it at a new frontier. That is how AI is being deployed to design proteins, predict materials, and read genomes. In every one of those domains, the model arrives carrying the assumptions baked into its training data — and in every one, a genuine discovery is, almost by definition, the thing that does not fit those assumptions.
This is the quiet irony at the heart of using yesterday's knowledge to find tomorrow's. The more thoroughly a model has internalized the current paradigm, the more fluent and useful it becomes at ordinary work — and the more it is predisposed to explain away the very anomalies that would overturn that paradigm. Scientific revolutions have always required someone willing to take an inconvenient result seriously instead of smoothing it into the existing picture. We are now building machines optimized to do the smoothing.
A discovery is the data that refuses to fit the theory. We have trained our machines, above all else, to make data fit the theory.— The structural tension in AI-driven science
None of this means the models are useless — far from it. The same family of techniques has produced real wins. Earlier in 2026, AI helped surface genuinely new behavior in plasma, the so-called fourth state of matter, a story I told in The Fourth State. Other groups have shown that neural networks, given enough freedom, can rediscover the foundations of particle physics from raw data, and even propose human-readable solutions to thorny quantum experiments. The machine can be a real engine of discovery. The June result is not a refutation of that; it is a warning about the conditions under which the engine quietly stalls.
If the problem is too much prior conviction, the obvious remedy is to dilute it — and that is where the research points. One approach is to train models that hold their assumptions more loosely, weighting their pretrained knowledge less heavily when the new data starts to disagree with it. Another is to deliberately build in what you might call productive ignorance: architectures designed to flag the ambiguous cases, the ones where a new effect and an old parameter look alike, rather than resolving them by default toward the familiar.
There is a deeper philosophical point lurking here, and the researchers seem alert to it. Human science advances not only by accumulating knowledge but by periodically setting some of it aside — by being willing to ask whether the framework itself is wrong. Thomas Kuhn called these moments paradigm shifts, and they are notoriously hard even for brilliant humans, who cling to their training as fiercely as any neural network clings to its weights. The aspiration now is to design AI that can do the harder thing: hold its knowledge firmly enough to be useful, and loosely enough to be surprised.
This result arrives at a moment when the scientific community is taking a harder, more sober look at what machine learning actually delivers. The first flush of excitement — the headlines about AI cracking problems that stumped humans for decades — is giving way to a more granular accounting of where these tools help, where they merely impress, and where they quietly mislead. Reviews in Nature through 2026 have begun cataloguing not just the triumphs but the failure modes: models that reproduce known results spectacularly while struggling to extrapolate beyond them, confident outputs that turn out to encode the biases of their training sets.
Negative transfer fits squarely into that reckoning. It is not a bug to be patched in a single update; it is a feature of how learning from examples works. A system that generalizes from the past will, by construction, be most confident about the futures that resemble the past. For most applications that is exactly what you want. For the specific, precious task of discovering something genuinely new, it is a liability that has to be designed around with care.
We wanted a machine that could see what we've never seen. We may first have to teach it the harder skill: to doubt what it has already learned.— On the next phase of AI-for-science
The practical upshot for working scientists is a dose of healthy suspicion. An AI that returns "nothing new here" is no longer a result you can take at face value, because the model may simply have lacked the room to recognize newness in the first place. Confirming a non-detection now demands the same scrutiny once reserved for a bold claim — checking whether the tool was even capable of seeing the thing it failed to find. That is a subtle but important shift in how evidence gets weighed.
For the rest of us, the lesson is broader and oddly humbling. We have spent several years asking whether machines can match human expertise. This study points at a different and more interesting frontier: whether they can match human open-mindedness — the capacity to be genuinely surprised, to let an inconvenient observation unsettle a cherished belief. That turns out to be one of the harder things a mind can do, biological or artificial. The next generation of scientific AI will be measured not only by how much it knows, but by how gracefully it can hold that knowledge up to doubt. The universe, after all, has spent ninety-five percent of itself hiding in the dark. Finding the rest may depend less on how much our machines have learned than on what we can teach them to forget.

AI finds new physics in plasma, the strangest state of matter.

How machine intelligence is becoming a tool of scientific invention.

When computers learn to think without errors — and why it matters.

The search for materials with powers we barely understand.

AI's race to build machines that understand physical reality.

What happens when AI starts to improve itself.
Buy me a coffee