AI Science · The New Minds
FutureHouse's Robin didn't just assist researchers — it read the literature, formed hypotheses, designed experiments, and identified a treatment for blindness entirely on its own. The closed-loop AI scientist has arrived.
Picture a researcher arriving at the lab on a Monday morning. Over the weekend, their AI colleague read 551 scientific papers, identified a promising therapeutic mechanism, proposed a series of experiments, and returned a list of validated drug candidates — including one that had never been proposed for this disease before. This isn't a thought experiment. It happened.
In May 2026, FutureHouse, an AI research nonprofit based in San Francisco, published in Nature the results of a system that may mark a turning point in how science gets done. Their platform, called Robin, is a multi-agent AI that autonomously ran the complete arc of scientific inquiry: from a blank-slate question to a validated discovery. The paper sent a ripple through the biomedical community — not only because of the specific finding, though that finding matters enormously to millions of people — but because of what the architecture implies. Robin doesn't just answer scientific questions. It asks them.
The researchers gave Robin a single starting prompt: "dry age-related macular degeneration." Nothing more. No prior hypotheses. No candidate drugs. No suggested mechanism. Just the name of a disease.
Dry age-related macular degeneration (dAMD) is the leading cause of blindness in the developed world, affecting an estimated 196 million people globally. Unlike the "wet" form of AMD — which has effective VEGF-inhibitor treatments — dry AMD has no approved therapy that meaningfully slows its progression. Millions of patients are told, essentially, to wait.
Robin began by doing what any researcher would: reading the literature. But at a pace no human can match. In approximately 30 minutes, Robin's literature agent processed 551 relevant papers — a task that FutureHouse estimates would take a skilled researcher around 540 hours of manual reading and synthesis. It didn't skim titles and abstracts. It tracked mechanisms, identified conflicting claims, and mapped where the evidence was converging.
From that synthesis, Robin generated a hypothesis: that enhancing retinal pigment epithelium (RPE) phagocytosis — the retinal cells' ability to clear photoreceptor debris — represented a promising therapeutic target. RPE cells are the unsung workhorses of the retina, continuously recycling the spent photoreceptor tips that enable vision. When that recycling slows, debris accumulates, and the photoreceptors die. AMD follows. This mechanism was not unknown to researchers, but identifying it as the most tractable entry point — from a blank slate — required synthesizing a body of literature that no individual scientist could hold in their head at once.
Robin then did something that has never been done by an AI system before: it identified a specific existing drug as a candidate for repurposing, and proposed the experiments to test it. The drug was ripasudil — a rho kinase (ROCK) inhibitor approved in Japan for glaucoma. Ripasudil had never previously been proposed as a treatment for dry AMD.
The insight wasn't a database lookup or a similarity score. Robin traced a mechanistic chain from ROCK inhibition through cytoskeletal regulation to RPE phagocytic capacity — a connection distributed across multiple papers that no single researcher had assembled. This is synthesis at a scale that individual human cognition cannot achieve, not because the reasoning is beyond human ability, but because the reading is.
Crucially, Robin's discovery didn't end with a prediction. FutureHouse researchers then ran the laboratory experiments Robin had designed. The results confirmed the hypothesis: ripasudil demonstrably enhanced RPE phagocytosis in laboratory conditions. A drug already proven safe in humans for a decade appeared to have a new, potentially important clinical application for the most common form of blindness in the developed world.
"Robin is the first multi-agent system for discovery in biology that integrates novel hypothesis generation with experimental data analysis in one continuous workflow."— FutureHouse, Nature 2026
Robin is not a single AI model. It's an orchestrated network of three specialized sub-agents, each assigned a specific cognitive task in the research pipeline. Crow handles literature search and synthesis. Falcon manages experimental design and scientific evaluation. Finch analyzes the experimental data that comes back from the lab. Together, they cover the full arc from question to answer — passing structured outputs between them rather than operating as a monolith.
The system ran on a combination of OpenAI's o4-mini and Anthropic Claude 3.7, with FutureHouse providing the orchestration layer and experimental scaffolding. The underlying models were not fine-tuned for this specific task; the architecture itself — the division of labor, the structured handoffs, the iterative feedback loop — is FutureHouse's contribution.
The loop is explicitly iterative. If experimental results are ambiguous, Robin generates revised hypotheses and proposes follow-up experiments, updating its model of the problem through successive cycles of abductive reasoning. This is the same cognitive rhythm that characterizes good human science: form hypothesis, test it, learn from what the test reveals, form better hypothesis. Robin does it faster and without fatigue.
The efficiency statistics from the Nature paper deserve emphasis, because they reframe what "faster" means in biomedical research. FutureHouse estimates a 200-fold reduction in researcher time compared to a conventional discovery workflow. The entire process from the first prompt to paper submission took 2.5 months. A comparable human-led effort, they estimate, would have taken multiple years — not because the science would be different, but because the reading and synthesis steps are fundamentally bottlenecked by human bandwidth.
This isn't a marginal improvement. A 200-fold acceleration, if it proves reproducible across disease areas, would be among the most significant productivity shifts in the history of biomedical research. For context: the total number of biomedical papers published annually is now approaching two million. No individual researcher, or even a large team, can meaningfully track more than a fraction of what's relevant to their work. Robin's architecture is designed precisely for this condition — a world in which the knowledge base has outgrown the cognitive capacity of the researchers trying to use it.
"The pharmaceutical industry is rapidly betting that AI can do the scientific reasoning, not just the administrative processing."— Lisa Pedrosa
FutureHouse's Robin is not operating in isolation. In June 2026, Chai Discovery announced a partnership with Pfizer, licensing its Chai-3 generative AI platform — trained on Pfizer's proprietary data — for drug discovery. This represents a different approach (generative molecular design rather than literature-driven hypothesis generation), but points to the same industrial conclusion: top-tier pharmaceutical companies are now deploying AI not as a search tool, but as a scientific reasoner.
The implications stretch well beyond finding new uses for approved drugs. FutureHouse's broader ambition is explicit: to accelerate discovery across all of biology and medicine. The bottleneck they're attacking is not experimental capacity — we have plenty of sequencers, microscopes, and robotic liquid handlers. The bottleneck is the human cognitive capacity to read, synthesize, hypothesize, and design. Robin addresses all four.
There are serious open questions. Autonomous AI systems running scientific workflows raise real concerns about reliability and the propagation of errors. What happens when Robin generates a flawed hypothesis that sends researchers down a dead end for months? The current system still depends on human researchers to physically execute the experiments Robin proposes, which provides a natural checkpoint on its reasoning. But as lab automation advances — closed-loop bioreactors, robotic high-throughput screening, automated imaging analysis — that human checkpoint may itself become optional. A fully closed loop, in which an AI system not only proposes but runs its own experiments, is not a distant prospect. FutureHouse itself describes this as a direction of travel.
What FutureHouse demonstrated with Robin is a qualitative shift in what AI can do in science. For a decade, AI in drug discovery meant tools — better ways to predict protein binding, faster compound screening, smarter genomic analysis. These were instruments that researchers wielded. Robin is something different. It's an agent: a system that pursues questions, adjusts strategy when evidence demands it, and produces outputs that were genuinely unknown at the start. The drug it identified, ripasudil for dry AMD, was not on any human researcher's list when Robin began. It is now in clinical consideration.
Whether that makes Robin a scientist in any philosophically meaningful sense is a question for another day. The practical consequences are already visible. A system that can autonomously identify novel therapeutic candidates in 2.5 months changes the economics of pharmaceutical research — and eventually, it will change our answer to the question of who, or what, does science.
"The drug Robin identified was not on any human researcher's list when it began. It is now in clinical consideration."— Lisa Pedrosa
Milestones: AI from Tool to Scientist — 2020–2026
Share this article

How AI learned to design proteins from scratch, and why it might be the most consequential tool in the history of medicine.

Fault-tolerant quantum computing is no longer a theoretical promise — it's an engineering problem with a finish line coming into view.

The architecture that made modern AI possible is showing its limits. A new generation of models is already rewriting what's possible.

De-extinction isn't science fiction anymore. The questions are no longer whether we can — but whether we should, and what comes after.

From the EU AI Act to existential risk: the global effort to build guardrails around the most powerful technology in human history.

How artificial intelligence became science's most powerful instrument — and what it means for the pace of human knowledge.
Buy me a coffee