In April 2025, a language model ran for the first time on a neuromorphic chip. It was as accurate as any GPU-based equivalent. It used half the energy. The chip it ran on — Intel's Loihi 2 — contained one million artificial neurons connected by 120 million synthetic synapses, all packed onto a processor that fits in the palm of your hand. The experiment lasted a few hours. The implications have not stopped since.
A Billion Neurons in a Data Centre
In April 2024, Intel announced that it had built the world's largest neuromorphic computing system. The system, codenamed Hala Point and deployed at Sandia National Laboratories in New Mexico, contains 1.15 billion artificial neurons and 128 billion synthetic synapses — distributed across 140,544 neuromorphic processing cores running on 1,152 Loihi 2 chips. At peak load it consumes 2,600 watts of power. For context, that is roughly equivalent to a domestic kettle left running continuously. A conventional computing system of comparable neural capacity would require orders of magnitude more energy.
The scale of Hala Point was a statement, not a product. Intel was demonstrating that neuromorphic systems could reach the density of biological neural networks — the human brain contains approximately 86 billion neurons — while maintaining the radical energy efficiency that makes the architecture worth building in the first place. The system at Sandia is being used for research into brain-inspired AI, scientific computing, and optimisation problems that prove intractable for conventional processors.
Then, in January 2026, Intel released Loihi 3 — a third-generation neuromorphic processor fabricated on a 4-nanometre process that packs 8 million neurons and 64 billion synapses on a single chip. That is an eightfold increase over Loihi 2 in neuron density. Unlike its predecessors, which communicated exclusively in binary spikes — a signal either fires or it does not — Loihi 3 supports 32-bit graded spikes. This matters because it allows the chip to run the deep neural networks that power current AI systems alongside the spiking neural networks that define the neuromorphic paradigm. The wall between conventional AI and brain-inspired computing has become, for the first time, permeable.
Why the Brain Computes So Cheaply
The human brain operates on roughly 20 watts of power. Your smartphone requires far more than that to run a large language model inference. A single ChatGPT query consumes approximately 10 watt-hours of energy — roughly 30 times what Google needs to return a search result, and approximately 33,000 times what your brain uses to answer the same question from memory. This gap is not an engineering oversight. It reflects a deep architectural difference between how biological neural networks compute and how silicon processors do.
Almost every processor you have ever used — from the chip in your laptop to the GPUs training the world's largest AI systems — is built on the Von Neumann architecture, named for the mathematician John von Neumann who formalised it in 1945. In this architecture, the processor (CPU or GPU) and the memory (RAM) are physically separated. To perform any computation, data must be continuously shuttled from memory to the processor and back. This transfer — the "memory wall" — consumes energy, creates latency, and sets a ceiling on how efficiently the system can run.
The brain has no such bottleneck. In biological neural networks, memory and computation are co-located — the synapse, which stores information (as a weighted connection between neurons), is also the site where computation occurs (when a signal crosses it). There is no bus. There is no transfer. The biology that stores a memory and the biology that uses it to compute are the same biology.
Neuromorphic processors attempt to recreate this architecture in silicon. The key innovation is the artificial synapse — a circuit element that stores a weighted connection between two artificial neurons and modifies its resistance when signals pass through it, permanently changing the connection's strength. This is the silicon analogue of synaptic plasticity, the mechanism by which biological neurons learn: the connections that fire together, wire together.
The second key innovation is the spiking neural network (SNN). Conventional deep learning runs on continuous, high-precision floating-point numbers — every neuron in the network fires a value on every forward pass, generating enormous amounts of computation even when nothing meaningful is happening. A spiking neural network is asynchronous and event-driven: a neuron fires — produces a "spike" — only when its accumulated input crosses a threshold. Between spikes, it is silent. It consumes no energy. Computation is sparse. Only information-bearing events generate processing. This is exactly how biological neurons behave.
The energy numbers this produces are remarkable. On a video classification task, a Loihi 2 chip running a sparsified deep learning algorithm used one hundred and fiftieth the energy of a GPU running the standard version. A 2024 continual learning benchmark showed 5,600 times better energy efficiency and 70 times better latency for online learning tasks. These are not marginal improvements. They are differences of several orders of magnitude — the kind of gap that makes previously impossible applications suddenly viable.
Thirty Years of a Dream
The term "neuromorphic" was coined by Carver Mead, a Caltech electrical engineer, in a 1990 paper describing circuits that mimicked the analogue behaviour of biological neural systems. Mead's circuits were not digital — they operated on continuous voltages that behaved like the membrane potentials of real neurons. His insight was that biology had solved, through four billion years of evolution, a set of engineering problems that silicon had barely begun to address: low power, real-time sensory processing, learning from sparse data, and robust operation in noisy environments.
The field spent two decades in academic laboratories, producing exquisite demonstrations and limited commercial interest. IBM released TrueNorth in 2014 — a chip with one million neurons and 256 million synapses running on 70 milliwatts — as a research platform. Intel released Loihi in 2017 and its successor, Loihi 2, in 2021. Progress was steady but the gap between proof-of-concept and real-world deployment remained wide. Neuromorphic hardware could outperform conventional processors on highly specific tasks while underperforming on everything else. The ecosystem of software, tools, and trained engineers needed to deploy it at scale did not exist.
"Neuromorphic computing is not about building a better GPU. It is about building a fundamentally different kind of computer — one whose architecture is aligned with the structure of intelligence rather than the structure of calculation."
— PNAS, 2025 — "Can neuromorphic computing help reduce AI's high energy cost?"What changed the trajectory was not a single breakthrough but a collision of pressures. The energy cost of large-scale AI became, by 2023 and 2024, impossible to ignore. Training GPT-4 consumed an estimated 51 gigawatt-hours of electricity — roughly equivalent to the annual energy consumption of 4,700 American homes. Inference costs compounded this: by 2025, the AI industry's electricity consumption had grown to a scale that was reshaping data centre construction, power grid planning, and national energy policy. The question of whether AI could be made radically more energy-efficient moved from research curiosity to genuine industrial priority.
| Chip | Released | Neurons | Key Advantage |
|---|---|---|---|
| IBM TrueNorth | 2014 | 1 million | 70mW power — first large-scale demonstration of SNN efficiency |
| Intel Loihi | 2017 | 130,000 | On-chip learning — first chip with programmable synaptic plasticity |
| Intel Loihi 2 | 2021 | 1 million | 100× GPU efficiency on specific tasks; LLM demonstrated (2025) |
| IBM NorthPole | 2023 | N/A (DNN accelerator) | 25× more efficient than NVIDIA V100; 42,460 frames/joule |
| Intel Loihi 3 | Jan 2026 | 8 million | 4nm process; 32-bit graded spikes; bridges SNN and DNN; 1,000× less power than traditional |
Figure 1 — Neuromorphic hardware timeline: key chips and milestones
IBM's NorthPole, released in October 2023, took a different approach. Rather than using spiking neural networks, it adapted brain-inspired principles of co-located memory and computation to build an extremely efficient accelerator for conventional deep neural network inference. NorthPole does not attempt to replicate biological spiking behaviour. It borrows the biological insight — that memory and compute should be unified — and implements it with digital precision. The result: 42,460 frames per joule, making it 25 times more efficient than NVIDIA's V100 GPU and 5 times more efficient than the H100. For the narrow task of inference (running a trained model, rather than training it), the energy argument became hard to dispute.
Where the Brain Goes Next
Juniper Research, publishing in January 2026, forecast that the first commercially scalable neuromorphic chips would reach the market within the year. Intel's timeline for Loihi 3 targets initial commercial availability in Q4 2026, initially directed at research institutions and specialised embedded applications, with consumer device integration beginning in 2027. The global neuromorphic computing market, estimated at between 8 and 9.5 billion dollars in 2025, is forecast to reach between 47 and 59 billion dollars by 2033. The range in the projections reflects genuine uncertainty about the speed of ecosystem development, not uncertainty about whether the market exists.
The applications divide naturally across three scales. At the smallest scale — the edge — neuromorphic chips offer a path to AI on battery-powered devices that have never been able to sustain meaningful intelligence locally. Smart sensors, wearable health monitors, hearing aids that process speech in real time, robotic limbs that respond to muscle signals without a cloud connection. These applications require exactly what neuromorphic chips provide: very low latency, very low energy consumption, and the ability to learn and adapt from live sensory data without requiring continuous network access.
At the middle scale, robotics offers perhaps the most compelling near-term use case. A robot navigating a real environment processes continuous streams of sensory data — visual, tactile, proprioceptive — and must respond in real time without the luxury of sending data to a server and waiting for a reply. The computational load of real-time sensorimotor integration has historically required either enormous power budgets or severe limitations on the robot's capabilities. Neuromorphic processors offer a different trade-off: the chip itself handles the sensory processing loop locally, with the energy efficiency that sustained mobile operation requires.
At the largest scale, the most tantalising application is science itself. Neuromorphic systems capable of simulating hundreds of millions or billions of neurons at biological time scales could function as tools for understanding cognition — not just building artificial intelligence but studying natural intelligence. Intel's Hala Point, at Sandia, is already being used for this kind of research. A system that can simulate the dynamics of a cortical column in real time, testing hypotheses about memory consolidation or perceptual binding that no clinical trial could ever directly address, would represent a fundamentally new instrument for neuroscience.
Neuromorphic chips are not a drop-in replacement for GPUs. They require different software — spiking neural networks must be designed and trained differently from conventional deep networks, and the tools for doing this at scale are still maturing. The research community is substantially smaller than the GPU ecosystem, and the knowledge transfer from conventional deep learning to neuromorphic computing is a genuine engineering challenge, not a straightforward migration.
There is also the question of what neuromorphic hardware does well versus what it merely does differently. The 5,600× efficiency gain on continual learning tasks is real and extraordinary. The performance of neuromorphic chips on large-scale language modelling — the task that currently defines the frontier of AI capability — is still substantially below GPU baselines. Loihi 3's 32-bit graded spikes narrow this gap, but closing it entirely will require years of hardware and algorithm co-development.
The first language model to run on a neuromorphic chip used half the energy of its GPU equivalent. That number will improve. The chip it ran on had a million neurons. The successor chip has eight million. The system at Sandia has a billion. The trajectory is not ambiguous. What is still being resolved is the question of what, precisely, you build when you stop designing chips that calculate and start designing chips that think.
Buy me a coffee