When you interact with an LLM today, you use an API to interact with a model via tokens. With generative computing, IBM Research has replaced the API with a runtime that’s equipped with programming abstractions. These abstractions can create safety guardrails, outline structured requirements (including explicit calls that software at runtime can check and enforce), and use instructions to implement a generation strategy.
And this strategy will pay off for enterprises. Generative computing will make it possible to use much smaller models to achieve the same level of accuracy compared to the use of prompts, Raghavan said.
The runtime can also be used to detect hallucination, bias, and prompt injection, he explained. “As opposed to hoping that a piece of English will get correctly interpreted, you have well-trained adaptors that are going to implement security checks.”
Portability, too, is enabled by generative computing because the runtime’s structured abstractions aren’t programmed to meet the whims of a specific model.
Demonstrating the organization’s commitment to generative computing, Raghavan announced that IBM is releasing a Granite runtime, as well as the next generation of Granite 4.0 models (including one small enough to fit on a single GPU), which will come later this summer. Utilizing state-space models, transformer approaches, and a mixture-of-experts approach, early benchmarks suggest these models can perform inference two to five times more quickly than comparable models, Raghavan said.
“We’re really making cutting-edge technologies more performant and more consistent,” added Nazario, emphasizing what the new runtime and models will mean for enterprises. “We’re making them more cost effective, more flexible.”
The announcements were no less groundbreaking on the quantum side. “We believe quantum advantage will actually happen in 2026,” said IBM Research VP of Quantum Computing Jay Gambetta from the 2025 Think stage.
Reaching quantum advantage depends on cooperation between the quantum computing and high-performance computing communities, in the form of quantum-centric supercomputing, a compute paradigm that combines classical and quantum computing. “It’s not about classical versus quantum computing,” Gambetta said. “It’s about quantum plus classical.”
The belief that IBM will achieve quantum advantage by 2026 is rooted in IBM’s leadership and approach to quantum computing, treating it as an engineering problem rather than as a science project. Gambetta showed off IBM’s series of quantum processors and their relevant packaging technologies as evidence of its leadership in that space, backed up by IBM's long history of expertise in semiconductors.
In line with this, Gambetta’s team is working on new algorithms for quantum-centric supercomputers, such as sample-based quantum diagonalization (SQD), to shift what is possible with today’s pre-fault-tolerant quantum devices. Working with RIKEN in Japan, IBM Research was able to use the SQD technique to accurately simulate the ground state energy of [4Fe-4S], a 77-qubit problem beyond the scale of what's amenable to exact diagonalization methods on classical computers.
IBM Quantum expects to see advantage first in chemistry or materials science, then in optimization, and finally mathematical problems. With quantum advantage just around the corner, the IBM Quantum roadmap continues to push out toward fault-tolerant quantum computing with IBM Quantum Starling in 2029.