Generative AI is changing everything. But what’s left when the hype is gone?
It was clear that OpenAI was on to something. In late 2021, a small team of researchers was playing around with an idea at the company’s San Francisco office. They’d built a new version of OpenAI’s text-to-image model, DALL-E, an AI that converts short written descriptions into pictures: a fox painted by Van Gogh, perhaps, or a corgi made of pizza. Now they just had to figure out what to do with it.
“Almost always, we build something and then we all have to use it for a while,” Sam Altman, OpenAI’s cofounder and CEO, tells MIT Technology Review. “We try to figure out what it’s going to be, what it’s going to be used for.”
Not this time. As they tinkered with the model, everyone involved realized this was something special. “It was very clear that this was it—this was the product,” says Altman. “There was no debate. We never even had a meeting about it.”
But nobody—not Altman, not the DALL-E team—could have predicted just how big a splash this product was going to make. “This is the first AI technology that has caught fire with regular people,” says Altman.
DALL-E 2 dropped in April 2022. In May, Google announced (but did not release) two text-to-image models of its own, Imagen and Parti. Then came Midjourney, a text-to-image model made for artists. And August brought Stable Diffusion, an open-source model that the UK-based startup Stability AI has released to the public for free.
The doors were off their hinges. OpenAI signed up a million users in just 2.5 months. More than a million people started using Stable Diffusion via its paid-for service Dream Studio in less than half that time; many more used Stable Diffusion through third-party apps or installed the free version on their own computers. (Emad Mostaque, Stability AI’s founder, says he’s aiming for a billion users.)
And then in October we had Round Two: a spate of text-to-video models from Google, Meta, and others. Instead of just generating still images, these can create short video clips, animations, and 3D pictures.
The pace of development has been breathtaking. In just a few months, the technology has inspired hundreds of newspaper headlines and magazine covers, filled social media with memes, kicked a hype machine into overdrive—and set off an intense backlash.
“The shock and awe of this technology is amazing—and it’s fun, it’s what new technology should be,” says Mike Cook, an AI researcher at King’s College London who studies computational creativity. “But it’s moved so fast that your initial impressions are being updated before you even get used to the idea. I think we’re going to spend a while digesting it as a society.”
Artists are caught in the middle…