The core principles of academic artificial intelligence have remained largely unchanged over the past few decades, with neural networks first theorized in the 1990s and early AI chatbots like Eliza created as early as 1964. Today’s advancements in AI image generation stem from the vast amounts of data collected and analyzed by tech companies, transforming old neural networks into powerful tools capable of creating complex images. Non-profits like LAION and Common Crawl have amassed extensive datasets of public images and text descriptions, inadvertently including private medical images, as revealed by artist Lapine when she discovered her own image in the AI training data. This underscores concerns about the appropriation of individual creators' work, with much of AI output seen as relying on uncredited human labor. The text critiques the commercialization of these technologies, suggesting that they exploit personal and cultural resources for the profit of a few tech giants while failing to provide the promised transformative experiences for society.
Write me a 5 sentence summary about the text below.
The fundamental concepts of academic artificial intelligence have not changed in the last couple of decades. The underlying technology of neural networks – a method of machine learning based on the way physical brains function – was theorised and even put into practice back in the 1990s. You could use them to generate images then, too, but they were mostly formless abstractions, blobs of colour with little emotional or aesthetic resonance. The first convincing AI chatbots date back even further. In 1964, Joseph Weizenbaum, a computer scientist at the Massachusetts Institute of Technology, developed a chatbot called Eliza. Eliza was modelled on a “person-centred” psychotherapist: whatever you said, it would mirror back to you. If you said “I feel sad”, Eliza would respond with “Why do you feel sad?”, and so on. (Weizenbaum actually wanted his project to demonstrate the superficiality of human communication, not to be a blueprint for future products.)
Early AIs didn’t know much about the world, and academic departments lacked the computing power to exploit them at scale. The difference today is not intelligence, but data and power. The big tech companies have spent 20 years harvesting vast amounts of data from culture and everyday life, and building vast, energy-hungry data centres filled with ever more powerful computers to churn through it. What were once creaky old neural networks have become super-powered, and the gush of AI we’re seeing is the result.
AI image generation relies on the assembly and analysis of millions upon millions of tagged images; that is, images that come with some kind of description of their content already attached. These images and descriptions are then processed through neural networks that learn to associate particular and deeply nuanced qualities of the image – shapes, colours, compositions – with certain words and phrases. These qualities are then layered on top of one another to produce new arrangements of shape, colour and composition, based on the billions of differently weighted associations produced by a simple prompt. But where did all those original images come from?
The datasets released by LAION, a German non-profit, are a good example of the kind of image-text collections used to train large AI models (they provided the basis for both Stable Diffusion and Google’s Imagen, among others). For more than a decade, another non-profit web organisation, Common Crawl, has been indexing and storing as much of the public world wide web as it can access, filing away as many as 3bn pages every month. Researchers at LAION took a chunk of the Common Crawl data and pulled out every image with an “alt” tag, a line or so of text meant to be used to describe images on web pages. After some trimming, links to the original images and the text describing them are released in vast collections: LAION-5B, released in March 2022, contains more than five billion text-image pairs. These images are “public” images in the broadest sense: any image ever published on the internet may be gathered up into them, with exactly the kind of strange effects one may expect.
In September 2022, a San Francisco–based digital artist named Lapine was using a tool called Have I Been Trained, which allows artists to see if their work is being used to train AI image generation models. Have I Been Trained was created by the artists Mat Dryhurst and Holly Herndon, whose own work led them to explore the ways in which artists’ labour is coopted by AI. When Lapine used it to scan the LAION database, she found an image of her own face. She was able to trace this image back to photographs taken by a doctor when she was undergoing treatment for a rare genetic condition. The photographs were taken as part of her clinical documentation, and she signed documents that restricted their use to her medical file alone. The doctor involved died in 2018. Somehow, these private medical images ended up online, then in Common Crawl’s archive and LAION’s dataset, and were finally ingested into the neural networks as they learned about the meaning of images, and how to make new ones. For all we know, the mottled pink texture of our Saint-Exupéry-style piggy could have been blended, however subtly, from the raw flesh of a cancer patient.
“It’s the digital equivalent of receiving stolen property. Someone stole the image from my deceased doctor’s files and it ended up somewhere online, and then it was scraped into this dataset,” Lapine told the website Ars Technica. “It’s bad enough to have a photo leaked, but now it’s part of a product. And this goes for anyone’s photos, medical record or not. And the future abuse potential is really high.” (According to her Twitter account, Lapine continues to use tools like Dall-E to make her own art.)
The entirety of this kind of publicly available AI, whether it works with images or words, as well as the many data-driven applications like it, is based on this wholesale appropriation of existing culture, the scope of which we can barely comprehend. Public or private, legal or otherwise, most of the text and images scraped up by these systems exist in the nebulous domain of “fair use” (permitted in the US, but questionable if not outright illegal in the EU). Like most of what goes on inside advanced neural networks, it’s really impossible to understand how they work from the outside, rare encounters such as Lapine’s aside. But we can be certain of this: far from being the magical, novel creations of brilliant machines, the outputs of this kind of AI is entirely dependent on the uncredited and unremunerated work of generations of human artists.
AI image and text generation is pure primitive accumulation: expropriation of labour from the many for the enrichment and advancement of a few Silicon Valley technology companies and their billionaire owners. These companies made their money by inserting themselves into every aspect of everyday life, including the most personal and creative areas of our lives: our secret passions, our private conversations, our likenesses and our dreams. They enclosed our imaginations in much the same manner as landlords and robber barons enclosed once-common lands. They promised that in doing so they would open up new realms of human experience, give us access to all human knowledge, and create new kinds of human connection. Instead, they are selling us back our dreams repackaged as the products of machines, with the only promise being that they’ll make even more money advertising on the back of them.
1 answer