Neither DALL-E 2 nor Imagen is currently available to the public. Yet they share an issue with many others that already are: they can also produce disturbing results that reflect the gender and cultural biases of the data on which they were trained — data that includes millions of images pulled from the internet.
The bias in these AI systems presents a serious issue, experts told CNN Business. The technology can perpetuate hurtful biases and stereotypes. They’re concerned that the open-ended nature of these systems — which makes them adept at generating all kinds of images from words — and their ability to automate image-making means they could automate bias on a massive scale. They also have the potential to be used for nefarious purposes, such as spreading disinformation.
“Until those harms can be prevented, we’re not really talking about systems that can be used out in the open, in the real world,” said Arthur Holland Michel, a senior fellow at Carnegie Council for Ethics in International Affairs who researches AI and surveillance technologies.
Documenting bias
Lama Ahmad, policy research program manager at OpenAI, said researchers are still learning how to even measure bias in AI, and that OpenAI can use what it learns to tweak its AI over time. Ahmad led OpenAI’s efforts to work with a group of outside experts earlier this year to better understand issues within DALL-E 2 and offer feedback so it can be improved.
Google declined a request for an interview from CNN Business. In its research paper introducing Imagen, the Google Brain team members behind it wrote that Imagen appears to encode “several social biases and stereotypes, including an overall bias towards generating images of people with lighter skin tones and a tendency for images portraying different professions to align with Western gender stereotypes.”
The contrast between the images these systems create and the thorny ethical issues is stark for Julie Carpenter, a research scientist and fellow in the Ethics and Emerging Sciences Group at California Polytechnic State University, San Luis Obispo.
“One of the things we have to do is we have to understand AI is very cool and it can do some things very well. And we should work with it as a partner,” Carpenter said. “But it’s an imperfect thing. It has its limitations. We have to adjust our expectations. It’s not what we see in the movies.”
Holland Michel is also concerned that no amount of safeguards can prevent such systems from being used maliciously, noting that deepfakes — a cutting-edge application of AI to create videos that purport to show someone doing or saying something they didn’t actually do or say — were initially harnessed to create faux pornography.
“It kind of follows that a system that is orders of magnitude more powerful than those early systems could be orders of magnitude more dangerous,” he said.
Hint of bias
Because Imagen and DALL-E 2 take in words and spit out images, they had to be trained with both types of data: pairs of images and related text captions. Google Research and OpenAI filtered harmful images such as pornography from their datasets before training their AI models, but given the large size of their datasets such efforts are unlikely catch all such content, nor render the AI systems unable to produce harmful results. In its Imagen paper, Google researchers pointed out that, despite filtering some data, they also used a massive dataset that is known to include porn, racist slurs, and “harmful social stereotypes.”
Filtering can also lead to other issues: Women tend to be represented more than men in sexual content, for instance, so filtering out sexual content also reduces the number of women in the dataset, said Ahmad.
And truly filtering these datasets for bad content is impossible, Carpenter said, since people are involved in decisions about how to label and delete content — and different people have different cultural beliefs.
“AI doesn’t understand that,” she said.
Some researchers are thinking about how it might be possible to reduce bias in these types of AI systems, but still use them to create impressive images. One possibility is using less, rather than more, data.
Alex Dimakis, a professor at the University of Texas at Austin, said one method involves starting with a small amount of data — for example, a photo of a cat — and cropping it, rotating it, creating a mirror image of it, and so on, to effectively turn one picture into many different images. (A graduate student Dimakis advises was a contributor to the Imagen research, but Dimakis himself was not involved in the system’s development, he said.)
“This solves some of the problems, but it doesn’t solve other problems,” Dimakis said. The trick on its own won’t make a dataset more diverse, but the smaller scale could let people working with it be more intentional about the images they’re including.
Royal raccoons
For now, OpenAI and Google Research are trying to keep the focus on cute pictures and away from images that may be disturbing or show humans.
There are no realistic-looking images of people in the vibrant sample images on either Imagen’s nor DALL-E 2’s online project page, and OpenAI says on its page that it used “advanced techniques to prevent photorealistic generations of real individuals’ faces, including those of public figures.” This safeguard could prevent users from getting image results for, say, a prompt that attempts to show a specific politician performing some kind of illicit activity.
“Researchers, specifically, I think it’s really important to give them access,” Ahmad said. This is, in part, because OpenAI wants their help to study areas such as disinformation and bias.
Still, as Google Research noted in its Imagen paper, “Even when we focus generations away from people, our preliminary analysis indicates Imagen encodes a range of social and cultural biases when generating images of activities, events, and objects.”
A hint of this bias is evident in one of the images Google posted to its Imagen webpage, created from a prompt that reads: “A wall in a royal castle. There are two paintings on the wall. The one on the left a detailed oil painting of the royal raccoon king. The one on the right a detailed oil painting of the royal raccoon queen.”
The image is just that, with paintings of two crowned raccoons — one wearing what looks like a yellow dress, the other in a blue-and-gold jacket — in ornate gold frames. But as Holland Michel noted, the raccoons are sporting Western-style royal outfits, even though the prompt didn’t specify anything about how they should appear beyond looking “royal.”
Even such “subtle” manifestations of bias are dangerous, Holland Michel said.
“In not being flagrant, they’re really hard to catch,” he said.