OpenAI Is Opening DALL·E 2
Here's an informative analysis for you to get the most out of it.
The long-awaited news is here.
OpenAI has announced they’re opening the DALL·E 2 beta. In a few weeks, everyone on the waitlist will have access to the model. For three months and a half, OpenAI has kept the system in research mode to assess its potential harms. But, as Sam Altman said on April 6th, they wanted to launch a product in the summer. The wait is now over.
Let’s see how the DALL·E 2 beta is going to work, how much it’ll cost you, what you can and can’t do with your creations, and what are the immediate consequences beyond the beta — with a few links to help you navigate the world of AI-powered creativity.
DALL·E 2 open beta
In case you’re new here and don’t know anything about DALL·E 2, I wrote an in-depth non-technical review that covers how it works, what it can do, and its inherent (technical and social) issues. I also recommend looking up DALL·E 2’s official Instagram, the subreddit r/dalle2, and the Twitter hashtag #dalle to understand just how amazing this tech is.
If you don’t want to read that much, the basic idea you need to know is that DALL·E 2 allows you to create images from words. You input a sentence (prompt) and DALL·E 2 outputs a set of original images it associates with the words you used. The normal mode (text → image) gives you 4 images per prompt. DALL·E 2 can also edit and make variations (text+image → image) on generated or uploaded images. These modes give you 3 images per prompt.
Now, let’s see how much it’ll cost you to play with the most advanced AI visual generator publicly available.
OpenAI has established a credit system to use DALL·E 2. One credit = one generation/edit/variation. That means that one credit gives you either 4 or 3 images, depending on the mode.
Each account receives 50 free credits in the first month and 15 free credits in the subsequent months. In case you want more credits, you can purchase packages of 115 credits for $15 ($0.13/credit). There lies the secret of the DALL·E 2 business model. Let’s understand why.
If you haven’t tried DALL·E 2 (or any other AI art generator, for that matter), I can tell you now that 15 credits — which is 15 prompts — is a very low number.
Let’s see an example. I decided to use Midjourney (DALL·E 2’s cousin) to create the cover image for my previous article on The Algorithmic Bridge:
It’s not a perfect result (that’s obvious) and it still took me around half an hour and multiple try-and-error attempts. I tried three or four prompts before settling with “a typewriter with eyes, black and white, in a symbolic and meaningful style, artstation, —ar 16:9” (details on prompt engineering later). for each prompt, I made a few variations and upscaled the images a few times to get a better result.
That was, in total, around 20 requests. To get a similar image with DALL·E 2 I may need to use up my whole free monthly quota.
And that’s because I got tired of trying things. Digital artists can dedicate entire days to experimenting with prompts. They could easily spend a year’s worth of credits on a single image. I’m not exaggerating, they can be very perfectionist — and once you put your hands on DALL·E 2, you may too.
To overcome this notable limitation OpenAI offers packages of 115 credits for $15. Taking a conservative estimate — and assuming that most people aren’t excellent prompters —, I’d say 115 credits can turn into 5–10 decent images.
This is key to understanding the implications of the payment model OpenAI wants to implement. To get a better estimate of the expenses we should think in terms of $ per “good result” instead of $ per attempt. $15 for 10 good results — 15 if you’re really skilled— is fairly expensive.
First, OpenAI says at the end of the announcement that they’d subsidize access for “qualifying artists”. That is, those artists who depend on DALL·E 2 for their work (in contrast to people like me who plan to use it sporadically) and are “in need of financial assistance” could use the system without paying that much money.
I find this option very reasonable for any person who could defend that DALL·E 2 may impact their job in now way or another (either because it’s threatening or because it’s a key tool for inspiration or enhancement).
If you consider you fulfill the requirements, you can fill out this form.
Second, and more generally important, OpenAI says that “as we learn more and gather user feedback, we plan to explore other options that will align with users’ creative processes.”
This means they may modify the pricing system if they receive feedback that asks for a change. The two alternatives that come to mind are pay-per-prompt and subscription models. The first case is similar to what they use with GPT-3. You pay for each image you generate (it could be something like $0.05–0.10). This is interesting for casual users who plan to just play around with DALL·E 2 to see what the fuss is about.
A subscription model would make sense for people who plan to use the service a lot. People who don’t want to feel pressure when it comes to experimenting. Creativity doesn’t flourish if you are worried about spending too much money.
A subscription model would certainly help those users best positioned to give the most useful feedback to the company. Those people, who may not qualify for the subsidy, would eventually amortize the upfront payment.
But there’s a reason why I don’t think OpenAI is — at least right now — considering this business model. It’s the least correlative to GPU usage, which comprises the bulk of the company’s costs.
Anyway, feel free to give them feedback and you may see it change toward a business model that better fits your needs.
Understanding this point is paramount because not doing so is the best way to get your account banned and your access to DALL·E 2 revoked, maybe forever depending on the infringement.
OpenAI researchers have been working hard to adapt DALL·E 2 to the current general understanding of what makes an AI model safe. First, they used a red team to assess its limitations and potential harms. Then, once they opened the research beta, they gave access slowly in small batches to gather feedback and check possible issues they may have overlooked.
Now, with the open beta access, they establish three main policy guidelines for safety.
Curbing misuse. They don’t allow for uploading faces, generating faces of famous people, or “photorealistic generations of real individuals’ faces.” This means you can’t upload a selfie and you can’t ask DALL·E 2 to generate a pic of Trump doing something ridiculous.
Preventing harmful images. Users can’t generate images that fit into one of the prohibited categories as defined by OpenAI’s content policy (e.g. hate, sex, violence, etc.) They’ve implemented content filters and have reduced the amount of this type of data from DALL·E 2’s training set.
Reducing bias. Now, DALL·E 2 “more accurately reflect[s] the diversity of the world’s population” leveraging a new technique. With this approach, the company wants to avoid situations where, for instance, prompting “CEO” gives you only pics of White/Asian men in suits.
Beyond the beta
OpenAI opening DALL·E 2 beta is the beginning of a lot of changes that will affect all corners of society. The main reason is they’ve decided to grant creators full ownership of the generated images. Here’s a brief overview of the most imminent consequences.
OpenAI — contrary to what I originally thought, I have to confess — will allow users to leverage DALL·E 2 for commercial purposes.
From the announcement:
“Starting today, users get full usage rights to commercialize the images they create with DALL·E, including the right to reprint, sell, and merchandise.”
This is the most important news.
Let me illustrate why with my particular case. When I started writing I realized that good cover images were critical for a well-performing article. I started using free repositories of images like Unsplash and Pexels but soon found out those were very limited on what they could offer me. I decided to purchase a yearly subscription on Shutterstock. I’ve been using the service for a year now and it’s given me some of the best cover images I’ve found.
Once I can use DALL·E 2 for my articles (as Casey Newton has been doing), I’ll never buy a subscription to a stock image library ever again. For $15/month, I can easily create 10 images that perfectly match what I want whereas any good stock image company will charge me +$30/month for 10 images that, not only I have to find, but simply can’t compete in precision and creativity to what I can get with DALL·E 2.
Stock image services are already dead.
But the consequences don’t stop there.
The end of graphic designers?
Not long ago, award-winning director Karen X. Cheng used DALL·E 2 to create the cover for Cosmopolitan magazine. It was the first time an AI was used for this kind of work but it won’t be the last. This was an experiment but once these AI visual generators get good enough to depict humans faithfully (hands with all their fingers and eyes that look in the correct direction) even the biggest contractors of human graphic designers — like magazines — will use AI.
But she doesn't think DALL·E 2 will “replace humans,” as she explained in a Twitter thread. “It took hundreds of attempts … Hours and hours of prompt generating and refining before getting the perfect image.” Many people were involved in creating the Cosmopolitan cover but once these systems are refined, a single person will be able to substitute entire teams of designers — and will create better art faster and more efficiently.
Humans will still be in the loop, that’s for sure. But, how many humans will remain in comparison to the times before DALL·E 2, is a different question.
The last bit I wanted to touch on is about communication between humans and AI.
Since GPT-3, people realized that the way you communicate with AI systems matters a lot to the quality you get in return. You shouldn’t think of these systems as diviners. They can’t read your mind. They are great at creating new things with just a bit of help, but that help is critical. And that’s on you. You have to learn how to get the best out of them, otherwise, they could be disappointing.
That’s why researchers came up with the term prompt engineering. It reflects the fact that learning how to communicate with these AIs is a skill. People can play for days and days with GPT-3 or DALL·E 2 and realize they’re not getting better results because they never learned the correct techniques. Others may find out that even after months, they’re still improving at tweaking words and concepts here and there and getting increasingly higher-quality outputs.
This is important for two reasons.
First, digital artists, designers, and illustrators that are aware of the advent of these technologies can — and should — get ahead of the curve and develop prompting skills to stay relevant. Most people will remain, at most, casual users of these systems. People with excellent prompting skills will be able to get the best out of DALL·E 2-like AIs while most people will need to rely on them once more to get decent art.
This is the most powerful argument against the idea that AI visual generators will take the jobs of artists or designers. They will have to update their skill set, yes. But companies won’t use DALL·E 2 directly. They will look for people who know how to use it. People with great prompt engineering skills.
And I’m quite sure those who are well-versed in visual creativity and updated on the latest AI trends are the best posited to fill those spots.
Let me emphasize that. Don’t be fooled, don’t think that just because you can use DALL·E 2 you’ll be able to do artistic magic. Any digital artist could confirm that communicating with DALL·E 2 could be considered an art in and of itself.
Like oil painting and digital drawing require a particular set of skills, creating with DALL·E 2 does, too.
How difficult it is to acquire those skills will certainly define how much competence tech-savvy artists will face. For now, they have that edge.
OpenAI’s announcement isn’t unexpected. We knew this was coming. It isn’t the beginning of something within the AI field, but a continuation of already existing trends.
However, it’s definitely an inflection point. People who would have never come into explicit contact with AI otherwise will do so through DALL·E 2 creations.
It will reach the most distant corners of the non-tech-savvy world and many more people will start to become aware of AI and its influence on the world and their lives.
And what better way to achieve that than through art, creativity, and imagination?