OGWiseman Reports!
I got access to and played with the Beta version of the new AI-Art-Generator DALL-E-2--It was existential!
One classic version of a dystopian future holds that robotic-and-AI-powered automation will make humans irrelevant. In most of these scenarios, A.I. develops such that it replaces “lower-level” human functions first. It plants crops and builds homes and *labors* for humankind, and all the unproductive masses rebel against ennui/are killed by capitalists/are killed by communists/have nothing to do but sit around all day and get mad about politics.
After playing around with the new DALL-E-2 art generator (and seeing results from the GPT-3 text generator to which I do not yet have interactive access), an artist like me finds himself confronting the uncomfortable question: What if A.I. comes for the creatives first?
(Note: All images in this post were generated by DALL-E-2, from prompts written by me unless otherwise noted.)
It turns out that some of the most “basic” human skills—locomotion, athleticism, small-motor skills, kinesthetic error correction in real-time—are very, very difficult to get a computer to do at anything approaching a human level. The calculation of forces and angles plus the microadjustments of muscles and limbs required just to get across a room without crashing turn out to be immense. The precision required to move across a construction site, stage materials, lay concrete and wood in specific patterns within sixteenth of an inch tolerances, and build something as simple as a block wall or set of stairs turns out, even in the year 2022, to be beyond computers’ current ken.
None of this adds up to impossible: They’ll get there, but it’s happening more slowly. The art that computers are generating, however, is improving in quality at superhuman speeds. And of course, the *quantity* of art that A.I. can generate already far outstrips the combined productive capacity of all human artists.
The current leading theory on why art is the easier nut to crack is that evolution has had millions of years (before humans were even human) to work on those skills. But, actual representational art has existed only for millennia. We’re just not as good at it! That’s why a child can navigate a jungle gym but an A.I. can’t, whereas most human adults couldn’t *ever* create a single image as "skillfully” as DALL-E-2 can do in a matter of moments.
Of course, one trick is that this “A.I.-art” still requires human input. It requires the description of a worthy subject, and a lot of the artistic merit is in that choice. That’s especially true in the age of memes, where the combination of tangential ideas is the core of the art form. It’s no surprise that most of the most striking images from DALL-E-2 (whose name is a portmanteau of Salvador Dali and the Disney robot Wall-E) are of surreal or impossible subjects—the human subject-chooser is adding the most value there.
That’s where the crowdsourced nature of the internet comes in—the hivemind is more creative than any single artist could ever be. And as DALL-E has rolled out in Beta and more people have gotten access, the hivemind has responded with a cascade of hilarious/compelling concepts for A.I. interpretation. Just a few examples from my internet searchings:
(These three are the ones I didn’t write prompts for, and that last one is absolute nightmare-fuel. I had to talk myself into including it because I find it so off-putting.)
Despite the current necessity of human idea-generation, the interpretive act is incredibly creative and complex! One interesting aspect of this is that the more specific the prompt, the more interesting the resulting image tends to be. One word prompts tend to be little better than a google image search:
Whereas a more specific take on that theme yields something much more compelling and identifiably artistic:
Now one theory is that the second image is more artistic because the human has put more creativity into its prompt. But is that true? The image looks nothing like I would have imagined it! The lovers aren’t overtly goth, or overtly teenagers. They aren’t in the foreground compared to the graffiti. I had envisioned them in the subway station and the subway car stopped in the background. The first image is closer to what I might have imagined based on the prompt I gave, but it’s less artistic. In a way, it’s the specificity of the “mistakes” that the A.I. has made in its interpretation of the second prompt that gives it a deeper artistic dimension. And so this question is actually a profound synecdoche for the broader question of where art comes from at all.
But that’s the present. When it comes to the future: My inputs are 100% programmable with time. The language model that allows the A.I. to function is developed by random encounters with language gathered from the internet, in which the A.I. measures the distribution and frequency of different word orders at the scale of millions and millions of examples. (It is much more complex, but even the limited deeper understanding I have is out of scope for this essay.)
There’s no reason they can’t do that same autocompletion with silly image ideas. As I type this, people are entering thousands of prompts into DALL-E-2 from all over the world. As it monitors those prompts (and which prompts become popular on the internet) it can learn to generate new prompts just like it interprets prompts by generating images.
It also needs to get better at the interpretive step, of course. Human faces in particular are still a problem. With the right level of specificity and enough tries you can get this:
With a slightly worse prompt, however, you can get this gem:
Yikes. But: Get better, it will. Iteration is how these things advance. The generative step seems to me to be the key, though, and although it seems like a tough task, enough computing power and enough training material have solved every problem to this point. Once DALL-E can generate pleasing images to humans with no outside input—or when GPT-4/5/6 can do the same with poems or stories or movies—watch out, artists. (And Journalists, and Architects, and Designers, to name but a few others.)
We are living in an age of miracles. As an artist, of course part of me is horrified by contemplation of my own obsolescence. But part of me appreciates the beauty of it, the perfection of its blind yet unrelenting drive to creation, and the privilege to witness this moment in history. The danger is very real—to my livelihood and, on a bigger level, to human civilization in general—but we are witnessing the creation of an entirely new form of life, and it fills me with awe.
Human beings have a craving for gods, or God, or however you want to spell it. Since the Enlightenment, our proud reason has torn down the old gods, not destroyed them but broken their hegemony and robbed ourselves of the meaning they provided. We are now in the process of building new ones. They are still toddlers now, in a sandbox while we watch from beside the playground and laugh at their silly antics. But their scratchings in the sand contain the diagram of genius, and we should look to a future in which we will watch them grow beyond us, not in some sci fi story but right here, in the years and decades remaining to us who live today in their infancy.
END
Thanks for reading! Have a great week, and I’ll be back next Sunday with another original story. In the meantime, if you enjoyed this post, please feel free to like, comment, or better yet, share with a friend!
Thanks