The Age of Infinite Entertainment

Ethan Kaplan
while(true)
Published in
5 min readDec 23, 2022

--

What does it mean when the creator is the catalyst for the infinite, and not the origin of the finite?

There are a couple of truisms about the evolution of technology and the arts.

  1. Those that will be the most effected by innovation (both for good and ill) will be the least likely to understand it
  2. Those that do understand it have the least skin in the game, but stand to make the most money
  3. The only thing that will ruin art is artists’ inability take the risk to ruin it for themselves

The attention that Generative AI has gotten is interesting, but predictably it falls into these three buckets. It’s a new technology, understood by few, scaring many, and the reaction to it is immediate fear instead of seeing things as opportunity. A tale as old as any media transformation.

So let’s flip it a bit and look at what “Generative AI” really is, what it represents and what the future will hold. What everyone is freaking out about — right now at least for visual art — is a process called Stable Diffusion. I’m not going to explain how it works, I’ll link to it. But the basics are, take a large amount of images, label them, and break them down into “features” (composition, color, edges, etc) that can be used to reconstitute the image. Store this information in a way that machines can process quickly.

Then, take the text descriptions, funnel them through a machine based way of understanding text. Combine text descriptions with the ability to generate images from the ground up through reconstitution. The more computing power, time, etc the more it’ll look good. But essentially all image generators start as the instructions necessary to make an image out of noise.

Here’s the thing though, all “Generative AI”, whether ChatGPT, music generators, diffusion model generators, deep learning algorithms, etc all depend on two factors:

  1. The ability to take anything (audio, video, movement, images) and distill it into math
  2. The ability to take the math and reconstitute it as a representational artifact

This doesn’t mean that distilling video into math will result in a video. Sometimes you can take audio, extract a feature from it, and apply it to other audio and make the other audio “sound” like the first, without being the first.

Math can be manipulated. And the output of manipulation, whether supervised or unsupervised machine learning, basic calculations, Fast-Fourier transforms, or simple matrix functions can get represented in various forms as well. FFT data from one song can “convolute” another. It’s how amp modeling works, how one amp can sound like 100 (I should know!).

So while everyone is focused on Generative AI, lets call it what it really is: its an engine for the creation of infinite entertainment.

Classic entertainment depended on a chain of events happening in the right order, at the right time, with the right backers to make it happen. A pitch to a treatment to a script to a development deal, pre-production, production, post, distribution, exhibition, ex-post-facto rights. Of course there are the vague “franchise” umbrellas, but in the end the output was a finite piece of entertainment and its derivative works. Sometimes a lot of people were involved, sometimes just the creator.

In this process, value was created along the way. Writers, producers, directors, post-production supervisors. Ideas in, art out, media presented. Everyone knew what they made, how it contributed to the whole and what the output was. Also, there was a monetary model built entirely around this. It was and is finite entertainment.

Now we’re in the world of Infinite Entertainment.

Examples from the HuggingFace Studio Ghibli model

Look at this fine tuned model on HuggingFace. It lives on top of the Stable Diffusion model, but uses the IP from Studio Ghibli (without compensation) to convolute the output. Or if you look at the “create your own avatar” app Lensa, they are using your face to convolute SD models, along with other models used to train it (BladeRunner, etc).

We’ve gone from creator → consumer; to training model foundations, fine tuned models, auto-encoders, input models and then generative output.

Stable Diffusion and Dall-E are all image based. There’s been some that have altered it into representation of sonic data, and of course Chat-GPT3 was trained against creative arts data as well and can spit out poetry, lyrics, scripts, etc. There are tons of deep-learning based tools for video which create “deep fakes” all using some of the same basic technology, and there’s emerging tech within music to auto-generate music.

The point being, we’re maybe a year away from this:

“Give me a three minute song that sounds like Taylor Swift, produced by Aaron Dessner, but the voice of Michael Stipe, along with a video that looks like something Anton Coribjn would have done in the 90’s. Oh, and make it sound like it was recorded at Compass Point in the Bahamas”

Anything — and I mean anything — that can be distilled into data will be subject to recreation in an infinite stream of media. The implications for who owns the rights, who owns the rights to what aspects, who owns the media generated vs the training data to do the generation, and who owns the mathematics that the models were trained on; no matter what people say is all the Wild West.

The age of Infinite Entertainment is here.

Content will outlast the creators. The ability to create new content will continue long after the original artist has died. The technology will get good enough so that generated content will be indistinguishable from the originals.

From above a shoreline looks finite, but the closer you get, the more and more infinite it becomes. Every frame, every pixel, every 3d voxel and captured data point is now subject to recombination to make infinite durable works of media. The frame of reference is as infinite as the coastline paradox.

But are they works of art?

There was one urinal put in a gallery, signed R.Mutt and called “Fountain” in 1917. But did that redefine every urinal after it as art? Did Yves Klein redefine a color as his own, or is the color itself art? Does any artwork using International Klein Blue have authorial ties to Yves Klein, or is it non-derivative?

The question on originality, of the map and the territory, is as old as the questions about art itself. But art, media, culture are all self reflexive terms. The very nature of the debate about the terms and their validity and what falls within each itself becomes Art, Culture and Media. Generative art, AI, Infinite Entertainment — call it whatever — it’s a new provocation to a question as old as humans drawing a bison in a cave.

What is art?

Yes.

--

--

music+technology - geek and fan in equal measure. ex chief digital officer at Fender