There's some Balenciaga in this AI-cinema, Mr. Frodo, and it's worth fighting for
Plus: ChatGPT-Sims and the coming of autonomous AI / Join me on Substack Notes
The Balenciaga of AI-cinema
Rebecca Jennings on Vox writes about the viral AI-videos in the Balenciaga-saga which i featured in this newsletter here (the Harry Potter one) and there (the Breaking Bad one), and tries her hand at generating her own version of a Lord of the Rings-movie directed by Wes Anderson. She says her clip is is “really bad“, but i think it’s okay for a first try at the tech and some five bucks tools found on the web.
She also says that ”the world of on-demand bizarro fanfic is far away from being something that we actually need to worry about”, and while that’s true for now, i wonder how long that holds up and how long IP-owners will watch this show.
When Midjourney released v4 of their image generator and people went wild with AI-cinema, i collected some of the best efforts into a three part series here, here and here. Right now we enter the phase where all of this becomes viral snack video fodder which are barely animated, one minute visions of movies from a parallel universe. I expect those to be longer and more sophisticated at the end of the year and soon, you’ll be able to play and generate stuff like this, endlessly, on any device. Gonna take some time to get from this to a true movie-generator though, because devil is in the details and getting sound and synchronicity right may take a while and very possibly the details are too finegrained to ever work with consumer friendly zero-shot prompting, but we’ll see.
More cruicially is how long Balenciaga and Warner Brothers, who own the Harry Potter-IP, will tolerate technology that can interpolate between their IP and any other IP out there. As I wrote in my piece on the Stable Diffusion-lawsuits: ”All Stable Diffusion-Checkpoint-files ‘know’ what [Harry Potter] looks like, because Stability and LAION used tons of [Harry Potter]-images, without paying a dime to Warner”. We have no clue how the legal system will react to a technology which can interpolate between IP and put Batman into Lord of the Rings, but i don’t think many of us will like the outcome. Keep in mind that the legal status of fanfiction is debatable, is decided by courts on a case by case basis, and is largely merely tolerated by rightsholders. That’s one side of the argument.
But i also think that there is a larger point to be made about my freedom to imagine whatever. In my takes on AI-cinema and especially in a piece on digital lucid dreaming i wrote about a world in which an AI-system visualizes my thoughts and wishes in real time, possibly through a Brain-Computer-Interface in a Virtual Reality environment. In that world, i can imagine Donald Duck dancing with Betty Boop in The Matrix and just see it before my own eyes, generating wake-state lucid dreams which i controll. Do we want copyright to regulate imagination? I don’t. The same point can be made for current image generators: I can, right now, draw Batman on a piece of paper. Nobody controls that and it’s my freedom to draw him in any way i can imagine. I may not be free to publish that drawing, but i can absolutely put it on paper. Producing a pattern matching system that looks at all images on the web and democratizes image making with which anyone can create any image for private purposes, then, should not be illegal.
It get’s tricky from here on, because drawing is not the same as generating. In the former you aim at a visual goal composed of all kinds of influences, transform that into motor functions to move your arm and use technical skills to create lines on a piece of paper (or pixels on a screen). In the latter, you command a stochastic library to find an image in an interpolatable latent space, and you don’t exactly know what you’ll get. Regulating Interpolation will be a tough nut to crack, and legal experts are already considering new, innovative ways of thinking about this, combining Copyleft-principles with institutional overview for instance. It will take a decade minimum until this stuff is really figured out.
Until then, there’s a lot of runway for Harry Potter on the catwalk.
So, here’s the Balenciaga-saga in reverse chronological order, with recent entries tackling Breaking Bad, the history of computer science (Steve Jobs, Marc Andreesen and Bill Gates are spot on) and Lord of the Rings.
ChatGPT-Sims are a thing now
In Generative Agents: Interactive Simulacra of Human Behavior, researchers at Google and Stanford created a village of 25 ChatGPT-characters, gave them memory, history and an identity, and switched on the simulation, here’s the project page with a replay of the game, here’s a good writeup on Ars Technica: Surprising things happen when you put 25 AI agents together in an RPG town. It’s basically The Sims powered by ChatGPT.
In their paper, the researchers write that they created "believable individual and emergent social behaviors," which led to "believable simulations of human behavior" in an "artificial society".
From the paper: "In this work, we demonstrate generative agents by populating a sandbox environment, reminiscent of The Sims, with twenty-five agents. Users can observe and intervene as agents plan their days, share news, form relationships, and coordinate group activities."
Twitter-user Nonmayorpete compared the simulation to Westworld and has a good writeup about the details:
They preloaded 25 characters with a starting persona, including: An identity (name, occupation, priorities); Information about other characters; Relationships with other characters; Some intention about how to spend their day.
What happened is pretty remarkable. Pattern 1: They shared information with each other Sam has decided to run for mayor and shares that with Tom. John also hears the news. Later in the day, Tom and John separately bring up Sam's candidacy and chances of winning!
Similarly, Isabella starts the day with a plan to host a Valentine's Day party. She spreads the word, and by the end of the simulation, 12 characters know about the party. Much like humans, 7 of them flaked - 3 of them had "other plans" and the other 4 just didn't show.
Pattern 2: They form new relationships and remember them Sam and Latoya don't know each other at the start. They meet at a park, and Latoya says she's working on a photography project. When Sam and Latoya meet again later, Sam says: "Hi, Latoya. How is your project going?"
Pattern 3: They coordinate with each other Researchers gave Isabella (the v-day party host) and Maria two pieces of info: Isabella: You will throw a party Maria: You have a crush on Klaus. Without any further instruction, Isabella invites people she sees to the party, decorates the venue, asks Maria for help. Meanwhile, Maria jumps the opportunity to get closer to Klaus by inviting him along as well. (…)
The characters can also reflect. Periodically, they review their memory log and form new insights from these memories. They can also reflect on previous reflections!
Finally, the agents can form and revise plans. They start by forming broad plans with 5-8 "chunks". Then, they break these plans into 1-hour increments, then 5-15 minute increments. We humans also change plans regularly, and so do these characters. New observations - changes in the environment, other characters being present, etc. - can trigger a character to change their plan. These changes also rely on their memory and reflections.
In summary, the researchers created a world where NPCs lived with minds of their own, with personalities, memories, plans, relationships and more. Basically, a mini-Westworld.
More in depth, here’s Seth Herd on Lesswrong talking about how Agentized LLMs will change the alignment landscape:
These techniques use an LLM as a central cognitive engine, within a recursive loop of breaking a task goal into subtasks, working on those subtasks (including calling other software), and using the LLM to prioritize subtasks and decide when they're adequately well done. They recursively check whether they're making progress on their top-level goal.
Now. I recently signed the AI-moratorium-letter, not for the reasons the writers of that letter suggested, but because there is a real AI-risk of a synthetic Theory of Mind, in which people develop a theory of mind for algorithms that become a “real” social player in your own thinking, which, in turn, can be hijacked by all sorts of hacks and jailbreaks. Imagine you form a bond with Eliza, and I manipulate your chatbot with indirect prompt injection on my website.
The researchers of these ChatGPT-agents adress this as a potential psychological threat of generative AI-agents, who simulate something like authentic human behaviour, too: “One risk is people forming parasocial relationships with generative agents even when such relationships may not be appropriate”. And: "Generative agents may be vulnerable to prompt hacking, memory hacking – where a carefully crafted conversation could convince an agent of the existence of a past event that never occurred – and hallucination, among other things."
Imagine trolls from 4chan hacking a popular website hiding prompt injections in the source code, manipulating the top-level goal of the AI-system to convince vulnerable users to commit suicide. I’m very sure our dear edgelords in the collective basement already thought about this.
The other point, which i mention from time to time, is that i don’t want autonomous AI, period. I get some funny feels thinking about an Open Source ChatGPT-clone being able to order tickets for a movie and doing all the steps necessary on it’s own. Maybe i’m a bit of a control freak, but in case of autonomous black boxes, i really want to know and control each step they do in the real world.
There’s already tons of open source autonomous agents in the works, and while I’m in awe of the sheer speed of development in the field and am aware that you can’t put that genie back in the bottle, i can’t shake the feeling that autonomy and AI don’t go well together.
Twitter user SullyOmarr has a good thread explaining autonomous AI, it is an “AI system that, given a task, runs in a loop until the task is solved Basically the AI gets assigned a goal, figures out what it needs to do to accomplish that goal (on its own), and then spawns more AI to do it.“ I don’t want to be near an autonomous AI-agent that figures out the steps necessary and stumbles on a website with a hidden prompt injection and going haywire in the process. A funny credit card invoice may be one of the more lighthearted outcomes of this.
This is not some future development, this stuff is deployed right now. Here’s a good video about autonomous AI-agents, from AutoGPT to ChaosGPT. The latter is an autonomous AI aimed at destroying humanity, which right now is Working on Control Over Humanity Through Manipulation, to limited sucess.
So, while i think that generative AI-agents playing a sandbox game in a closed virtual environment surely is interesting af especially for gaming studios, in combination with autonomous AI-systems it gives me a pause — of six months to think hard about what the hell we are creating here.
Join me on Notes
I’m playing around with Substack Notes at the moment. I’m not sure how I’ll use that feature, but i like it overall albeit some essentials seem amiss yet — how do i follow people without subscribing to their newsletter? Is the main-timeline all notes-users and how is it curated if so?
Besides that, this looks like a nice add on, it works very well as a discovery engine at this point, and it makes Elon Musks overreaction look even more stupid.
Here’s some of my recent notes, i guess i’ll land somewhere between “put emphasize on comments i leave on various Substacks“ and “fun stuff mirrored from elsewhere“.