I wrote a piece recently on why I don’t believe in a flood of AI-content, not because there would be no mass of synthetic culture produced with generative AI, but because that flood of stuff just has no impact, lacks effort and emotionality to grab your attention, and is, as art and culture, just mediocre and flat.
And while I want to emphasize that while writing i was thinking about human psychology and perception, I was wrong regarding systems and institutions.
I have a piece coming up in a tech magazine about how generative AI might overwhelm systems of rights management organizations and collection societies for holders of copyright, and that the current copyrights are not up for the task, even when you can’t claim a copyright on synthetic content due to a lack of a natural person creator.
The piece is in editorial process right now, but there’s a first taste of the scenario playing out right now:
One week ago, Clarkesworld Magazine, a long running mag publishing Science Fiction- and Fantasy-literature, posted a piece about a concerning trend, in which the numbers of spam-submissions have spiked in February due to AI-generated content:
Clarke created a chart that showed the number of submitters that the magazine has had to ban by month, with February 2023 being disproportionately high, at over 500 people. A year ago, there were around 20 bans in February.
"Prior to late 2022, that was mostly plagiarism. Now it's machine-generated submissions," the magazine's Twitter account stated.
Clarke wrote in his blog that he reached out to editors of other magazines who confirmed that this is a pattern across the board, and not just a unique situation to Clarkesworld. Indeed, due to ChatGPT's free and open access, an entire cottage industry has popped up online of people using the chatbot to make money, and instructional videos and blogs giving tips on how to do so. There are hundreds of e-books on Amazon listing ChatGPT as an author or co-author, Reuters reported, including many books about how to use ChatGPT, written by ChatGPT.
Submissions to the Magazine are closed for now.
What is happening at Clarkesworld might happen to any and all systems that distribute cultural expression and while you can ban AI-generated content alltogether, all those systems will have to deal with a new form of synthetic cultural spam.
Even when we will be able to watermark synthetic content, those watermarks can be easily destroyed by manual editing, and maybe that can be automated too. The open source AI-detector GPTzero recently posted a case study in which they reported a 50% rate of false positives, and while those techniques will improve, I’m not positive that watermarking and AI-detection will ever be effective enough to hold up a rising tide of cultural expression in which individual pieces are impossible to differentiate from stuff created by humans.
For now, I don’t think that this contradicts my stance that an AI-flood on content will not really overwhelm a human audience, because most synthetic content is just shallow bullshit, and ai-generated story-spam submitted to SciFi-mags and Amazons kindle-store fits that bill.
There were over 200 e-books in Amazon’s Kindle store as of mid-February listing ChatGPT as an author or co-author, including "How to Write and Create Content Using ChatGPT," "The Power of Homework" and poetry collection "Echoes of the Universe." And the number is rising daily. There is even a new sub-genre on Amazon: Books about using ChatGPT, written entirely by ChatGPT. (…)
One author, who goes by Frank White, showed in a YouTube video how in less than a day he created a 119-page novella called “Galactic Pimp: Vol. 1” about alien factions in a far-off galaxy warring over a human-staffed brothel. The book can be had for just $1 on Amazon’s Kindle e-book store. In the video, White says anyone with the wherewithal and time could create 300 such books a year, all using AI.
But also:
Consumer interest so far has been admittedly sleepy: Banc said sales have totaled about a dozen copies.
I mean look at this:
This is just the SEO-spam-bullshit for a new AI-age, and no real human falls for this (i hope), except other SEO-spam-bullshitters, who never were real people to begin with. It’s the first wave of synthesizing the businessmen-smile, “in which a stochastic parrot positions itself behind the customer to service the account”.
But while I am quite sure that this sort of spam poses no real threat to humans themselves, I am concerned that synthetic stuff can flood existing systems, like the beforementioned SciFi-mags or, maybe more crucially, bureaucratic institutions like performance rights organizations like BMI or GEMA for music, or the german VGWort which collects license fees for journalists.
The basic model of institutions like GEMA or BMI are: You create music, you apply that music for coverage, your music gets played by a deejay and you cash in. So what happens if that Deejay is replaced by a generative AI generating music in real time in the style of Nirvana? Thats one question, and its extremely tricky if not unsolvable: If generative AI provides a computed latent space in which you can interpolate between songs, who gets paid and how much? How does the BMI want to calculate, how much of a Nirvana-song is actually generated in a song in the style of "Nirvana, Bongo Boys and MC Hammer"?
Generative AI means the dissolution of the human archive of cultural expression into a explorable latent spaces. I just can train a model on the structure of a thing, say "protein folding", and it computes a multidimensional space with every possible solution to that folding, and i just have to explore by selecting parameters. The same is true for music, art, visuals, movies, text, song lyrics, business coaching, 101 marketing tricks to boost your productivity, real estate listings, job applications, basically every form of communication that uses language in the form of text, audio, and (moving) images.
How can BMI or GEMA or any other organization of rightsholders spread incomes for creatives, when there's a library of stochastic, interpolating nature, featuring the whole history of human knowledge ready for prompt-guided interpolative exploration?
The dissolution of the human archive of cultural expression into a explorable latent spaces also means the dissolution of the licensing model of these institutions, which relies on identifiable works and they are unprepared for what's coming, because in an interpolatable latent space, the very concept of identification becomes fuzzy.
The closing for submissions of Clarkesworld SciFi-mag is one of the first signs of this.
Links
Microsoft rolled out their unhinged BingAI in India 4 months ago in a public test. Gary Marcus has a piece featuring a tweet by yours truly.
A new paper shows how researchers “could have poisoned 0.01% of the LAION-400M or COYO-700M datasets for just $60 USD“.
The wikipedia edit wars of the past may become the dataset poisoning wars of the future. Imagine Putins troll army injecting false information about Ukraine war into ai training datasets. The coming propaganda might hide behind false accusations of 'hallucinations'.Very interesting piece of AI-art by Eryk Salvaggio:
”Sensitive Noise (2023) draws from censored AI-generated images associated with the prompt ‘Gaussian Noise, Human Sensuality’ in Stable Diffusion. Automated content moderation systems blur images deemed inappropriate. While collecting a series of images for the prompt ‘Gaussian Noise’ — a prompt which generates abstract images as a result (or representation) of rendering errors, I noticed that many of these images of pure noise were nonetheless flagged as containing sensitive content. Collecting these images, and further prompting the model with ‘Gaussian Noise, Human Sensuality’, created a dataset of censored (blurred) abstract images. ‘Sensitive Noise’ compiles these images into a silent video, with each frame interpolated into the next. It is clear that these images represent nothing (though some suggest more than others) — nonetheless, they are ‘suggestive’ to the machine. The knowledge that these contain forbidden scenes of human sensuality prompts the viewer into seeking the sensual element of these images: whether because we know it is there, or to see these images through the eyes of the content moderation system.”Nothing, Forever is set to return to Twitch with new guardrails in place
Infinite Big Bang Theory InfiniTV_AI. We see where this is going. In a few years, all kind of IPs will have infinite streams you just log on. I’m very sure that’s not what Michael Ende had in mind when he named his famed novel “The Neverending Story“.
Süddeutsche Zeitung has a long piece on the biggest case of organized cyberbullying and online manhunting: Jagd auf den Drachenlord - der Fall Rainer Winkler. I wrote extensively about this case (in german).
India's heat waves may be coming earlier as February temperatures break records: “Last year, Indian meteorologists sounded the first heat wave alert of the year in March, foreshadowing a summer that arrived unusually early — and brought some of the most extreme temperatures in India’s recorded history. This year, they are sounding the alarm even earlier. The India Meteorological Department issued the first heat wave alert of the year on Sunday, warning that parts of India’s western region will reach 98.6 degrees Fahrenheit (37C). Meanwhile, other parts of India are recording temperatures that are usually seen in mid-March and at least 9 degrees above normal.”
Thanks for the article. "the dissolution of the human archive of cultural expression into a explorable latent spaces" made me think about Luciano Floridi's article: https://link.springer.com/article/10.1007/s13347-018-0325-3.
In the article he describes how new possibilities that exist for the creation and manipulation of content provide nuances between real and fake that we do not yet have good words for. He proposes the Greek word ‘ectypes’, meaning ‘a copy that has a special relationship with its source’. He considers the original source, the archetype, the authenticity of the production process and the end result, the artefact. Floridi’s prediction is that different forms of ectypes will emerge in the near future, which will mean even more nuances in the difference between fake and real.
We discussed this thinking in depth in our book Real Fake: https://www.amazon.com/Real-Fake-Playing-Deepfakes-Metaverse-ebook/dp/B09GV4718D