The New AI-Psychedelica
The whole YT-channel of Hueman Instrumentality is quite an experience. Make sure to watch this stuff in HD. I can only imagine what Stable Diffusion will be able to do for live motion graphics at music venues and concerts in a year or two.
AI-anachronistic Faces
In Faces Through Time researchers trained StyleGANs on faces for specific decades and now can send a face to the past and imagine a person from the past in the present day. These AI-anachronisms are one of the most fascinating things coming out of this technology, like imaginary reincarnation with a time machine. Results are not quite there yet, but i expect a full working generational Face-DeLorian within a year or so. More examples here.
How can one visually characterize people in a decade? In this work, we assemble the Faces Through Time dataset, which contains over a thousand portrait images from each decade, spanning the 1880s to the present day. Using our new dataset, we present a framework for resynthesizing portrait images across time, imagining how a portrait taken during a particular decade might have looked like, had it been taken in other decades. Our framework optimizes a family of per-decade generators that reveal subtle changes that differentiate decades—such as different hairstyles or makeup—while maintaining the identity of the input portrait. Experiments show that our method is more effective in resynthesizing portraits across time compared to state-of-the-art image-to-image translation methods, as well as attribute-based and language-guided portrait editing models.
I was betting on realtime BCI-2-Image-tech sometime in 5 years that would be able to basically let you have lucid daydreams. I think this will be here way faster, we're already nearing low framerate realtime image generation and BCI language decoding already is here. See: Poisson Flow Generative Models offering 10x to 20x acceleration on image generation, here’s “8 Stable Diffusion Images in less than 6 Seconds” and Yilun Xu auf Twitter: “we can further achieve 100x - 200x speedup with no loss on image quality” which is insane. One more time: This tech is merely 2 months old. Next year we will have video generative AI that morph imagery in realtime. Hook this up to speech recognition and you can command the images with your voice. Then add a Brain-Computer-Interface to that setup and output to VR. Once again: Get ready for digital lucid dreaming. It’s already happening, check out this „Stable Diffusion VR - Real-time immersive latent space using depth maps.”
„A Proof-of-Concept of an AI Assistant Designer using Unreal Engines Metahuman, Stable Diffusion, OpenAIs Whisper and GPT3.
Googles Imagen Video-model. I expected good text2video-models like this next year and given the rapidly developing space around generative audio and 3d (see below), I guess I have to adjust my estimation of a generative movie application where you can just prompt Siri with “The Matrix but in South Carolina featuring a black Neo and a Soundtrack by Creedence” and it just does it.
Style Transfer for singing voices and AudioGen: Textually Guided Audio Generation. Siri, play Creep by Radiohead, but in the voice of Darth Vader.
Text2Music-model by MubertAI generates soundtracks for Stable Diffusion prompts. “Star Wars but underwater featuring Leia as Luke and Luke as Leia and a Soundtrack by Tycho” etc.
Human Motion Diffusion Model. You gonna have realistic singers over realistic audio backgrounds with realistiv movements by realistic CGI models in long generative video, in 1 year, 2 if things slow down. I expected AI Matrix with an img-2-neo-pipeline on prompt in 20 years. I think i have to re-evaluate.
The Laion coco dataset consists of 600 million synthetic human-like image descriptions generated from webscraps full of SEO optimized gibberish. Its a nice paradox, that prompt engineering will have to use much more natural language due to synthetic data and that synthetic machine image descriptions contain much more literary aesthetics that human descriptions optimized for the web.
“CarperAI, a new research lab within the EleutherAI research collective, aims to democratize the LLMs instruction-tuning of large language models, the same way Stable Diffusion democratized image generation.” — This will do for LLM-development as much as Stable Diffusion did for image synth and speed up AI-development even further. Next year will be very, very interesting. We ain't seen nothing yet.
You know you're old when your manual photoshop editing skills get eaten for breakfast by natural language processing: “Imagic is a method for Text-Based Real Image editing with models like stable diffusion with just one image of a subject.”
You know you're getting old when the "enhance"-jokes just are not funny anymore because they became true: “the AI madness continues with the Google Pixel 7 Pro”.
You know you’re old when the blonde doesn’t marry you but the other guy: Paris Hilton and Albert Einstein wedding pictures.
Roope Rainisto “finetuned the new SD1.5 model with some realistic portrait photos, ran a quick test of quality. (…) ‘model eating ice cream’ still needs a tad more work.” No, these are perfect.
The iPhone 14 keeps calling 911 on rollercoasters: “The iPhone 14’s new Crash Detection feature, which is supposed to alert authorities when it detects you’ve been in a car accident, has an unexpected side effect: it dials 911 on rollercoasters.”
Pretty unrealistic sounding interview of AI-Steve Jobs by AI-Joe Rogan at podcast.ai, but given the speed of innovation i expect many such conversational simulations coming up, if only to listen to X interview [legendary celebrity from the past]. GPT3 should already be able to simulate interview questions “in the style of” and the rest is just a matter of time. Albert Einstein will interview Bruce Lee and it will be believable, that’s what I’m saying.
David Chalmers talk at NYU, "Are Large Language Models Sentient?"
Feels Matrix Surf 😡😊😂😛😔😢😠 a lot of generative wojaks feeling the feels.
“Individuals with nice personality traits tend to earn a lower income.” Tell me about it.
The lowest-quality photo sharing site on the internet, every image is compressed to 1KB (1,024 bytes) or less
Woman accused of using bees to attack sheriff's deputies during Longmeadow eviction. Yes, a woman attacked a bunch of cops with a swarm of bees. The obligatory Izzard: I’M COVERED IN BEES!
Also: Do bumble bees play?
And: The Wildlife Photography Winner 2022 is a BALL OF BEES. All caps, yes.
New JWST photo reveals the Pillars of Creation as never before
Linus Åkesson made an accordion from two C64s playing two SIDs as melody and rythm section. Unreal.