Generative AI tools are cool, but they will not replace filmmakers yet.
This post is written by Daniela Nedovescu.
The Internet is a fun and overwhelming place to be. Every day there’s a new update about more AI tools coming out and it’s getting hard to keep up. Eight months ago, it seemed like only a handful of people had heard about GPT-3 and Stable Diffusion. Now, we’re already talking about school kids writing essays with ChatGPT and about its upcoming upgrades and rivals. My only wish in middle school was to have a Nokia 3110, so I could beep (couldn’t afford to call) my friends when our favorite song was playing on MTV.
Also, about eight months ago, we were invited by neuroscientist Zach Mainen to take part in an art-science-technology exhibit in Portugal at the Champalimaud Center for the Unknown. The exhibit was highly focused on the potential of using AI in digital therapeutics, by way of neuroscience. Here’s where I’ll stop with the neuro-talk because it can get overwhelming quite fast (I’m talking about myself here).
Given that this was commissioned by scientists, and we were in an unexplored territory we decided to label it as an experimental project and use the machine to tell us what it thinks and write a short film concept that we could use.
By the way, using terms like “it thinks” “it creates” or “AI is the artist”, isn’t really accurate, but we’ll take a peek into that rabbit hole a bit later.
Latent Space – a short film made with AI from Mots on Vimeo.
Writing with GPT-3
We took the main topics (AI, digital therapeutics, neuroscience, psychedelics, future) and convinced GPT-3 with a few prompts to come up with scene outlines for a short film. We knew that we wanted to make something that is neither pretentious nor highbrow, so we may have also asked it to take into consideration that Andrei Tarkovsky is making a movie with Ingmar Bergman, because neither of them is, well, pretentious nor highbrow.
We ended up with snippets of scene outlines combining sci-fi, meta-humor, and absurdity which wasn’t very far from the stuff that we were making. For instance, the machine created a sort of pattern of similar encounters where the main character meets different characters in the same manner, but in different locations.
These scene outlines were then combined into a plot which was fed back to the AI to format it into a screenplay. The screenplay was not bad. It had dialogue and a clear structure, but it felt mediocre. Also, it kept making weird spiritual references which we initially did not see but managed to decode with the help of Philosopher Razvan Sandru, who, together with Zach Mainen found connections to certain threads that are common in philosophy, psychology, and neuroscience.
In prep, we changed it even more based on the locations and resources that we could get our hands on given the limited budget and time. For example, the idea that the protagonist sees herself (but thinks that is God), made sense both from a production standpoint, but also story-wise.
Using AI for Post-Production
When we started production, we had no idea whether the technologies we were planning to use would work in a post-production pipeline, but we focused on the production and getting footage in the can knowing that we can rely on our friends at Zauberberg (who co-produced the short with us) if things spin out of control. We also focused quite a lot on not getting killed by nature both in Germany and in Portugal where we shot on location, but this is the story of another time.
I don’t have enough paper for this article to list how many times the phrase “AI will fix it in post” was mumbled on set…
But here are some of the many things we did to speed up post-production and the creative process:
Aging the Protagonist with an AI-Trained Model
To dive a bit into one of the fears of the protagonist, we wanted to show an aged version of her as she was looking in the mirror. Instead of waiting for Siori, our actress, to grow old or scout for an identical yet way older twin, we learned how to train an AI model to generate an aged version of the main character.
After feeding only 20 pictures of the actress to the AI (on a local machine, using stills from the footage only) and some hours spent on engineering the prompts, we got a few good keyframes with aged-Siori that we could use. We selected one and used EbSynth to replace the face in the mirror from the original footage.
Once you have the AI model trained, it’s hard not to lose yourself in procrastination and generate an infinite amount of imaginary Sioris, and we even made a bunch of movie posters with it, but that’s also the story of another time.
The Floating Whale
“For this 2-second shot, we will need a floating whale” is something that a VFX producer does not want to hear on a low-budget short film.
For anyone who has dealt with CG, it’s obvious that this would not only take hours of 3D modeling and character animation, but also a lot of work to match the light, camera, and so on.
Unfortunately, it wasn’t super easy for AI to do this either since it needs to generate a weird unrealistic concept while being constrained by the perspective of the shot. We decided to break down the work into human-made elements (initial 3D whale mockup that was added into the moving shot) and AI-made elements (re-painting of the whale mockup on a single frame, then replicating that across the entire shot), and then some more human-made stuff (additional compositing, effects, color, etc.)
Embracing the Weak Points and Strong Points of Stable Diffusion
Since the short film was produced with the intention to screen it at an exhibition, we decided to create multiple variants of the short film that we played in parallel on several screens, allowing the audience to explore the character’s backstory by way of images.
For example, we generated 20 different variations of weird hands that the protagonist sees in the film. We chose to play with the hands because everyone working with Stable Diffusion knows that there is still a problem with generating good-looking realistic hands. So we wanted to embrace this flaw and see where it will bring us.
On the flip side, Stable Diffusion seems to be pretty neat with inpainting – a technique that allows you to mask a certain portion of your image that makes AI focus on creating something specific for that area only. If you can master the right prompts and you have a lot of patience to try and fail, you can get some really interesting results. In one shot we replaced a cliff from the ocean with 20 different ridiculous objects.
Almost no additional human work was needed, except for some minor color and mask fixes, and the most impressive thing is how accurately AI reproduces the light and the style of the scene, and also that it only takes around 5-10 seconds to generate an image on a good graphics card. Again, for everyone working with VFX, the amount of work involving something like this is insane compared to the few minutes that it takes to get to some pretty decent results with AI.
Here’s another example of a moving shot that we generated by letting AI do its thing on a clip with a tree bark we shot during a location scout. Abstractions like these are probably where AI really shines because it can create these weird worlds in only a few minutes, something that would otherwise take a lot of creative time to design and execute.
Probably one of the biggest challenges of the project was to figure out how to make use of the AI tools in post-production, in the tight turnaround that we had to work with. At times it felt that we spent way too much time researching, trying stuff out, and rewiring our brains to use the tech at its full potential. But, in the end, that time investment saved us at least half of the time we would’ve otherwise needed in post, and the VFX crew could focus better on tasks that AI is still not able to do.
Have we made the best short film ever? Not even close, and it’s not just perfectionism talking.
One can still notice a lot of flaws and possible improvements. As I mentioned earlier, the idea of an experimental piece that would simply allow us to use state-of-the-art AI to see where it leads us was one of the goals of the project (or maybe we use this as an excuse for just being bad filmmakers?).
Is AI something that anyone can use for high-level stuff? Of course not. Not yet at least.
For many things you need coding skills or at least to be comfortable in using a terminal (it’s that app that expects you to give it text commands when you open it up, as I recently found out). The good news is that you can use AI to learn that too – or even code without being very good at it: have you heard of GitHub Copilot? But, again, I digress…
Preparing for the Future
AI has been inside many editing software for a while to help us with the automatization of certain repetitive tasks. But these generative tools that are becoming increasingly popular and available for filmmakers are pretty amazing. They allow for a level of creativity, experimentation, and optimization that was previously impossible and can speed up progress.
We’ve come to a point where it’s possible to make movies with AI that are actually not bad and can serve as more than proof of concepts. We, as filmmakers, are going to make more and more use of it as much as possible because we feel that it helps, something that might not be true for others.
Is AI a great tool for making films? Just like any other tool, it’s as good as the painter who holds the brush or the guitar player that uses a flute to stroke the guitar… But maybe what is scary for a lot of filmmakers, and artists, in general, is the fact that AI can produce such a huge diversity of options at a very decent level of quality. And this almost feels like it’s taking away maybe the few things that artists are appreciated for creativity, uniqueness, technique, and so on…
Is AI going to take over arts and creativity because it’s just more efficient than humans? Eight months ago, I would’ve said that that’s a nice movie plot, but now it’s hard to tell what’s coming. One thing is certain: as long as AI cannot take responsibility for what it “creates”, we’re still going to have humans direct, control, and curate, and therefore use AI just like any other tool in one’s creative arsenal.
Now excuse me, but I have to go back to