Film Noir AI Video Generatiion#
I’m a sucker for film noir… The hard shadows, the moral ambiguity, the way everything feels like it’s one cigarette away from collapsing. It just resonates with me on a level that no other genre of movie ever has. Naturally, when I started experimenting with Grok Imagine for AI video generation, noir was the first aesthetic I tried to push it into.
It turns out: film noir is a surprisingly tough nut for AI to crack.
Even though the training corpus for classic cinema isn’t small, it’s clearly overshadwed by the tidal wave of modern high‑definition footage. Models seem biased toward clean, crisp, hyper‑detailed imagery; the exact opposite of what noir demands. And if older material is in the dataset, it may be down‑weighted or filtered because of its lower resolution and analog imperfections.
That becomes obvious the moment you try to force an AI model into the 1940s…
Grok Imagine struggles with:
Artificial downsampling: it tries to mimic low‑res footage but often overshoots into mushiness.
Synthetic grain and noise: instead of organic film texture, you get digital artifacts pretending to be analog.
Low‑fidelity mono audio: noir’s audio palette becomes a kind of “AI AM radio” effect.
Lighting logic: noir’s signature hard‑key lighting (can someone in Hollywood please bring that back?) and deep shadows require intentionality that models don’t naturally reproduce.
And then there’s the human element…
Classic noir relied on actors who had to sell everything with posture, micro‑expressions, and voice, because they didn’t have modern editing tricks or VFX to lean on. AI models don’t inherently understand that kind of physical storytelling. They approximate it, sometimes well, sometimes.. not.
Still, the results I got from Grok Imagine were impressive, after some lengthy tweaking, especially considering the constraints. It took effort, iteration, and a bit of wrestling with prompts, but the model can be coaxed into something that feels authentically noir‑adjacent.
Below are three short 6‑second tests that show where it shines, stumbles, and where the uncanny creeps in.
All video generations are based of AI-generated still images I prompted via Grok Ask.
I was focusing on compressing a stark emotional contrast into these clips, that’s why the dialogue might come over as borderline disordered, not exactly a far cry from many film-noir characters anyway…
Screen Test#1 (Female Subject)#
I just love the raspiness of the voice, the little stutter after the subject’s sighing “oh, well”, the trembling of the head when screaming, and I was astounded by the fact on how explicit I could be in my prompts, down to the pauses of the speech pattern. However, modulating the voice has been the hardest to do, to make it not sound like this Sora2-esque AI-babble.
The skin texture has some very slight issues with continuity, but apart from that it’s fine. Retaining the slight yellowing of the teeth required explicit prompting however.
Screen Test #2 (Male Subject)#
The voice with this subject required very explicit prompting to not sound too much like a tin-can man.
Is the subject a realistic film-noir character? No, the subject is a person of color and film-noir existed primarily throughout the 1940s-60s. It was a tough time for people of color to thrive in Hollywood (or anywhere else) and an actor similar to the subject would have never gotten a main, nor supporting role with a portrayal this bold. But the cinematic accuracy wasn’t the point here , I was really interested in how the the typical chiaroscuro lighting would translate and with this subject, it’s remarkable.
Screen Test #3 (Male Subject)#
This was the toughest one and it took me 31 iterations to get to a result that still has some weaknesses that are hard for me to overlook, mainly the overpronunciation of certain phrases. Also, authentic male facial expressions are a lot harder to model… I would describe it as requiring a stoic warrior mask, but still letting true emotions shimmer through.
And just have to share this specific outtake of an alteration I did prior to the final screen test, because it’s just really funny… Somehow the subject turned into an Austrian. But it highlights the main problem, when the prompts are loaded with explicit directives on vocal tone, timbre, etc., the sentences instructed to be spoken need to follow a very specific rhythm not to end in gibberish. The endinge of the sentence is supposed to be “then I will go. Good bye.”
Comments
Feel free to leave a public comment on my Film Noir AI Video Generatiion blog post.
Before you comment...
In order to comment, you need to authenticate yourself with a
valid e-mail address. The e-mail address will not be publicly displayed, or
shared with anyone, as I (Tiara) also operate the commenting service on my
own server, on which your e-mail address is securely stored.
Choose your
username in accordance with your privacy expectations.