Film Noir AI Video Generation#

I love film noir… The hard shadows, the moral ambiguity, the way everything feels like it’s one cigarette away from collapsing. No grand resolution, everybody loses. It just resonates with me on a level that no other genre of movie ever has.

Naturally, when I started experimenting with Grok Imagine for AI video generation, noir was the first aesthetic I tried to push it into.

It turns out: film noir is a surprisingly tough nut for AI to crack.

Even though the training corpus for classic cinema isn’t small, it’s clearly overshadwed by the tidal wave of modern high‑definition footage. Models seem biased toward clean, crisp, hyper‑detailed imagery; the exact opposite of what noir demands. And if older material is in the dataset, it may be down‑weighted or filtered because of its lower resolution and analog imperfections.

That becomes obvious the moment you try to force an AI model into the 1940s…

Grok Imagine struggles with:

  • Artificial downsampling: it tries to mimic low‑res footage but often overshoots into mushiness.

  • Synthetic grain and noise: instead of organic film texture, you get digital artifacts pretending to be analog.

  • Low‑fidelity mono audio: noir’s audio palette becomes a kind of “AI AM radio” effect.

  • Lighting logic: noir’s signature hard‑key lighting (can someone in Hollywood please bring that back?) and deep shadows require intentionality that models don’t naturally reproduce.

And then there’s the human element…

Classic noir relied on actors who had to sell everything with posture, micro‑expressions, and voice, because they didn’t have modern editing tricks or VFX to lean on. AI models don’t inherently understand that kind of physical storytelling. They approximate it, sometimes well, sometimes.. not.

Still, the results I got from Grok Imagine were impressive, after some lengthy tweaking, especially considering the constraints. It took effort, iteration, and a bit of wrestling with prompts, but the model can be coaxed into something that feels authentically noir‑adjacent.

Setup#

Below are three short 10‑second tests that demonstrates it’s capabilities.

I’m deliberately not sharing any prompts, because I don’t want them to be misused. I figured that the prompts required for these results are a good enough gate.

All video generations are based of AI-generated still images I prompted via Grok Ask.

Additionally I’ve created post-processed screen-tests to further evaluate the quality.

I was focusing on compressing a stark emotional contrast into these clips, that’s why the dialogue might come over as borderline-disordered, not exactly a far cry from many film-noir characters anyway…

Post-Processing#

I did some visual post-processing, emulating visual degradation as best as possible. Visually, I’ve added greyscale, some slight contrast and white balance adjustment, recolorization for a slight tint, and grain.

For audio-post-processing. High and low pass filter to mimmick the frequency range of 40s recording equipment, some frequency pitch flutter and generated white noise.

Both audio and video have been downsampled as well.

It’s more to illustrate that only a reasonable amount of post-processing is required to get a result that’s adjacent to authentic film noir material. And honestly, it makes a lot more sense to think this way anyway: Generate the scenes that would have happened, regardless of the recording equipment, then just degrade and downsample.

Screen Tests#

Screen Test#1 (Female Subject)#

I just love the raspiness of the voice, the little stutter after the subject’s sighing “oh, well”, the trembling of the head when screaming, and I was astounded by the fact on how explicit I could be in my prompts, down to the pauses of the speech pattern. However, modulating the voice has been the hardest to do, to make it not sound like this Sora2-esque AI-babble.

The skin texture has some very slight issues with continuity, but apart from that it’s fine. Retaining the slight yellowing of the teeth required explicit prompting however.

Screen Test #2 (Male Subject)#

The voice with this subject required very explicit prompting to not sound too much like a tin-can man.

Is the subject a realistic film-noir character? No, the subject is a person of color and film-noir existed primarily throughout the 1940s-60s. It was a tough time for people of color to thrive in Hollywood (or anywhere else) and an actor similar to the subject would have never gotten a main, nor supporting role with a portrayal this bold. But the cinematic accuracy wasn’t the point here , I was really interested in how the the typical chiaroscuro lighting would translate and with this subject, it’s remarkable.

Screen Test #3 (Male Subject)#

This was the toughest one and it took me 31 iterations to get to a result that still has some weaknesses that are hard for me to overlook, mainly the overpronunciation of certain phrases. Also, authentic male facial expressions are a lot harder to model… I would describe it as requiring a stoic warrior mask, but still letting true emotions shimmer through.

Screen Test #3 (Male Subject - Outtake)#

And just have to share this specific outtake of an alteration I did prior to the final screen test, because it’s just really funny… Somehow the subject turned into an Austrian. But it highlights the main problem, when the prompts are loaded with explicit directives on vocal tone, timbre, etc., the sentences instructed to be spoken need to follow a very specific rhythm not to end in gibberish. The endinge of the sentence is supposed to be “then I will go. Good bye.”


Comments

Feel free to leave a public comment on my Film Noir AI Video Generation blog post.

Before you comment...

If you don't have an account at accounts.tiararodney.com yet, feel free to create one during sign in, after you've read and agreed to my Privacy and Acceptable Use Policy