We set out with a simple goal: backport something from AV1 to H.264 that would provide compression improvements. Even minor gains would be worth documenting. What we got instead was a masterclass in why understanding where filters work matters more than how they work.

Spoiler: One experiment gave us a 366% increase in file size. Another gave us 82% reduction. Here's the story.

The CDEF Disaster: 366% File Size Increase

CDEF (Constrained Directional Enhancement Filter) is AV1's secret sauce for cleaning up compression artifacts. It's an in-loop filter that runs inside the encoder after quantization, smoothing out block boundaries and directional artifacts before those frames are used for motion prediction.

We thought: "What if we run CDEF as a preprocessor before encoding with H.264?"

❌ Catastrophic Failure

Test: Apply AV1 CDEF filter to The Matrix before H.264 encoding

Expected: Cleaner input = better compression

Result:

CDEF Preprocessing Results (Bigger = Worse)

Baseline (Original H.264) 100%
100%
CDEF + H.264 (Strength 1) 309% INCREASE
309%
CDEF + H.264 (Strength 3) 366% INCREASE
366%

Why CDEF Failed as a Preprocessor

CDEF is designed to work post-quantization, not pre-encoding. Here's what went wrong:

The In-Loop Filter Problem

AV1's filters work inside the encoding loop:

AV1 Encoding Flow:
┌──────────┐    ┌─────────────┐    ┌──────────┐    ┌──────────┐
│  Source  │───>│ Quantization│───>│  CDEF    │───>│ Wiener   │
│  Frame   │    │  (lossy)    │    │ Filter   │    │ Filter   │
└──────────┘    └─────────────┘    └──────────┘    └──────────┘
                                           │
                                           ↓
                                    ┌──────────────┐
                                    │ Motion Pred  │
                                    │ (Next Frame) │
                                    └──────────────┘

CDEF cleans up artifacts AFTER quantization damage has occurred.
Then cleaned frames are used for motion prediction.

Our Failed Approach:
┌──────────┐    ┌──────────┐    ┌──────────────┐
│  Source  │───>│  CDEF    │───>│   H.264      │
│  Frame   │    │ Filter   │    │  Encoder     │
└──────────┘    └──────────┘    └──────────────┘
     ↑                                  │
     └──────────── WRONG! ──────────────┘

CDEF was designed for post-quantization cleanup,
not pre-encoding spatial filtering.

Running CDEF before encoding:

"Trying to use AV1's in-loop filters as preprocessors is like using a bandage before getting injured. The timing is everything." — WINK Engineering Team

The Pivot: Film Grain Removal

After CDEF's spectacular failure, we shifted focus. AV1's Film Grain Synthesis is different — it's about removing grain before encoding and storing parameters to re-add it during playback.

But we didn't want to port 3,500 lines of AV1 grain synthesis code. Instead, we tested a simpler question:

"Can we just denoise the video using existing FFmpeg filters and see what happens?"

Enter hqdn3d: The 20-Year-Old Champion

FFmpeg's hqdn3d (High-Quality Denoise 3D) has been around since 2003. It's a spatial/temporal denoiser that's perfect for removing film grain:

ffmpeg -i The_Matrix.mkv \
  -vf "hqdn3d=4:3:6:4.5" \
  -c:v libx264 -preset medium -crf 18 \
  -tune film matrix_clean.mp4

Parameters:
  luma_spatial=4     (moderate spatial smoothing)
  chroma_spatial=3   (light spatial smoothing)
  luma_tmp=6         (aggressive temporal denoising)
  chroma_tmp=4.5     (moderate temporal denoising)

✅ Stunning Success

Test: Denoise The Matrix samples with hqdn3d before H.264 encoding

Result:

The Test Results: Matrix vs Bunny

We tested on two very different types of content:

The Matrix (10 min)
83%
730 MB → 123 MB
The Matrix (40 min)
81%
677 MB → 126 MB
The Matrix (70 min)
83%
771 MB → 130 MB
Average (Matrix)
82%
Consistent results!

Matrix Denoising Results (5-minute samples)

10 min mark 730 MB → 123 MB
-83%
40 min mark 677 MB → 126 MB
-81%
70 min mark 771 MB → 130 MB
-83%

Big Buck Bunny: Not So Lucky

Animated content is already very clean. Denoising it doesn't help much:

Bunny (1 min)
12%
16 MB → 14 MB
Bunny (4 min)
0%
13 MB → 13 MB
Bunny (7 min)
-44%
16 MB → 23 MB ⚠️

Why Big Buck Bunny Failed

Animated content is already:

Conclusion: This technique is content-specific. It works amazingly on grainy film, poorly on animation.

Visual Comparisons: The Complete Journey

Here's the full transformation for each sample: Original (with grain) → Denoised (grain removed) → With Grain Overlay (grain added back via Canvas). The grain overlay adds only 0.4 MB overhead!

Matrix @ 10 min - Dark Interrogation Scene

Matrix 10min Original

1. Original

730 MB

HEVC 10-bit with natural film grain

All that grain costs 607 MB!

Matrix 10min Denoised

2. Denoised

123 MB

-83% reduction

Grain removed, detail preserved

Clean but looks "too digital"

Matrix 10min with Grain Overlay

3. With Grain Overlay

123.4 MB

-83% (grain = 0.4 MB)

Film aesthetic restored!

Best of both worlds ✓

Matrix @ 40 min - Agent Training Program

Matrix 40min Original

1. Original

677 MB

Training sequence with Morpheus

Consistent grain throughout

Matrix 40min Denoised

2. Denoised

126 MB

-81% reduction

Smooth, compressed efficiently

Matrix 40min with Grain Overlay

3. With Grain Overlay

126.4 MB

-81% total

Grain pattern composited

User-adjustable strength

Matrix @ 70 min - Subway Station Fight

Matrix 70min Original

1. Original

771 MB

High-motion action scene

Grain + motion = expensive!

Matrix 70min Denoised

2. Denoised

130 MB

-83% reduction

Even action scenes compress well

Matrix 70min with Grain Overlay

3. With Grain Overlay

130.4 MB

-83% total

Grain moves with animation

Authentic film look

Big Buck Bunny Comparisons (Animation)

For contrast, here's how animated content behaves — very different results!

Bunny 1min Original

Bunny @ 1 min - Original

16 MB

Already clean (animation)

Bunny 1min Denoised

Bunny @ 1 min - Denoised

14 MB

-12% only

Minimal gains on animation

Bunny 1min with Grain

Bunny @ 1 min - With Grain

14.4 MB

Grain looks artificial on animation

Not recommended ✗

The Grain Overlay Solution

But wait — doesn't removing grain make the video look "too digital"? That's where the innovation comes in.

Instead of baking grain back into the video (which would destroy the compression), we:

  1. Store the clean denoised video (123 MB)
  2. Store a small tileable grain pattern (0.4 MB)
  3. Composite the grain during playback using HTML5 Canvas overlay

Total size: 123 MB + 0.4 MB = 123.4 MB (grain overhead is negligible!)

The Breakthrough

We created a browser-based video player that overlays film grain in real-time using the HTML5 Canvas API. It works in any modern browser with:

How It Works: Canvas Overlay Technique

// 1. Load grain pattern (once, ~0.4 MB)
const grainImage = new Image();
grainImage.src = 'matrix_grain_sample.png';

// 2. On each frame, tile grain pattern across canvas
function drawGrain() {
    ctx.globalAlpha = opacity; // User-adjustable

    // Animate grain position for authentic look
    const offsetX = (frame * speed) % 512;
    const offsetY = (frame * speed * 0.7) % 512;

    // Tile grain across video
    for (let y = -512; y < height + 512; y += 512) {
        for (let x = -512; x < width + 512; x += 512) {
            ctx.drawImage(grainImage, x + offsetX, y + offsetY);
        }
    }

    requestAnimationFrame(drawGrain);
}

🎬 Interactive Comparisons

Compare the original grainy footage with denoised versions. All videos are 5-minute samples from different timestamps.

Matrix @ 10 min - Original (730 MB)

Natural film grain visible throughout

Dark interrogation scene with Agent Smith

Matrix @ 10 min - Denoised (123 MB, -83%)

Grain removed, detail preserved

82% file size reduction

Matrix @ 40 min - Original (677 MB)

Agent training program sequence

Matrix @ 40 min - Denoised (126 MB, -81%)

Consistent compression across different scenes

Matrix @ 70 min - Original (771 MB)

Subway station fight scene (high motion)

Matrix @ 70 min - Denoised (130 MB, -83%)

Even action scenes compress well

🎬 Try the Interactive Grain Overlay Player

Play denoised videos with adjustable film grain overlay

Lessons Learned

What We Discovered

When to Use This Technique

✅ Perfect for:

❌ Don't use for:

The Math: Why This Works

Film grain is high-frequency random noise that's extremely expensive to encode:

Original The Matrix Frame:
├── Structural content (faces, objects, backgrounds)
│   → Compresses well with motion prediction
│   → ~100 KB per frame
│
└── Film grain (random per-pixel noise)
    → No correlation between frames
    → Motion prediction fails
    → ~500 KB per frame trying to encode noise

Remove the grain:
├── Structural content: ~100 KB per frame (same)
└── Film grain: REMOVED ✓

Result: 600 KB → 100 KB per frame = 83% reduction
"Compression is about finding patterns. Film grain has no patterns. Stop trying to compress it — store it once and overlay it." — WINK Engineering

Comparison to AV1 Film Grain Synthesis

AV1's approach vs. our grain overlay technique:

AV1 Film Grain Synthesis

  • Store parametric grain data per frame
  • Synthesize grain in AV1 decoder
  • Requires 3,500+ lines of code
  • Works only in AV1 decoders
  • Frame-accurate grain
  • Not user-adjustable

Result: Codec-specific, complex

Our Grain Overlay Method

  • Store single tileable grain pattern
  • Composite in browser using Canvas
  • ~150 lines of JavaScript
  • Works in any browser
  • Approximate (tiled) grain
  • ✅ User-adjustable strength

Result: Universal, simple, flexible

Distribution Strategy

For end users, you can package the optimized content like this:

Matrix_Optimized/
├── matrix_clean.mp4           (123 MB - denoised video)
├── grain_overlay.png          (0.4 MB - tileable grain pattern)
├── player.html                (HTML5 player with grain controls)
└── README.txt                 (instructions)

Total: 123.4 MB vs 730 MB original (83% reduction)

Users can:
- Play in any video player (clean version)
- Use HTML player for grain overlay + adjustable controls
- Upload to web hosting for streaming
- Share via standard video platforms

Conclusion: The Right Tool for the Right Job

We set out to backport AV1 features to H.264. The CDEF experiment failed spectacularly — a 366% file size increase taught us that in-loop filters don't work as preprocessors.

But the film grain removal experiment succeeded beyond expectations — an 82% average reduction on The Matrix proved that sometimes the simplest approach (use existing denoising tools) beats complex codec-specific implementations.

Key Takeaways