Netflix's bet on advanced video encoding
Anne Aaron, Netflix's senior encoding technology director, drives bandwidth savings and quality improvements through innovative encoding methods like per-title encoding and machine learning models. Netflix's commitment to optimizing streaming quality remains strong.
Read original articleNetflix's senior encoding technology director, Anne Aaron, has been instrumental in optimizing the way Netflix encodes its content over the past 13 years. Her team's work has led to significant bandwidth savings for 4K streams and contributed to industry efforts like the development of the AV1 video codec. As Netflix expands into cloud gaming and live streaming, Aaron faces new challenges in real-time encoding for live events like WWE RAW. The company's innovative per-title encoding approach, introduced in 2015, has resulted in bandwidth savings and improved streaming quality. By encoding videos shot by shot and applying different settings to each segment, Netflix ensures optimal visual quality while saving bandwidth. Aaron's team uses machine learning models to analyze video quality and determine the best encoding settings for each slice of content. Netflix's involvement in advancing video codecs, like AV1, and developing new metrics for HDR encoding demonstrates their commitment to optimizing streaming quality. Despite challenges, such as the need for tailored testing content, Netflix continues to lead in advanced video encoding techniques to enhance the viewer experience.
Related
20x Faster Background Removal in the Browser Using ONNX Runtime with WebGPU
Using ONNX Runtime with WebGPU and WebAssembly in browsers achieves 20x speedup for background removal, reducing server load, enhancing scalability, and improving data security. ONNX models run efficiently with WebGPU support, offering near real-time performance. Leveraging modern technology, IMG.LY aims to enhance design tools' accessibility and efficiency.
Video annotator: a framework for efficiently building video classifiers
The Netflix Technology Blog presents the Video Annotator (VA) framework for efficient video classifier creation. VA integrates vision-language models, active learning, and user validation, outperforming baseline methods with an 8.3 point Average Precision improvement.
Generating audio for video
Google DeepMind introduces V2A technology for video soundtracks, enhancing silent videos with synchronized audio. The system allows users to guide sound creation, aligning audio closely with visuals for realistic outputs. Ongoing research addresses challenges like maintaining audio quality and improving lip synchronization. DeepMind prioritizes responsible AI development, incorporating diverse perspectives and planning safety assessments before wider public access.
HybridNeRF: Efficient Neural Rendering
HybridNeRF combines surface and volumetric representations for efficient neural rendering, achieving 15-30% error rate improvement over baselines. It enables real-time framerates of 36 FPS at 2K×2K resolutions, outperforming VR-NeRF in quality and speed on various datasets.
'Meridian': Why Netflix Is Helping Competitors with Content and Code (2016)
Netflix created "Meridian," a 12-minute film for testing video codecs on 4K TVs. Shared under a Creative Commons license, it aids industry collaboration and promotes open source practices in Hollywood.
I got into programming/software by encoding my cough well-sourced cough movies/TV shows to MP4 for my iPod video.
Far be it from me, maybe it's insufferably geeky detail, but the slow decade-long march described as "gee each movie is different" and "gee each scene is different" followed by Herculean work of FAANGers insufficiently appreciated by creative types was solved by VBR years upon years earlier. (VBR = variable bit rate)
Once you're getting to "we'll use ML as a VBR algorithm!", that's original, but the problems described and solution was understandable and solvable by a 18 year old non-programmer in 2007 with free software.
VBR wasn't some niche thing either, it's a very very obvious optimization I've never seen a codec miss, from MP3 audio to MP4 video. There's no caveats here or haughtiness or flippant "Dropbox is rsync + my weekend" dismissiveness on my part. It wasn't news to _anyone_, it's a very obvious optimization that was applied by everyone
I'd be veeeeery curious if there was much contribution here beyond using x264, occasionally with patches, and then engineering a pipeline around it
Even better they „researched” better metrics like pVMAF so they can again show how good they are, in theory.
Or are they just aggressively searching for corners to cut to save bits?
i.e. slice it not just into scenes but also into objects and do bitrate on that level. i.e. Face and objects in foreground get more. It seems we now have pretty small models that can do that sort of stuff (see Apple & MS ones recently) so should be feasible at scale.
I'd imagine you can also train an LLM on patterns that encoders choke on...chequered patterns etc.
- Netflix will only send you "high quality" (over 720p) streams on certain browsers (IE on windows, Safari on OSX, ???? on Linux) that support Encrypted Media Extensions.
- Netflix also auto-scales the quality they are sending based on their understanding of your connection.
- Also apparently they do per-title re-encoding passes.
This all combines, in my experience, to on average the worst streaming quality overall and also the most opaque. There are debug modes you can enable to see some of this, but generally it's very hard to tell what quality you are looking at and what is preventing you from getting a nicer quality. I also find that Netflix's "low profile" content (i.e. not max quality) looks bad - i.e. the 720p stream looks quite bad in addition to being low resolution.
Rapidly oscillating patches, blurry smears, it’s only Netflix (other video streaming services are fine) its getting quite annoying, a few times it’s been so severe I thought it was a problem with my projector
German Court Fines Netflix €7.05 Million for Continued Infringement of Broadcom HEVC Patent (2023)
https://www.broadcom.com/company/news/product-releases/61711
You'd think netflix could do better than the ad-hoc groups of individuals who do it for free in their spare time.
we've already crossed the line where it creates a garbage viewing experience for the end users, with pixel porridge everywhere and absolutely horrendous visuals in the darks and shadows for every movie and series
netflix engineers are failing (on purpose) to create a decent viewing experience for their users
Every Netflix show I've seen in the past year (n=3) has had these crazy panning drone shots of forests in winter, with really high f#'s, graphics of a million particles exploding, shots of ocean waves at dusk... basically, they are shooting video encoding stress tests and then encoding them very poorly. The result looks like dogshit.
When people say "refreshing their browser" it sounds like they're watching on a PC. That's probably the worst platform to watch on these days, from a market share point of view. Just saying.
Related
20x Faster Background Removal in the Browser Using ONNX Runtime with WebGPU
Using ONNX Runtime with WebGPU and WebAssembly in browsers achieves 20x speedup for background removal, reducing server load, enhancing scalability, and improving data security. ONNX models run efficiently with WebGPU support, offering near real-time performance. Leveraging modern technology, IMG.LY aims to enhance design tools' accessibility and efficiency.
Video annotator: a framework for efficiently building video classifiers
The Netflix Technology Blog presents the Video Annotator (VA) framework for efficient video classifier creation. VA integrates vision-language models, active learning, and user validation, outperforming baseline methods with an 8.3 point Average Precision improvement.
Generating audio for video
Google DeepMind introduces V2A technology for video soundtracks, enhancing silent videos with synchronized audio. The system allows users to guide sound creation, aligning audio closely with visuals for realistic outputs. Ongoing research addresses challenges like maintaining audio quality and improving lip synchronization. DeepMind prioritizes responsible AI development, incorporating diverse perspectives and planning safety assessments before wider public access.
HybridNeRF: Efficient Neural Rendering
HybridNeRF combines surface and volumetric representations for efficient neural rendering, achieving 15-30% error rate improvement over baselines. It enables real-time framerates of 36 FPS at 2K×2K resolutions, outperforming VR-NeRF in quality and speed on various datasets.
'Meridian': Why Netflix Is Helping Competitors with Content and Code (2016)
Netflix created "Meridian," a 12-minute film for testing video codecs on 4K TVs. Shared under a Creative Commons license, it aids industry collaboration and promotes open source practices in Hollywood.