June 25th, 2024

Texture Enhancement for Video Super-Resolution

The GitHub repository contains the Pytorch implementation of "EvTexture" for video super-resolution presented at ICML 2024. It includes author details, updates, demos, installation, testing, data prep, citations, contacts, and licenses.

Read original articleLink Icon
Texture Enhancement for Video Super-Resolution

The GitHub repository for "EvTexture: Event-driven Texture Enhancement for Video Super-Resolution" hosts the official Pytorch implementation of the paper showcased at ICML 2024. It offers details about the authors, news updates, video demos, code installation guidelines, testing procedures, data preparation instructions, citation information, contact details, and license acknowledgements. For any queries or support related to this repository, you can reach out for assistance.

Link Icon 10 comments
By @vessenes - 7 months
So this is surprisingly bleeding edge, at least to me. I had to go learn about some hardware and physical imaging stuff I didn’t know to get my head around it.

Upshot: Event Cameras are a different sort of camera in that they have an array of sensor pixels, and sensors only fire when there is a brightness change for that sensor. This has a bunch of benefits, including very high dynamic range, reduced ghosting, and high frame rates, and has some downsides, like reconstructing video, and presumably others.

The paper seems to have started out with the idea that if you had event camera output, you’d be able to reconstruct more fine texture details. And, this works incredibly well, their baby model trained for 8 days significantly beats SOTA and looks a lot better in comparisons as well.

They then seem to have added a step where you simulate/infer event camera data from “normal” RGB video, using a different set of networks, and use that inferred event data to do the texture recovery, and … this also works.

Pretty surprising, and interesting. Their GitHub is full of people like “I want to try this” and then realizing it’s a fairly deep stack to deploy. Even as is, it seems worth someone building a GUI around this in an app, it’s quite remarkable.

By @mermerico - 7 months
If my interpretation of the paper is correct, they are using the high resolution event data in addition to the low resolution RGB data in order to do the reconstruction, so this technique won't enhance random videos on the internet. It's a new algorithm to take advantage of event-based cameras that usually record both high resolution event data and low resolution RGB.
By @dartharva - 7 months
So we finally have the magical "Enhance" button from sci-fi detective movies and shows, nice!
By @ComputerGuru - 7 months
I have an assortment of low quality original encodes from the 90s (an assortment of thousands of mpeg and flv web videos, think divx and co) that I’ve refrained from reencoding in hopes that some day AI would get there and having the originals would pay off. But looking at all the “originals” in the demo, they’re all super blurry (blurry upscaling, I know, but also trademark h264 low bitrate or high deblocking). It would be ironic if I had to use h264/h265 as a deblocking upscale intermediate step before using something like this someday.
By @smusamashah - 7 months
Project page with a few different clips https://dachunkai.github.io/evtexture.github.io/
By @pornel - 7 months
They don't explain what event-driven means, but AFAIK it's based on diffs between frames, which highlight motion and de-emphasise overall brightness/exposure:

https://github.com/uzh-rpg/rpg_vid2e?tab=readme-ov-file#read...

By @the8472 - 7 months
All the sample clips have camera motion. Does it perform worse with a static camera or is there enough variation from frame to frame to still recover details?
By @bilater - 7 months
It would be great if there was an upfront metric about how long the process takes (say per one minute of video) as it usually is a lot.
By @K0balt - 7 months
I wonder what you get if you give it extremely low resolution pictures (say, 64x64)
By @ggm - 7 months
Some historicaland b&w footage might have sold the idea. We're hoarding low resolution half scan VHS of our family super8.