Show HN: Music Generation - 100x Speed Demo
Riffusion demo enables quick generation of 30-second 44kHz stereo audio in 0.3 seconds using GPU diffusion technology. Users interact by blending genres like angelic choir and trap beat, with clickable prompts for mixing. Allows experimentation with music blending.
Read original articleThe demo called Riffusion allows users to generate 30 seconds of 44kHz stereo audio in just 0.3 seconds using diffusion technology on a GPU. Users can interact with the demo by dragging circles to control a transition between two genres, such as an angelic choir and a trap beat. Additionally, prompts can be clicked on to mix up the audio output. This tool provides a quick and interactive way for users to experiment with blending different music genres and creating unique sound combinations.
Related
Generating audio for video
Google DeepMind introduces V2A technology for video soundtracks, enhancing silent videos with synchronized audio. The system allows users to guide sound creation, aligning audio closely with visuals for realistic outputs. Ongoing research addresses challenges like maintaining audio quality and improving lip synchronization. DeepMind prioritizes responsible AI development, incorporating diverse perspectives and planning safety assessments before wider public access.
Show HN: I built a large JavaScript powered flipdisc display. Here's a guide
Flipdisc displays, or flip dots, use electromagnetic pulses to switch colors. A project details building a large interactive display for offices, covering construction, power, software, and design considerations. It aims to explore real-time visualizations and user interactions, hoping to make flipdisc technology more accessible.
We increased our rendering speeds by 70x using the WebCodecs API
Revideo, a TypeScript framework, boosted rendering speeds by 70 times with WebCodecs API. Challenges overcome by browser-based video encoding. Limited audio processing and browser compatibility remain.
Real-time audio programming 101: time waits for nothing
Real-time audio programming demands adherence to principles like avoiding blocking operations and prioritizing worst-case execution time to prevent glitches. Buffer sizes of 1-5ms are typical for low-latency software.
Related
Generating audio for video
Google DeepMind introduces V2A technology for video soundtracks, enhancing silent videos with synchronized audio. The system allows users to guide sound creation, aligning audio closely with visuals for realistic outputs. Ongoing research addresses challenges like maintaining audio quality and improving lip synchronization. DeepMind prioritizes responsible AI development, incorporating diverse perspectives and planning safety assessments before wider public access.
Show HN: I built a large JavaScript powered flipdisc display. Here's a guide
Flipdisc displays, or flip dots, use electromagnetic pulses to switch colors. A project details building a large interactive display for offices, covering construction, power, software, and design considerations. It aims to explore real-time visualizations and user interactions, hoping to make flipdisc technology more accessible.
We increased our rendering speeds by 70x using the WebCodecs API
Revideo, a TypeScript framework, boosted rendering speeds by 70 times with WebCodecs API. Challenges overcome by browser-based video encoding. Limited audio processing and browser compatibility remain.
Real-time audio programming 101: time waits for nothing
Real-time audio programming demands adherence to principles like avoiding blocking operations and prioritizing worst-case execution time to prevent glitches. Buffer sizes of 1-5ms are typical for low-latency software.