July 23rd, 2024

Show HN: Music Generation - 100x Speed Demo

Riffusion demo enables quick generation of 30-second 44kHz stereo audio in 0.3 seconds using GPU diffusion technology. Users interact by blending genres like angelic choir and trap beat, with clickable prompts for mixing. Allows experimentation with music blending.

Read original article

Show HN: Music Generation - 100x Speed Demo

The demo called Riffusion allows users to generate 30 seconds of 44kHz stereo audio in just 0.3 seconds using diffusion technology on a GPU. Users can interact with the demo by dragging circles to control a transition between two genres, such as an angelic choir and a trap beat. Additionally, prompts can be clicked on to mix up the audio output. This tool provides a quick and interactive way for users to experiment with blending different music genres and creating unique sound combinations.

Generating audio for video

Google DeepMind introduces V2A technology for video soundtracks, enhancing silent videos with synchronized audio. The system allows users to guide sound creation, aligning audio closely with visuals for realistic outputs. Ongoing research addresses challenges like maintaining audio quality and improving lip synchronization. DeepMind prioritizes responsible AI development, incorporating diverse perspectives and planning safety assessments before wider public access.

Show HN: I built a large JavaScript powered flipdisc display. Here's a guide

Flipdisc displays, or flip dots, use electromagnetic pulses to switch colors. A project details building a large interactive display for offices, covering construction, power, software, and design considerations. It aims to explore real-time visualizations and user interactions, hoping to make flipdisc technology more accessible.

We increased our rendering speeds by 70x using the WebCodecs API

Revideo, a TypeScript framework, boosted rendering speeds by 70 times with WebCodecs API. Challenges overcome by browser-based video encoding. Limited audio processing and browser compatibility remain.

Real-time audio programming 101: time waits for nothing

Real-time audio programming demands adherence to principles like avoiding blocking operations and prioritizing worst-case execution time to prevent glitches. Buffer sizes of 1-5ms are typical for low-latency software.

2 comments

By @haykmartiros - 10 months

One of the authors here. Short story is, check out our web demo (on desktop Chrome) for real-time music generation and editing! Music is made on the fly by our diffusion models as you control the transition between two random genres. Every time you make a change, you're creating 30s of 44kHz stereo in 0.3s on a GPU via diffusion. Genre transitions are one of the most fun use cases we've found.

Show HN: Music Generation - 100x Speed Demo

Related

Generating audio for video

Show HN: I built a large JavaScript powered flipdisc display. Here's a guide

We increased our rendering speeds by 70x using the WebCodecs API

Real-time audio programming 101: time waits for nothing

Related

Generating audio for video

Show HN: I built a large JavaScript powered flipdisc display. Here's a guide

We increased our rendering speeds by 70x using the WebCodecs API

Real-time audio programming 101: time waits for nothing