August 15th, 2024

YouTube Video to Tabs and Lyrics

Fish is an AI-powered multimodal model for music information retrieval, generating musical elements like chords and lyrics, featuring advanced audio processing and a specialized architecture for enhanced functionality.

Read original article

The GitHub repository for the project named Fish presents an AI-powered multimodal model designed for music information retrieval. Its primary function is to generate various musical elements such as chords, beats, lyrics, melody, and tabs for any song using a transformer-based hybrid model. Key features include chord detection, which identifies different chord types and song keys; beat detection for tracking tempo; pitch tracking for vocal melodies; and music structure analysis to label song segments. Additionally, it employs automatic speech recognition (ASR) for lyrics recognition, aligning them with audio, and generates playable sheet music with editing capabilities. The project incorporates advanced audio processing features like source separation, speed adjustment, and pitch shifting. The model architecture consists of several specialized models, including U-Net, Pitch-Net, Beat-Net, Chord-Net, and Segment-Net, with a core model called CombineNet that utilizes an encoder-decoder structure for audio processing. The repository also showcases training results for a speech sample, demonstrating the model's capabilities. For further exploration, users can visit the project's website or access the code in the repository.

- Fish is an AI model for music information retrieval, generating chords, beats, lyrics, and more.

- Key features include chord detection, beat tracking, pitch monitoring, and music structure analysis.

- The project uses various specialized models for audio processing, culminating in the CombineNet architecture.

- It offers functionalities like lyrics recognition and the generation of editable sheet music.

- A demo is available showcasing the model's capabilities with training results.

YouTube in talks with record labels over AI music deal

YouTube is in talks with major record labels to license AI tools replicating artists' music. Some artists are wary of devaluation concerns. Negotiations aim to involve select artists for AI music generation.

Alphatab.net

The website promotes alphaTab, a versatile tool for creating music notation applications on web, desktop, and mobile. It offers responsive display, audio playback synced with notation, and a customizable API. Users can access detailed music sheet data through alphaTab APIs for tailored UI components.

Show HN: AI assisted image editing with audio instructions

The GitHub repository hosts "AAIELA: AI Assisted Image Editing with Language and Audio," a project enabling image editing via audio commands and AI models. It integrates various technologies for object detection, language processing, and image inpainting. Future plans involve model enhancements and feature integrations.

Awesome AI Tools – A Curated List of Artificial Intelligence Top Tools

The "Awesome AI Tools" GitHub repository offers a curated collection of AI tools across various categories, featuring notable models like ChatGPT and DALL·E 2, and encourages user contributions.

A C/C++ library for audio and music analysis

audioFlux is a deep learning library for audio analysis, featuring new pitch algorithms in version 0.1.8. It supports Python 3.6+, with modules for transformations, features, and music information retrieval.

24 comments

By @rwl4 - 8 months

So I'm trying to understand. Is this spam for the Lamucal service? I saw this same code posted on Reddit the other day under a different name. Here are a few repos with the exact same code under different names:

- https://github.com/DoMusic/Hybrid-Net

- https://github.com/TuneMusic/NiceMusic

- https://github.com/JoinMusic/fish

- https://github.com/Famuse/CombineNet

- https://github.com/AIAudioLab/AITabs

- https://github.com/AIMusicLab/MicroMuisc

I'm pretty sure there are more, but I'll stop there. Especially suspicious considering all the usernames.

Here's a post from yesterday on Reddit:

- https://www.reddit.com/r/coolaitools/comments/1ervthn/found_...

I'm guessing the general process here is:

- Push novelty (but unusable to most people) code to new Github repo

- Submit that code to Reddit/Hacker News

- People see it and are impressed by the novelty code, despite not running it due to missing the models themselves, etc. They upvote and subscribe ($$$) to actually try it.

- Repeat

I understand the desire to promote one's new service, and the product seems like it could be interesting, but this is not the way to get the word out. Reputation matters.

Edit:

Check out the user deeplover's post/comment history. One submission with the MicroMusic (see above) repo, and one comment, see below.

Also, the post by user liwei0517 is almost exactly like BigOrange688 on Reddit. See: https://www.reddit.com/r/MachineLearning/comments/1es0deh/co...

By @rrherr - 8 months

Here's the most impressive results I've seen for automated guitar transcription:

High-resolution guitar transcription via domain adaptation

Demo Videos: https://xavriley.github.io/HighResolutionGuitarTranscription... Paper: https://arxiv.org/abs/2402.15258

> We propose the use of a high-resolution piano transcription model to train a new guitar transcription model. The resulting model obtains state-of-the-art transcription results on GuitarSet in a zero-shot context, improving on previously published methods.

By @kranner - 8 months

The "tabs" seem to be arpeggiations of the chords, which might have been some use if the chord detection had worked well, which doesn't seem the case. I see chords and tabs being generated from sections which have only spoken audio, while actual guitar parts are not notated at all. The arpeggios are not consistent either and switch arbitrarily to upstrokes/downstrokes and back to arpeggios.

edit: removed a reference to a competing product

By @criddell - 8 months

I tried to get it to generate tabs for Where is my Mind by the Pixies. I see the chords, but get the NO icon (red circle with diagonal bar) when I try to click on tabs. Am I doing something wrong?

A couple of weeks ago I asked one of the AIs to teach me this song. It responded that it can't teach specifics or tell me strumming patterns because it would be a copyright violation. I told it that if I went to a human teacher, they would have no problem teaching me how to play along to the song. That was a good enough argument to get the AI to changed its mind (whatever that means) and produced a chord chart and strumming pattern (which was wrong).

By @authorfly - 8 months

Nice idea, but gives the wrong chords for jazz music: https://lamucal.com/chords/emmet-cohen/after-youve-gone-patr...

E.g. Bb instead of Ebmaj7

Bb7 instead of Bm7b5

By @neilyio - 8 months

Very happy to see more tools like this. There is so much potential for interactive tabs and sheet music with YouTube videos.

I only found out about https://www.soundslice.com recently. I'm not sure how it managed to evade me for years of searching for music resources on the internet... but for anyone interested in sheet music, I can't recommend it enough.

The design of the whole platform is so minimal and beautiful, and having notation synchronized with YouTube is simply brilliant. Built by one of the co-creators of Django, too!

By @dadver - 8 months

I played around a few minutes with the various features. The voice removing was kind of impressive, though I don't know how novel that is.

I tried making some AI covers, too, which was kind of fun. For one of my tries, I submitted Nirvana's Smells Like Teen Spirit for AI voice generation to make a cover of Carola's song "Främling" (Sweden's ESC song 1983 which came in third, a very non-Nirvana-type of song). At first, I thought the voice sounded pretty much like a Swedish Kurt Cobain, then the more I listened to it, all I could hear was the Swedish artist Nordman, and it dawned upon me that they have similar voice styles. I tried lowering the pitch, and then I was certain I recognised the voice from another artist but couldn't place it. So I'm leaning towards the AI voices being trained on some not-so-unfamiliar artists rather than there being some cool AI magic happening, though I'm out of my depth here.

By @sixall - 8 months

I am an entrepreneur, and my startup team is about to fail. In response to the questions raised by rwl4, I am very sorry. All the features on our website are completely free for everyone to use. If we can help some people, it will be a small consolation for the team before the failure. I apologize again.

By @buildsjets - 8 months

Pretty cool! Is there a way to either detect or enforce alternate tunings? There is a world beyond EADGBe... I put in a few songs that I know of which have trivial chord fingerings in drop D, and it comes up with some correct but convoluted chord fingerings in standard tuning.

By @SoftMachine - 8 months

I don't often read hackernews for Lamucal AI spams but when I do, it's always nice.

By @ksr - 8 months

I'm looking to do the opposite: Given a melody in MusicXML / MIDI, generate an accompaniment "in the style of". Any pointers?

By @riiii - 8 months

Very interesting. I don't suppose this would work with instrumental music?

Anyone know of a thing that does?

By @smrtinsert - 8 months

This is going to save me a lot of time for when I actually have time to play music. Much better than whipping out Audacity and its chord estimation, manually grabbing a video etc.

By @TrackerFF - 8 months

Seems like a cool concept, but the tab function was more or less useless. Tried a bunch of different songs in various complexities, couldn't get anything convincing.

By @ndriscoll - 8 months

Neat, it would also be awesome to package it into something like a Clone Hero/Rocksmith tool/plugin to generate charts just like Audiosurf did.

By @zerop - 8 months

Off topic: what's best way to generate good quality videos given a transcript. Only automation, no manual work. I can code.

By @CMLab - 8 months

AI cover song platforms, How to address issues related to copyright, legal, and ethical concerns?

By @guitarlimeo - 8 months

I was expecting something like this https://www.youtube.com/watch?v=nLtlyzWuoqM

but was somewhat disappointed. The site is cool, can give you a headstart when transposing a song to practice, but the chords were quite off in a few examples I tried.

By @afpx - 8 months

Wow - nice job!

By @liwei0517 - 8 months

I just trained a Taylor Swift voice model using Lamucal ( https://lamucal.com/ai-cover/share/66be2087bc3fdb000baf3cac ),

The mid-range is eerily close to Taylor's voice - there were moments when I almost thought it was really her singing. But, If you listen closely, you can still catch a tiny hint of that robotic sound.

A Bar Song : (Taylor Swift Cover) https://lamucal.com/ai-cover/song-share/66be244cbc3fdb000e76... Original: A Bar Song (Tipsy) https://www.youtube.com/watch?v=t7bQwwqW-Hc

YouTube in talks with record labels over AI music deal

Alphatab.net

Show HN: AI assisted image editing with audio instructions

Awesome AI Tools – A Curated List of Artificial Intelligence Top Tools

The "Awesome AI Tools" GitHub repository offers a curated collection of AI tools across various categories, featuring notable models like ChatGPT and DALL·E 2, and encourages user contributions.

YouTube Video to Tabs and Lyrics

Related

YouTube in talks with record labels over AI music deal

Alphatab.net

Show HN: AI assisted image editing with audio instructions

Awesome AI Tools – A Curated List of Artificial Intelligence Top Tools

A C/C++ library for audio and music analysis

Related

YouTube in talks with record labels over AI music deal

Alphatab.net

Show HN: AI assisted image editing with audio instructions

Awesome AI Tools – A Curated List of Artificial Intelligence Top Tools

A C/C++ library for audio and music analysis