YouTube Video to Tabs and Lyrics
Fish is an AI-powered multimodal model for music information retrieval, generating musical elements like chords and lyrics, featuring advanced audio processing and a specialized architecture for enhanced functionality.
Read original articleThe GitHub repository for the project named Fish presents an AI-powered multimodal model designed for music information retrieval. Its primary function is to generate various musical elements such as chords, beats, lyrics, melody, and tabs for any song using a transformer-based hybrid model. Key features include chord detection, which identifies different chord types and song keys; beat detection for tracking tempo; pitch tracking for vocal melodies; and music structure analysis to label song segments. Additionally, it employs automatic speech recognition (ASR) for lyrics recognition, aligning them with audio, and generates playable sheet music with editing capabilities. The project incorporates advanced audio processing features like source separation, speed adjustment, and pitch shifting. The model architecture consists of several specialized models, including U-Net, Pitch-Net, Beat-Net, Chord-Net, and Segment-Net, with a core model called CombineNet that utilizes an encoder-decoder structure for audio processing. The repository also showcases training results for a speech sample, demonstrating the model's capabilities. For further exploration, users can visit the project's website or access the code in the repository.
- Fish is an AI model for music information retrieval, generating chords, beats, lyrics, and more.
- Key features include chord detection, beat tracking, pitch monitoring, and music structure analysis.
- The project uses various specialized models for audio processing, culminating in the CombineNet architecture.
- It offers functionalities like lyrics recognition and the generation of editable sheet music.
- A demo is available showcasing the model's capabilities with training results.
Related
YouTube in talks with record labels over AI music deal
YouTube is in talks with major record labels to license AI tools replicating artists' music. Some artists are wary of devaluation concerns. Negotiations aim to involve select artists for AI music generation.
Alphatab.net
The website promotes alphaTab, a versatile tool for creating music notation applications on web, desktop, and mobile. It offers responsive display, audio playback synced with notation, and a customizable API. Users can access detailed music sheet data through alphaTab APIs for tailored UI components.
Show HN: AI assisted image editing with audio instructions
The GitHub repository hosts "AAIELA: AI Assisted Image Editing with Language and Audio," a project enabling image editing via audio commands and AI models. It integrates various technologies for object detection, language processing, and image inpainting. Future plans involve model enhancements and feature integrations.
Awesome AI Tools – A Curated List of Artificial Intelligence Top Tools
The "Awesome AI Tools" GitHub repository offers a curated collection of AI tools across various categories, featuring notable models like ChatGPT and DALL·E 2, and encourages user contributions.
A C/C++ library for audio and music analysis
audioFlux is a deep learning library for audio analysis, featuring new pitch algorithms in version 0.1.8. It supports Python 3.6+, with modules for transformations, features, and music information retrieval.
- https://github.com/DoMusic/Hybrid-Net
- https://github.com/TuneMusic/NiceMusic
- https://github.com/JoinMusic/fish
- https://github.com/Famuse/CombineNet
- https://github.com/AIAudioLab/AITabs
- https://github.com/AIMusicLab/MicroMuisc
I'm pretty sure there are more, but I'll stop there. Especially suspicious considering all the usernames.
Here's a post from yesterday on Reddit:
- https://www.reddit.com/r/coolaitools/comments/1ervthn/found_...
I'm guessing the general process here is:
- Push novelty (but unusable to most people) code to new Github repo
- Submit that code to Reddit/Hacker News
- People see it and are impressed by the novelty code, despite not running it due to missing the models themselves, etc. They upvote and subscribe ($$$) to actually try it.
- Repeat
I understand the desire to promote one's new service, and the product seems like it could be interesting, but this is not the way to get the word out. Reputation matters.
Edit:
Check out the user deeplover's post/comment history. One submission with the MicroMusic (see above) repo, and one comment, see below.
Also, the post by user liwei0517 is almost exactly like BigOrange688 on Reddit. See: https://www.reddit.com/r/MachineLearning/comments/1es0deh/co...
High-resolution guitar transcription via domain adaptation
Demo Videos: https://xavriley.github.io/HighResolutionGuitarTranscription... Paper: https://arxiv.org/abs/2402.15258
> We propose the use of a high-resolution piano transcription model to train a new guitar transcription model. The resulting model obtains state-of-the-art transcription results on GuitarSet in a zero-shot context, improving on previously published methods.
edit: removed a reference to a competing product
A couple of weeks ago I asked one of the AIs to teach me this song. It responded that it can't teach specifics or tell me strumming patterns because it would be a copyright violation. I told it that if I went to a human teacher, they would have no problem teaching me how to play along to the song. That was a good enough argument to get the AI to changed its mind (whatever that means) and produced a chord chart and strumming pattern (which was wrong).
E.g. Bb instead of Ebmaj7
Bb7 instead of Bm7b5
I only found out about https://www.soundslice.com recently. I'm not sure how it managed to evade me for years of searching for music resources on the internet... but for anyone interested in sheet music, I can't recommend it enough.
The design of the whole platform is so minimal and beautiful, and having notation synchronized with YouTube is simply brilliant. Built by one of the co-creators of Django, too!
I tried making some AI covers, too, which was kind of fun. For one of my tries, I submitted Nirvana's Smells Like Teen Spirit for AI voice generation to make a cover of Carola's song "Främling" (Sweden's ESC song 1983 which came in third, a very non-Nirvana-type of song). At first, I thought the voice sounded pretty much like a Swedish Kurt Cobain, then the more I listened to it, all I could hear was the Swedish artist Nordman, and it dawned upon me that they have similar voice styles. I tried lowering the pitch, and then I was certain I recognised the voice from another artist but couldn't place it. So I'm leaning towards the AI voices being trained on some not-so-unfamiliar artists rather than there being some cool AI magic happening, though I'm out of my depth here.
Anyone know of a thing that does?
but was somewhat disappointed. The site is cool, can give you a headstart when transposing a song to practice, but the chords were quite off in a few examples I tried.
The mid-range is eerily close to Taylor's voice - there were moments when I almost thought it was really her singing. But, If you listen closely, you can still catch a tiny hint of that robotic sound.
A Bar Song : (Taylor Swift Cover) https://lamucal.com/ai-cover/song-share/66be244cbc3fdb000e76... Original: A Bar Song (Tipsy) https://www.youtube.com/watch?v=t7bQwwqW-Hc
Related
YouTube in talks with record labels over AI music deal
YouTube is in talks with major record labels to license AI tools replicating artists' music. Some artists are wary of devaluation concerns. Negotiations aim to involve select artists for AI music generation.
Alphatab.net
The website promotes alphaTab, a versatile tool for creating music notation applications on web, desktop, and mobile. It offers responsive display, audio playback synced with notation, and a customizable API. Users can access detailed music sheet data through alphaTab APIs for tailored UI components.
Show HN: AI assisted image editing with audio instructions
The GitHub repository hosts "AAIELA: AI Assisted Image Editing with Language and Audio," a project enabling image editing via audio commands and AI models. It integrates various technologies for object detection, language processing, and image inpainting. Future plans involve model enhancements and feature integrations.
Awesome AI Tools – A Curated List of Artificial Intelligence Top Tools
The "Awesome AI Tools" GitHub repository offers a curated collection of AI tools across various categories, featuring notable models like ChatGPT and DALL·E 2, and encourages user contributions.
A C/C++ library for audio and music analysis
audioFlux is a deep learning library for audio analysis, featuring new pitch algorithms in version 0.1.8. It supports Python 3.6+, with modules for transformations, features, and music information retrieval.