I Recreated Shazam's Algorithm with Go
NotShazam is a song recognition tool using Spotify and YouTube APIs, allowing users to identify and download songs. It requires Golang, FFmpeg, MongoDB, and NPM for installation and usage.
Read original articleNotShazam is a song recognition project inspired by Shazam, utilizing Spotify and YouTube APIs for song identification and downloading. Key features include the ability to recognize songs from audio recordings, download songs from Spotify, and store fingerprints using MongoDB.
To install NotShazam, users need to have Golang, FFmpeg, MongoDB, and NPM. The installation process involves cloning the repository, installing backend dependencies with Golang, and client dependencies using NPM.
For usage, users can start the client app, serve the backend app, download songs using a Spotify URL, find matches for songs from WAV files, and delete fingerprints and songs. Example commands demonstrate how to download a song and find matches for a specific audio file.
The project also provides resources on how Shazam operates and audio fingerprinting techniques. It is authored by Chigozirim Igweamaka, who has additional projects available on GitHub and a LinkedIn profile for further connection. NotShazam is licensed under the MIT License, and more information can be found on its GitHub repository.
Related
Groqnotes: Generate structured notes from audio using Groq, Whisper, and Llama3
The GitHub project "Groqnotes" is a streamlit app utilizing Groq, Whisper, and Llama3 to create structured notes from audio content efficiently. It offers rapid transcription, markdown styling, and download options. Access online or set up locally.
Devzat – Chat over SSH, with some nice quality-of-life features
The "Devzat" GitHub project offers a unique SSH server redirecting users to a chat interface instead of a shell prompt. It supports various features like rooms, Markdown, syntax highlighting, direct messages, games, and emoji replacements. Additionally, it provides integration with Slack, Discord, and Twitter, along with a plugin API for customization.
Immich has introduced a paid licence model
"immich" is a self-hosted photo and video management project under AGPLv3. It offers features like multilingual support, documentation at immich.app/docs, a demo at demo.immich.app, and a Discord community. Repository activity and contributors are accessible.
StreamPot: Run FFmpeg as an API with fluent-FFmpeg compatibility, queues and S3
StreamPot simplifies media transformation tasks like video trimming and audio extraction, offering a client library for integration. Users can self-host or use a hosted version, with installation guides available.
An ordinary day with a Linux mobile device
The author shares their experience using a Linux mobile device with postmarketOS, focusing on non-communication tasks like web radio, news aggregation, and podcast management, highlighting its customization and reliability.
- Questions about the data source for song recognition and whether it relies on a pre-existing library.
- Concerns about potential patent issues related to Shazam's technology and the need for a name change.
- Feedback on the project's usability, including installation difficulties and missing documentation for MongoDB.
- Comparisons with other music recognition tools, such as Soundhound and Google's features, highlighting varying levels of accuracy.
- Suggestions for improvements, including adding support for WAV files and addressing vulnerabilities in the client code.
I think it is very interesting that so many of the early applications of computer technology have to do with audio. John Bardeen's music box, the first commercial application of the transistor in hearing aids, the HP garage in Palo Alto was originally building audio oscillators, the iPhone evolved from the iPod, the internet was built on copper made to carry analog telephone calls, Bell Labs (ping!), the list goes on.
A friend of mine has the hypothesis that maybe human beings end up figuring out how to do kHz stuff before they go on to do MHz/GHz stuff. Not a perfect explanation but kind of attractive...
[1] https://github.com/cgzirim/not-shazam?tab=readme-ov-file#resources--card_file_box
- The instructions seem not to be the best to get it up and running (e.g. "cd not-shazam" and just a few lines later "cd not-shazam/client")
- MongoDB is needed but information on how to hook it up/use it are absent (I would make the DB swapable and provide something less intrusive like sqlite)
- If replacing MongoDB is not possible, I would provide a dockerfile and a docker compose to allow easy startup and testing.
- The client npm install has 8 critical vulnerabilities, these might not actually matter but it makes me hesitant to continue testing
- You might not care about the patent or the copyright, but I would still change the name at the very least. Github itself is located in the US and will remove the project if they receives a DMCA.
- Last, this might not be as important, I would add a way to add songs from wav files. Not everything I'd want to test this with is on spotify or youtube.
I'm not saying this to discourage you or anything, I just think the project needs that little extra bit of polish. Minor things will cause people to discredit or ignore a project. If I get around to it I might make a PR for the project. I want to experiment with audio matching outside of the music space, and your project seems like it'll be the easiest to modify.
Edit: Formatting
Algorithm don't matter, only data matters
https://github.com/cgzirim/not-shazam/blob/888070f3434acbc0a...
Related
Groqnotes: Generate structured notes from audio using Groq, Whisper, and Llama3
The GitHub project "Groqnotes" is a streamlit app utilizing Groq, Whisper, and Llama3 to create structured notes from audio content efficiently. It offers rapid transcription, markdown styling, and download options. Access online or set up locally.
Devzat – Chat over SSH, with some nice quality-of-life features
The "Devzat" GitHub project offers a unique SSH server redirecting users to a chat interface instead of a shell prompt. It supports various features like rooms, Markdown, syntax highlighting, direct messages, games, and emoji replacements. Additionally, it provides integration with Slack, Discord, and Twitter, along with a plugin API for customization.
Immich has introduced a paid licence model
"immich" is a self-hosted photo and video management project under AGPLv3. It offers features like multilingual support, documentation at immich.app/docs, a demo at demo.immich.app, and a Discord community. Repository activity and contributors are accessible.
StreamPot: Run FFmpeg as an API with fluent-FFmpeg compatibility, queues and S3
StreamPot simplifies media transformation tasks like video trimming and audio extraction, offering a client library for integration. Users can self-host or use a hosted version, with installation guides available.
An ordinary day with a Linux mobile device
The author shares their experience using a Linux mobile device with postmarketOS, focusing on non-communication tasks like web radio, news aggregation, and podcast management, highlighting its customization and reliability.