August 1st, 2024

I Recreated Shazam's Algorithm with Go

NotShazam is a song recognition tool using Spotify and YouTube APIs, allowing users to identify and download songs. It requires Golang, FFmpeg, MongoDB, and NPM for installation and usage.

Read original articleLink Icon
CuriositySkepticismInterest
I Recreated Shazam's Algorithm with Go

NotShazam is a song recognition project inspired by Shazam, utilizing Spotify and YouTube APIs for song identification and downloading. Key features include the ability to recognize songs from audio recordings, download songs from Spotify, and store fingerprints using MongoDB.

To install NotShazam, users need to have Golang, FFmpeg, MongoDB, and NPM. The installation process involves cloning the repository, installing backend dependencies with Golang, and client dependencies using NPM.

For usage, users can start the client app, serve the backend app, download songs using a Spotify URL, find matches for songs from WAV files, and delete fingerprints and songs. Example commands demonstrate how to download a song and find matches for a specific audio file.

The project also provides resources on how Shazam operates and audio fingerprinting techniques. It is authored by Chigozirim Igweamaka, who has additional projects available on GitHub and a LinkedIn profile for further connection. NotShazam is licensed under the MIT License, and more information can be found on its GitHub repository.

Related

Groqnotes: Generate structured notes from audio using Groq, Whisper, and Llama3

Groqnotes: Generate structured notes from audio using Groq, Whisper, and Llama3

The GitHub project "Groqnotes" is a streamlit app utilizing Groq, Whisper, and Llama3 to create structured notes from audio content efficiently. It offers rapid transcription, markdown styling, and download options. Access online or set up locally.

Devzat – Chat over SSH, with some nice quality-of-life features

Devzat – Chat over SSH, with some nice quality-of-life features

The "Devzat" GitHub project offers a unique SSH server redirecting users to a chat interface instead of a shell prompt. It supports various features like rooms, Markdown, syntax highlighting, direct messages, games, and emoji replacements. Additionally, it provides integration with Slack, Discord, and Twitter, along with a plugin API for customization.

Immich has introduced a paid licence model

Immich has introduced a paid licence model

"immich" is a self-hosted photo and video management project under AGPLv3. It offers features like multilingual support, documentation at immich.app/docs, a demo at demo.immich.app, and a Discord community. Repository activity and contributors are accessible.

StreamPot: Run FFmpeg as an API with fluent-FFmpeg compatibility, queues and S3

StreamPot: Run FFmpeg as an API with fluent-FFmpeg compatibility, queues and S3

StreamPot simplifies media transformation tasks like video trimming and audio extraction, offering a client library for integration. Users can self-host or use a hosted version, with installation guides available.

An ordinary day with a Linux mobile device

An ordinary day with a Linux mobile device

The author shares their experience using a Linux mobile device with postmarketOS, focusing on non-communication tasks like web radio, news aggregation, and podcast management, highlighting its customization and reliability.

AI: What people are saying
The comments on the NotShazam article reveal several key themes and concerns regarding the project.
  • Questions about the data source for song recognition and whether it relies on a pre-existing library.
  • Concerns about potential patent issues related to Shazam's technology and the need for a name change.
  • Feedback on the project's usability, including installation difficulties and missing documentation for MongoDB.
  • Comparisons with other music recognition tools, such as Soundhound and Google's features, highlighting varying levels of accuracy.
  • Suggestions for improvements, including adding support for WAV files and addressing vulnerabilities in the client code.
Link Icon 24 comments
By @pjs_ - 9 months
Shazam's technology came in part out of CCRMA, which is a very cool and special place on Stanford Campus, with deep connections to early computer history.

I think it is very interesting that so many of the early applications of computer technology have to do with audio. John Bardeen's music box, the first commercial application of the transistor in hearing aids, the HP garage in Palo Alto was originally building audio oscillators, the iPhone evolved from the iPod, the internet was built on copper made to carry analog telephone calls, Bell Labs (ping!), the list goes on.

A friend of mine has the hypothesis that maybe human beings end up figuring out how to do kHz stuff before they go on to do MHz/GHz stuff. Not a perfect explanation but kind of attractive...

By @halfmatthalfcat - 9 months
FYI - If this is a true reproduction of Shazam, it’s under patent by Apple through at least March 2025[1].

[1] https://patents.google.com/patent/US7627477

By @yazmeya - 9 months
I enjoyed this talk at the DAFx17 conference by Avery Wang, co-founder of Shazam. It goes a little into the theory behind the algorithm, and looks at some of the more practical issues (background noise, etc.): https://www.youtube.com/watch?v=YVTnj3OIhwI
By @vegabook - 9 months
Recently found Shazam is less accurate - somehow soundhound is giving me better results. On Shazam I'm getting a lot of results from Asian musical traditions which is great, if it wasn't the wrong song. Maybe they need to improve the algo if they've increased the range of music they will select from? Seems now there's a lot more hash table collision[1].

  [1] https://github.com/cgzirim/not-shazam?tab=readme-ov-file#resources--card_file_box
By @Cieric - 9 months
While the project does look nice to use and modify. I'm not sure I personally would have posted it yet.

- The instructions seem not to be the best to get it up and running (e.g. "cd not-shazam" and just a few lines later "cd not-shazam/client")

- MongoDB is needed but information on how to hook it up/use it are absent (I would make the DB swapable and provide something less intrusive like sqlite)

- If replacing MongoDB is not possible, I would provide a dockerfile and a docker compose to allow easy startup and testing.

- The client npm install has 8 critical vulnerabilities, these might not actually matter but it makes me hesitant to continue testing

- You might not care about the patent or the copyright, but I would still change the name at the very least. Github itself is located in the US and will remove the project if they receives a DMCA.

- Last, this might not be as important, I would add a way to add songs from wav files. Not everything I'd want to test this with is on spotify or youtube.

I'm not saying this to discourage you or anything, I just think the project needs that little extra bit of polish. Minor things will cause people to discredit or ignore a project. If I get around to it I might make a PR for the project. I want to experiment with audio matching outside of the music space, and your project seems like it'll be the easiest to modify.

Edit: Formatting

By @renierbotha - 9 months
Haven't crawled through the repo (yet) but quick question - where does the data that is being searched over come from? Are you loading a library or searching some large library acquired from somewhere else?
By @strongly-typed - 9 months
This is really cool. I’ve been itching to try building this exact kind of thing as part of my bucket list.
By @bravura - 9 months
It would be quite nice if there were a community-based way of sharing fingerprints.
By @KomoD - 9 months
If you insert Spotify songs, wouldn't it make more sense to output Spotify songs too?
By @blackeyeblitzar - 9 months
I’ve heard that the Google phones have a built in music recognition feature that is the best implementation of this stuff. Anyone know what their approach was? Apart from that I always have felt Soundhound was better than Shazam
By @jokoon - 9 months
This is useless unless you have all the songs on earth

Algorithm don't matter, only data matters

By @ascorbic - 9 months
This is cool, but you urgently need to change the name.
By @DandyDev - 9 months
Isn’t the whole point of Shazam that you don’t know the song and want to find it? If you don’t know the song, hoeven you provide a Spotify link?
By @scoot - 9 months
Shazam is historically interesting, but Google's "hum to search" algorithm is far superior, and even that is nearly four years old (since production).
By @anticristi - 9 months
I wonder how long until someone will simply smoosh a billion songs into a "large song model" and make all signal processing knowledge irrelevant.
By @johnneville - 9 months
I'd love a way to use local files instead of spotify/youtube to create the set of fingerprints that is searched.
By @euroderf - 9 months
Run it as a daemon that displays every song in a UI notification ?
By @hactually - 9 months
really decent and nicely done Golang! I'll pull and play with it tomorrow!
By @Philip-J-Fry - 9 months
I think you've leaked your developer key here... https://github.com/cgzirim/not-shazam/blob/main/spotify/yout...
By @wmichelin - 9 months
By @msie - 9 months
I enjoyed reading the Go source. As opposed to the time I had to read some Ruby code.