Python Audio Fingerprinter
A desktop application that identifies songs from audio snippets. Built from first principles using Python and Tkinter, featuring a custom flat-file database engine.
The Problem
For my major A-Level Computer Science project, I wanted to challenge myself to build something that felt like "magic." I was fascinated by how Shazam could match audio so quickly.
The catch? This was my very first program written in Python.
I didn't just want to build a standard CRUD app; I wanted to use this opportunity to teach myself the language by tackling a complex Digital Signal Processing (DSP) problem from first principles, without relying on pre-made APIs.
The Goal
My goal was to build a standalone desktop application that could:
- Analyze Audio: Take a raw audio file (WAV/MP3) and visualize its frequencies.
- Fingerprint: Convert that audio into a unique "hash" that persists even if the audio is distorted.
- Match: Search a database to find the song title and artist.
The Solution
Technical Decisions
I chose Python for its strong mathematical libraries (NumPy) and Tkinter for the GUI, as this allowed me to keep the application self-contained and runnable on any desktop without a complex web server setup.
The "Database" Challenge: To demonstrate a deep understanding of data structures, I opted not to use a pre-made SQL database. Instead, I architected a custom flat-file database system.
- I designed a schema using structured text files to store song metadata and fingerprint hashes.
- I wrote my own parser to efficiently read, write, and index these files, essentially building a mini-database engine from scratch.
Algorithms & Challenges
The core challenge was translating raw sound waves into something searchable.
- Spectrogram Generation: I used hand-coded Fast Fourier Transforms (FFT) to break the audio waves down into their component frequencies over time.
- Peak Finding: A raw spectrogram is too noisy. I implemented a "Constellation Map" algorithm that iterates over the spectrogram grid and identifies "peak" intensities—the loudest notes that survive noise.
- Combinatorial Hashing: I created unique hashes by pairing these peaks together (anchor points) and calculating the time difference between them. This ensured that even if the recording was sped up or slowed down slightly, the relative distance between peaks remained constant.
The Result
The final project was a fully functional GUI application. Users could upload a song, view the real-time spectrogram generation, and receive an accurate match from the library.
Lessons Learned
This project was a "trial by fire" as it was my first Python codebase. I had to learn the syntax, the standard library, and concepts like Fast Fourier Transforms (FFT) simultaneously.
It taught me how to read technical whitepapers and translate mathematical concepts into code—a skill that has stuck with me far longer than the specific Python syntax I learned at the time. It also gave me a foundational respect for database management systems by forcing me to handle file locks, parsing errors, and data retrieval manually in my custom text-file database.