Hi DSP experts! I have a question that i hope some of you can help me with. ;) What i want to is: given two audio clips, calculate a score for how similar they are. (how similar they sound) I assume i have to apply the Fourier transformation on the two clips, and somehow analyze the two frames (for example by comparing peaks) to see how similar they are. How should i do this? I will be eternally grateful for any pointers! (ideas, explanations, pointers to literature (websites, etc), ...) Let me just note that i'm very inexperienced with digital signal processing - i don't know much about it or DSP terminology. All i've had was a CS course "introduction to digital audio", where i made a FFT algorithm and some filters and timestretch. ----------------------------------------------------- Now that i've asked the question, maybe i should briefly explain what i'm gonna use it for, to give you some idea of what i'm after. This may be boring to you - in that case, just skip the rest of this post :). I'm trying to make a program that takes a normal audio clip as input (wav-file) and then approximates the input sound with simple waveforms (triangle, sawtooth, pulse, noise). Why? The reason i want to do this, is so that the approximation to the input sound can be played on an old computer, which can play 3 voices of these simple waveforms, but is incapable of playing digitized sounds. I use FFT with windowing, and thus only approximates small parts of the input sound by the 3 waveforms at a time (not the entire sound - it would of course be impossible to approximate anything but the simplest input sound by 3 simple waveforms, if the 3 waveforms didn't vary over time). There are some parameters of the 3 waveforms i can vary (freq, volume, etc). For the frame F of each burst B of the input sound, i run through all values of these parameters, to find which set of parameters best approximates the input sound. For each of these sets of parameters, i generate the sound-samples for the 3 waveforms, and does the FFT on it to get the frame F'. So now i have the two frames F and F' (one for the input sound and one for the generated sound). What i want to do, is to compare these two frames, and get a score for how similar they are, so that i can find the set of parameters that best approximates the burst B. I have made a simple comparator, to compare the two frames F and F', just to test that the rest of the code works. It simply returns a score for how well the peaks in F matches the peaks in F' (and ignores everything else but the peaks). (and the way it compares the peaks is a bit too naive and simple) This simple method works a bit (it can often follow tones), but as i said it's just naive sloppy work to see if the rest worked. Before i begin putting too much work into improving it, it might be best to get to know what other people have done. Is this the best approach to compare two frames? If so, could you point me to some literature (websites, etc) about it? If not, how should i compare the two frames instead?
How to compare two audio clips for similarity?
Started by ●July 17, 2007