Forums

(Audio DSP)Comparing and masking audio

Started by Jeremy Smith December 1, 2003
Hi!

I've been working on a program to take, say, a drum sound, then find it 
in another file (say, a song with that drum sound in it) by comparing the 
frequency spectrum of the drum to various chunks of the song, so that it 
finds a close match of the spectrum charts. It compares everything in the 
frequency spectrum by using the Fast Fourier Transform.

Once it knows where there is a duplicate of the drum sound, it has to try 
and remove it, using the other file as a frequency mask. I tried simply 
FFT'ing the bass drum, then subtracting this from the FFT of the song (at 
the correct position), but,

    	*This only works when the files are exactly aligned. If I'm out by 
say 8 bytes, there gets more and more noise in the output file.

    	*On a test file, subtracting one song (song 1) from another (song 
2, with some drums added to it) leaves the drums. However, subtracting 
the drums from song 2 doesn't seem to have any effect!

First, I thought by moving into the frequency domain that it would be 
easier to remove a specific block of frequencies. Second, I can't figure 
out why the subtraction of drums from song 2 isn't working!

I won't post my code as it won't help but all I'm using is an FFT, a 
reverse-FFT (which works), and a graphical window to display my results. 
I am subtracting the files in an FFT buffer. I might be going wrong in 
assuming that each FFT buffer has the same spectrum in the same place.

My question for the newsgroup is, is what I'm attempting possible (it 
looks like it should be at least in a graphical sense), and if so, am I 
going about it the right way?

I'll be reading the newsgroup, but any personal emails are fine.

Cheers,

Jeremy.
Jeremy Smith <jeremyalansmith@softhome.net> wrote in
news:Xns9444CAFC7EFCjeremyalansmithsofth@62.253.162.103: 

> Hi! > > I've been working on a program to take, say, a drum sound, then find > it in another file (say, a song with that drum sound in it) by > comparing the frequency spectrum of the drum to various chunks of the > song, so that it finds a close match of the spectrum charts. It > compares everything in the frequency spectrum by using the Fast > Fourier Transform.
My second question, about removing the drums from a (pre-prepared) song is solved (I found out that my code, which checked that one FFT element was higher than the other before subtracting, was wrong because of the use of negative numbers in an FFT element). I would still like to know the answer to my other question, "is this possible", before I spend a lot of time on it. As a postscript to that posting, I looked on the Web for papers, or articles, on subtracting one range of frequencies from another, and found virtually nothing. Which is why I post here! Best, Jeremy Smith.
On Mon, 01 Dec 2003 19:55:06 GMT, Jeremy Smith
<jeremyalansmith@softhome.net> wrote:
> Hi! > > I've been working on a program to take, say, a drum sound, then find it > in another file (say, a song with that drum sound in it) by comparing the > frequency spectrum of the drum to various chunks of the song, so that it > finds a close match of the spectrum charts. It compares everything in the > frequency spectrum by using the Fast Fourier Transform. > > Once it knows where there is a duplicate of the drum sound, it has to try > and remove it, using the other file as a frequency mask. I tried simply > FFT'ing the bass drum, then subtracting this from the FFT of the song (at > the correct position), but, > > *This only works when the files are exactly aligned. If I'm out by > say 8 bytes, there gets more and more noise in the output file.
That would follow, yes. An inverted drum sounds pretty much like a drum. A drum has noise as an important component of the sound. What do you mean by "out by 8 bytes?" I think of timing in the digital domain as being in terms of samples, as in, "After asserting SSync on the DDR, there are exactly 156 invalid samples." If you know there's a kick drum hit on sample N, you simply (sic) subtract the kick from sample N. Subtracting from sample N+1 gets you nowhere. Obviously, the decay of the drum is necessarily going to cause complications, but it may not be necessary to deal with them for the purposes of your test.
> *On a test file, subtracting one song (song 1) from another (song > 2, with some drums added to it) leaves the drums. However, subtracting > the drums from song 2 doesn't seem to have any effect!
No effect? As opposed to simply adding noise? Sounds like you're adding then. Double check your math.
> First, I thought by moving into the frequency domain that it would be > easier to remove a specific block of frequencies. Second, I can't figure > out why the subtraction of drums from song 2 isn't working! >
Probably wouldn't. Impulsive sounds have a lot of harmonics. Listen to a live sound check sometime: It's surprising how broad a frequency range a kick drum covers. What you're proposing is routinely done commercially by devices called "lead zappers." They typically work on stereo input by inverting one channel and adding, either digitally or with op amps. This eliminates anything panned dead center, typically the lead vocal, bass, and kick drum. A bypass LPF around 250Hz or so preserves enough of the bass to give you a singing foundation unless your trying to sing Kareoke style Russian Basso Profunda roles. But if you're trying to do that, you most likely already know about the Music Minus One library. :-)

Jeremy Smith wrote:
> > I would still like to know the answer to my other question, "is this > possible", before I spend a lot of time on it. >
It is not possible to completely delete particular drum sound by subtracting spectrums. However you can achieve some attenuation. The reason is that each of the sounds of the drum is quite different from any other sound from the same drum due to the random reasons. Vladimir Vassilevsky DSP and Mixed Signal Design Consultant http://www.abvolt.com