Finding start position of one audio file in another
Started by ●March 4, 2007
Just a novice question: If I have two audio files and one is, for instance, 5min
and the other is a 20s segment of the first one, how would I go about finding
the starting position of the second one within the first one? I'm looking
for sort of an IndexOf, InStr type of function, but for audio files. Any ideas
on where to start, or if there are any COM or .NET components that would be
helpful?
Reply by ●March 5, 20072007-03-05
Evan-
> Just a novice question: If I have two audio files and one is, for
> instance, 5min and the other is a 20s segment of the
> first one, how would I go about finding the starting position of
> the second one within the first one? I'm looking for
> sort of an IndexOf, InStr type of function, but for audio files.
> Any ideas on where to start, or if there are any COM
> or .NET components that would be helpful?
Suggest to research cross-correlation, or a subset sometimes referred to as "matched filtering". In this case, your
20 sec segment will act as a filter, to which you're trying to find the highest correlation inside the long data.
However, 20 sec at 44.1 kHz sampling rate is a lot of data, enough to make real-time cross-correlation out of the
question even with fastest PCs, so you will need to make some tradeoffs.
Have fun :-)
-Jeff
> Just a novice question: If I have two audio files and one is, for
> instance, 5min and the other is a 20s segment of the
> first one, how would I go about finding the starting position of
> the second one within the first one? I'm looking for
> sort of an IndexOf, InStr type of function, but for audio files.
> Any ideas on where to start, or if there are any COM
> or .NET components that would be helpful?
Suggest to research cross-correlation, or a subset sometimes referred to as "matched filtering". In this case, your
20 sec segment will act as a filter, to which you're trying to find the highest correlation inside the long data.
However, 20 sec at 44.1 kHz sampling rate is a lot of data, enough to make real-time cross-correlation out of the
question even with fastest PCs, so you will need to make some tradeoffs.
Have fun :-)
-Jeff
Reply by ●March 7, 20072007-03-07
On Sun, Mar 04, 2007 at 08:07:52PM -0600, Jeff Brower wrote:
> Suggest to research cross-correlation, or a subset sometimes referred to as "matched filtering". In this case, your
> 20 sec segment will act as a filter, to which you're trying to find the highest correlation inside the long data.
> However, 20 sec at 44.1 kHz sampling rate is a lot of data, enough to make real-time cross-correlation out of the
> question even with fastest PCs, so you will need to make some tradeoffs.
You can apply text search approach here. Just split shorter fragment into let's say 0,1 second blocks. Correlate
the first block, if the value is high, then append new block, if no, just move in the longer file. If the material
is music, downsampling by 4 (or even 8) before correlating may also reduce computational power required.
--
Grzegorz Kraszewski
http://teleinfo.pb.bialystok.pl/~krashan
> Suggest to research cross-correlation, or a subset sometimes referred to as "matched filtering". In this case, your
> 20 sec segment will act as a filter, to which you're trying to find the highest correlation inside the long data.
> However, 20 sec at 44.1 kHz sampling rate is a lot of data, enough to make real-time cross-correlation out of the
> question even with fastest PCs, so you will need to make some tradeoffs.
You can apply text search approach here. Just split shorter fragment into let's say 0,1 second blocks. Correlate
the first block, if the value is high, then append new block, if no, just move in the longer file. If the material
is music, downsampling by 4 (or even 8) before correlating may also reduce computational power required.
--
Grzegorz Kraszewski
http://teleinfo.pb.bialystok.pl/~krashan