Reply by Rune Allnor January 24, 20052005-01-24
robert bristow-johnson wrote:
> > in article 1106476416.358718.66140@z14g2000cwz.googlegroups.com,
Rune Allnor
> > at allnor@tele.ntnu.no wrote on 01/23/2005 05:33: > > > >> I don't know how the implementation you found works, but there is > >> basically only one way to implement the autocorrelation. > > in article KrQId.49525$re1.33287@fe2.columbus.rr.com, Mark Borgerding
at
> mark@borgerding.net wrote on 01/23/2005 11:40: > > > Not true. But your comments below indicate you already suspect
there is
> > another way. > > i can't even remember all of the variations of computing
autocorrelation.
> there is the "fast autocorrelation" by use of FFT,
magnitude-squaring, and
> iFFT. there are tricks that save you from having to compute the > autocorrelation for every lag value, if you don't need it for every
lag.
> there are ways to compute a normalized autocorrelation from the AMDF
(or
> more precisely, the ASDF). the normalized autocorrelation and half
the ASDF
> add to a constant or are supposed to, for a large window. what else?
if a
> rectangular window is used, there are ways to save partial sums from > previous autocorrelations to save computation of the present one. i
dunno.
> lotsa tricks.
Thanks for your comments, guys. Mark: I suspected the FFT could be used for these kinds of things, but I haven't actually seen a way to compute the time-domain auto correlation sequence by the FFT. Your comments indicate that I was right in that there are implementational aspects one needs to keep track of, and that the FFT is a relevant alternative at least for large amounts of data. The OP did not mention what numbers of samples are relevant for his application, but he did mention "speech codec", which makes me suspect he will use on the order of 80 - 100 samples per time sequence (10 ms frames sampled at 8 kHz). I'm not sure the faster FFT wil make up for the added overhead and book-keeping needed, in such short sequences. RBJ: I know you have been doing lot of fast implementations, while I'm more academically inclined. I used the word "implement" in my post which, of course, is wrong. I should have said "there is one way to *compute* the autocorrelation". As Mark commented, it's not strictly correct but it's not quite as bad as what I first wrote. "Computation" relates to the basic algorithm while "implementation" relates to how the computations actually are done. Rune
Reply by robert bristow-johnson January 23, 20052005-01-23
> in article 1106476416.358718.66140@z14g2000cwz.googlegroups.com, Rune Allnor > at allnor@tele.ntnu.no wrote on 01/23/2005 05:33: > >> I don't know how the implementation you found works, but there is >> basically only one way to implement the autocorrelation.
in article KrQId.49525$re1.33287@fe2.columbus.rr.com, Mark Borgerding at mark@borgerding.net wrote on 01/23/2005 11:40:
> Not true. But your comments below indicate you already suspect there is > another way.
i can't even remember all of the variations of computing autocorrelation. there is the "fast autocorrelation" by use of FFT, magnitude-squaring, and iFFT. there are tricks that save you from having to compute the autocorrelation for every lag value, if you don't need it for every lag. there are ways to compute a normalized autocorrelation from the AMDF (or more precisely, the ASDF). the normalized autocorrelation and half the ASDF add to a constant or are supposed to, for a large window. what else? if a rectangular window is used, there are ways to save partial sums from previous autocorrelations to save computation of the present one. i dunno. lotsa tricks. -- r b-j rbj@audioimagination.com "Imagination is more important than knowledge."
Reply by Mark Borgerding January 23, 20052005-01-23
Rune Allnor wrote:
> EC-AKD wrote: >> Is there any optimised method to write auto corelation of any signal
> I don't know how the implementation you found works, but there is > basically only one way to implement the autocorrelation.
Not true. But your comments below indicate you already suspect there is another way.
> If you have very long sequences you might be able to save some > time by using the FFT and do some of the computations in > frequency domain. This approach opens several new cans of worms, > as one have to deal with cyclic convolutions etc in time domain. > It is not obvious that such an approach will work. Even if it > does, the computational savings might not be relevant until the > sequences are several thousand samples long.
Autocorrelation in the frequency domain works well. The computational savings are considerable. The computational cost for autocorrelation of N lags (produces 2N+1 samples) is roughly half the cost of overlap add/save filtering for a FIR of length N. If memory serves, I implemented the algorithm in a manner that took a 3GHz Xeon only 1s to autocorrelate 50e6 samples, producing a 2049 point autocorrelation sequence. Richard Blahut's "Fast Algorithms for Digital Signal Processing" discusses a very clever implementation of autocorrelation when the required number of lags (delays) N is much less than the full-length autocorrelation sequence. The algorithm is sort of like an extension to overlap-add ... but not quite. For each block, the freq resp of a full buffer is mult'd by a half-zero buffer. There are a couple of tricks worth doing, but I can't do justice to them in a post. Read the book to find out how to a) postpone the inverse fft b) use the FFTs of two consecutive half-zero buffers to compute the FFT of their corresponding full buffer. Unfortunately, this book is out of print. Used copies can still be found. If time is precious, I can code it up for you. I've already done it once before, so I could do it faster than someone starting cold (and thus, maybe even cheaper). Email me with details if you are interested. My work email is ( I'm trying to keep it somewhat spam-free ) markb at 3db dash labs dot c0m or my home mark@borgerding.net -- already given up -- Mark Borgerding
Reply by January 23, 20052005-01-23
Dear Rune and Steve,

Thanks a lot for replies. Steve, actually the autocorelation in AMR
takes quite a few cycles, nearly 12% of the total cycles. The input
sequence is 160 samples long and the corelation is to be found for lag
ranging from 0 to about 140.
For each lag value, we have to fetch the same 160 samples of the input
signal again and again. Can we somehow prevent too much loading.
I am supposed to code this in assembly for ARM9E platform and hence was
wondering if any special instruction couild be used.
Thanks and Regards,
Anoop Deoras

Reply by Rune Allnor January 23, 20052005-01-23
EC-AKD wrote:
> Hi, > > Is there any optimised method to write auto corelation of any
signal
> in C. I was referring to AMR Speech codec's one of the auto
corelation
> function and found that it is too costly in terms of cycles it takes. > > Has anyone come up with any optimised method to deal with > autocorelation.
I don't know how the implementation you found works, but there is basically only one way to implement the autocorrelation. There are some variations where one use different scaling coefficients for the different coefficients in the autocorrelation sequence, but these are more expensive than the "naive" estimator for the autocorrelation. Another variation is known as "autocovariance", where one explicitly subtracts the mean from the sequence, and have to estimate that as well. Again, this is more expensive than the "naive" estimator. If you have very long sequences you might be able to save some time by using the FFT and do some of the computations in frequency domain. This approach opens several new cans of worms, as one have to deal with cyclic convolutions etc in time domain. It is not obvious that such an approach will work. Even if it does, the computational savings might not be relevant until the sequences are several thousand samples long. Rune
Reply by Steve Underwood January 21, 20052005-01-21
EC-AKD wrote:

>Hi, > > Is there any optimised method to write auto corelation of any signal >in C. I was referring to AMR Speech codec's one of the auto corelation >function and found that it is too costly in terms of cycles it takes. > >Has anyone come up with any optimised method to deal with >autocorelation. > >Any pointers would be appreciated. > >Thanks and Regards, >Anoop Deoras > >
The autocorrelation should be a tiny part of the total compute load in AMR. I can't imagine what you could do to make it otherwise. The autocorrelation code in the AMR reference code is pretty much how it is normally implemented. Regards, Steve
Reply by EC-AKD January 21, 20052005-01-21
Hi,

  Is there any optimised method to write auto corelation of any signal
in C. I was referring to AMR Speech codec's one of the auto corelation
function and found that it is too costly in terms of cycles it takes.

Has anyone come up with any optimised method to deal with
autocorelation.

Any pointers would be appreciated.

Thanks and Regards,
Anoop Deoras