# Autocorrelation matrix of a long sequence

Started by April 16, 2009
```Hi

I am using MATLAB to generate close to 10^8 samples of a Rayleigh fading
channel using Jakes method. Once I generate these samples, I have to then
create a 20 x 20 autocorrelation matrix. I am using MATLAB to do this.
MATLAB works perfectly for 10^6 samples, but once the number of samples
increases slightly above 50 * 10^6, MATLAB throws an "OUT of Memory" error.
I tried some techniques given in www.mathworks.com to overcome this
problem, but they dont seem to help once the number of samples goes above
10^7.

My question is, does anyone know of a technique to compute the
autocorrelation matrix for such a long sequence quickly and accurately. I
have heard that the autocorrelation can be computed using FFT. I tried to
lay my hands on "Fast Algorithms for Digital Signal Processing" book, but
could not. Can somebody guide me to any resources explaining this method or
any other method which will help me out through the above said problem.

Thanks
Aloknath

```
```On Apr 16, 5:39&#2013266080;pm, "sasuke" <a.ssjg...@gmail.com> wrote:
> Hi
>
> I am using MATLAB to generate close to 10^8 samples of a Rayleigh fading
> channel using Jakes method. Once I generate these samples, I have to then
> create a 20 x 20 autocorrelation matrix. I am using MATLAB to do this.
> MATLAB works perfectly for 10^6 samples, but once the number of samples
> increases slightly above 50 * 10^6, MATLAB throws an "OUT of Memory" error.
> I tried some techniques given inwww.mathworks.comto overcome this
> problem, but they dont seem to help once the number of samples goes above
> 10^7.
>
> My question is, does anyone know of a technique to compute the
> autocorrelation matrix for such a long sequence quickly and accurately. I
> have heard that the autocorrelation can be computed using FFT. I tried to
> lay my hands on "Fast Algorithms for Digital Signal Processing" book, but
> could not. Can somebody guide me to any resources explaining this method or
> any other method which will help me out through the above said problem.
>
>
> Thanks
> Aloknath

Best to split the problem up into batches and then average the
autocorrelation matrices.

Hardy
```
```On 16 Apr, 07:39, "sasuke" <a.ssjg...@gmail.com> wrote:
> Hi
>
> I am using MATLAB to generate close to 10^8 samples of a Rayleigh fading
> channel using Jakes method. Once I generate these samples, I have to then
> create a 20 x 20 autocorrelation matrix. I am using MATLAB to do this.
> MATLAB works perfectly for 10^6 samples, but once the number of samples
> increases slightly above 50 * 10^6, MATLAB throws an "OUT of Memory" error.
> I tried some techniques given inwww.mathworks.comto overcome this
> problem, but they dont seem to help once the number of samples goes above
> 10^7.

For real-valued data, 1e7 samples take 8e7 bytes of memory.
A 32-bit OS can accomodate at most 2^31 ~ 2e9 bytes of memory.
With matlab running at the same kernel as the OS, a couple of
dozens of other programs, and a little bit of bad luck with
memory fragmentation, you quickly approach the memory limit
with the amounts of data you have. As others already suggested,
dividing the data up in chunks goes a long way to solve the
problem.

> My question is, does anyone know of a technique to compute the
> autocorrelation matrix for such a long sequence quickly and accurately. I
> have heard that the autocorrelation can be computed using FFT. I tried to
> lay my hands on "Fast Algorithms for Digital Signal Processing" book, but
> could not. Can somebody guide me to any resources explaining this method or
> any other method which will help me out through the above said problem.

The general procedure to compute the 'usual'
biased estimate for the autocorrelation goes
something along the lines of:

1)  Allocate a 20-pt buffer for the end result,
initialized with all zeros
2)  Split the sequence up in manageable N-length frames.
What is 'manageable' is determined by balancing
speed against memory usage. Fewer frames is faster,
but requires more memory.
3)  Extract the next frame from the sequence and zero-pad
to at least 2N-1 points
4)  Compute the DFT of the zero-padded frame
5)  Compute the periodogram as the squared magnitude
of the spectrum coefficients
6)  Compute the IDFT of the periodogram
7)  *Add* the 20 first coefficients to the contents
of the buffer allocated in step 1)
8)  Handle scaling coefficients
9)  Repeat from 3) until all frames have been processed
10) If desired, rearrange the 20 correlation values
into a 20x20 Toeplitz matrix

Rune
```
```
sasuke wrote:
> Hi
>
> I am using MATLAB to generate close to 10^8 samples

First mistake. Don't use MATLAB for serious computations. A plain C++
program runs ~10 times faster.

> channel using Jakes method. Once I generate these samples, I have to then
> create a 20 x 20 autocorrelation matrix.

Second mistake. Trying to solve the problem by brute force, instead of
thinking of the elegant approach.

> I am using MATLAB to do this.

Don't use MATLAB. It is a toy for stupidents and studiots.

> MATLAB works perfectly for 10^6 samples, but once the number of samples
> increases slightly above 50 * 10^6, MATLAB throws an "OUT of Memory" error.
> I tried some techniques given in www.mathworks.com to overcome this
> problem, but they dont seem to help once the number of samples goes above
> 10^7.

Yes, of course. What else do you expect.

> My question is, does anyone know of a technique to compute the
> autocorrelation matrix for such a long sequence quickly and accurately. I
> have heard that the autocorrelation can be computed using FFT. I tried to
> lay my hands on "Fast Algorithms for Digital Signal Processing" book, but
> could not. Can somebody guide me to any resources explaining this method or
> any other method which will help me out through the above said problem.

Well. You apparently lack the basics, so there is no method which can help.

I let you know.

>
> Thanks
> Aloknath
>

VLV
```
```On 16 Apr., 22:52, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
wrote:
> sasuke wrote:
> > Hi
>
> > I am using MATLAB to generate close to 10^8 samples
>
> First mistake. Don't use MATLAB for serious computations. A plain C++
> program runs ~10 times faster.

This statement is far too general. It's not easy to compare the
performances on this level. The programming style is different. If you
want your Matlab code to run fast you usually exploit all the built-in
functions (which map to ATLAS, FFTW, ....) and operate on a matrix/
vector level as opposed to touching every scalar by hand. But this may
also lead to higher memory requirements than what would be necessary
in case of C or C++.

> > I am using MATLAB to do this.
>
> Don't use MATLAB. It is a toy for stupidents and studiots.

It may be of no great use to you. That doesn't mean it's not useful.
You're basically insulting every Matlab user for no good reason.
Matlab/Octave is nice for quickly testing things / prototyping /
plotting 2D- and 3D- graphs. You can quickly import data and
manipulate it. It has a nice collection of tools for all sorts of
things and an interactive shell ... etc etc etc.

C++ is nice. But it certainly isn't the best language for every job
and you know it.

SG
```
```On 16 Apr, 23:27, SG <s.gesem...@gmail.com> wrote:
> On 16 Apr., 22:52, Vladimir Vassilevsky <antispam_bo...@hotmail.com>
> wrote:
>
> > sasuke wrote:
> > > Hi
>
> > > I am using MATLAB to generate close to 10^8 samples
>
> > First mistake. Don't use MATLAB for serious computations. A plain C++
> > program runs ~10 times faster.
>
> This statement is far too general.

Unless you have stripped the operations down to the core
BLAS/LAPACK level, Vladimir is right. There is the interpreter
which take a lot more time than people realize. More below.

> It's not easy to compare the
> performances on this level. The programming style is different. If you
> want your Matlab code to run fast you usually exploit all the built-in
> functions (which map to ATLAS, FFTW, ....) and operate on a matrix/
> vector level as opposed to touching every scalar by hand.

And you have to throw every good programming practice
down the drains. One aspect of matlab is that variables
are passed between functions by copy, not by reference,
which means that local copies of variables are found
all over the place.

This first of all consumes memory but also consumes lots
of time copying variables back and forth, where one would
use pointers or references in C or C++.

The solution is to avoid using functions, thus forefeiting
testing habits, code reusability and source code managment.

> But this may
> also lead to higher memory requirements than what would be necessary
> in case of C or C++.

Nope. It *might* not spend more memory. It *certainly* makes
the program blow up and grab all the memory it can get.

Rune
```
```> 4) &#2013266080;Compute the DFT of the zero-padded frame
> 5) &#2013266080;Compute the periodogram as the squared magnitude
> &#2013266080; &#2013266080; of the spectrum coefficients
> 6) &#2013266080;Compute the IDFT of the periodogram

>
> Rune

A little bit off-topic:

Since multiplication in frequency domain is the same as (circular)
convolving in the time domain, shouldn't one of the segments be time-
reversed before zero-padding and DFTing? in that way the auto-
correlation is re-written as a convolution, and then it is ok to
compute it in the freq domain. Am I right?
```
```On 17 Apr., 09:45, Rune Allnor <all...@tele.ntnu.no> wrote:
> On 16 Apr, 23:27, SG <s.gesem...@gmail.com> wrote:
> > This statement is far too general.
>
> Unless you have stripped the operations down to the core
> BLAS/LAPACK level, Vladimir is right.

That's what I've been saying. Matlab comes with lots of built-in
functions that operate on vectors and matrices (find, min, max,
mldivide, +, -, *, .....) which is why I said "different programming
style" and "not easy to compare". No point in arguming here.

> There is the interpreter
> which take a lot more time than people realize.
> [...]
> And you have to throw every good programming practice
> down the drains. One aspect of matlab is that variables
> are passed between functions by copy, not by reference,
> which means that local copies of variables are found
> all over the place.

Pass-by-value semantics doesn't exclude the possibility of "copy-on-
write" optimizations. It's perfectly possible to write

x = somefunc(x);

and let somefunc modify its local "copy" without any actual copying
being involved. Now, whether Matlab does this behind the scenes is
beyond me.

Anyhow, I wasn't saying Matlab is better than anything else in every
situation! Obviously it isn't. And it might not be the best tool for
the original poster except for testing some quickly-written prototype
algorithms on "toy problems". But Vladimir's response was too general
and too one-sided which deserved some correcting, IMHO.

Cheers!
SG
```
```On 17 Apr, 09:54, itakatz <itak...@gmail.com> wrote:
> > 4) &#2013266080;Compute the DFT of the zero-padded frame
> > 5) &#2013266080;Compute the periodogram as the squared magnitude
> > &#2013266080; &#2013266080; of the spectrum coefficients
> > 6) &#2013266080;Compute the IDFT of the periodogram
>
> > Rune
>
> A little bit off-topic:
>
> Since multiplication in frequency domain is the same as (circular)
> convolving in the time domain,

Convolution by DFT is circular, which is why you need
to zero-pad before computing the DFT. The convolution
of two N-length sequences gives as result a sequence
with at most 2N-1 non-zero elements. Zero-padding to
at least 2N-1 total length allocates the space needed
to avoid wrap-around effects. Depending on what
implementation of the FFT you have available, you might
zero-pad further, to the next power of 2.

> shouldn't one of the segments be time-
> reversed before zero-padding and DFTing?

Time-reversing in time domain is not sufficient to
handle wrap-around effects from circular convolution,
since you don't change the length of the input sequence.

With pen and paper your trick might work, but in the
computer you need to allocate memory and find a map
from the 'true' index space k = -N,...,N to the RAM

Essentially, you need to do some book-keeping to find
out if the rxx coefficient turns out in the index 0
position of the buffer, or in the index N position.

> in that way the auto-
> correlation is re-written as a convolution, and then it is ok to
> compute it in the freq domain. Am I right?

In time domain, reversing the direction of a sequence x[n]
is written as x[-n]. It's a standard excercise in most DSP
books to show that if

x[n] <-> X[k]

is a DFT pair, then

x[-n] <-> X'[k]

where X' is the complex conjugate of X, is also a DFT pair.
Well, this is correct at least when x[n] is real-valued,
it might not be for complex-valued x[n].

Anyway, convolving x[n] by x[-n] in time domain, which
is a common estimator for the autocorrelation function
(save some scaling coefficients), can alternatively be
computed in frequency domain as a product of the DFTs:

Sxx[k] = X[k]X'[k] = |X[k]|^2.

Hence my algorithm works, once the relevant scaling
coefficients have been accounted for.

Rune
```
```On 17 Apr, 10:21, SG <s.gesem...@gmail.com> wrote:
> On 17 Apr., 09:45, Rune Allnor <all...@tele.ntnu.no> wrote:
>
> > On 16 Apr, 23:27, SG <s.gesem...@gmail.com> wrote:
> > > This statement is far too general.
>
> > Unless you have stripped the operations down to the core
> > BLAS/LAPACK level, Vladimir is right.
>
> That's what I've been saying. Matlab comes with lots of built-in
> functions that operate on vectors and matrices (find, min, max,
> mldivide, +, -, *, .....) which is why I said "different programming
> style" and "not easy to compare". No point in arguming here.

At least C++ (I don't know 'bout C) have a few libraries
that provide data structures like trees, lists, queues etc,
that allow for far more efficient algorithms than matlab can,
which is locked into the 'straightjacket' posed by the ND
array data structure.

Here is a teaser I wrote some time ago on comp.soft-sys.matlab,

how to make matlab run fast. As you can see from the
quoted profiler report, my 'naive' matlab alternative
beat the 'elegant' matlab alternative by almost a factor 2.

Then I did the same job with C++, the only difference
being that I exploited the available data structures.
This naive C++ code, with the algorith tailored to the
problem but not optimized by me in any other way[*], ran
almost 10x faster than the fastest matlab alternative.

As I said, unless you work with core BLAS/LAPACK types
of problems, leave matlab out if run-time is a concern.

> > There is the interpreter
> > which take a lot more time than people realize.
> > [...]
> > And you have to throw every good programming practice
> > down the drains. One aspect of matlab is that variables
> > are passed between functions by copy, not by reference,
> > which means that local copies of variables are found
> > all over the place.
>
> Pass-by-value semantics doesn't exclude the possibility of "copy-on-
> write" optimizations. It's perfectly possible to write
>
> &#2013266080; &#2013266080;x = somefunc(x);
>
> and let somefunc modify its local "copy" without any actual copying
> being involved. Now, whether Matlab does this behind the scenes is
> beyond me.

I am pretty sure it does. While not as bad as worst-case, it
still represents a significant overhead compared to C or C++.

> Anyhow, I wasn't saying Matlab is better than anything else in every
> situation! Obviously it isn't. And it might not be the best tool for
> the original poster except for testing some quickly-written prototype
> algorithms on "toy problems".

That's exactly what matlab was designed to do. Despite
appearances, that's all matlab is useful for.

> But Vladimir's response was too general
> and too one-sided which deserved some correcting, IMHO.

1) "Don't use MATLAB for serious computations."
2) "A plain C++ program runs ~10 times faster."

The first I fully support, since matlab is a beast
both what 'serious' programming/development and
run-time resource demands are concerned. If you try
and make anything beyond prototypes or toy programs,
you will have to work your butt off in return for
less than mediocre performance. As demonstrated in
the post quoted above.

Vladimir's second statementis factually correct for
just about anything but core BLAS/LAPACK applications.

Rune

[*] I compiled the C++ code as a MEX routine to be
called from within matlab. Whatever optimizing
compiler directives were invoked from the MEX
compilation setupwe, were standardized and not
modified by me.
```