Reply by Marc Feuerstein March 18, 20092009-03-18
Hi guys,
I am using Matlab for a solar data statistics project. The aim of the project is to make an automatic unsupervized segmentation of EIT images coming from the SoHO satellite.

One of the approaches involves, at some point of the algorithm, to calculate a Mahalanobis distance (generalized dissimilarity distance, popular in multivariate stats).

I received the code from a person that developed it. Here is the part of the code that hangs:

function dist = mahalan(X,Mean,Cov)
% Input:
% X [dim x num_data] Input data.
% Mean [dim x 1] Vector.
% Cov [dim x dim] Matrix.
[dim, num_data] = size( X );
XC = X - repmat(Mean,1,num_data);

% We work with diagonal matrices
D = diag(1./diag(Cov));
dist = diag(XC'*D*XC);
return;
Big problemo. This code crashes with an "out of memory" error when he calculates the following matrix expression: XC'*D*XC
The main matrix (X and also XC) are 128 * 128 matrices. They don't seem that large to me.

I'm looking for an elegant solution to this problem. I already tried the /3 GB switch on my Windows XP Pro 32 bit Matlab box, to no avail.

I'm thinking about two possible strategies, and I'd like to have your advise.

1. I know there is an "official" mahal function in Matlab. Could someone help me by giving me some insight on how to use the mahal function when given X (the main matrix: 128*128), the mean vector (dims: 128*1) and the Variance matrix (dims: 128*128). I would like to replace the manual calculation of the Mahalanobis distance with the built-in Matlab function.

2. I isolated the code crash in the calculation of the XC'*D*XC expression. Maybe it could be possible to make this calculation outside Matlab (for example R, or any scientific language that could handle matrix expressions) ; write the parameters in files, call the outside procedure, have it write the results back in a file and read the file. Basically, an "out-of-process" helper.

What would you guys do ? Do you think there is a better (and also importantly: quicker) way to solve this memory crash ? Maybe add some technical instructions in the code ?

Code snippets more than appreciated.

So, if any of the list could have some idea, it's most welcome !

Marc.

PS: Possibly that other parts will crash too, but for the moment, I'm blocked there.
PPS: I realize that some of you might be tempted to reply "go ask the guy that wrote the code". Tried that. Answer: "cannot help you on that topic".