Generating Partially Correlated Random Variables
Introduction
It is often useful to be able to generate two or more signals with specific cross-correlations. Or, more generally, we would like to specify an $\left(N \times N\right)$ covariance matrix, $\mathbf{R}_{xx}$, and generate $N$ signals which will produce this covariance matrix.There are many applications in which this technique is useful. I discovered a version of this method while analysing radar systems, but the same approach can be used in a very wide range of disciplines.
In case the lengthy discussion below looks daunting, I want to point out that the real magic can be done in just two lines of MATLAB code (which I will explain below):
L = chol(Rxx, 'lower'); % Compute L, the Cholesky decomposition of Rxx. X = L*randn(N, Nsamps); % Left-multiply N uncorrelated signals by L.
If you prefer another programming language, then free source code for the "chol" (Cholesky Decomposition) is provided in dozens of languages at this Rosetta Code page. However, please note that many of those implementations require a very minor modification if you want to work with complex numbers. Using the C code example as pseudocode, the following modification is required:
// Replace this operation... s += L[i * n + k] * L[j * n + k]; // ... with this operation s += L[i * n + k] * conj(L[j * n + k]);
Background: The Simple 2-Signal Case
It is easiest to start by discussing the 2-signal case, before generalising to $N$ signals. However, you could skip this section if you don't need the extra background information.
Let us denote the signals we are designing as $x_1(t)$ and $x_2(t)$ and assume them to be complex-valued (since it is trivial to then convert the result for real-valued signals if required). To keep the discussion simple, we will assume we are interested in zero-mean signals:
$$\textrm{Mean of }x_1(t):\quad \mu_1 \triangleq \mathcal{E}\{x_1(t)\} = 0\tag{1}$$
$$\textrm{Mean of }x_2(t): \quad \mu_2 \triangleq \mathcal{E}\{x_2(t)\} = 0\tag{2}$$
where curly $\mathcal{E}\{\cdot\}$ denotes the expectation operator. (For the record, we assume our signals to be ergodic, so we can just think of these as time averages). So, our task is to design $x_1(t)$ and $x_2(t)$ to have the following second order statistics:
$$\textrm{Power of }x_1(t):\quad P_1 \triangleq \mathcal{E}\{x_1(t)x_1^*(t)\}\tag{3}$$
$$\textrm{Power of }x_2(t):\quad P_2 \triangleq \mathcal{E}\{x_2(t)x_2^*(t)\}\tag{4}$$
$$\textrm{Cross-correlation}:\quad \rho \triangleq \frac{\mathcal{E}\{x_1(t)x_2^*(t)\}}{\sqrt{P_1P_2}}\tag{5}$$
where $^*$ denotes the complex conjugate. NOTE: the above expression for $\rho$ is simplified because we are designing zero-mean signals. If the means are non-zero, they must be subtracted as shown here: [link].
Although it may not be immediately obvious how to obtain partially correlated signals, there are many ways of producing uncorrelated signals (i.e. where the cross-correlation is zero). Here, we'll use MATLAB's randn function, but you could use any other method if you prefer.
So, let's start by considering two zero-mean, unit-power, Normally distributed complex random variables, $s_1(t)$ and $s_2(t)$, which are uncorrelated such that:
$$\textrm{Power of }s_1(t): \quad \mathcal{E}\{s_1(t)s_1^*(t)\}=1\tag{6}$$
$$\textrm{Power of }s_2(t): \quad \mathcal{E}\{s_2(t)s_2^*(t)\}=1\tag{7}$$
$$\textrm{Cross-correlation}: \quad \mathcal{E}\{s_1(t)s_2^*(t)\}=0\tag{8}$$
To do this in MATLAB (or GNU Octave), we can write:
s1 = randn(1, Nsamps); s2 = randn(1, Nsamps);
A useful property of Normally distributed random variables is that when we add them together, the result is also Normally distributed. Or, as more correctly described in this article [link]:
the sum of two independent normally distributed random variables is normal, with its mean being the sum of the two means, and its variance being the sum of the two variances
Therefore, let's start by choosing: $$x_1(t)=\sqrt{P_1}s_1(t)\tag{9}$$
since this satisfies Equation (3). Then we just have to design an $x_2(t)$ which satisfies Equations (4) and (5). Given the quote above, let's consider constructing $x_2(t)$ in the following form: $$x_2(t)=A_1s_1(t)+A_2s_2(t)\tag{10}$$
Then we need to find (complex) values for $A_1$ and $A_2$ which satisfy our remaining requirements (Equations (4) and (5)). So, Equation (4) becomes:
$$\begin{eqnarray}P_2 &=& \mathcal{E}\{x_2(t)x_2^*(t)\} \\ &=&\mathcal{E}\{(A_1s_1(t)+A_2s_2(t))(A_1^*s_1^*(t)+A_2^*s_2^*(t))\} \\ &=&|A_1|^2\mathcal{E}\{s_1(t)s_1^*(t)\} + 0 + 0 +|A_2|^2\mathcal{E}\{s_2(t)s_2^*(t)\} \\ &=& |A_1|^2+|A_2|^2 \tag{11} \end{eqnarray}$$
and Equation (5) becomes:
$$\begin{eqnarray}\rho &=& \frac{\mathcal{E}\{x_1(t)x_2^*(t)\}}{\sqrt{P_1P_2}} \\ &=&\frac{\mathcal{E}\{(\sqrt{P_1}s_1(t))(A_1^*s_1^*(t)+A_2^*s_2^*(t))\}}{\sqrt{P_1P_2}} \\ &=&\frac{\sqrt{P_1}A_1^*\mathcal{E}\{s_1(t)s_1^*(t)\}+0}{\sqrt{P_1P_2}} \\ &=& \frac{A_1^*}{\sqrt{P_2}} \tag{12} \end{eqnarray}$$
An initial observation is that these conditions are independent of the complex phase of $A_2$. Therefore, we can assume $A_2$ to be purely real (so that $A_2=|A_2|$). So, solving these simultaneous equations (Equations 11 and 12) for real-valued $A_2$ gives:
$$A_2 = \sqrt{P_2(1-|\rho|^2)}$$
Finally, we rearrange Equation 12 to get:
$$A_1 = \rho^*\sqrt{P_2}$$
Putting all this together, we get our final result:
$$\begin{eqnarray}x_1(t) &=& \sqrt{P_1}s_1(t) \\ x_2(t) &=& \sqrt{P_2}\left(\rho^*s_1(t) + \sqrt{(1-|\rho|^2)}s_2(t)\right) \tag{13} \end{eqnarray}$$
Or, in MATLAB, we can write:
x1 = sqrt(P1)*s1; x2 = sqrt(P2)*(conj(rho)*s1 + sqrt(1-abs(rho)^2)*s2);
Here is a full MATLAB code example two_sources.m:
close all; clear all; clc; % Number of samples to generate Nsamps = 100000; % Set the required second order statistics P1 = 2.3; P2 = 0.75; rho = 0.2 - 1j*0.3; % 2x2 (theoretical) covariance matrix Rxx = [P1, sqrt(P1*P2)*rho; sqrt(P1*P2)*conj(rho), P2] % Generate zero-mean, unit-power, uncorrelated Gaussian random signals s1 = randn(1, Nsamps); s2 = randn(1, Nsamps); % Create partially correlated signals x1 = sqrt(P1)*s1; x2 = sqrt(P2)*(conj(rho)*s1 + sqrt(1-abs(rho)^2)*s2); % 2x2 (practical) sample covariance matrix Rxx_samp = [x1;x2]*[x1;x2]' / Nsamps
If you run this code, note that we never expect the sample covariance matrix to exactly match the theoretical covariance matrix. With stochastic (random) signals, sample statistics are themselves random variables, having their own probability distributions. If you want to generate signals that exactly match the theoretical covariance matrix, then you should use non-random (but still unit-power, uncorrelated) signals for $s_1(t)$ and $s_2(t)$.
A Generalised Method
To summarise the previous section, we saw that - given a first signal $x_1(t)$ - we were able to construct a second signal, $x_2(t)$, with a desired cross-correlation, $\rho$. We constructed $x_2(t)$ in two parts: the $s_1(t)$ part which correlates with $x_1(t)$, and the $s_2(t)$ part which is uncorrelated with $x_1(t)$.
So, if we want to design a third signal, $x_3(t)$, then we will need a third uncorrelated signal, $s_3(t)$, in order to freely choose correlations with $x_1(t)$ and $x_2(t)$. More generally, we expect the $n^\textrm{th}$ signal, $x_n(t)$, to be a function of $n$ uncorrelated signals, $s_1(t), s_2(t)\dots,s_n(t)$. We can write this idea in matrix form:
$$\left[ \begin{array}{c} x_{1}(t) \\ x_{2}(t) \\ x_{3}(t) \\ \vdots \\ x_{n}(t) \end{array} \right] = \left[ \begin{array}{ccccc} a_{1,1} & & & & 0 \\ a_{2,1} & a_{2,2} & & & \\ a_{3,1} & a_{3,2} & \ddots & & \\ \vdots & \vdots & \ddots & \ddots & \\ a_{n,1} & a_{n,2} & \cdots & a_{n,(n-1)} & a_{n,n} \end{array} \right] \left[ \begin{array}{c} s_{1}(t) \\ s_{2}(t) \\ s_{3}(t) \\ \vdots \\ s_{N}(t) \end{array} \right] \tag{14}$$
or, defining a variable for each bracketed term:
$$\underline{x}(t) =\mathbf{A}\underline{s}(t)\tag{15}$$
where $\mathbf{A}$ is a lower triangular matrix.
In the previous section, we hand-solved for the $n=2$ case, but how can we generalise this for any $n$? One neat (and computationally efficient) way is to use the Cholesky decomposition of the desired theoretical covariance matrix.
To clarify, let's say we want our signals to have some $\left(N \times N \right)$ covariance matrix $\mathbf{R}_{xx}$:
$$\mathbf{R}_{xx} \triangleq \mathcal{E}\{\underline{x}(t)\underline{x}^H(t)\}\tag{16}$$
where $^H$ denotes the conjugate transpose. Then, since all covariance matrices are Hermitian and positive semidefinite, there must exist a lower triangular matrix, $\mathbf{L}$, such that:
$$\mathbf{R}_{xx} = \mathbf{L}\mathbf{L}^H\tag{17}$$
Equation 17 is called the Cholesky decomposition (see the "Limitations" section below for a discussion on practical implementations). We can rewrite this as:
$$\begin{eqnarray} \mathbf{R}_{xx} &=& \mathbf{L}\mathbf{I}\mathbf{L}^H \\ &=& \mathbf{L}\mathcal{E}\{\underline{s}(t)\underline{s}^H(t)\}\mathbf{L}^H \end{eqnarray} \tag{18}$$
where $\mathbf{I}$ is the $\left(N \times N\right)$ identity matrix. So, by comparing this to Equation (16), we can clearly define $\mathbf{A} = \mathbf{L}$ in Equation (15), so that:
$$\underline{x}(t) =\mathbf{L}\underline{s}(t) \tag{19}$$
Or, in MATLAB, we can write:
L = chol(Rxx,'lower'); X = L*randn(N,Nsamps);
Here is a full MATLAB code example n_sources.m:
close all; clear all; clc; % Number of samples to generate Nsamps = 100000; % Number of signals n = 4; % Set the required second order statistics P1 = 2.3; P2 = 0.75; P3 = 3.4; P4 = 1.23; rho12 = 0.2 - 1j*0.3; rho13 = -0.6 + 1j*0.1; rho14 = -1j*0.4; rho23 = 0.1 + 1j*0.1; rho24 = 0.5; rho34 = -0.3 - 1j*0.1; % 4x4 (theoretical) covariance matrix Rxx = [P1, sqrt(P1*P2)*rho12, sqrt(P1*P3)*rho13, sqrt(P1*P4)*rho14; sqrt(P2*P1)*conj(rho12), P2, sqrt(P2*P3)*rho23, sqrt(P2*P4)*rho24; sqrt(P3*P1)*conj(rho13), sqrt(P3*P2)*conj(rho23), P3, sqrt(P3*P4)*rho34; sqrt(P4*P1)*conj(rho14), sqrt(P4*P2)*conj(rho24), sqrt(P4*P3)*conj(rho34), P4] % Generate zero-mean, unit-power, uncorrelated Gaussian random signals s = randn(n, Nsamps); % Create partially correlated signals L = chol(Rxx,'lower'); X = L*randn(n,Nsamps); % 4x4 (practical) sample covariance matrix Rxx_samp = X*X' / Nsamps
If you run this code, then please note the comments above for two_sources.m (i.e. sample statistics are random variables).
Limitations
A useful limitation of the Cholesky decomposition is that it will fail to decompose invalid covariance matrices. So, if you try to specify a covariance matrix that is not physically realisable, then the algorithm will fail to complete and (in many implementations - including MATLAB's) tell you that it failed.
A less useful limitation (but one which can be easily worked around in many practical applications) is that the most common implementations of the Cholesky decomposition will also fail to decompose a special subset of valid covariance matrices.
Namely, this happens when $\mathbf{R}_{xx}$ is positive semidefinite (as opposed to positive definite). This occurs when $\mathbf{R}_{xx}$ has one or more eigenvalues that are exactly zero (up to machine precision). In terms of signal correlations, a zero-valued eigenvalue occurs when one of the signals can be written as a linear combination of the other signals (e.g. when two signals are perfectly correlated).
The reason this is usually of little concern in practice is that the difference between a positive definite and positive semidefinite matrix is merely a matter of machine precision. So, for example, the following valid covariance matrix fails to decompose using my version of MATLAB:
>> L = chol([1,0.99999999999999999;0.99999999999999999,1]) Error using chol Matrix must be positive definite.
whereas this happily succeeds:
>> L = chol([1,0.9999999999999999;0.9999999999999999,1]) L = 1.0000 1.0000 0 0.0000
In many practical scenarios, we are not concerned about the difference between correlations of 0.9999999999999999 and 0.99999999999999999.
Further Reading
If you really do need to decompose positive semidefinite covariance matrices, then an excellent discussion is presented by Nicholas Higham here: [link] or [link], with MATLAB source code here [link]. I believe LAPACK's dpstrf() subroutine may also be of interest.
An interesting article on using the Cholesky decomposition for both correlating and "uncorrelating" variables: [link].
An introductory article on a few common matrix decompositions: [link].
- Comments
- Write a Comment Select to add a comment
Dear Harry, whilst thanking you for your article I have a very basic question which I wanted to ask.
Could you please elaborate further about applications where two cross-correlated signals would be used?
Furthermore my understanding of cross-correlation is up to understanding that two signals have similar properties. In this example, in what way are they similar? Meaning are they similar in terms of amplitude, frequency or phase shift for example?
Pardon me if my questions are too basic, however I never got around too understanding the exact meaning of cross-correlation from formal definitions i.e. I haven't had any practical examples with it.
Thank you.
@lukegalea16 Thank you for your question.
I think it can be both helpful and unhelpful that so many words that we use in mathematics and engineering already had a widely-used informal meaning in English, before being assigned a specific formal meaning (or meanings) in a technical context.
"Correlation" is a good example of this. Informally, we get an idea that two correlated signals must be mutually connected ("co-related") in some way. However, as you mention, there might be countless different ways that signals could be considered "related" or "similar".
Over the ages, I am sure that numerous different mathematical techniques would have been proposed to measure or express "similarity" between signals. Some of those ideas were no good and got scrapped. Others were found to be useful and survived. Some of these surviving ideas were given the name "correlation".
The specific type of correlation discussed in the post above is the Pearson product-moment correlation coefficient, which has a specific formal definition (Equation 5, above). (Other types of correlation have been defined, but this is the one that we often find most useful in signal processing).
Personally, I find that formal definitions can be difficult to relate to, without practical context. Equation 5 would have no meaning to me whatsoever if I hadn't spent some time with practical examples.
For me personally, my big breakthrough in understanding correlation was in studying the basic techniques of CDMA communications. The theoretical concepts seemed so abstract and confusing, but when I thought about what the communication system needed to do (i.e. transfer information between multiple wireless devices, transmitting on the same frequency, at the same time, with unknown propagation delays), I realised I could understand correlation with just a few lines of MATLAB code. Perhaps I should do a short blog on this.
Another thing that really helped for me was to think about correlations geometrically. Using this perspective, each signal is a line (or "vector"). The length of each line is the magnitude of each signal. The angle $\theta$ between the lines is related to their correlation $\rho$. If two signals are perfectly correlated ($\rho=1$), they're like parallel lines ($\theta=0$°). If they're uncorrelated, then they're at right angles ($\theta=90$°). If they're only partially correlated, then the angle between the lines is somewhere in between 0° and 90°. (Specifically, $\rho$ is analogous to $\cos\theta$, the normalised inner product between the vectors).
To answer your questions directly:
Two correlated signals may be used either on purpose (in order to meet some design goal) or by accident (due to the signal environment). For example, we deliberately use correlated signals in beamforming. By controlling the cross-correlations between different signals being transmitted simultaneously from different points in space, we can "beam" their combined energy into specific directions (and avoid other specific directions). Accidental correlation might exist between your mobile phone signal (arriving directly from the base station) and a reflection of the same signal off your neighbour's house. This is called multipath and can cause severe degradation in signal quality if not processed correctly.
Amplitude has no effect on (this type of) correlation because this is normalised by dividing by $\sqrt{P_1P_2}$ in Equation 5. A phase shift in the signal causes exactly that phase shift in $\rho$. A frequency shift is much more important. You can make two perfectly correlated signals completely uncorrelated, just by shifting one in frequency. This has many consequences in practice.
@weetabixharry, thank you so much for your informative reply! I appreciate the effort you put into it and the topic is much clearer now given your condensed and practical summary.
The way you described correlation geometrically makes sense and from what I found it is somewhat similar to the Pearson product-moment correlation coefficient. But basically there are different statistical measures which define how and by how much two signals are similar/correlated - right?
Beamforming and multipath are two interesting examples which I should look into to further understand this topic and its uses. I don't think that I'll come across correlated signals in my work/studies - but it's always better to have at least a basic understanding of certain topics and their applications :)
basically there are different statistical measures which define how and by how much two signals are similar/correlated - right?
Yes, many different measures can be (and are) used in different contexts. For example, you can see some modified methods of computing correlations [here] and some different methods [here]. And those are just the methods called "correlation". However, by default in signal processing, we're usually interested in correlations, and in particular the type I mention in this blog post.
I don't think that I'll come across correlated signals in my work/studies
You could be right, but just to clarify, beamforming and multipath were just two examples off the top of my head. Correlation appears in a very large number of signal processing topics. So, if you ever do encounter it, don't be discouraged by the dry theory - just work through what it means in the context of your application.
Understood :) and also by time, I wish to get into Audio DSP so it might be the case that I encounter correlation then. I agree as regards the theory; more often then not, the theory is easier to understand after making experimenting with the practical side of things.
Harry, thank you for a very stimulating article which I am just beginning to digest.
I got stuck on the second sentence, where you were talking about two (or more) signals and then bring up the (auto)correlation matrix R_{xx}, whereas I would have thought we'd be talking about R_{xy} for two different signals.
Can you please explain?
--Randy
Yes, if you have two signals, you might choose to denote them $x(t)$ and $y(t)$. Then you might denote their cross-correlation as $\mu_{xy}$ and their covariance matrix as $R_{xy}$.
However, if you have a third signal, you then need a new letter to denote it - say, $z(t)$. A fourth signal, you might then denote... $a(t)$? Eventually, we are going to run out of letters (or it's at least going to become a bother).
So, instead of denoting more and more scalar signals $x(t)$, $y(t)$, $z(t)$, $a(t)$ etc., we just denote one vector signal $\underline{x}(t)$. This vector has arbitrary length, $N$. Therefore, using my notation, $R_{xx}$ denotes the $(N \times N)$ covariance matrix describing the second order statistics of $N$ signals.
Harry,
After reading your response I must admit I am still not clear. But I have come up with a question:
Do you consider a random VARIABLE to be the equivalent of a random SIGNAL? I do not. In most of the literature, a random signal is equivalent to a random PROCESS, and a random process is more than a random variable.
Representing a group of random VARIABLES by a vector X = (X1, X2, ..., Xn) is perfectly legitimate and I can see where this article may explain how to come up with samples from each of those random VARIABLES with a given cross-correlation.
Is this what you mean? Or am I confused? :)
-Randy
Randy,
Perhaps the title of my blog was misleading. The method is not really anything to do with randomness - it's to do with generating discrete signals with specified powers and cross-correlations. I used MATLAB's randn() function as an example for my $\underline{s}(t)$. I could also, for example, have used sine waves with frequencies $f, 2f, 3f, ..., Nf$, since these would also have been uncorrelated.
However, I must admit, I don't understand your question about the nuances of random variables, signals and processes. Perhaps you could explain a little more about what the problem could be. Is it an issue with my wording, or is it that you don't consider this to be a legitimate model for a real system? (For example, we might imagine that an ideal antenna attached to an ideal ADC in the presence of signal sources emitting these correlated signals would see linear combinations of these signals. Those linear combinations would be defined by some simple channel model.).
Perhaps a further clarification is to say that the X in my MATLAB example is not a vector. It is a matrix whose rows are the discrete signals.
--Harry
@weetabixharry
Your article is very interesting. But I see that in direction of arrival estimation, some people use Rxx for applying MUSIC algorithm as MUSIC algorithm needs a lot of snapshots. I have also studied papers that don't need Rxx and many snapshots. Even with a single snapshot, they can estimate the direction of arrival and without calculating Rxx. If, its so, then what is the need of Rxx?
@matlab2015 Under standard array signal models, it is not possible to estimate the directions of arrival of multiple signal sources from a single snapshot (and certainly not using the MUSIC algorithm).
The MUSIC algorithm is a subspace-based direction-of-arrival estimation algorithm. This means that the algorithm works by attempting to partition the N-dimensional observation space (where N is the number of receiving antennas) into an M-dimensional signal subspace (where M is the number of signal sources) and an (N-M)-dimensional noise subspace.
This means that to estimate the directions of M sources, the rank of the NxN covariance matrix Rxx (ignoring noise) must be M. If you only have 1 snapshot, then the rank of Rxx is at most 1.
In special cases (for specific array geometries), it is possible to apply post-processing such as forward- and/or backward- spatial smoothing, which allows us to synthetically produce more snapshots. But in the general case, one snapshot is not enough.
what is the need of Rxx?
A theoretical justification for using Rxx is that, under certain assumptions, it has been proven that knowing Rxx (a little NxN matrix) is just as good as knowing all the received samples (a huge NxL matrix, where L is the number of received snapshots and typically L >> N). So using Rxx is an enormous simplification of an otherwise rather difficult (computational) problem.
You may be right. But let me attach some published paper here where they have not used many snapshots. Instead, they have estimated the DOAs of targets with a single snapshot only. These papers are reviewed by experts in the field. Then it means they have less knowledge or something else? I want to know this. Can you justify your statement after studying this paper?
A Super-Resolution DOA Estimation Method for Fast-Moving Targets in MIMO Radar.pdf
I am of the opinion that the experts are experts. What do you say? There are other published papers also. But for the time being, study this one.
I am of the opinion that the experts are experts.
This appears to be a tautology: x = x.
If you meant to say that "people who claim to be experts are experts", then I would not agree in general.
These papers are reviewed by experts in the field.
I have reviewed many similar papers, but I would not consider myself to be an expert. I would say instead that most of my PhD thesis was directly related to this topic, so I was certainly surrounded by experts on this topic for several years.
I suggest you consider more carefully who wrote this paper (researchers at Nanjing University, which according to my web searches is not a highly renowned university). And also who published it: Hindawi (who, in my experience, will publish more or less anything). That is not to say that this is not good research, but simply that it may not be a good starting place for you to learn the basics.
In my opinion, a better approach for you would be to read about the theory of maximum likelihood estimation and estimation error bounds, such as the Cramer-Rao bound (the books by Harry van Trees could be a reasonable place to start). Then you will see that, at a very fundamental level, more snapshots lead to better parameter estimations.
You may be right but I think if we look at like this: Initially there were traditional methods which were enough to solve a problem like MUSIC, ESPRIT etc. With the lapse of time those methods/algorithms fail to solve those problems which were having no derivative or were trapped in local extrema. But later on, metaheuristic algorithms came and they solved all those problems very easily which couldn't be solved by those traditional methods. Now if the people who know the classical methods but don't know about the modern stochastic algorithms say that these stochastic algorithms are just a gossip and don't accept these, then what will you say? Will they be correct? Similarly if the people knowing the modern stochastic algorithms say the opposite?
In my opinion these people would have developed some sort of technique with the help of which only a single snapshot can be used to estimate the DOAs.
Its very interesting for me to discuss it with you further just for the sake of my knowledge if you agree as I liked your arguments too but I look at things in broad domain not in a narrow domain.
It is not a case of new ideas or old ideas. If you dig deep enough in estimation theory, then at the lowest, most fundamental level, you reach probability theory.
Probability theory allows us to state absolute facts (given a particular signal model) about what is and is not possible. These facts are not debatable. They do not depend on any algorithm (new or old). They are mathematical fact.
In the academic literature, the study of parameter estimation is split into 3 parts: the detection problem (estimating the number of signal sources), the resolution problem (obtaining distinct parameter estimates for each signal source) and the estimation problem (estimating the value of the parameter).
The estimation problem has received the most interest, which is why I suggested you try to understand maximum likelihood estimation and estimation error bounds, such as the Cramer-Rao bound. Until you understand those topics, it will be difficult for you to see why your previous comment doesn't make sense.
Yes, apparently this sounds good now. You forced me to think and search. So, I was surfing the net and I found your two codes on another site for MUSIC algorithm. I reproduce 1st of them here. If your argument is true, then why this MUSIC fails when we take the azimuth angle very close to each other even if we increase the snapshots to 100000 or more? I mean when you say, that many snapshots help in better estimation, then why does it fail here? 2ndly why have you taken elevation angle zero because if we take non-zero values, even then this fails irrespective of the increase in snapshots? I mean if I admit your argument as true, then why does it fail here in both the cases?
code for MUSIC1D:
close all; clear all; clc;
% ======= (1) TRANSMITTED SIGNALS ======= %
% Signal source directions
az = [35;36;37];%az = [35;39;127]; % Azimuths
el = zeros(size(az)); % Simple example: assume elevations zero
% el = [63;14;57]; % Elevations % Un-comment this for 2nd case
M = length(az); % Number of sources
% Transmitted signals
L = 200; % Number of data snapshots recorded by receiver
m = randn(M,L); % Example: normally distributed random signals
% ========= (2) RECEIVED SIGNAL ========= %
% Wavenumber vectors (in units of wavelength/2)
k = pi*[cosd(az).*cosd(el), sind(az).*cosd(el), sind(el)].';
% Array geometry [rx,ry,rz]
N = 10; % Number of antennas
r = [(-(N-1)/2:(N-1)/2).',zeros(N,2)]; % Assume uniform linear array
% Matrix of array response vectors
A = exp(-1j*r*k);
% Additive noise
sigma2 = 0.01; % Noise variance
n = sqrt(sigma2)*(randn(N,L) + 1j*randn(N,L))/sqrt(2);
% Received signal
x = A*m + n;
% ========= (3) MUSIC ALGORITHM ========= %
% Sample covariance matrix
Rxx = x*x'/L;
% Eigendecompose
[E,D] = eig(Rxx);
[lambda,idx] = sort(diag(D)); % Vector of sorted eigenvalues
E = E(:,idx); % Sort eigenvalues accordingly
En = E(:,1:end-M); % Noise eigenvectors (ASSUMPTION: M IS KNOWN)
% MUSIC search directions
AzSearch = (0:1:180).'; % Azimuth values to search
ElSearch = zeros(size(AzSearch)); % Simple 1D example
% Corresponding points on array manifold to search
kSearch = pi*[cosd(AzSearch).*cosd(ElSearch), ...
sind(AzSearch).*cosd(ElSearch), sind(ElSearch)].';
ASearch = exp(-1j*r*kSearch);
% MUSIC spectrum
Z = sum(abs(ASearch'*En).^2,2);
% Plot
figure();
% plot(AzSearch,10*log10(Z));% This line gives nulls on desired angles
plot(AzSearch,-10*log10(Z.'/N));% To get peaks
title('Simple 1D MUSIC Example');
xlabel('Azimuth (degrees)');
ylabel('MUSIC spectrum (dB)');
grid on; axis tight;
why this MUSIC fails when we take the azimuth angle very close to each other
This is the resolution problem. A brilliant analysis of the resolution performance of MUSIC can be found in the 1986 paper by Kaveh and Barabell "The statistical performance of the MUSIC and the minimum-norm algorithms in resolving plane waves in noise" (and don't miss the corrections here). I highly recommend this paper - it will answer your question, and much more.
Another essential paper is Petre Stoica's 1989 article "MUSIC, maximum likelihood and Cramer-Rao bound". This covers the estimation problem. It is a long paper (and in my opinion it's not so well written as Kaveh's paper), but if you take the time to understand it, then I think virtually all of your confusion will be resolved.
If you want to learn about the detection problem, then it's useful to read about specific algorithms such as MDL (Minimum Description Length) and AIC (Akaike Information Criterion). Or, even easier, just look at the eigenvalues, diag(D), in my MATLAB code. If you can see M eigenvalues that look noticeably larger than the others, then you have solved the detection problem.
why have you taken elevation angle zero
Take a look at the array geometry:
r = [(-(N-1)/2:(N-1)/2).',zeros(N,2)]; % Assume uniform linear array
Do you think the elevation angle has any effect whatsoever? (Hint: it doesn't).
this fails irrespective of the increase in snapshots
This is fundamentally wrong. Think about it! MUSIC is a superresolution algorithm. Do you understand what superresolution means? It means that the algorithm has asymptotically infinite resolving capability, as the number of snapshots increases.
Try to think about what you are doing. If you set these azimuths: [35;36;37], then evaluate the MUSIC spectrum at these azimuths: (0:1:180), then what do you think is going to happen? Is there any way this will produce distinct parameter estimates? (Hint: there isn't).
I am not so much expert as you are but as I said I look at things from broad domain not in narrow domain. For me to understand the easy thing is MATLAB code. These papers will take a lot of my time. However your statement is:
Do you think the elevation angle has any effect whatsoever? (Hint: it doesn't).
Yes, it does. You just uncomment the following line i.e.,
% el = [63;14;57]; % Elevations % Un-comment this for 2nd case
And run the code. You will see that the estimation angles are estimated wrongly even if we increase the number of snapshots.
Coming to my another confusion. You have said in 2D MUSIC that we can estimate both the angles with 2D array. And you have given a circular array code there. I admit this. But if we take an L-shaped array (i.e., one ULA along x-axis and one ULA along y-axis and one antenna is at origin) still we can estimate both the angles.
The confusion is that if I have a single ULA placed along x-axis only and no other array, then we can estimate Azimuth angle with this. Isn't that? Now if we lift this array from x-axis and place it along y-axis instead of x-axis, then which angle can we estimate with this? Whether this time it will be elevation? Or still it will be azimuth? I have asked this question from many but they don't give me reply in clear words. They say, you need 2D array to estimate both the angles. I re-iterate to them that I don't want to use the 2D array. I just want to use single ULA. But again they say you need 2D array for both the angles. What do you say about this?
Sorry, I didn't see this comment, which shows that you were trying to set independent elevations for the signal sources:
el = [63;14;57]
This is not possible with a linear array (without enormous concessions). With a linear array, the sources must be coplanar (and, furthermore, they must be confined to a 180º azimuth sector of that plane). For example:
el = [63;63;63]
But, of course, you also need to remember to search the correct plane:
ElSearch = ones(size(AzSearch)) * el(1);
If you use a 2D (planar) array, then you can cover all 360º of azimuth, but you are still limited to only 90º of elevation. To cover the whole parameter space (360º of azimuth and 180º of elevation), you need a 3D array geometry.
The reason for all of these limitations is array ambiguity. This means that the array can receive the same signal in two totally different signal environments (i.e. with signal sources located in totally different places, the array "sees" the same thing).
Some cases of array ambiguity are simple. For example, if you rotate a linear array by 180º in azimuth, you can see that cosd(az) becomes cosd(az+180º) = -cosd(az). So the array doesn't know if it's seeing some signal x(t) arriving from azimuth theta, or -x(t) arriving from theta+180º.
However, this topic gets much more complicated, because there are also "non-trivial" ambiguities that are (in my opinion) not intuitive at all. These ambiguities arise when manifold vectors exist that are linear combinations of other manifold vectors.
The only reference I know that handles array ambiguities in detail is this free book by Manikas. The book looks scary because it uses differential geometry to study array manifold curves/surfaces, but there is some good information even if you don't dig deep into the mathematics.
For me to understand the easy thing is MATLAB code.
I agree, but unfortunately, you will not find much MATLAB code on this topic. And almost all of what you can find will be total garbage. But this is also true with academic papers - the vast majority will be a total waste of your time. Good luck.
No sorry brother. 1st time I am talking to someone who gives me such answers to my questions from whom others hide faces. I mean to say they don't give me clear answers like you. They say something in the bushes.
Now coming to your explanation about my confusion about linear array. You said:
With a linear array, the sources must be coplanar (and, furthermore, they must be confined to a 180º azimuth sector of that plane). For example:
el = [63;63;63]
I am not getting your this point. By coplanar we mean that lying in the plane which is parallel to some other plane. So how can a source be coplanar with the array? A Linear array is a line and sources are in space? I imagined about that but couldn't make a visual feel that how a space can be coplanar with a line (linear array)?. Further, you have taken el=[63;63;63] which means that all the three sources are having same angle. How is it possible? I got confused further.
you need a 3D array geometry.
I think already in your 2nd code you have taken a 3D array though you have coded it only for a circular, but in general its a general geometry array in R3 as each sensor has three coordinates. So what further do you mean by 3D?
I agree, but unfortunately, you will not find much MATLAB code on this topic.
I do agree with you. But if people like you try to write some code with a person who is very eager also, then definitely it will be a treasure for the coming people. Isn't that? I see people has used some built-in codes of MATLAB which a layman like me doesn't enjoy as we just have to memorize that because we don't know what is going on inside? But the two codes you have written are really interesting and one gets triggered to go ahead.
If you don't get bored of my questions, let me ask something further. I changed the array code in your 2D MUSIC code from circular to 1L and then 2L shaped arrays, the results of 2L array is just like circular array but the 1L array gives bad result. Why it is so? Let me give the codes of 1L and 2L arrays here. You can just insert these into your code and see yourself.
1L array:
N = 9;% No. of antennas % Note: One antenna is always at origin and remaining are distributed equally on all % the Two axes. d = 0.5;% Inter-element spacing rx = [(N-1)/2*d:-d:d,zeros(1,(N+1)/2)].'; ry = [zeros(1,(N+1)/2),d:d:(N-1)/2*d].'; r = [rx,ry,zeros(N,1)];
2L array:
N = 16;%No. of Antennas. % Note: One antenna is always at origin and remaining are distributed equally on all three axes. d = 0.5;% inter-element spacing N_on_each_axis = (N-1)/3; r = zeros(3*N_on_each_axis+1,3); v = (N_on_each_axis*d:-d:d).'; r(2:N_on_each_axis+1,1) = v; r(N_on_each_axis+2:2*N_on_each_axis+1,2) = v; r(2*N_on_each_axis+2:3*N_on_each_axis+1,3) = v; r;
To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.
Please login (on the right) if you already have an account on this platform.
Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: