comp.dsp | convolving noise with noise will get Gaussian? what does that imply?

Suppose I have a segment of data, which is basically random numbers between 
0 and 1. I call it "f".

Doing "f" -- the noise -- self convolution several times, the resultant 
data, if plotted, is of Gaussian shape.

I know this is somewhat related to CLT...

But what does this mean in practice? What does this imply? Does this CLT 
fact has any implication or application in practice? Any deep thoughts, 
intuitions?

Thanks a lot!

Reply by Rune Allnor ●November 3, 20042004-11-03

"kiki" <lunaliu3@yahoo.com> wrote in message news:<cm8mci$sjo$1@news.Stanford.EDU>...
> Suppose I have a segment of data, which is basically random numbers between 
> 0 and 1. I call it "f".
> 
> Doing "f" -- the noise -- self convolution several times, the resultant 
> data, if plotted, is of Gaussian shape.

Exactly what do you do, and how have you verified that your output 
is of Gaussian shape, and not, say, triangular?
 
> I know this is somewhat related to CLT...

It needs not be at all. I'd put my money on that what you see is a 
"streching effect" due to accumulating end effects from doing the 
multiple convolutions. I'd expect something like this to happen even 
if you happen to perform a circular convolution.

Actually (but I may be wrong here!), I believe the CLT applies to 
convolving PDFs, not the random data themselves...

> But what does this mean in practice? What does this imply? Does this CLT 
> fact has any implication or application in practice? Any deep thoughts, 
> intuitions?

I'd expect it to be a nice learning experience with respect to 
pracitical analyzis of data (including documentation of your algorithm), 
as well as to approach results with a hint of critical thoughts 
involved. Other than that, I'm quite shallow these days. 

> Thanks a lot!

Y' welcome.

Rune

Reply by David C. Ullrich ●November 3, 20042004-11-03

On Tue, 2 Nov 2004 11:15:30 -0800, "kiki" <lunaliu3@yahoo.com> wrote:

>Suppose I have a segment of data, which is basically random numbers between 
>0 and 1. I call it "f".
>
>Doing "f" -- the noise -- self convolution several times, the resultant 
>data, if plotted, is of Gaussian shape.
>
>I know this is somewhat related to CLT...
>
>But what does this mean in practice? What does this imply? Does this CLT 
>fact has any implication or application in practice? Any deep thoughts, 
>intuitions?

I don't have any deep thoughts or intuitions about what this means,
but since someone else has said you're simply wrong about all this
I'll just say that yes, convolving the distribution of the noise
with itself several times does lead to something roughly Gaussian.
This has a lot to do with CLT, in fact this is exactly what CLT
_says_!

>Thanks a lot!
>
>
>
>

************************

David C. Ullrich

Reply by Herman Rubin ●November 3, 20042004-11-03

In article <f56893ae.0411030026.1385213f@posting.google.com>,
Rune Allnor <allnor@tele.ntnu.no> wrote:
>"kiki" <lunaliu3@yahoo.com> wrote in message news:<cm8mci$sjo$1@news.Stanford.EDU>...
>> Suppose I have a segment of data, which is basically random numbers between 
>> 0 and 1. I call it "f".

>> Doing "f" -- the noise -- self convolution several times, the resultant 
>> data, if plotted, is of Gaussian shape.

The convolution of identical distributions with second
moments is APPROXIMATELY normal, the approximation becoming
better with the number convolved.

The convolution of two distributions is never normal 
unless both are normal.

These two theorems may seem paradoxical, but they are
both true.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Reply by Rune Allnor ●November 4, 20042004-11-04

hrubin@odds.stat.purdue.edu (Herman Rubin) wrote in message news:<cmbd3d$5vsm@odds.stat.purdue.edu>...
> In article <f56893ae.0411030026.1385213f@posting.google.com>,
> Rune Allnor <allnor@tele.ntnu.no> wrote:
> >"kiki" <lunaliu3@yahoo.com> wrote in message news:<cm8mci$sjo$1@news.Stanford.EDU>...
> >> Suppose I have a segment of data, which is basically random numbers between 
> >> 0 and 1. I call it "f".
>  
> >> Doing "f" -- the noise -- self convolution several times, the resultant 
> >> data, if plotted, is of Gaussian shape.
> 
> The convolution of identical distributions with second
> moments is APPROXIMATELY normal, the approximation becoming
> better with the number convolved.
> 
> The convolution of two distributions is never normal 
> unless both are normal.
> 
> These two theorems may seem paradoxical, but they are
> both true.

I only know the CLT from statistics, where Papoulis says [1, p 214]:

  "The [CLT] states that under _certain_general_conditions_..."

and

  "The CLT can be expressed as a property of convolution: The 
   convolution of a large number of _positive_functions_ 
   is approcimately a normal function."

(My emphasis in both quotes.)

So there are "conditions" for the CLT to hold, and "positive functions"
are mentioned explicitly. My personal observation is that the "CLT" is 
always mentioned in the context of "distributions", never with "data".

Now, the OP mentined that she(?) worked with random *data* with 
a probability distribution defined on the interval [0,1]. So my 
argument is that the CLT can not be invoked in this case, since the 
convolution operates on the *data*, not the PDF. I could stretch as 
far, though, as accepting the CLT as not entirely irrelevant for this 
particular data set that necessarily is non-negative because of the 
ditribution it is generated from. In general, however, I think the CLT 
has no place in a discussion that regards the *data* directly.

Rune  

[1] Papoulis: "Probability, Random Variables and stochastic Processes"
      3rd ed., McGraw-Hill, 1991.

Reply by Rune Allnor ●November 5, 20042004-11-05

Randy Yates <randy.yates@sonyericsson.com> wrote in message news:<xxppt2tofos.fsf@usrts005.corpusers.net>...
> allnor@tele.ntnu.no (Rune Allnor) writes:
> 
> > hrubin@odds.stat.purdue.edu (Herman Rubin) wrote in message news:<cmbd3d$5vsm@odds.stat.purdue.edu>...
> > > In article <f56893ae.0411030026.1385213f@posting.google.com>,
> > > Rune Allnor <allnor@tele.ntnu.no> wrote:
> > > >"kiki" <lunaliu3@yahoo.com> wrote in message news:<cm8mci$sjo$1@news.Stanford.EDU>...
> > > >> Suppose I have a segment of data, which is basically random numbers between 
> > > >> 0 and 1. I call it "f".
>  
> > > >> Doing "f" -- the noise -- self convolution several times, the resultant 
> > > >> data, if plotted, is of Gaussian shape.
> > > 
> > > The convolution of identical distributions with second
> > > moments is APPROXIMATELY normal, the approximation becoming
> > > better with the number convolved.
> > > 
> > > The convolution of two distributions is never normal 
> > > unless both are normal.
> > > 
> > > These two theorems may seem paradoxical, but they are
> > > both true.
> > 
> > I only know the CLT from statistics, where Papoulis says [1, p 214]:
> > 
> >   "The [CLT] states that under _certain_general_conditions_..."
> > 
> > and
> > 
> >   "The CLT can be expressed as a property of convolution: The 
> >    convolution of a large number of _positive_functions_ 
> >    is approcimately a normal function."
> > 
> > (My emphasis in both quotes.)
> > 
> > So there are "conditions" for the CLT to hold, and "positive functions"
> > are mentioned explicitly. My personal observation is that the "CLT" is 
> > always mentioned in the context of "distributions", never with "data".
> 
> Rune,
> 
> In my estimation, you're either confused or blinding yourself by being
> overly pedantic.

In the past, I've been known to be both... ;)

> First of all, a PDF is *ALWAYS* a "positive function," 

Agreed.

> so your caveat
> on this point is empty. 

It is pretty obvious from the original post that what is being convolved 
here is the (positive) random data that comply to a (positive) PDF, not 
the PDF itself. Or, to rephrase the original question a bit a more 
pedantically:

   [ Note that I have changed the notation to comply with "standard"
   conventions. The fact that a random variable by chance is referred 
   to by the letter "f" does *not* transform the variable into a PDF!]

   A random vector x= [x_0, x_1,...,x_N] is drawn from a random 
   process X. The process X is characterized by an (unspecified) 
   probability density function f such that each coefficient x_n 
   of x obeys

        0 < x_n < 1,   n=0,1,...N.

   The discussion relates to repeated convolutions of the random 
   vector x, not the PDF f. In fact, the PDF f has never been 
   explicitly defined in this thread. 

I am sure you agree with me in that the Gaussian N(0,1) probability 
distribution function is positive for all arguments. I am also sure 
you agree with me in that the random data generated by a stochastic 
process characterized by the N(0,1) PDF will *not* be positive.  

So where does the CLT apply? With positive PDFs? With not necessarily 
positive random data? That's the whole point here.

> Secondly, when you add independent random variables, the distribution
> of the result is the convolution of the input variables' PDFs. 

Exactly. But the OP talked about "data", not "PDFs". One of my 
severe shortcommings in life, is that I am not clairvoyant. I can't 
look into the minds of other people and see what they actually mean
to ask. I have to relate to what they express either orally or in writing. 
The OP explicitly used the term "data". So I try to discuss "data". 
If the OP meant to discuss PDFs, well, so be it. If so, she(?) phrased 
the question very poorly. And did the wrong experiment as well.

> What's
> the difference betseen adding "data" that is from two random variables
> and adding the random variables themselves? I say "nothing."

I have no problems with that. But what you say here has nothing to do 
with PDFs. The CLT as I know it, applies to PDFs, not random variables.

> Both Papoulis and Vinniotis [1] state that in practice, adding about 30 
> variables results in a Gaussian distribution.

I am sure they are right. But I can't see what this has to do with 
*convolving* the random variables, which was what the OP did. The last 
time I checked, "addition" and "convolution" were two different 
operations.

Again, the CLT as I know it applies to PDFs. Please show me references 
to the CLT being extended to arbitrary non-positive functions.

> [1] Yannis Vinniotis, "Probability and Random Process for Electrical
> Engineers," C1998, McGraw-Hill.

Rune

Reply by Randy Yates ●November 5, 20042004-11-05

allnor@tele.ntnu.no (Rune Allnor) writes:
> [...]
> The CLT as I know it, applies to PDFs, not random variables.

Do you mean that the convolution it speaks of is of the PDFs, not
the random variables themselves? I agree with that. However, to
say that the CLT does not apply to random variables is pretty
much false in my book - it's all about random variables. 

> [...] But I can't see what this has to do with *convolving* the
> random variables, which was what the OP did. 

I didn't read the original post clearly enough. It does appear that
is what he is saying. Of course that is a different operation than
adding the R.V.s.

> The last time I checked, "addition" and "convolution" were two
> different operations.

That comes across as extremely smart-assed, Rune. Maybe that's my
misinterpretation, but I thought I'd let you know. 

> Again, the CLT as I know it applies to PDFs. Please show me references 
> to the CLT being extended to arbitrary non-positive functions.

Again, as in the first paragraph above, the CLT does "apply" to random
variables. The proper thing to do is supply a version of the theorem
and show this from the language of the theorem, but I'm not motivated
enough at the moment.

To make a long story short, I thought the OP was talking about adding
the noise data. If that's not the operation being performed, then I
agree that the CLT doesn't apply.
-- 
%  Randy Yates                  % "Rollin' and riding and slippin' and
%% Fuquay-Varina, NC            %  sliding, it's magic."
%%% 919-577-9882                %  
%%%% <yates@ieee.org>           % 'Living' Thing', *A New World Record*, ELO
http://home.earthlink.net/~yatescr

Reply by Rune Allnor ●November 6, 20042004-11-06

Randy Yates <yates@ieee.org> wrote in message news:<7jp0bmy9.fsf@ieee.org>...
> allnor@tele.ntnu.no (Rune Allnor) writes:
> > [...]
> > The CLT as I know it, applies to PDFs, not random variables.
> 
> Do you mean that the convolution it speaks of is of the PDFs, not
> the random variables themselves? 

That's exactly what I have been saying during this whole thread!

> I agree with that. 

Good.

> However, to
> say that the CLT does not apply to random variables is pretty
> much false in my book - it's all about random variables. 

OK, if we have to go nit-picking, here's my 2c: A "random process"
generates "random variables" (or "random data"), RVs,  that in some 
way are characterized by a "Probablility Density Function", PDF.
In that sense, the RV and the PDF are interconnected in that both 
are associated with a random process.

The "CLT operator" takes multiple PDFs as input and produces one 
PDF as output. When I look at the inner workings of the CLT, I see 
PDFs, not RVs. I could have agreed with you if you said "it's all 
about random _processes_". You didn't.

> > [...] But I can't see what this has to do with *convolving* the
> > random variables, which was what the OP did. 
> 
> I didn't read the original post clearly enough. It does appear that
> is what he is saying. Of course that is a different operation than
> adding the R.V.s.

Good. We agree.

> > The last time I checked, "addition" and "convolution" were two
> > different operations.
> 
> That comes across as extremely smart-assed, Rune. Maybe that's my
> misinterpretation, but I thought I'd let you know. 

Too bad. Still, one ought to be aware that different words more often 
than not mean different things. More than that, it depends to a large 
extent on the context whether any particular word makes sense or not. 

My point is that mentioning the CLT only makes sense when studying 
PDFs. The OP tried to link the CLT directly to the random variable. 
A "PDF" and a "random variable" are, like it or not, two different 
things just as "addition" and "convolution" are two different things. 
If stating this makes me a smart-ass, well, so be it. Like it or not, 
but such subtle points are essential to this discussion.

> > Again, the CLT as I know it applies to PDFs. Please show me references 
> > to the CLT being extended to arbitrary non-positive functions.
> 
> Again, as in the first paragraph above, the CLT does "apply" to random
> variables. The proper thing to do is supply a version of the theorem
> and show this from the language of the theorem, but I'm not motivated
> enough at the moment.

I haven't seen that done, and based on the text I quoted a couple 
of posts ago, I doubt the CLT is as general as that. I would prefer 
to see a proof that the "CLT operator" works as well with RVs as it 
does with PDFs.

> To make a long story short, I thought the OP was talking about adding
> the noise data. If that's not the operation being performed, then I
> agree that the CLT doesn't apply.

Good. We agree.

Rune

Reply by ●November 6, 20042004-11-06

allnor@tele.ntnu.no (Rune Allnor) writes:

> Randy Yates <yates@ieee.org> wrote in message news:<7jp0bmy9.fsf@ieee.org>...
> > allnor@tele.ntnu.no (Rune Allnor) writes:
> > > [...]
> > > The CLT as I know it, applies to PDFs, not random variables.
> > 
> > Do you mean that the convolution it speaks of is of the PDFs, not
> > the random variables themselves? 
> 
> That's exactly what I have been saying during this whole thread!

No, that's not "exactly" what you've been saying, and that is part
of my issue with you, Rune. You said, exactly, 

  The CLT as I know it, applies to PDFs, not random variables.

This statement does not mention convolution.

> > However, to
> > say that the CLT does not apply to random variables is pretty
> > much false in my book - it's all about random variables. 
> 
> OK, if we have to go nit-picking, 

If I'm nit-picking, then so is Papoulis. His section on the CLT
begins like this:

  Given n independent *RVs* x_i, we form their sum

    x = x_1 + ... + x_n

  This is an *RV* with mean ... and variance ... . ... Furthermore, if
  the *RVs* x_i are of continuous type, ... the density f(x) of x
  approaches a normal density ... . This important theorem ... .

[emphases mine]. He CLEARLY associates the CLT with RVs. 

Now it is true that he also goes on to say "The CLT can be expressed
as a property of convolutions ...", but it seems pretty clear that the
main interpretation and utility of the CLT is in association with RVs.
To divorce it from RVs and speak only of convolving "positive
functions," while theoretically accurate, robs it of its real value:
explaining why randomness in nature is often Gaussian.

> here's my 2c: A "random process"
> generates "random variables" (or "random data"), RVs,  that in some 
> way are characterized by a "Probablility Density Function", PDF.
> In that sense, the RV and the PDF are interconnected in that both 
> are associated with a random process.

Wow. Now that's rich, Rune. After two courses in Random Processes
and another two in basic probability theory, I've never heard anyone
condition the association of a RV and its PDF on an association
with a random process. I don't know where you've come up with that
idea, but it is completely unorthodox in my experience. 

> The "CLT operator" 

Huh? Since when was anyone talking about a "CLT operator"? You've just
now introduced new language. The topic of discussion thus far has been
about a theorem, the "Central Limit Theorem," NOT an operator!

> takes multiple PDFs as input and produces one 
> PDF as output. When I look at the inner workings of the CLT, I see 
> PDFs, not RVs. I could have agreed with you if you said "it's all 
> about random _processes_". You didn't.

No, I certainly did not, because the CLT (reverting to the terminology
that we've been using) at least as presented by Papoulis, is not about
a random process. It has NOTHING to do with random processes.

> My point is that mentioning the CLT only makes sense when studying 
> PDFs. 

I heartily disagree, for the reasons I've already explained above. 

> The OP tried to link the CLT directly to the random variable. 

As well he should. The only problem is, he apparently did so 
improperly (i.e., via convolution of the RVs rather than the
sum of the RVs).
-- 
Randy Yates
Sony Ericsson Mobile Communications
Research Triangle Park, NC, USA
randy.yates@sonyericsson.com, 919-472-1124

Reply by Rune Allnor ●November 8, 20042004-11-08

Randy Yates <randy.yates@sonyericsson.com> wrote in message news:<xxpzn1u9xzh.fsf@usrts005.corpusers.net>...
> allnor@tele.ntnu.no (Rune Allnor) writes:
> 
> > Randy Yates <yates@ieee.org> wrote in message news:<7jp0bmy9.fsf@ieee.org>...
> > > allnor@tele.ntnu.no (Rune Allnor) writes:
> > > > [...]
> > > > The CLT as I know it, applies to PDFs, not random variables.
> > > 
> > > Do you mean that the convolution it speaks of is of the PDFs, not
> > > the random variables themselves? 
> > 
> > That's exactly what I have been saying during this whole thread!
> 
> No, that's not "exactly" what you've been saying, and that is part
> of my issue with you, Rune. You said, exactly, 
> 
>   The CLT as I know it, applies to PDFs, not random variables.
> 
> This statement does not mention convolution.

C'm on, Randy. If you read the whole thread (including your own
posts),
you will find that the OP convolved the data in the first place.

In fact, you yourself wrote 

"Secondly, when you add independent random variables, the distribution
of the result is the convolution of the input variables' PDFs." 

in the post of November 4th (your first post in this thread). We agree
in the basic properties of the CLT; why do you make such a fuzz about
disagreeing with me now?

> > > However, to
> > > say that the CLT does not apply to random variables is pretty
> > > much false in my book - it's all about random variables. 
> > 
> > OK, if we have to go nit-picking, 
> 
> If I'm nit-picking, then so is Papoulis. His section on the CLT
> begins like this:
> 
>   Given n independent *RVs* x_i, we form their sum
> 
>     x = x_1 + ... + x_n
> 
>   This is an *RV* with mean ... and variance ... . ... Furthermore, if
>   the *RVs* x_i are of continuous type, ... the density f(x) of x
>   approaches a normal density ... . This important theorem ... .
> 
> [emphases mine]. He CLEARLY associates the CLT with RVs. 

I have never contested that. But if you want the CLT to work and 
produce Gaussian distributions, you need to work on the PDFs.

> Now it is true that he also goes on to say "The CLT can be expressed
> as a property of convolutions ...", but it seems pretty clear that the
> main interpretation and utility of the CLT is in association with RVs.
> To divorce it from RVs and speak only of convolving "positive
> functions," while theoretically accurate,

Make up your mind. Do you agree in tht what is convolved to produce
results according to the CLT are PDFs, or do you not agree?

> robs it of its real value:
> explaining why randomness in nature is often Gaussian.

No. The CLT is an ad hoc excuse for the analyst to stay with the 
nice and easily tractable Gaussian distributions instead of diving
into the more tricky ones. The CLT does not "make a non-Gaussian
process
Gaussian", it only provides some comfort in stating that one does not 
make a very big mistake if one chooses to work under the Gaussian 
hypothesis. 

> > here's my 2c: A "random process"
> > generates "random variables" (or "random data"), RVs,  that in some 
> > way are characterized by a "Probablility Density Function", PDF.
> > In that sense, the RV and the PDF are interconnected in that both 
> > are associated with a random process.
> 
> Wow. Now that's rich, Rune. After two courses in Random Processes
> and another two in basic probability theory, I've never heard anyone
> condition the association of a RV and its PDF on an association
> with a random process. I don't know where you've come up with that
> idea, but it is completely unorthodox in my experience. 

Is a "random process" unorthodox to you? (OK, I should perhaps used
the
term "stochastic process", but I didn't want to go pedantic on you...)
Hey, Randy, this is a joke, right?

> > The "CLT operator" 
> 
> Huh? Since when was anyone talking about a "CLT operator"? You've just
> now introduced new language. The topic of discussion thus far has been
> about a theorem, the "Central Limit Theorem," NOT an operator!

I'm not introducing new language. If you take a course on linear
systems
in maths, you'll find the term "operator" used all over the place. 
Particularly in the context of convolution integrals.

If you express the CLT as a property of the expression 

   y_CLT = y_1 (*) y_2 (*) ... (*) y_N

where (*) means convolution and y_n are PDFs, the term "CLT operator"
makes perfect sense.

> > takes multiple PDFs as input and produces one 
> > PDF as output. When I look at the inner workings of the CLT, I see 
> > PDFs, not RVs. I could have agreed with you if you said "it's all 
> > about random _processes_". You didn't.
> 
> No, I certainly did not, because the CLT (reverting to the terminology
> that we've been using) at least as presented by Papoulis, is not about
> a random process. It has NOTHING to do with random processes.

Well, you may disagree with my approach to these matters and the 
exact way I interpret the problem and phrase my opionions. You should 
be very careful about how you state your objections, though. You might
find yourself in a position you can not defend.

> > My point is that mentioning the CLT only makes sense when studying 
> > PDFs. 
> 
> I heartily disagree, for the reasons I've already explained above. 

Please, Randy, I know you don't mean this. Yes, the effects of adding 
several random variables is the reason why the CLT is interesting. 
Arguing *why* the CLT works, and *how*, requires the studying 
stochastic processes and the convolution of their PDFs. Not the 
random variables. 

For the simple reason that given a random vector, you don't know 
anything about its PDF. You can make up an opinion, based on a 
histogram, but you don't know. The concept of a PDF only makes sense 
in the context of a stochastic process.

> > The OP tried to link the CLT directly to the random variable. 
> 
> As well he should. The only problem is, he apparently did so 
> improperly (i.e., via convolution of the RVs rather than the
> sum of the RVs).

The OP used random data (a single realization of a random variable) 
where a PDF should have been used. The exact nature of the PDF was 
never specified (not enven an estimate through a histogram), and 
no histogram of the resulting data were used. The important difference
between a stochastic process generating random variables, and the 
random data as a realization of sucha random variable, was never 
grasped. The question was phrased in a way that disagreed just enough 
with standard terminology to cause confusion (denoting the random 
variable by the symbol "f", which usually is reserved for PDFs). 

Apart from that, the OP did an excellent job in verifying the CLT.

Rune

Previous12 Next

convolving noise with noise will get Gaussian? what does that imply?

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group