comp.dsp | Show me some more numbers| page 5

Reply by Cedron ●June 11, 20152015-06-11

>
>There are many tricks to using computer state variables to reseed PRGs
on
>the fly so any encryption based on them can be cracked.  

That should be "can't be cracked".  Sorry.

Ced
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Eric Jacobsen ●June 11, 20152015-06-11

On Thu, 11 Jun 2015 20:08:43 -0500, "Cedron" <103185@DSPRelated>
wrote:

>>
>>There are many tricks to using computer state variables to reseed PRGs
>on
>>the fly so any encryption based on them can be cracked.  
>
>That should be "can't be cracked".  Sorry.
>
>Ced
>---------------------------------------
>Posted through http://www.DSPRelated.com

No, "can be cracked" was correct.    This is why there is a lot of
effort to build truly stochastic (i.e., natural, not deterministic or
reproducible) sources for seeds and similar parameters.   Even the
ring oscillators used in many applications are not considered
sufficient for others, so things like transistor/diode noise embedded
in the device are sometimes exploited, and even then a lot of specific
design considerations are made to assure the entropy  and distribution
characteristics of the results are sufficient, and the standards are
very, very high.

Intel has done a lot of work in this area since they embed a lot of
encryption and security hardware acceleration instructions these days.
Their new on-chip RNGs are very good.

Eric Jacobsen
Anchor Hill Communications
http://www.anchorhill.com

Reply by Cedron ●June 14, 20152015-06-14

>Sorry, I don't have the program recoded yet for pattern reuse.
>
>Ced
>---------------------------------------
>Posted through http://www.DSPRelated.com
It's done now.

Here are some results using canned uncooked noise.  I went with 1,000 and
5,000 runs instead of the 10,000 and 50,000 because of the differences in
my 3 Bin Complex with Candan 2013.  Since these are mathematically the
results should be identical as they are with smaller run sizes.  The 1,000
are the first part of the 5,000.  The program has been adjusted to
reanchor the three bin sets above 3.5.

You can clearly see that there appears to be a bias pattern in each
average column that repeats at the higher noise level and appears similar
in the higher run size set.  This is in contrast to the original set that
started this thread where there is no discernible pattern and no repeat of
values at the next level.  However, the standard deviation columns do show
the same pattern.  That is for the three bin formulas the peak is in the
center and for the two bin formula the minimum is at the center.  This is
explained by the SNR values of the bins.  Without that explanation, it
would not be clear from these data sets that it was due to the formula and
not the noise.

There is no way to tell by looking at the data if the apparent bias
pattern in the average column is due to the noise rather than the
formulas.  By using fresh noise for every row, no pattern emerges, so the
noise is the logical explanation for the variation from zero.

Ced

=========================================

All values x1000    Sample Count = 10

Target Noise Level = 0.010  Run Count = 1000

Freq Dawg Real     Dawg 2 Bin    Dawg 3 Bin    Candan 2013
---- ------------- ------------- ------------- -------------
3.00  0.019  1.568  0.063  2.241  0.017  1.658  0.017  1.658
3.10  0.003  1.671  0.045  1.858  0.005  1.702  0.005  1.702
3.20 -0.012  1.859  0.030  1.586 -0.006  1.815 -0.006  1.815
3.30 -0.024  2.133  0.018  1.404 -0.013  1.994 -0.013  1.994
3.40 -0.029  2.500  0.010  1.298 -0.013  2.236 -0.013  2.236
3.50 -0.026  2.982  0.009  1.266 -0.003  2.552 -0.004  2.552
3.60  0.053  1.601  0.012  1.310  0.097  2.168  0.097  2.168
3.70  0.049  1.527  0.022  1.434  0.090  1.971  0.090  1.971
3.80  0.050  1.542  0.037  1.642  0.082  1.843  0.082  1.843
3.90  0.055  1.642  0.057  1.937  0.073  1.771  0.074  1.771


Target Noise Level = 0.100  Run Count = 1000

Freq Dawg Real     Dawg 2 Bin    Dawg 3 Bin    Candan 2013
---- ------------- ------------- ------------- -------------
3.00  0.209 15.673  0.634 22.400  0.161 16.565  0.161 16.565
3.10  0.064 16.676  0.471 18.538  0.051 16.987  0.048 16.987
3.20 -0.070 18.564  0.315 15.828 -0.052 18.137 -0.059 18.135
3.30 -0.164 21.344  0.187 14.035 -0.119 19.963 -0.131 19.961
3.40 -0.178 25.091  0.105 13.005 -0.120 22.437 -0.139 22.436
3.50 -0.067 30.002  0.081 12.687 -0.026 25.621 -0.059 25.618
3.60  0.584 16.049  0.120 13.115  0.965 21.747  0.985 21.746
3.70  0.548 15.274  0.222 14.354  0.895 19.728  0.907 19.727
3.80  0.568 15.415  0.382 16.456  0.819 18.417  0.825 18.417
3.90  0.638 16.445  0.585 19.469  0.733 17.689  0.736 17.688


Target Noise Level = 0.010  Run Count = 5000

Freq Dawg Real     Dawg 2 Bin    Dawg 3 Bin    Candan 2013
---- ------------- ------------- ------------- -------------
3.00  0.011  1.598  0.012  2.267  0.025  1.683  0.025  1.683
3.10  0.006  1.691 -0.007  1.901  0.017  1.721  0.017  1.721
3.20  0.002  1.857 -0.021  1.631  0.009  1.819  0.009  1.819
3.30  0.002  2.105 -0.029  1.440  0.003  1.978  0.003  1.978
3.40  0.004  2.452 -0.031  1.319  0.000  2.207  0.000  2.207
3.50  0.011  2.923 -0.029  1.271  0.001  2.520  0.000  2.520
3.60 -0.010  1.597 -0.023  1.301  0.004  2.202  0.004  2.202
3.70 -0.002  1.494 -0.013  1.411  0.012  1.959  0.012  1.959
3.80  0.006  1.485  0.000  1.605  0.019  1.793  0.019  1.793
3.90  0.015  1.570  0.014  1.885  0.025  1.699  0.025  1.699


Target Noise Level = 0.100  Run Count = 5000

Freq Dawg Real     Dawg 2 Bin    Dawg 3 Bin    Candan 2013
---- ------------- ------------- ------------- -------------
3.00  0.131 15.985  0.133 22.678  0.252 16.828  0.252 16.827
3.10  0.077 16.927 -0.074 19.030  0.161 17.225  0.158 17.224
3.20  0.048 18.611 -0.216 16.345  0.077 18.218  0.071 18.217
3.30  0.059 21.113 -0.298 14.422  0.018 19.816  0.007 19.815
3.40  0.129 24.603 -0.323 13.204 -0.004 22.104 -0.024 22.102
3.50  0.278 29.342 -0.298 12.717  0.015 25.220 -0.016 25.217
3.60 -0.037 16.002 -0.228 13.016  0.032 22.066  0.052 22.065
3.70  0.041 14.957 -0.124 14.132  0.124 19.626  0.135 19.625
3.80  0.136 14.866  0.007 16.076  0.202 17.948  0.208 17.947
3.90  0.242 15.717  0.152 18.895  0.254 16.981  0.257 16.981


---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Cedron ●June 14, 20152015-06-14

>Cedron <103185@DSPRelated> wrote:
>
>>>>Centering and rescaling would help with that at the cost of
>>>>being less realistic.
>
>>>I think that's a pretty bad direction to go in.
>
>>I wouldn't call it good or bad.  What you are in essence doing is
>>shortcutting using a much larger runsize.  The purpose of a larger
runsize
>>is to get the distributions closer to the ideal.
>
>But it's then no longer N(0,1) noise.  That to me is a very big deal.
>
[...snip...]
>
>Steve

I think your distaste for centering and rescaling the noise is unfounded. 
I have no problem with the term "cooked" to describe it.  If you think of
the situation in terms of the analytical equation:

WB(Z+E) / W(Z+E) = WBZ / WZ + (Misc terms)E + H.O.T.

(I used V by mistake last time.)

E represents the DFT of the noise.  By shifting the noise, only the DC bin
of E will be affected.  As long as your two or three bin set doesn't cover
the DC bin there will be no effect.  Rescaling the noise will rescale each
bin of E the same.  This will not alter the relative values.  If you are
fine with rescaling the noise for different noise levels, then you should
be comfortable with rescaling it to make the numbers more "well behaved"
in terms of magnitude, so the results reflect more consistent values.  By
doing so, the standard deviations become a little smaller, but if your
interest is building a model to quantify them, then rescaling them will
give you better results.  I am thinking particularly about the problem of
figuring out where the cutoff is between the three and two bin formulas
is.

Ced

---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Steve Pope ●June 14, 20152015-06-14

Cedron <103185@DSPRelated> wrote:

> [Pope wrote]

>>But it's then no longer N(0,1) noise.  That to me is a very big deal.

>I think your distaste for centering and rescaling the noise is unfounded. 
>I have no problem with the term "cooked" to describe it.  If you think of
>the situation in terms of the analytical equation:
>
>WB(Z+E) / W(Z+E) = WBZ / WZ + (Misc terms)E + H.O.T.
>
>(I used V by mistake last time.)
>
>E represents the DFT of the noise.  By shifting the noise, only the DC bin
>of E will be affected.  As long as your two or three bin set doesn't cover
>the DC bin there will be no effect.  

I agree so far

>Rescaling the noise will rescale each
>bin of E the same.  This will not alter the relative values.  If you are
>fine with rescaling the noise for different noise levels, then you should
>be comfortable with rescaling it to make the numbers more "well behaved"
>in terms of magnitude, so the results reflect more consistent values.  

That's completely wrong. Those noise patterns that happen
(due to the nature of the Gaussian distribution) to have large
individual components are the ones dominating the error rate;
and if you scale these back (because, you notice they are large) then
you are screwing with your error rate.

What could be more fundamental than applying AWGN to a signal,
and leaving it at that?

>doing so, the standard deviations become a little smaller, but if your
>interest is building a model to quantify them, then rescaling them will
>give you better results.  

Wronger results.


Steve

Reply by Steve Pope ●June 14, 20152015-06-14

Cedron <103185@DSPRelated> wrote:

>Here are some results using canned uncooked noise.  

Thanks for running these.

>I went with 1,000 and
>5,000 runs instead of the 10,000 and 50,000 because of the differences in
>my 3 Bin Complex with Candan 2013.  Since these are mathematically the
>results should be identical as they are with smaller run sizes.  The 1,000
>are the first part of the 5,000.  The program has been adjusted to
>reanchor the three bin sets above 3.5.

>You can clearly see that there appears to be a bias pattern in each
>average column that repeats at the higher noise level and appears similar
>in the higher run size set.  This is in contrast to the original set that
>started this thread where there is no discernible pattern and no repeat of
>values at the next level.  

This is what I would expect

>There is no way to tell by looking at the data if the apparent bias
>pattern in the average column is due to the noise rather than the
>formulas.  

I am quoting a subset of your results:

>All values x1000    Sample Count = 10

>Target Noise Level = 0.010  Run Count = 1000
>
>Freq Dawg Real
>---- -------------
>3.00  0.019  1.568
>3.10  0.003  1.671
>3.20 -0.012  1.859
>3.30 -0.024  2.133
>3.40 -0.029  2.500
>3.50 -0.026  2.982
>3.60  0.053  1.601
>3.70  0.049  1.527
>3.80  0.050  1.542
>3.90  0.055  1.642

>Target Noise Level = 0.010  Run Count = 5000
>
>Freq Dawg Real     
>---- -------------
>3.00  0.011  1.598
>3.10  0.006  1.691
>3.20  0.002  1.857
>3.30  0.002  2.105
>3.40  0.004  2.452
>3.50  0.011  2.923
>3.60 -0.010  1.597
>3.70 -0.002  1.494
>3.80  0.006  1.485
>3.90  0.015  1.570

My conclusion looking at the above is that the simulation has not
converged.  That is, the run with the first 1,000 noise patterns
exhibits an apparent negative bias in the "average" column for frequencies 
3.2 through 3.5, but after running this out to 5,000 patterns total 
(including the first 1,000 as a subset) this apparent bias disappears, 
which means it was an artifact peculiar to the first 1,000 noise patterns 
and is not a real result portrying algorithm behavior.

You'd have to run it out past 5,000 to see if the averages in the run
of 5000 are accurate or are also an artifact.

Do you have a different conclusion?

Steve

Reply by Cedron ●June 14, 20152015-06-14

[...snip...]
>
>>Rescaling the noise will rescale each
>>bin of E the same.  This will not alter the relative values.  If you
are
>>fine with rescaling the noise for different noise levels, then you
should
>>be comfortable with rescaling it to make the numbers more "well
behaved"
>>in terms of magnitude, so the results reflect more consistent values.  
>
>That's completely wrong. Those noise patterns that happen
>(due to the nature of the Gaussian distribution) to have large
>individual components are the ones dominating the error rate;
>and if you scale these back (because, you notice they are large) then
>you are screwing with your error rate.
>

There is no error rate because there is no tolerance specification.  So it
is just a matter of tightening up the variance.  There are just as many
expected cases that need to be increased as those that need to be
decreased.  So by "cooking" the numbers you can expect values that will be
more similar to making a larger set of runs.  However, it is not a true
short cut in that the sizes of the E values are more likely to be varied
than if you ran longer runs.

>What could be more fundamental than applying AWGN to a signal,
>and leaving it at that?
>

That's one way of testing, it shouldn't exclude others.

>>doing so, the standard deviations become a little smaller, but if your
>>interest is building a model to quantify them, then rescaling them will
>>give you better results.  
>
>Wronger results.
>
>
>Steve

This isn't a matter of right and wrong.  When I said "better" I meant in
regard to getting numbers that closer represent what an analytical
solution would provide with fewer runs.  Unfortunately, upping the run
count has introduced a clear precision error.

As long as you understand the conditions of the test you shouldn't have
trouble interpreting the results.  And as we have previously agreed, the
most important aspect when doing a side by side comparison is that all the
formulas face the same test cases.

Ced
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Cedron ●June 14, 20152015-06-14

>Cedron <103185@DSPRelated> wrote:
>
>>Here are some results using canned uncooked noise.  
>
>Thanks for running these.
>

You're welcome.  It was worth it to have a clear set of numbers that
reflected the statements I have made.

[...snip...]
>
>I am quoting a subset of your results:
>
>>All values x1000    Sample Count = 10
>
>>Target Noise Level = 0.010  Run Count = 1000
>>
>>Freq Dawg Real
>>---- -------------
>>3.00  0.019  1.568
>>3.10  0.003  1.671
>>3.20 -0.012  1.859
>>3.30 -0.024  2.133
>>3.40 -0.029  2.500
>>3.50 -0.026  2.982
>>3.60  0.053  1.601
>>3.70  0.049  1.527
>>3.80  0.050  1.542
>>3.90  0.055  1.642
>
>>Target Noise Level = 0.010  Run Count = 5000
>>
>>Freq Dawg Real     
>>---- -------------
>>3.00  0.011  1.598
>>3.10  0.006  1.691
>>3.20  0.002  1.857
>>3.30  0.002  2.105
>>3.40  0.004  2.452
>>3.50  0.011  2.923
>>3.60 -0.010  1.597
>>3.70 -0.002  1.494
>>3.80  0.006  1.485
>>3.90  0.015  1.570
>
>My conclusion looking at the above is that the simulation has not
>converged.  That is, the run with the first 1,000 noise patterns
>exhibits an apparent negative bias in the "average" column for
frequencies 
>3.2 through 3.5, but after running this out to 5,000 patterns total 
>(including the first 1,000 as a subset) this apparent bias disappears, 
>which means it was an artifact peculiar to the first 1,000 noise patterns

>and is not a real result portrying algorithm behavior.
>
>You'd have to run it out past 5,000 to see if the averages in the run
>of 5000 are accurate or are also an artifact.
>
>Do you have a different conclusion?
>
>
>Steve

The other cases weren't as clear cut.  I still maintain that it is much
easier discerning what was caused by the noise and which was caused by the
formula by using fresh noise for each row.  In those cases, there was no
pattern in the averages, and the average values in the increased noise
case were not nearly proportional to the next noise level.  However, since
the standard deviation columns did present a clear pattern that survived
with different noise cases means (most likely) that that pattern was due
to the formulas.

So, in conclusion, fresh noise for each row does a better job of
distinguishing what effects are due to the noise and which are due to the
formula.  I'm not saying your approach can't do it, it just doesn't do it
as well.

Ced
---------------------------------------
Posted through http://www.DSPRelated.com

Reply by Steve Pope ●June 14, 20152015-06-14

Cedron <103185@DSPRelated> wrote:

> Pope says,

>> Rescaling the noise will rescale each bin of E the same.
>> This will not alter the relative values.  If you are fine with
>> rescaling the noise for different noise levels, then you should
>> be comfortable with rescaling it to make the numbers more
>> "well behaved" in terms of magnitude, so the results reflect
>> more consistent values.

>>That's completely wrong. Those noise patterns that happen
>>(due to the nature of the Gaussian distribution) to have large
>>individual components are the ones dominating the error rate;
>>and if you scale these back (because, you notice they are large) then
>>you are screwing with your error rate.

>There is no error rate because there is no tolerance specification.  

Then replace "error rate" by "average error" (i.e your first column
of data).

> So it is just a matter of tightening up the variance.  There
> are just as many expected cases that need to be increased as
> those that need to be decreased.  So by "cooking" the numbers
> you can expect values that will be more similar to making a
> larger set of runs.

I don't think so.

>>What could be more fundamental than applying AWGN to a signal,
>>and leaving it at that?

> That's one way of testing, it shouldn't exclude others.
> [..] This isn't a matter of right and wrong.  When I said "better" 
> I meant in regard to getting numbers that closer represent what an 
> analytical solution would provide with fewer runs.  

By rescaling each individual noise pattern to have the same sigma, you
are destroying your results.

The purpose of creating an ensemble of 1,000 (or 10,000, or as 
many as is necessary) noise patterns is so that you can see how the 
statistics of added white Gaussian noise affect system performance.  

As to whether doing this "excludes" other tests, of course it does
not, but if you want to discuss the performance of a system
in AWGN, and compare results with those of other investigations /
investigators, you really must not do this rescaling business.  

Steve

Reply by Cedron ●June 14, 20152015-06-14

>
>Then replace "error rate" by "average error" (i.e your first column
>of data).
>
>> So it is just a matter of tightening up the variance.  There
>> are just as many expected cases that need to be increased as
>> those that need to be decreased.  So by "cooking" the numbers
>> you can expect values that will be more similar to making a
>> larger set of runs.
>
>I don't think so.
>

By variance here I meant the variance of the average error values.  In
your own words, cooking the noise will make the average error values more
optimistic, meaning closer to zero.  This is what a larger set of runs is
expected to do as well.

[...snip...]
>
>By rescaling each individual noise pattern to have the same sigma, you
>are destroying your results.
>

"Modifying" is not "destroying".

>The purpose of creating an ensemble of 1,000 (or 10,000, or as 
>many as is necessary) noise patterns is so that you can see how the 
>statistics of added white Gaussian noise affect system performance.  
>

Unfortunately, the precision issue messes with this.

>As to whether doing this "excludes" other tests, of course it does
>not, but if you want to discuss the performance of a system
>in AWGN, and compare results with those of other investigations /
>investigators, you really must not do this rescaling business.  
>
>Steve

Well, I never claimed AWGN, all I claimed was "near Gaussian".  My purpose
was to back my assertion that my formula would react to noise in a similar
manner to Jacobsen's estimator because his estimator is an approximation
of my formula.  I also said quite clearly in the Matlab Beginner thread
that I was not the best person to do standard testing.  My tests did back
my assertions, and they seemed to prompt both Julien and Jacobsen to run
independent tests.  I am more impressed by Julien's tests because he
tested all the formulas against both real and complex signals whereas
Jacobsen, though he mentions my formula was derived for real signals,
tested only against complex signals.  I am still surprised and pleased at
how well it does in the complex signal case.

The bottom line is that my formula is a significant advance at the
theoretical level and quite the contender at the pragmatic level.  Martin
Vicanek's approach still has the possibility of bettering it in noisy
cases and I am working on an improvement which is much more calculation
intensive, but may offer better results even yet.

Stay tuned.

Ced
---------------------------------------
Posted through http://www.DSPRelated.com