DSPRelated.com
Forums

Show me some more numbers

Started by Cedron June 4, 2015
>On Saturday, June 6, 2015 at 12:05:21 PM UTC-7, Cedron wrote: > >... > >For the same signal duration, increasing the sample frequency and >transform size will separate the two components of the real signal and
reduce the
>"self interference" of a real signal with a complex estimation
algorithm. No it won't. Bin 3 is still the same distance from bin N-3. To separate them you need to increase the sampling interval, i.e. the frame size. For a real signal you should use a real signal formula, not a complex signal formula, this is the point I've been trying to make.
>Windowing can accomplish the same result. Note that the most accurate >estimators will be designed for the window applied.
Kind of stupid if they aren't. Likewise real signal formulas should be used with real signals and complex signal formulas should be used with complex signals.
>A transform size of 10 is >rarely used in instrumentation. Values in the hundreds are common. Your
choice
>makes a real signal an unrealistic error source for people who might >mistakenly apply your tables to how things work in the real world.
I never claimed my tests represented "the real world". Your the one who seems to be application fixated. I've claimed a theoretical advance. The fact that it whups butt is just icing on the cake. I used a small sample size to demonstrate the superiority of an exact equation over an approximation that is bin number dependent, e.g. Jacobsen's and Candan's 2011. This notion of distinguishing "exact" from "very precise" still seems difficult for some to grasp.
> >> In either case, the unwindowed results of a real signal formula is
going
>> to be better than the windowed or not version of a complex signal >> formula. >> > >This statement does not take into account the general cases where >transform sizes are chosen based on signal characteristics beyond just
frequency,
>where there is interference as well as noise, windows are chosen to cope >with interference and algorithms are designed to match the windows. You
are
>still a long ways from the real world. You persist in making strong >statements about regions you have not yet explored. When that leads to
false claims,
>don't be surprised if people talk. >
That statement will be true no matter what your parameters are. I'm a space cowboy, what do you think about that? Please cite any false claim I've made. So far they have all been borne out. Shall I list them?
> >Crap is claiming to implement an algorithm as published and then posting >the results of something else.
I have done no such thing. I've explained completely how I ran the tests. Why aren't you jumping on Jacobsen's case for doing a comparison of my real valued signal formula against complex valued signal formulas with complex valued signals. It still did quite well, didn't it?
> > >Please make your tables correct for everyone. >
"correct" according to what standard?
>> I am evaluating Martin Vicanek's formula from the link he provided in >this >> post. It looks good so far. It's a mighty fine day outside
weatherwise
>> so this won't get done till this evening at the earliest. >> >> I will do as you wish, would you prefer 2.5 to 3.5 or 3.0 to 3.9? >>
Looks like tomorrow. It's late already.
> >It doesn't matter which. It does matter that it is done consistently
with
>your description. >
Once again, I described what I did in detail. I have provided the formulas, if you don't like how I tested them, test them yourself just like Jacobsen and Julien.
> >Dale B. Dalrymple
Ced --------------------------------------- Posted through http://www.DSPRelated.com
>Cedron <103185@DSPRelated> wrote: > >> Pope wrote > >>>dbd <d.dalrymple@sbcglobal.net> wrote: >
[...snip...]
> >>Alright, I misinterpreted what Dale said. You want me to use the same
set
>>of noise additions for every frequency for every noise level. > >Well... were we working on a project together, that is what I >would want. :) >
Well, if we were, I would still argue against it. In the complex signal set with the complex formulas that I started this thread with, it is pretty clear that all the formulas are unbiased. As noise is added the average is no longer zero, but close to zero. When measured in standard deviations, the average is very small. Therefore the non-zero values can be interpreted as being an artefact of an imperfect noise model. Since a fresh set of noise is applied for every row there is no apparent pattern in the average values. On the other hand, if the same noise set was used for every row, and amplified for every noise level, the biases would appear to be one sided and form a pattern. When you jumped to the next noise level, the same pattern would appear multiplied by ten. It would be impossible to distinguish whether the apparent bias was caused by the noise or was due to the formula.
>>I'm not sure it's that important. I'm doing 10,000 runs per row, so
the
>>results should be pretty close to the theoretical statistical value. >>Multiple runs confirm this as the numbers don't vary much. > >If you run it until is converges and all you are interested is >in a statistically accurate result then you're fine. See below >for other possible cases of interest.
In the complex case, because the averages are basically random numbers, it is the standard deviations column that tells the interesting story. With the real signal, the complex formulas (including mine) exhibit a clear bias. The average values reflect this bias with very small variation at all noise levels.
> >>Standard deviation and RMS are the same when your average is zero. So
the
>>proportionality between the noise levels and the resulting standard >>deviations makes perfect sense. There are enough runs to see that this >>relationship is true even if the values don't jump exactly be a factor
of
>>10 between each noise level. > >>The formulas are all working from the same signals so the side by side >>comparisons are still valid. > >A runsize of 10,000 might be, in most situations, good for finding >the SNR operating point at which an algorithm has a 0.1% error rate, >that is, and estimate of the SNR at which the algorithm fails to make an
>adequate estimate 0.1% of the time.
Now you are talking about a different type of testing criteria. When I was in college, as a side job I computerized the SPC (Statistical Process Control) for an automobile supplier factory. What you are now talking about is the equivalent of the difference between control lines and tolerance lines.
> >(Where "adequate" means the downstream system using the estimate >functions as opposed to does not function). Now, 0.1% would be a >reasonable spec for the marginal contribution of a frequency estimator >to an overall 1% packet-error rate spec in a receiver (this is a typical
>spec for a wireless device). > >With the runsize of 10,000 having a 0.1% error rate, you have 10 errors >out of 10,000 simulated estimates in the run, and you can state with
some
>confidence that the SNR at which this occurs is your operating point >per the above spec. > >(I hope you are still following me, I know I ramble sometimes.) >
Ramble on, sing your song. You've been helpful so far.
>Now suppose in addition to testing whether your algorithm meets >spec, you are comparing two algorithms. Suppose the Cedron >algorithm, at a single SNR near the operating point, exhibited >10 errors out of 10,000, and a second proposed algorithm exhibited >12 errors. > >Suppose for the sake of argument that the 9988 datapoints for which >the second algorithm obtained the correct answer, the Cedron algorithm >also got the correct answer. [*] > >So we now have 12 datapoints at which Cedron got all 12 correct, >but the second proposed algorithm only got 10 out of 12 correct. >Can we assert that Cedron outperforms the second algorithm? >Well, if both algorithms were presented with identical noise >patterns, you can assert this. Whereas if the two algorithms >were presented with different noise patterns, the assertion is much >weaker -- the 12 vs. 10 successes could just be a random effect of >different noise patterns. >
Your numbers aren't adding up, but I get the point. Correct or not correct means meeting a tolerance, or spec as you call it. I have not introduced a tolerance, all I'm doing is measuring error distributions.
>Furthermore, had you used identical noise, you could analyze in detail >the two events in which Cedron succeded and the second algorithm failed,
>and perhaps obtain more insight into why Cedron is better. >With non-identical noise, you can't even perform this analysis. >
The formulas do use the identical noise for each run. So looking across any row is a fair comparison. It is the standard deviation that tells you which one is going to have the most errors closer to zero (for the unbiased estimators).
>So, in my view, for the purposes of asserting the type of things you're >trying to assert, you would be on much much firmer ground applying the
same
> >noise patterns to competing algorithms.
The competing algorithms do get the same noise patterns.
>Applying the same noise >pattern to every SNR, for similar reasons, gets you an accurate >curve faster. The curve will more closely intersect your actual
operating
>point with a shorter runsize. >
This I disagree with as I argued above. The program runs in just a few seconds so cutting runsize is not that important for that reason. I am wondering, with the differences between my 3 bin complex formula and Candan's 2013 whether there is a precision problem with large runsizes like I found with large sample counts in the "Show me the numbers" thread.
>Steve > >[*] This "sake of argument" assumption is not true in general, >but for those cases similar arguments can be constructed.
Thanks for your input. Ced --------------------------------------- Posted through http://www.DSPRelated.com
Cedron <103185@DSPRelated> wrote:

>Pope wrote,
>>Cedron <103185@DSPRelated> wrote:
>>> Alright, I misinterpreted what Dale said. You want me to use the >>> same set of noise additions for every frequency for every noise level. >> >>Well... were we working on a project together, that is what I >>would want. :)
>Well, if we were, I would still argue against it. In the complex signal >set with the complex formulas that I started this thread with, it is >pretty clear that all the formulas are unbiased. As noise is added the >average is no longer zero, but close to zero. When measured in standard >deviations, the average is very small. Therefore the non-zero values can >be interpreted as being an artefact of an imperfect noise model. Since a >fresh set of noise is applied for every row there is no apparent pattern >in the average values. On the other hand, if the same noise set was used >for every row, and amplified for every noise level, the biases would >appear to be one sided and form a pattern. When you jumped to the next >noise level, the same pattern would appear multiplied by ten. It would be >impossible to distinguish whether the apparent bias was caused by the >noise or was due to the formula.
There's a very basic premise in science, that within practicability you want to change as few variables in each experiment as possible. Here you are arguing that you gain insight into the behavior of the algorithm by randomly changing one variable (the noise pattern) when you change another variable (the noise level). Part of me is saying, "that can't possibly be true". Over a runsize of 10,000, if you generate 10,000 noise patterns, and apply each of these across algorithms, across SNR's, and across frequencies being tested, then you are not changing more variables than you have to. So it is in a sense the first experiment you should run. If this basic experiment does not expose some features you believe are there, then design a more targeted experiment for that ... whereas simply saying that randomizing the noise pattern over SNR gives you a better ensemble averaging (if that is a valid restatement of what you are saying), and using this thought to skip over doing the more basic experiment, does not seem like the best experimental sequence. But, there are always a host of issues when designing simulation experiments. So what I am saying is not a platitude or anything, just a perspective.
>>A runsize of 10,000 might be, in most situations, good for finding >>the SNR operating point at which an algorithm has a 0.1% error rate, >>that is, and estimate of the SNR at which the algorithm fails to make an >>adequate estimate 0.1% of the time. > >Now you are talking about a different type of testing criteria.
Yes, but I am doing it for a reason.
> When I was in college, as a side job I computerized the > SPC (Statistical Process Control) for an automobile supplier > factory. What you are now talking about is the equivalent of > the difference between control lines and tolerance lines.
I do not have that perspective, but thanks -- I have not encountered this terminology. The difference I see is between estimating a parameter (frequency) and estimating one of a discrete set of outcomes (for example, whether an estimate of frequency succeeds, or does not succeed at obtaining some higher-level outcome). The reason I am introducing this is that oftern, a parameter estimation is really just a proxy for an (eventual) more important discrete estimation of system outcome. And the perspective I am trying to convey is while the runsize value of 10,000 might seem enough to obtain an accurate parameter estimate, when viewed from the discrete point of view there might actually a fairly low number of these failure events, such as 10 or 12 in my previous example. (However you have since clarified that you're applying the same noise pattern to the algorithms being compared, so that mostly obviates what I was getting at with that particular example.)
>Ramble on, sing your song. You've been helpful so far.
Thanks
>The formulas do use the identical noise for each run. So looking across >any row is a fair comparison. It is the standard deviation that tells you >which one is going to have the most errors closer to zero (for the >unbiased estimators). [...] >The competing algorithms do get the same noise patterns.
Great
>> Applying the same noise pattern to every SNR, for similar reasons, gets >> you an accurate curve faster. The curve will more closely intersect >> your actual operating point with a shorter runsize.
>This I disagree with as I argued above. The program runs in just a few >seconds so cutting runsize is not that important for that reason.
Well, okay, but it really is true in most cases the curve of a parameter value vs. SNR converges faster when you use the same noise pattern across SNR's, and this reduces the risk (other things being equal) of having a partially-converged curve that may contain inflections that are not present in reality. For some types of behaviors the difference in required runsize is a factor of 10 or more.
>I am wondering, with the differences between my 3 bin complex formula and >Candan's 2013 whether there is a precision problem with large runsizes >like I found with large sample counts in the "Show me the numbers" >thread.
That is always something to be on the lookout for. Steve Thanks Steve
On 2015-06-07 07:27:05 +0000, Steve Pope said:

> Cedron <103185@DSPRelated> wrote: > >> Pope wrote, > >>> Cedron <103185@DSPRelated> wrote: > >>>> Alright, I misinterpreted what Dale said. You want me to use the >>>> same set of noise additions for every frequency for every noise level. >>> >>> Well... were we working on a project together, that is what I >>> would want. :) > >> Well, if we were, I would still argue against it. In the complex signal >> set with the complex formulas that I started this thread with, it is >> pretty clear that all the formulas are unbiased. As noise is added the >> average is no longer zero, but close to zero. When measured in standard >> deviations, the average is very small. Therefore the non-zero values can >> be interpreted as being an artefact of an imperfect noise model. Since a >> fresh set of noise is applied for every row there is no apparent pattern >> in the average values. On the other hand, if the same noise set was used >> for every row, and amplified for every noise level, the biases would >> appear to be one sided and form a pattern. When you jumped to the next >> noise level, the same pattern would appear multiplied by ten. It would be >> impossible to distinguish whether the apparent bias was caused by the >> noise or was due to the formula. > > There's a very basic premise in science, that within practicability > you want to change as few variables in each experiment as possible.
You might want to take a look at the topic called "Design Of Experiments" in Statistics and/or Industrial Engineering where you want to control what you change. There is a special case of One at a Time designs which are based on Grey codes.
> Here you are arguing that you gain insight into the behavior of > the algorithm by randomly changing one variable (the noise pattern) > when you change another variable (the noise level). > > Part of me is saying, "that can't possibly be true". > > Over a runsize of 10,000, if you generate 10,000 noise patterns, > and apply each of these across algorithms, across SNR's, and across > frequencies being tested, then you are not changing more variables than > you have to. So it is in a sense the first experiment you should run. > > If this basic experiment does not expose some features you believe are > there, then design a more targeted experiment for that ... whereas simply > saying that randomizing the noise pattern over SNR gives you a better > ensemble averaging (if that is a valid restatement of what you are > saying), and using this thought to skip over doing the more basic > experiment, does not seem like the best experimental sequence. > > But, there are always a host of issues when designing simulation > experiments. So what I am saying is not a platitude or anything, > just a perspective. > >>> A runsize of 10,000 might be, in most situations, good for finding >>> the SNR operating point at which an algorithm has a 0.1% error rate, >>> that is, and estimate of the SNR at which the algorithm fails to make an >>> adequate estimate 0.1% of the time. >> >> Now you are talking about a different type of testing criteria. > > Yes, but I am doing it for a reason. > >> When I was in college, as a side job I computerized the >> SPC (Statistical Process Control) for an automobile supplier >> factory. What you are now talking about is the equivalent of >> the difference between control lines and tolerance lines. > > I do not have that perspective, but thanks -- I have not encountered > this terminology. > > The difference I see is between estimating a parameter (frequency) > and estimating one of a discrete set of outcomes (for example, whether > an estimate of frequency succeeds, or does not succeed at obtaining > some higher-level outcome). > > The reason I am introducing this is that oftern, a parameter estimation > is really just a proxy for an (eventual) more important discrete > estimation of system outcome. And the perspective I am trying to > convey is while the runsize value of 10,000 might seem enough to obtain > an accurate parameter estimate, when viewed from the discrete point of > view there might actually a fairly low number of these failure events, > such as 10 or 12 in my previous example. > > (However you have since clarified that you're applying the same noise > pattern to the algorithms being compared, so that mostly obviates > what I was getting at with that particular example.) > >> Ramble on, sing your song. You've been helpful so far. > > Thanks > >> The formulas do use the identical noise for each run. So looking across >> any row is a fair comparison. It is the standard deviation that tells you >> which one is going to have the most errors closer to zero (for the >> unbiased estimators). [...] >> The competing algorithms do get the same noise patterns. > > Great > >>> Applying the same noise pattern to every SNR, for similar reasons, gets >>> you an accurate curve faster. The curve will more closely intersect >>> your actual operating point with a shorter runsize. > >> This I disagree with as I argued above. The program runs in just a few >> seconds so cutting runsize is not that important for that reason. > > Well, okay, but it really is true in most cases the curve of > a parameter value vs. SNR converges faster when you use the same > noise pattern across SNR's, and this reduces the risk (other > things being equal) of having a partially-converged curve that > may contain inflections that are not present in reality. > For some types of behaviors the difference in required runsize > is a factor of 10 or more. > >> I am wondering, with the differences between my 3 bin complex formula and >> Candan's 2013 whether there is a precision problem with large runsizes >> like I found with large sample counts in the "Show me the numbers" >> thread. > > That is always something to be on the lookout for. > > Steve > > Thanks > > Steve
[...snip...]
> >There's a very basic premise in science, that within practicability >you want to change as few variables in each experiment as possible. >
Controls, blind, double-blind, etc. etc.
>Here you are arguing that you gain insight into the behavior of >the algorithm by randomly changing one variable (the noise pattern) >when you change another variable (the noise level). > >Part of me is saying, "that can't possibly be true". >
Thinks of it as knowns and unknowns. The premise of your statement is that the noise is known and the behavior of the formula is unknown. Therefore, when you study the combined outcome you learn something about the formula. But the noise is not known beyond its expected average and its expected RMS. If I would want to use "canned" noise patterns for all rows and all noise levels, I would at least want to recenter and rescale them so their average was known to be zero and their RMS known to be my target value. Even so, the particular distribution of values may present as some pattern in the data that would be indistinguishable from a pattern formed by the formulas.
>Over a runsize of 10,000, if you generate 10,000 noise patterns, >and apply each of these across algorithms, across SNR's, and across >frequencies being tested, then you are not changing more variables than >you have to. So it is in a sense the first experiment you should run. >
Depends on what you are trying to show. For the rough comparison I am making, I think how I am doing it is sufficient. For fine grained estimation of the noise level to standard deviation, your approach is probably better.
>If this basic experiment does not expose some features you believe are >there, then design a more targeted experiment for that ... whereas simply
>saying that randomizing the noise pattern over SNR gives you a better >ensemble averaging (if that is a valid restatement of what you are >saying), and using this thought to skip over doing the more basic >experiment, does not seem like the best experimental sequence. >
If you did do canned noise values like you suggest, you would want to do more than one set of results so any patterns discerned could be attributed to either the noise or the formulas. You could still be fooled.
>But, there are always a host of issues when designing simulation >experiments. So what I am saying is not a platitude or anything, >just a perspective. >
Yeppers, and I appreciate your constructive criticism. [...snip...]
> >The reason I am introducing this is that oftern, a parameter estimation >is really just a proxy for an (eventual) more important discrete >estimation of system outcome. And the perspective I am trying to >convey is while the runsize value of 10,000 might seem enough to obtain >an accurate parameter estimate, when viewed from the discrete point of >view there might actually a fairly low number of these failure events, >such as 10 or 12 in my previous example. >
The idea behind measuring the standard deviation and assuming your distribution is Gaussian is that you can get a good estimate of how many samples you can expect outside any threshold by using the area under a bell curve. In SPC, which is based on the notion that the production process is not exact and any errors are an accumulation of many tiny independent errors, a Gaussian distribution is used as the model. The control lines are set at a specific number of standard deviations. If the samples start straying close to or over these lines the process is said to be out of control. This is entirely independent of the tolerance lines which are the manufacturers specification. Your control lines should fall well within your tolerance lines. If they don't, you are doomed from the start. [...snip...]
> >Thanks > >Steve
Thank you, Ced --------------------------------------- Posted through http://www.DSPRelated.com
[...snip...]
> >>I am wondering, with the differences between my 3 bin complex formula
and
>>Candan's 2013 whether there is a precision problem with large runsizes >>like I found with large sample counts in the "Show me the numbers" >>thread. > >That is always something to be on the lookout for. > >Steve > >Thanks > >Steve
I have prepared another run of the complex signal comparison like the first set in this thread. If increased the frequency granularity and kept only the two noisier levels. The sample and bin count has been increased to 100, and the runsize reduced to 1,000. The data is now a little bumpier, but the trends are still clear. All the differences in the displayed values between my 3 bin complex formula and Candan 2013 are gone. Ced All values x1000 Target Noise Level = 0.010 Freq Dawg Real Dawg 2 Bin Dawg 3 Bin Candan 2013 ---- ------------- ------------- ------------- ------------- 3.00 0.005 0.527 0.013 0.733 0.004 0.523 0.004 0.523 3.02 0.011 0.509 0.019 0.683 0.009 0.508 0.009 0.508 3.04 0.018 0.528 0.015 0.690 0.019 0.529 0.019 0.529 3.06 -0.008 0.502 0.002 0.632 -0.010 0.514 -0.010 0.514 3.08 -0.032 0.509 -0.048 0.640 -0.029 0.521 -0.029 0.521 3.10 0.011 0.509 -0.010 0.603 0.016 0.529 0.016 0.529 3.12 -0.021 0.505 -0.023 0.578 -0.021 0.531 -0.021 0.531 3.14 0.004 0.503 0.013 0.556 0.001 0.535 0.001 0.535 3.16 -0.008 0.488 -0.005 0.549 -0.009 0.518 -0.009 0.518 3.18 0.016 0.520 0.012 0.531 0.017 0.558 0.017 0.558 3.20 0.015 0.522 0.024 0.518 0.013 0.562 0.013 0.562 3.22 -0.008 0.496 -0.000 0.477 -0.010 0.546 -0.010 0.546 3.24 -0.010 0.530 -0.019 0.482 -0.008 0.585 -0.008 0.585 3.26 -0.013 0.540 -0.007 0.473 -0.014 0.597 -0.014 0.597 3.28 0.004 0.520 -0.002 0.467 0.007 0.574 0.007 0.574 3.30 -0.012 0.551 -0.003 0.460 -0.015 0.617 -0.015 0.617 3.32 -0.006 0.570 -0.020 0.453 -0.003 0.640 -0.003 0.640 3.34 -0.011 0.567 0.000 0.422 -0.013 0.643 -0.013 0.643 3.36 0.015 0.559 0.002 0.423 0.017 0.638 0.017 0.638 3.38 0.000 0.584 0.011 0.427 -0.002 0.667 -0.002 0.667 3.40 -0.037 0.604 0.000 0.419 -0.045 0.691 -0.045 0.691 3.42 -0.007 0.598 0.017 0.398 -0.013 0.692 -0.013 0.692 3.44 -0.020 0.622 -0.023 0.398 -0.021 0.723 -0.021 0.723 3.46 -0.005 0.631 -0.007 0.391 -0.005 0.737 -0.005 0.737 3.48 0.005 0.660 0.010 0.422 0.006 0.768 0.006 0.768 Target Noise Level = 0.100 Freq Dawg Real Dawg 2 Bin Dawg 3 Bin Candan 2013 ---- ------------- ------------- ------------- ------------- 3.00 0.048 5.213 -0.175 7.293 0.093 5.143 0.093 5.143 3.02 0.167 5.343 0.323 7.295 0.136 5.280 0.136 5.280 3.04 -0.100 5.023 -0.118 6.653 -0.084 5.083 -0.084 5.083 3.06 -0.077 5.073 -0.139 6.197 -0.058 5.202 -0.058 5.202 3.08 0.134 5.045 0.292 6.181 0.103 5.177 0.103 5.177 3.10 -0.156 5.060 -0.109 5.918 -0.162 5.249 -0.162 5.249 3.12 -0.134 5.040 -0.067 5.840 -0.142 5.256 -0.142 5.256 3.14 -0.227 5.136 -0.149 5.756 -0.231 5.404 -0.231 5.404 3.16 0.098 5.021 -0.061 5.407 0.141 5.348 0.141 5.348 3.18 -0.193 5.119 -0.220 5.115 -0.181 5.556 -0.181 5.556 3.20 -0.341 5.040 -0.233 5.037 -0.360 5.467 -0.360 5.467 3.22 -0.016 5.188 0.040 4.917 -0.036 5.649 -0.036 5.649 3.24 -0.042 5.404 -0.127 5.009 -0.026 5.916 -0.026 5.916 3.26 -0.125 5.345 -0.048 4.757 -0.142 5.911 -0.142 5.911 3.28 0.160 5.443 0.299 4.583 0.137 6.055 0.137 6.055 3.30 0.047 5.668 -0.099 4.470 0.086 6.378 0.086 6.378 3.32 0.020 5.596 0.030 4.322 0.011 6.365 0.011 6.365 3.34 -0.023 5.595 -0.042 4.294 -0.021 6.369 -0.021 6.369 3.36 -0.163 5.649 -0.172 4.322 -0.155 6.455 -0.155 6.455 3.38 0.055 5.921 0.143 4.179 0.031 6.790 0.031 6.790 3.40 0.259 5.999 0.203 4.209 0.284 6.881 0.284 6.881 3.42 0.005 6.191 0.005 4.080 0.005 7.181 0.005 7.181 3.44 0.311 5.871 0.147 3.944 0.361 6.782 0.361 6.782 3.46 -0.093 6.394 0.258 4.109 -0.186 7.453 -0.186 7.453 3.48 -0.036 6.884 -0.046 4.074 -0.030 8.083 -0.030 8.083 --------------------------------------- Posted through http://www.DSPRelated.com
Gordon Sande  <Gordon.Sande@gmail.com> wrote:

>On 2015-06-07 07:27:05 +0000, Steve Pope said:
>> There's a very basic premise in science, that within practicability >> you want to change as few variables in each experiment as possible.
>You might want to take a look at the topic called "Design Of Experiments" >in Statistics and/or Industrial Engineering where you want to control what >you change. There is a special case of One at a Time designs which are >based on Grey codes.
Okay thanks. I have never heard of this. I haven't yet looked it up, but intuitively one might well consider changing one variable at a time in a Grey-coded manner. My IEOR and Statistics courses in college may have pre-dated this idea. Steve
Cedron <103185@DSPRelated> wrote:

>>There's a very basic premise in science, that within practicability >>you want to change as few variables in each experiment as possible.
>Controls, blind, double-blind, etc. etc.
>>Here you are arguing that you gain insight into the behavior of >>the algorithm by randomly changing one variable (the noise pattern) >>when you change another variable (the noise level).
>>Part of me is saying, "that can't possibly be true".
>Thinks of it as knowns and unknowns. The premise of your statement is >that the noise is known and the behavior of the formula is unknown. >Therefore, when you study the combined outcome you learn something about >the formula.
>But the noise is not known beyond its expected average and its expected >RMS. If I would want to use "canned" noise patterns for all rows and all >noise levels, I would at least want to recenter and rescale them so their >average was known to be zero and their RMS known to be my target value. >Even so, the particular distribution of values may present as some pattern >in the data that would be indistinguishable from a pattern formed by the >formulas.
There is indeed a risk that, say, one particular noise pattern might be pathological with respect to a particular data pattern. In simulation, you select a runsize (i.e. a large set of noise patterns) large enough such that these pathological combinations appear in their expected incidence, such that you have created a large enough ensemble such that the results are not overly-dominated by outliers, but include them with close to their correct statistics. In a sense, you are right that varying the noise patterns with SNR may give you a larger ensemble effect. That is, if you have six SNR's each with 10,000 different noise patterns, you may be able to look at the resulting curve or column of data and get a gist of what it would have looked like had you chosen a runsize of 60,000 in the first place. But it's not going to be as good a result as running the same 60,000 noise patterns at each SNR. It is ... sort of a half-measure, if that makes any sense. Also you are saying runtime is not a problem for this experiment. Steve
On Sun, 7 Jun 2015 20:23:38 +0000 (UTC), spope33@speedymail.org (Steve
Pope) wrote:

>Gordon Sande <Gordon.Sande@gmail.com> wrote: > >>On 2015-06-07 07:27:05 +0000, Steve Pope said: > >>> There's a very basic premise in science, that within practicability >>> you want to change as few variables in each experiment as possible. > >>You might want to take a look at the topic called "Design Of Experiments" >>in Statistics and/or Industrial Engineering where you want to control what >>you change. There is a special case of One at a Time designs which are >>based on Grey codes. > >Okay thanks. I have never heard of this. I haven't yet looked >it up, but intuitively one might well consider changing one >variable at a time in a Grey-coded manner. > >My IEOR and Statistics courses in college may have pre-dated this idea. >
I took a class on Design of Experiments in grad school. Awesome stuff. This is particularly useful in industrial or process design where you don't have the time, resources, or money to only change one variable per experiment. Bascially, DoE lets you reduce the number of experiments by changing the variables in patterns that are orthogonal to each other. This allows all of the effects to be separated, even though multiple variables may be changed every experiment. Very useful. Eric Jacobsen Anchor Hill Communications http://www.anchorhill.com
On 2015-06-07 20:23:38 +0000, Steve Pope said:

> Gordon Sande <Gordon.Sande@gmail.com> wrote: > >> On 2015-06-07 07:27:05 +0000, Steve Pope said: > >>> There's a very basic premise in science, that within practicability >>> you want to change as few variables in each experiment as possible. > >> You might want to take a look at the topic called "Design Of Experiments" >> in Statistics and/or Industrial Engineering where you want to control what >> you change. There is a special case of One at a Time designs which are >> based on Grey codes. > > Okay thanks. I have never heard of this. I haven't yet looked > it up, but intuitively one might well consider changing one > variable at a time in a Grey-coded manner. > > My IEOR and Statistics courses in college may have pre-dated this idea. > > Steve
When experiments are expensive and take a long time, think agriculture, then it is important to change many things at once in a disciplined fashion. DOE is an old topic but is often only looked at by those with specialized needs.