comp.dsp | Analyzing an "undersampled" sequence

Perhaps I should post this elsewhere but we speak the same language 
here.  I may have asked a similar question some time ago but now I have 
a new perspective and want to investigate.

I have a wastewater process that's being sampled periodically (uniform 
sampling for what it's worth).
The sample rate is way too low to avoid aliasing but the samples are 
real enough and the data is continuously available and very likely not 
amenable to being sampled more often (economics).

It's a bit like sampling a random series except that I "know" there is 
an underlying pattern that repeats each day with variable amplitude no 
doubt.  That, plus transients, would be the highest frequency content 
and seasonal things are the lowest frequency content which I'm not too 
worried about.  And, while I'd like to know when transients happen and 
how big they are, I'm afraid that's out of the question.

In fact, what's of value here is to estimate how much plant capacity is 
being "used up".  By my reckoning, 6 months of data during our peak 
months is a good averaging period - as it's the peak months that 
determine our capacity "use" for regulatory purposes.
In the shorter term, the numbers are used for determining charges for 
overly high concentrations, shared use, etc.

To make things a bit more complicated, the regulatory agency has us 
report the weekly data on a monthly basis (actually here there are 2 
samples per week) and average it for the month.
If there are 3 contiguous months with these averages exceeding our 
"capacity" or some large fraction of it, then we are put on notice that 
planning for future capacity must begin.  So, this is one "measure" 
that's in concrete.  But, I digress a bit .....

Here is my question:

Instead of worrying about aliasing which is where I go to first of 
course, is there a statistical measure that might help me better 
understand the "quality" of our numbers or how much variation is 
"expected" given those numbers?
For example, given 4 to 8 weeks of data (4 to 8 samples), what can be 
said the data set in a statistical sense?  How might one best put the 
answer to use in a case like this?

Where should I be looking?

Fred

Reply by jim ●July 13, 20102010-07-13

I didn't see where you reveal what is being sampled.  Is it how full a tank
is? How much fluid is flowing in a pipe?

Fred Marshall wrote:

> Perhaps I should post this elsewhere but we speak the same language
> here.  I may have asked a similar question some time ago but now I have
> a new perspective and want to investigate.
>
> I have a wastewater process that's being sampled periodically (uniform
> sampling for what it's worth).
> The sample rate is way too low to avoid aliasing but the samples are
> real enough and the data is continuously available and very likely not
> amenable to being sampled more often (economics).
>
> It's a bit like sampling a random series except that I "know" there is
> an underlying pattern that repeats each day with variable amplitude no
> doubt.  That, plus transients, would be the highest frequency content
> and seasonal things are the lowest frequency content which I'm not too
> worried about.  And, while I'd like to know when transients happen and
> how big they are, I'm afraid that's out of the question.
>
> In fact, what's of value here is to estimate how much plant capacity is
> being "used up".  By my reckoning, 6 months of data during our peak
> months is a good averaging period - as it's the peak months that
> determine our capacity "use" for regulatory purposes.
> In the shorter term, the numbers are used for determining charges for
> overly high concentrations, shared use, etc.
>
> To make things a bit more complicated, the regulatory agency has us
> report the weekly data on a monthly basis (actually here there are 2
> samples per week) and average it for the month.

Depending on what is being sampled that could be a complete accounting or
an incomplete accounting of usage. If each sample records how much was used
since the last sample was taken, then when you add them together you have
complete accounting of the usage for the month.  If all that the sample is
measuring is the instantaneous usage at the instant the sample is taken
then you have a very incomplete accounting of usage and could make it mean
just about anything you want it to.

-jim



>
> If there are 3 contiguous months with these averages exceeding our
> "capacity" or some large fraction of it, then we are put on notice that
> planning for future capacity must begin.  So, this is one "measure"
> that's in concrete.  But, I digress a bit .....
>
> Here is my question:
>
> Instead of worrying about aliasing which is where I go to first of
> course, is there a statistical measure that might help me better
> understand the "quality" of our numbers or how much variation is
> "expected" given those numbers?
> For example, given 4 to 8 weeks of data (4 to 8 samples), what can be
> said the data set in a statistical sense?  How might one best put the
> answer to use in a case like this?
>
> Where should I be looking?
>
> Fred

Reply by Steve Pope ●July 13, 20102010-07-13

Fred Marshall  <fmarshallx@remove_the_xacm.org> wrote:

>Instead of worrying about aliasing which is where I go to first of 
>course, is there a statistical measure that might help me better 
>understand the "quality" of our numbers or how much variation is 
>"expected" given those numbers?
>For example, given 4 to 8 weeks of data (4 to 8 samples), what can be 
>said the data set in a statistical sense?  How might one best put the 
>answer to use in a case like this?
>
>Where should I be looking?

Something like a Student's T test can tell you if a sample
or group of samples is out-of-line.

(I think I may have said the same thing, the last time you
asked a similar question.)

Steve

Reply by Jerry Avins ●July 13, 20102010-07-13

On 7/13/2010 12:31 PM, Fred Marshall wrote:
> Perhaps I should post this elsewhere but we speak the same language
> here.  I may have asked a similar question some time ago but now I have
> a new perspective and want to investigate.
>
> I have a wastewater process that's being sampled periodically (uniform
> sampling for what it's worth).
> The sample rate is way too low to avoid aliasing but the samples are
> real enough and the data is continuously available and very likely not
> amenable to being sampled more often (economics).
>
> It's a bit like sampling a random series except that I "know" there is
> an underlying pattern that repeats each day with variable amplitude no
> doubt. That, plus transients, would be the highest frequency content and
> seasonal things are the lowest frequency content which I'm not too
> worried about. And, while I'd like to know when transients happen and
> how big they are, I'm afraid that's out of the question.
>
> In fact, what's of value here is to estimate how much plant capacity is
> being "used up". By my reckoning, 6 months of data during our peak
> months is a good averaging period - as it's the peak months that
> determine our capacity "use" for regulatory purposes.
> In the shorter term, the numbers are used for determining charges for
> overly high concentrations, shared use, etc.
>
> To make things a bit more complicated, the regulatory agency has us
> report the weekly data on a monthly basis (actually here there are 2
> samples per week) and average it for the month.
> If there are 3 contiguous months with these averages exceeding our
> "capacity" or some large fraction of it, then we are put on notice that
> planning for future capacity must begin. So, this is one "measure"
> that's in concrete. But, I digress a bit .....
>
> Here is my question:
>
> Instead of worrying about aliasing which is where I go to first of
> course, is there a statistical measure that might help me better
> understand the "quality" of our numbers or how much variation is
> "expected" given those numbers?
> For example, given 4 to 8 weeks of data (4 to 8 samples), what can be
> said the data set in a statistical sense? How might one best put the
> answer to use in a case like this?
>
> Where should I be looking?

Other things being equal, clustering should follow a Poisson 
distribution. If you measure flow -- a quantity that can be heavily 
influenced by rainfall -- only twice a week, how do you bill equitably?

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by Fred Marshall ●July 13, 20102010-07-13

Jerry Avins wrote:

> 
> Other things being equal, clustering should follow a Poisson 
> distribution. If you measure flow -- a quantity that can be heavily 
> influenced by rainfall -- only twice a week, how do you bill equitably?
> 
> Jerry

Jerry,

I don't imagine that we bill entirely "equitably" - more like "agreeably".

We measure flow continuously to get the volume and concentration once or 
twice a week.

The concentration is assumed to apply for the entire measured volume 
between concentration samples.  So, one may say that we sample loading 
in that fashion.

I think I answered my own question to the point where I can deal with it:

We have the weekly or twice-weekly samples and have computer monthly 
averages - as the latter have some regulatory importance.
You might consider these monthly averages to be lowpassed versions of 
the samples.
Then, one can compute the distribution of outcomes and infer(?) the 
amount of loading.

My "backwards" sort of reasoning goes like this:
We take a set of samples.
We determine the distribution of those sample values over a suitably 
long time such that daily and even annual variations are included in the 
distribution.
The caution here is that trends get wiped out - so a suitable time frame 
or set of them needs to be selected that has some meaning where gross 
trends are concerned.
If we assume that the distribution represents a reasonable estimate of 
ground truth, then we can infer in quantitative terms what's happening - 
such as over-loading (i.e. loading that's above some determined threshold).
It's surely not "perfect" but it's better than nothing ... I think.

Fred

Reply by Jerry Avins ●July 13, 20102010-07-13

On 7/13/2010 8:28 PM, Fred Marshall wrote:
> Jerry Avins wrote:
>
>>
>> Other things being equal, clustering should follow a Poisson
>> distribution. If you measure flow -- a quantity that can be heavily
>> influenced by rainfall -- only twice a week, how do you bill equitably?
>>
>> Jerry
>
> Jerry,
>
> I don't imagine that we bill entirely "equitably" - more like "agreeably".
>
> We measure flow continuously to get the volume and concentration once or
> twice a week.
>
> The concentration is assumed to apply for the entire measured volume
> between concentration samples. So, one may say that we sample loading in
> that fashion.
>
> I think I answered my own question to the point where I can deal with it:
>
> We have the weekly or twice-weekly samples and have computer monthly
> averages - as the latter have some regulatory importance.
> You might consider these monthly averages to be lowpassed versions of
> the samples.
> Then, one can compute the distribution of outcomes and infer(?) the
> amount of loading.
>
> My "backwards" sort of reasoning goes like this:
> We take a set of samples.
> We determine the distribution of those sample values over a suitably
> long time such that daily and even annual variations are included in the
> distribution.
> The caution here is that trends get wiped out - so a suitable time frame
> or set of them needs to be selected that has some meaning where gross
> trends are concerned.
> If we assume that the distribution represents a reasonable estimate of
> ground truth, then we can infer in quantitative terms what's happening -
> such as over-loading (i.e. loading that's above some determined threshold).
> It's surely not "perfect" but it's better than nothing ... I think.

If your samples are taken at times of unusually high I&I, the dilution 
can make the measured concentrations uncharacteristically low.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by Fred Marshall ●July 14, 20102010-07-14

Jerry Avins wrote:

> If your samples are taken at times of unusually high I&I, the dilution 
> can make the measured concentrations uncharacteristically low.
> 
> Jerry

Yes, I know but the sample times are set for a number of reasons. 
Actually, our concern right now is why the concentrations are so darned 
high!  So, in these parts where there's nearly 100 inches of rain each 
year, we're used to seeing and fixing I&I.  Right now it's not a big 
concern.

Fred

Reply by Greg Heath ●July 14, 20102010-07-14

On Jul 13, 9:29&#4294967295;pm, Jerry Avins <j...@ieee.org> wrote:
> On 7/13/2010 8:28 PM, Fred Marshall wrote:
>
>
>
>
>
> > Jerry Avins wrote:
>
> >> Other things being equal, clustering should follow a Poisson
> >> distribution. If you measure flow -- a quantity that can be heavily
> >> influenced by rainfall -- only twice a week, how do you bill equitably?
>
> >> Jerry
>
> > Jerry,
>
> > I don't imagine that we bill entirely "equitably" - more like "agreeably".
>
> > We measure flow continuously to get the volume and concentration once or
> > twice a week.
>
> > The concentration is assumed to apply for the entire measured volume
> > between concentration samples. So, one may say that we sample loading in
> > that fashion.
>
> > I think I answered my own question to the point where I can deal with it:
>
> > We have the weekly or twice-weekly samples and have computer monthly
> > averages - as the latter have some regulatory importance.
> > You might consider these monthly averages to be lowpassed versions of
> > the samples.
> > Then, one can compute the distribution of outcomes and infer(?) the
> > amount of loading.
>
> > My "backwards" sort of reasoning goes like this:
> > We take a set of samples.
> > We determine the distribution of those sample values over a suitably
> > long time such that daily and even annual variations are included in the
> > distribution.
> > The caution here is that trends get wiped out - so a suitable time frame
> > or set of them needs to be selected that has some meaning where gross
> > trends are concerned.
> > If we assume that the distribution represents a reasonable estimate of
> > ground truth, then we can infer in quantitative terms what's happening -
> > such as over-loading (i.e. loading that's above some determined threshold).
> > It's surely not "perfect" but it's better than nothing ... I think.
>
> If your samples are taken at times of unusually high I&I, the dilution
> can make the measured concentrations uncharacteristically low.

Duh, what's, I & I?

Greg

Reply by Jerry Avins ●July 14, 20102010-07-14

On 7/14/2010 5:55 AM, Greg Heath wrote:

> ... what's, I&  I?

Infiltration and inflow, which force sewage plants to process rainwater. 
Infiltration occurs when leaky mains are lower that the water table. 
Inflow is often illegal pump connections to the sanitary sewer. When 
streets become submerged, rainwater can pour in through manhole covers.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Reply by Jerry Avins ●July 14, 20102010-07-14

On 7/14/2010 5:55 AM, Greg Heath wrote:

 > ... what's, I&  I?

Infiltration and inflow, which force sewage plants to process rainwater. 
Infiltration occurs when leaky mains are lower than the water table. 
Inflow is often illegal pump connections to the sanitary sewer. When 
streets become submerged, rainwater can pour in through manhole covers.

Jerry
-- 
Engineering is the art of making what you want from things you can get.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Previous12 Next

Analyzing an "undersampled" sequence

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group