Maybe Jerry will comment on this: We have a wastewater plant where we measure flow on a daily basis and biological oxygen demand and total suspended solids on a weekly basis. I'm sure that all of these measurements fail to meet any reasonable signal processing sampling criterion. But, it is what it is. Except for flow, the measurements are taken manually and are somewhat expensive as things go. So, a higher sample rate is unlikely to be seen. Because the plant capacity is shared we need a way to determine who is using how much of the capacity for each of these 3 measures - looking for excess usage. I've done a sample-hold on the weekly data and done 30-day and 90-day averages of them to get an indicator of capacity use. On the one hand it's easy to say that the measurements are bogus because they don't meet the sampling criterion - the bandwidth is higher than the sample rate by far. On the other hand, it's really long-term demand or long-term peak demand that's of importance. I'm wondering if there isn't some sensible thing to do with the data under these circumstances? Things that come to mind are: - Consider the peaks over some suitable time epoch. Because the data is undersampled, the measured peaks can only be too small and the missed maxima can only be of "relatively: short duration. Use a suitable measure on the peaks as a measure of the demand for the period. (There are no anomalous components like sinusoids in the data - it's quite random). - Consider that there can't be any strong aliasing because there are no spectral lines at high frequencies - only temporal spikes which are very broadband. Thus, any aliasing must be mostly noiselike in appearance. Fred
Undersampled Data
Started by ●September 10, 2007
Reply by ●September 10, 20072007-09-10
Fred Marshall wrote: (snip regarding undersampled data)> I'm wondering if there isn't some sensible thing to do with the data under > these circumstances? Things that come to mind are: > - Consider the peaks over some suitable time epoch. Because the data is > undersampled, the measured peaks can only be too small and the missed maxima > can only be of "relatively: short duration. Use a suitable measure on the > peaks as a measure of the demand for the period. (There are no anomalous > components like sinusoids in the data - it's quite random).Is it really random? The thing I wonder about is systematic error. While I agree that there aren't likely large sinusoids, it would seem likely that there are periodic components. Many of them may be averaged out on the way to the plant.> - Consider that there can't be any strong aliasing because there are no > spectral lines at high frequencies - only temporal spikes which are very > broadband. Thus, any aliasing must be mostly noiselike in appearance.To me, if it is really (random) noise-like then statistically you should have a good enough sample, on average, to measure peak and/or average demand. If your sampling is always at the same time of the day, though, and some large users also have high flow rates at certain times of the day, it seems that you might miss something. -- glen
Reply by ●September 10, 20072007-09-10
Fred Marshall wrote:> Maybe Jerry will comment on this: > > We have a wastewater plant where we measure flow on a daily basis and > biological oxygen demand and total suspended solids on a weekly basis. I'm > sure that all of these measurements fail to meet any reasonable signal > processing sampling criterion. But, it is what it is. Except for flow, the > measurements are taken manually and are somewhat expensive as things go. > So, a higher sample rate is unlikely to be seen.Our main plant serves three municipalities. The piping is such that we need three flow meters. (We have more, but that's another story.) The primary meters are notched weirs, and there is no difficulty recording continuously from a stilling chamber associated with each weir. Essentially, flow isn't sampled, but measured continuously. Integrators record totals. BOD is recorded twice daily. We had one high-BOD industrial uses who was careful to dump his holding tanks between 2:00 and 4:00 AM. We caught him anyway and he settled out of court by paying two years worth of BOD surcharge. Weekly won't cut it, especially if your users know your sampling schedule.> Because the plant capacity is shared we need a way to determine who is using > how much of the capacity for each of these 3 measures - looking for excess > usage.You have a bunch of users writing checks on the same account with no records being kept. Good luck!> I've done a sample-hold on the weekly data and done 30-day and 90-day > averages of them to get an indicator of capacity use. > > On the one hand it's easy to say that the measurements are bogus because > they don't meet the sampling criterion - the bandwidth is higher than the > sample rate by far. > On the other hand, it's really long-term demand or long-term peak demand > that's of importance.You're probably right, but you need to make some unannounced short-term measurements to be sure that no one games the system. I could block an interceptor for part of a day and let it dump and run freely the rest of the time. If I did that in sync with your known sampling times, I could drastically cut my apparent measured flow.> I'm wondering if there isn't some sensible thing to do with the data under > these circumstances? Things that come to mind are: > - Consider the peaks over some suitable time epoch. Because the data is > undersampled, the measured peaks can only be too small and the missed maxima > can only be of "relatively: short duration. Use a suitable measure on the > peaks as a measure of the demand for the period. (There are no anomalous > components like sinusoids in the data - it's quite random). > - Consider that there can't be any strong aliasing because there are no > spectral lines at high frequencies - only temporal spikes which are very > broadband. Thus, any aliasing must be mostly noiselike in appearance.Noise, yes, but I'm sure you know about noise shaping. Jerry -- Engineering is the art of making what you want from things you can get. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Reply by ●September 11, 20072007-09-11
"Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote in message news:srKdnTCl_uucBnjbnZ2dnUVZ_oytnZ2d@centurytel.net...> Maybe Jerry will comment on this: > > We have a wastewater plant where we measure flow on a daily basis and > biological oxygen demand and total suspended solids on a weekly basis. > I'm sure that all of these measurements fail to meet any reasonable signal > processing sampling criterion. But, it is what it is. Except for flow, > the measurements are taken manually and are somewhat expensive as things > go. So, a higher sample rate is unlikely to be seen. > > Because the plant capacity is shared we need a way to determine who is > using how much of the capacity for each of these 3 measures - looking for > excess usage. > > I've done a sample-hold on the weekly data and done 30-day and 90-day > averages of them to get an indicator of capacity use. > > On the one hand it's easy to say that the measurements are bogus because > they don't meet the sampling criterion - the bandwidth is higher than the > sample rate by far. > On the other hand, it's really long-term demand or long-term peak demand > that's of importance. > > I'm wondering if there isn't some sensible thing to do with the data under > these circumstances? Things that come to mind are: > - Consider the peaks over some suitable time epoch. Because the data is > undersampled, the measured peaks can only be too small and the missed > maxima can only be of "relatively: short duration. Use a suitable measure > on the peaks as a measure of the demand for the period. (There are no > anomalous components like sinusoids in the data - it's quite random). > - Consider that there can't be any strong aliasing because there are no > spectral lines at high frequencies - only temporal spikes which are very > broadband. Thus, any aliasing must be mostly noiselike in appearance. >I can't make to much about what your saying. It seems that you are trying to pinpoint who is using how much? i.e.,> Because the plant capacity is shared we need a way to determine who is > using how much of the capacity for each of these 3 measures - looking for > excess usage.(To be honest I don't understand what your problem. If you could go into more detail it might help) If thats the case and you know the times of who is actively using the system then you can easily measure how much of the capacity was used by measuring the initial and final capacities. You do not need to sample inbetween. That is, unless you charge not by capacity but by flow rate(which I'm confused about as it seems your measuring flow but then you talk about capacity(unless you are talking about flow capacity? ;)). Obviously if Joe uses the system from time t1 to t2 and the capacity changed from C1 to C2 then the capacity change is C2 - C1. Doesn't matter exactly when he did what because whatever the change is, the total change is what matters. If the sharing is done simultaneously then obviously you cannot measure the capacity because then it is simultaneously shared too. This is just an idea but might not be applicable to your problem, but you can turn simultaneous sharing partially into a time sharing by having each user have there own small capacity. This is probably impractical for your problem but what it essentially does is average out the flow to some degree(there are some constraint issues involved besides the other problems). Really though, probably the only way to get good data on an how flow capacity an individual user is using among a shared system is going to be to increase the sampling rate or somehow limit flow(which I suppose isn't possible but you could average it out using the above idea to some degree(which might end up creating more problems than its worth)). This sounds like a similar problem that water co's have. If so, then suppose that there flow meters could not measure flow past a certain rate. Then what a user could do is surge the water so that it goes past what the meter can handle and my idea above to level it out before they use it so it stops the surging. e.g., suppose the flow meter cannot measure past 100ga/s and caps off at this point. User x creates a large tank, say 1000ga and attaches some logic and a strong pump. When the user needs water(tank gets low) he turns on the pump which pulls 1000ga/s from the water co. It takes 1 second to fill up the tank but the water co thinks he used only 100ga. So he has gained 900ga of water for free. He then uses a smaller pump to pump the water for his needs. This is a similar problem to what you have it seems(But not the analogous). The water company only has two choices. To increase the flow meter's measuring capacity or introduce some way to limit the flow to the meter. (i.e., increase sampling rate or limit bandwidth) If this is a serious problem and you have to increase sampling rate then best bet is to get some flow meters or if possible just limit the flow rate. Anyways, hopefully I'm close ;) Jon
Reply by ●September 11, 20072007-09-11
Jon Slaughter wrote:> "Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote in message > news:srKdnTCl_uucBnjbnZ2dnUVZ_oytnZ2d@centurytel.net... >> Maybe Jerry will comment on this: >> >> We have a wastewater plant where we measure flow on a daily basis and >> biological oxygen demand and total suspended solids on a weekly basis. >> I'm sure that all of these measurements fail to meet any reasonable signal >> processing sampling criterion. But, it is what it is. Except for flow, >> the measurements are taken manually and are somewhat expensive as things >> go. So, a higher sample rate is unlikely to be seen. >> >> Because the plant capacity is shared we need a way to determine who is >> using how much of the capacity for each of these 3 measures - looking for >> excess usage. >> >> I've done a sample-hold on the weekly data and done 30-day and 90-day >> averages of them to get an indicator of capacity use. >> >> On the one hand it's easy to say that the measurements are bogus because >> they don't meet the sampling criterion - the bandwidth is higher than the >> sample rate by far. >> On the other hand, it's really long-term demand or long-term peak demand >> that's of importance. >> >> I'm wondering if there isn't some sensible thing to do with the data under >> these circumstances? Things that come to mind are: >> - Consider the peaks over some suitable time epoch. Because the data is >> undersampled, the measured peaks can only be too small and the missed >> maxima can only be of "relatively: short duration. Use a suitable measure >> on the peaks as a measure of the demand for the period. (There are no >> anomalous components like sinusoids in the data - it's quite random). >> - Consider that there can't be any strong aliasing because there are no >> spectral lines at high frequencies - only temporal spikes which are very >> broadband. Thus, any aliasing must be mostly noiselike in appearance. >> > > I can't make to much about what your saying. It seems that you are trying > to pinpoint who is using how much? > > i.e., > >> Because the plant capacity is shared we need a way to determine who is >> using how much of the capacity for each of these 3 measures - looking for >> excess usage. > > (To be honest I don't understand what your problem. If you could go into > more detail it might help) > > If thats the case and you know the times of who is actively using the system > then you can easily measure how much of the capacity was used by measuring > the initial and final capacities. You do not need to sample inbetween. That > is, unless you charge not by capacity but by flow rate(which I'm confused > about as it seems your measuring flow but then you talk about > capacity(unless you are talking about flow capacity? ;)). > > Obviously if Joe uses the system from time t1 to t2 and the capacity changed > from C1 to C2 then the capacity change is C2 - C1. Doesn't matter exactly > when he did what because whatever the change is, the total change is what > matters. > > If the sharing is done simultaneously then obviously you cannot measure the > capacity because then it is simultaneously shared too. > > This is just an idea but might not be applicable to your problem, but you > can turn simultaneous sharing partially into a time sharing by having each > user have there own small capacity. This is probably impractical for your > problem but what it essentially does is average out the flow to some > degree(there are some constraint issues involved besides the other > problems). > > Really though, probably the only way to get good data on an how flow > capacity an individual user is using among a shared system is going to be to > increase the sampling rate or somehow limit flow(which I suppose isn't > possible but you could average it out using the above idea to some > degree(which might end up creating more problems than its worth)). > > This sounds like a similar problem that water co's have. If so, then suppose > that there flow meters could not measure flow past a certain rate. Then what > a user could do is surge the water so that it goes past what the meter can > handle and my idea above to level it out before they use it so it stops the > surging. > > > e.g., suppose the flow meter cannot measure past 100ga/s and caps off at > this point. > > User x creates a large tank, say 1000ga and attaches some logic and a strong > pump. When the user needs water(tank gets low) he turns on the pump which > pulls 1000ga/s from the water co. It takes 1 second to fill up the tank but > the water co thinks he used only 100ga. So he has gained 900ga of water for > free. He then uses a smaller pump to pump the water for his needs. > > This is a similar problem to what you have it seems(But not the analogous). > The water company only has two choices. To increase the flow meter's > measuring capacity or introduce some way to limit the flow to the meter. > (i.e., increase sampling rate or limit bandwidth) > > If this is a serious problem and you have to increase sampling rate then > best bet is to get some flow meters or if possible just limit the flow rate. > > Anyways, hopefully I'm close ;)Jon, I don't think you're close. I assume that Fred's system is similar to mine. Three municipalities contribute domestic and industrial flow to a single plant. Any toilet can be flushed anywhere at any time, and flow from the far reaches of the system can take the better part of a day to reach the plant. Fred's system is evidently smaller than the one I'm associated with (http://sbrsa.org/) but I imagine that many of our ground rules are similar. Participants' charges are based not only on the aggregate flow for the year and on capital costs apportioned according to the fraction of total capacity necessarily reserved for them. (An averaging scheme smooths the year-to-year fluctuations related to weather.) We have records of daily total flows going back to the 1970s. Jerry -- Engineering is the art of making what you want from things you can get. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Reply by ●September 11, 20072007-09-11
Hello, I think this is not so much about signal processing but about statistics: When you say you are "undersampling" I agree: It is not possible to reconstruct the actual "waveform" from the samples without loss of information. Still, statistics will be able to extract useful information from the data. My point is simply: The answers won't be in a signal processing textbook, but in one on statistics. Well, probably :) Cheers Markus
Reply by ●September 11, 20072007-09-11
"mnentwig" <mnentwig@elisanet.fi> wrote in message news:buqdnZiD6J1LqXvbnZ2dnUVZ_sKqnZ2d@giganews.com...> Hello, > > I think this is not so much about signal processing but about statistics: > When you say you are "undersampling" I agree: It is not possible to > reconstruct the actual "waveform" from the samples without loss of > information. > > Still, statistics will be able to extract useful information from the > data. > My point is simply: The answers won't be in a signal processing textbook, > but in one on statistics. Well, probably :) > > Cheers > > MarkusThanks everyone for the thoughtful replies. Yes Jerry, we have similar situations it appears. Your point about surging and gaming the system is well taken. Markus suggests using statistics which is what I was driving at regarding peaks I suppose. Here's a description of the data / source: The plant costs (and someday the need for added capacity) are allocated using 3 parameters: flow, BOD and TSS. The actual values of each vary throughout each day and likely have weekly trends. The measurements are taken once a week - as I mentioned before. BOD and TSS are measures of what is *in* the flow and flow is measured in gallons per day. There is no time sharing - just composite sharing (well, except for surges Jerry!). There are only two users so shares are calculated by measuring the total at the plant and measuring one of the users and taking the difference. Not ideal but probably OK for our purposes. The measurement points are physically separated by 1/2 mile. There is no place to measure the other user individually (that's us) because of mixing ahead of the plant where the total is taken. Capacity is a constant determined from design parameters, operational results and is deemed by the state dept. of ecology. Capacity is shared among the two users on an allocated basis (due to investment and continuing payments). So, the objectives of measurement and analysis are threefold: 1) Determine over the long term how much of the deemed capacity is being used. This would lead to investment in additional capacity when the time comes. 2) Determine each month how to allocate variable costs based on share of flow, BOD and TSS plugged into a suitable formula. This is the most interesting from an analysis point of view. Jerry's comments seem to suggest that perhaps calculation of payments monthly is too frequent.... 3) Determine over some reasonable time frame if one user has encroached on the capacity of the other. This is the next most interesting. It suggests averages of at least 3 months and possibly as long as a year. At present, no user has exceeded their allocated share of capacity using a 90-day average. If one were to use statistical tools in #2, what might be suggested? Fred
Reply by ●September 11, 20072007-09-11
Fred Marshall wrote: ...> So, the objectives of measurement and analysis are threefold: > > 1) Determine over the long term how much of the deemed capacity is being > used. This would lead to investment in additional capacity when the time > comes. > > 2) Determine each month how to allocate variable costs based on share of > flow, BOD and TSS plugged into a suitable formula. This is the most > interesting from an analysis point of view. Jerry's comments seem to > suggest that perhaps calculation of payments monthly is too frequent.... > > 3) Determine over some reasonable time frame if one user has encroached on > the capacity of the other. This is the next most interesting. It suggests > averages of at least 3 months and possibly as long as a year. At present, > no user has exceeded their allocated share of capacity using a 90-day > average. > > If one were to use statistical tools in #2, what might be suggested?SBRSA measures far more frequently than you, as I wrote earlier. We compile flow data for a year and also track operating costs for that year. Estimated charges for each municipality are assesesd in time for the next year's municipal budgets. Payments are due monthly. At the end of the year, actual charges for the past year are computed. Excess payments are refunded immediately, and shortfalls are added to the new year's estimates. Since each user is expected to have paid a total of the capital costs based on that user's actual flow, year-to-year variations due to weather-related I&I (inflow and infiltration to the uninitiated) would cause large cash transfers between municipalities now that the plant is over 25 years old. We used to averave for a shorter period, but some years ago we negotiated a new scheme that averages over seven years and addresses the problem of the last year of bond repayment. For more details, John Kantorek, the executive Director, or Toni Pichola, the Manager of Engineering, would probably be polite if you tell them I sent you :-). Better yet, Pat Carlino, the Office Manager, will direct you wherever you can be best served. You don't need an extension for Pat. The other's are listed on http://www.sbrsa.com/contact.html, along with the phone. Jerry -- Engineering is the art of making what you want from things you can get. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
Reply by ●September 11, 20072007-09-11
"Jerry Avins" <jya@ieee.org> wrote in message news:46E7009E.3020800@ieee.org...> Fred Marshall wrote: > > ... > >> So, the objectives of measurement and analysis are threefold: >> >> 1) Determine over the long term how much of the deemed capacity is being >> used. This would lead to investment in additional capacity when the time >> comes. >> >> 2) Determine each month how to allocate variable costs based on share of >> flow, BOD and TSS plugged into a suitable formula. This is the most >> interesting from an analysis point of view. Jerry's comments seem to >> suggest that perhaps calculation of payments monthly is too frequent.... >> >> 3) Determine over some reasonable time frame if one user has encroached >> on the capacity of the other. This is the next most interesting. It >> suggests averages of at least 3 months and possibly as long as a year. >> At present, no user has exceeded their allocated share of capacity using >> a 90-day average. >> >> If one were to use statistical tools in #2, what might be suggested? > > SBRSA measures far more frequently than you, as I wrote earlier. We > compile flow data for a year and also track operating costs for that year. > Estimated charges for each municipality are assesesd in time for the next > year's municipal budgets. Payments are due monthly. At the end of the > year, actual charges for the past year are computed. Excess payments are > refunded immediately, and shortfalls are added to the new year's > estimates. > > Since each user is expected to have paid a total of the capital costs > based on that user's actual flow, year-to-year variations due to > weather-related I&I (inflow and infiltration to the uninitiated) would > cause large cash transfers between municipalities now that the plant is > over 25 years old. We used to averave for a shorter period, but some years > ago we negotiated a new scheme that averages over seven years and > addresses the problem of the last year of bond repayment. For more > details, John Kantorek, the executive Director, or Toni Pichola, the > Manager of Engineering, would probably be polite if you tell them I sent > you :-). Better yet, Pat Carlino, the Office Manager, will direct you > wherever you can be best served. You don't need an extension for Pat. The > other's are listed on http://www.sbrsa.com/contact.html, along with the > phone. > > JerryJerry, Thanks. I may well follow up with them. In our situation, the capacity shares are specified and paid for up front. Each pays to maintain their share of capacity and to operate it. Each pays variable costs according to loading. Then, it's only a matter of how to pay for encroachments into another's capacity. If, when and how much. In one contract there is the notion of leasing such capacity but the terms aren't spelled out. This has never happened. We considered using the current value of the plant with time value of money to determine the transfer price of capacity. This approach was a bit too complicated for the boards to deal with. So in a later draft there is the notion of "returning" capacity upon new construction. That's actually a pretty good idea if the conditions warrant because ...... In both cases the drafters envisioned long-term, permanent capacity needs. In practice, capacity encroachments happen from time to time and depend on averaging epochs. This leads one to think about renting capacity based on an estimated plant life and pay-for-use per unit time - or to just ignore transient overuse as no harm is done. Whatever, it's fertile ground for lawyers..... Fred
Reply by ●September 12, 20072007-09-12
On 11 Sep, 20:23, "Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote:> Here's a description of the data / source:Fred, Only having browsed the thread very quickly, my impression is that the traditional tools for DSP are not the best for this sort of problem. If I were asked to analyze this sort of problem I would have tried to use the Kalman filter. Check out the book by Durbin and Koopman on Kalman filters in econometric applications. Rune






