> Richard Owlett wrote:
>
>> I've sortof been following several threads concerned with irregular
>> sampling of data and various noise effects.
>>
>> I think I might benefit from a *QUALITATIVE* discussion of how to
>> approach a problem which APPEARS to be outside realm of DSP.
>> [The problem is "real" but ... ]
>>
>> QUESTION:
>> What is fuel mileage of a class of vehicles.
>>
>> AVAILABLE DATA (and possible error sources)
>> [source is credit card purchase record]
>> Date - 0 error
>> Time - 0 error
>> Gallons - 0 error
>> Vehicle Odometer - entered by [ possibly careless ] human :{
>> transposed digits - you don't have to be dyslectic to have problem
>> careless entry - hopefully low frequency
>>
>> POSSIBLY AVAILABLE DATA:
>> very few known good date/time/gallons/odometer data points
>>
>> I can see how to approach problem if dependent variable(gallons) is
>> error prone. But what to do if independent variable(odometer) is
>> unreliable?
>>
>> Secondary question.
>> Can you end up in this kind of mess in DSP?
>>
>> For any replies - thanks.
>
>
> Ignoring that different driving (and engine) conditions affect mileage,
> dividing total miles by total gallons gives the number you want. The
> intermediate data points have no affect at all. An erroneous record of
> miles at fill-up will make one leg longer and the other shorter. Two
> differences determine the final result unless you have cause to discard
> an end point from consideration.
>
> Jerry
Which goes once again demonstrate why many regulars routinely caution
newbies to carefully state question or risk a correct but not useful answer.
I was trying to understand how to deal with two similar problems:
1. irregular sampling intervals.
2. sampling ordinate is noisy as well as sampled data being noisy.
I chose fuel mileage as a trivial physical problem that I understood.
Fred's and Rune's replies in particular pointed out how to "think".
Reply by Jerry Avins●April 13, 20052005-04-13
Richard Owlett wrote:
> I've sortof been following several threads concerned with irregular
> sampling of data and various noise effects.
>
> I think I might benefit from a *QUALITATIVE* discussion of how to
> approach a problem which APPEARS to be outside realm of DSP.
> [The problem is "real" but ... ]
>
> QUESTION:
> What is fuel mileage of a class of vehicles.
>
> AVAILABLE DATA (and possible error sources)
> [source is credit card purchase record]
> Date - 0 error
> Time - 0 error
> Gallons - 0 error
> Vehicle Odometer - entered by [ possibly careless ] human :{
> transposed digits - you don't have to be dyslectic to have problem
> careless entry - hopefully low frequency
>
> POSSIBLY AVAILABLE DATA:
> very few known good date/time/gallons/odometer data points
>
> I can see how to approach problem if dependent variable(gallons) is
> error prone. But what to do if independent variable(odometer) is
> unreliable?
>
> Secondary question.
> Can you end up in this kind of mess in DSP?
>
> For any replies - thanks.
Ignoring that different driving (and engine) conditions affect mileage,
dividing total miles by total gallons gives the number you want. The
intermediate data points have no affect at all. An erroneous record of
miles at fill-up will make one leg longer and the other shorter. Two
differences determine the final result unless you have cause to discard
an end point from consideration.
Jerry
--
Engineering is the art of making what you want from things you can get.
�����������������������������������������������������������������������
Reply by Richard Owlett●April 13, 20052005-04-13
Fred Marshall wrote:
> "Richard Owlett" <rowlett@atlascomm.net> wrote in message
> news:115ogtrkdp7jj7f@corp.supernews.com...
>
>>I've sortof been following several threads concerned with irregular
>>sampling of data and various noise effects.
>>
>>I think I might benefit from a *QUALITATIVE* discussion of how to approach
>>a problem which APPEARS to be outside realm of DSP.
>>[The problem is "real" but ... ]
>>
>>QUESTION:
>>What is fuel mileage of a class of vehicles.
>>
>>AVAILABLE DATA (and possible error sources)
>>[source is credit card purchase record]
>> Date - 0 error
>> Time - 0 error
>> Gallons - 0 error
>> Vehicle Odometer - entered by [ possibly careless ] human :{
>> transposed digits - you don't have to be dyslectic to have problem
>> careless entry - hopefully low frequency
>>
>>POSSIBLY AVAILABLE DATA:
>> very few known good date/time/gallons/odometer data points
>>
>>I can see how to approach problem if dependent variable(gallons) is error
>>prone. But what to do if independent variable(odometer) is unreliable?
>>
>>Secondary question.
>>Can you end up in this kind of mess in DSP?
>>
>
>
> Sure. This suggests plotting the data in some form.
>
> Since you have time accurately, one notion would be to record calculated
> miles per gallon as a function of time. Then you can use a variety of
> methods to get rid of bad data points known as "outliers". Then, if you are
> willing to assume that gas mileage is a constant, you can fit a flat,
> straight line to the remaining data using a least squares fit.
>
> Fred
>
>
Thanks. Once I see the answer it becomes obvious ;) Looks like my basic
problem was making problem more complicated than it was.
This kick starts me into thinking about other facts I know that allows
me to appropriately treat bad data points.
1. its unlikely that one day's MPG would vary more that x% from average.
2. for >90% of cases total mileage for a particular day WILL BE
A + i*B + j*C + k*D
where i and j can be 0|1|2
k can be 0|1
B|C << A
D << B|C
In simplest case a bad odometer, fuel record can be deleted and fuel
used can be added to next day's record. If LMS is deemed necessary
rather than simple average, that point can be given 'double weight'.
So this is exercise in THINKING
not dsp :)
I'll quit babbling.
Thank you Fred, Peter, Rune
Reply by Rune Allnor●April 13, 20052005-04-13
Richard Owlett wrote:
> I've sortof been following several threads concerned with irregular
> sampling of data and various noise effects.
>
> I think I might benefit from a *QUALITATIVE* discussion of how to
> approach a problem which APPEARS to be outside realm of DSP.
> [The problem is "real" but ... ]
>
> QUESTION:
> What is fuel mileage of a class of vehicles.
>
> AVAILABLE DATA (and possible error sources)
> [source is credit card purchase record]
> Date - 0 error
> Time - 0 error
> Gallons - 0 error
> Vehicle Odometer - entered by [ possibly careless ] human :{
> transposed digits - you don't have to be dyslectic to have
problem
> careless entry - hopefully low frequency
>
> POSSIBLY AVAILABLE DATA:
> very few known good date/time/gallons/odometer data points
>
> I can see how to approach problem if dependent variable(gallons) is
> error prone. But what to do if independent variable(odometer) is
unreliable?
>
> Secondary question.
> Can you end up in this kind of mess in DSP?
>
> For any replies - thanks.
This seems to me as a Total Least Squares (TLS) problem.
The difference between the TLS and the "usual" Least Mean Squares (LMS)
problem, is that there is the inherent assumption in LMS that there
is uncertainty/errors only in the dependent variable, and not in
the parameter. The TLS method, on the other hand, is based on that
even the parameter can be "noisy".
The computations in the TLS can be a bit tricky, though, and are
based on concepts from linear algebra. Check out chapter 12.3 of
Golub & van Loan: "Matrix Computations", 3rd ed.
Rune
Reply by Peter K.●April 12, 20052005-04-12
Richard Owlett wrote:
> I can see how to approach problem if dependent variable(gallons) is
> error prone. But what to do if independent variable(odometer) is
> unreliable?
If one is known to be reliable, just make it the independent variable.
> Secondary question.
> Can you end up in this kind of mess in DSP?
Sure! The image processing problem of trying to find straight lines
in a noisy set of points has the same sorts of problem --- except
that there is noise on both the "x" and "y" axes (independent and
dependent variables).
Ciao,
Peter K.
Reply by Fred Marshall●April 12, 20052005-04-12
"Richard Owlett" <rowlett@atlascomm.net> wrote in message
news:115ogtrkdp7jj7f@corp.supernews.com...
> I've sortof been following several threads concerned with irregular
> sampling of data and various noise effects.
>
> I think I might benefit from a *QUALITATIVE* discussion of how to approach
> a problem which APPEARS to be outside realm of DSP.
> [The problem is "real" but ... ]
>
> QUESTION:
> What is fuel mileage of a class of vehicles.
>
> AVAILABLE DATA (and possible error sources)
> [source is credit card purchase record]
> Date - 0 error
> Time - 0 error
> Gallons - 0 error
> Vehicle Odometer - entered by [ possibly careless ] human :{
> transposed digits - you don't have to be dyslectic to have problem
> careless entry - hopefully low frequency
>
> POSSIBLY AVAILABLE DATA:
> very few known good date/time/gallons/odometer data points
>
> I can see how to approach problem if dependent variable(gallons) is error
> prone. But what to do if independent variable(odometer) is unreliable?
>
> Secondary question.
> Can you end up in this kind of mess in DSP?
>
Sure. This suggests plotting the data in some form.
Since you have time accurately, one notion would be to record calculated
miles per gallon as a function of time. Then you can use a variety of
methods to get rid of bad data points known as "outliers". Then, if you are
willing to assume that gas mileage is a constant, you can fit a flat,
straight line to the remaining data using a least squares fit.
Fred
Reply by Richard Owlett●April 12, 20052005-04-12
I've sortof been following several threads concerned with irregular
sampling of data and various noise effects.
I think I might benefit from a *QUALITATIVE* discussion of how to
approach a problem which APPEARS to be outside realm of DSP.
[The problem is "real" but ... ]
QUESTION:
What is fuel mileage of a class of vehicles.
AVAILABLE DATA (and possible error sources)
[source is credit card purchase record]
Date - 0 error
Time - 0 error
Gallons - 0 error
Vehicle Odometer - entered by [ possibly careless ] human :{
transposed digits - you don't have to be dyslectic to have problem
careless entry - hopefully low frequency
POSSIBLY AVAILABLE DATA:
very few known good date/time/gallons/odometer data points
I can see how to approach problem if dependent variable(gallons) is
error prone. But what to do if independent variable(odometer) is unreliable?
Secondary question.
Can you end up in this kind of mess in DSP?
For any replies - thanks.