DSPRelated.com
Forums

Possible repost - Using least squares cubic spline fitting as a "filter", what is analogus to Nyquist criterion?

Started by Richard Owlett November 10, 2009
Back in September, Zevv titled a post "Isolating semi-periodic 
waveforms in a signal" in which he gave a link to 
http://tinyurl.com/pmgraph (a plot of household power usage). He 
was kind enough to send me a week's worth of raw data ( sampled 
every 2 seconds).

As I am visually oriented, the first thing I did was to plot the 
data. I saw at least two distinct types of noise:
      1. apparently random small fluctuations
      2. large spikes associated with state changes

As these spikes were very large, independent of size of following 
level shift, and *EXACTLY* of 1 sample duration, I arbitrarily 
replaced them with the value of the following data point.

My next iteration was to replace all samples between a pair of 
state changes with the average during that period. That was 
useful to point out what would have to be taken into account for 
a better approximation.

The next thing was to consider doing a running average over a set 
of n samples between state changes. Two problems:
    1. how to chose n
    2. what to do within n samples of start/end of current state

If this were "the good old days", I would grab graph paper, 
french curve and a straight edge to do some calibrated eyeball 
curve fitting.

A later post to another group showed up, but this did not
so "if at first you don't succeed .... .. . '


But it's no "the good old days" and I want a less tedious and 
more reproducible method. I looked at tools available in Scilab 
and came across "lsq_splin" which given m data points and n 
breakpoints (m>n, >> implied) generates *a* curve of m points 
which is a least squares fit.

I've some playing/experimenting and demonstrated that too many 
breakpoints is as poor a solution as too few (surprise surprise).

What guidelines are there for choosing number and location of 
breakpoints?

What are good search terms to use so that Google would show 
informative pages?

I'm explicitly looking to doing _piecewise_ approximations as 
creating an analytical function to represent discontinuous data 
is a fool's errand. I tilt at windmills enough already ;/

{P.S. This student hasn't been in math class for ~50 yrs]

TIA


Richard Owlett <rowlett@pcnetinc.com> writes:
> [...] > {P.S. This student hasn't been in math class for ~50 yrs]
Today is the first day of the rest of your life... Seriously, nothing's stopping you from jumping back in the game. I do it regularly (whether it's been 5 months, 5 years, or 5 decades...). -- Randy Yates % "...the answer lies within your soul Digital Signal Labs % 'cause no one knows which side mailto://yates@ieee.org % the coin will fall." http://www.digitalsignallabs.com % 'Big Wheels', *Out of the Blue*, ELO
Randy Yates wrote:
> Richard Owlett <rowlett@pcnetinc.com> writes: >> [...] >> {P.S. This student hasn't been in math class for ~50 yrs] > > Today is the first day of the rest of your life... > > Seriously, nothing's stopping you from jumping back in the game. I do it > regularly (whether it's been 5 months, 5 years, or 5 decades...).
ARGHHH! *LOL* The fee to youngster harassing his elder is an answer to question ;) Seriously, I added the PS to indicate that I KNEW I was ignorant. I have NEITHER inclination nor resources to pursue further FORMAL education. Hints to questions I did ask? ;) { Wish I could remember daffynition of Freshman, SophisMoris, etc etc }
On 10 Nov, 21:20, Richard Owlett <rowl...@pcnetinc.com> wrote:

> What guidelines are there for choosing number and location of > breakpoints?
You have stumbled upon the 'art' part of 'the art of data analysis'. Data analysis is a bit desceptive, as it relies extensively but not entirely on mathemathics, which is a quantitative science. The deception lies in the fact that when it comes to making the decisions that are *not* governed by the maths you are more or less on your own. There might be an established 'best practice' within a field or user community, but if such a 'best practice' exists at all, it will be based on one or more qualitative factors like empiri, user experience, mutual agreements among users and/or clients, and convenience. Unless you happen to stumble upon a user community that happen to use *exactly* the same methods as you to answer *exactly* the same question as you - and who are willing let you in on their experiences - the best you can do to get an answer is to play with your data while keeping some key questions in mind: 1) What do I attempt to achieve? 2) Why do I expect any one particular method to produce the results I want? 3) What does it take to implement / apply the method? 4) How well did the method work? 5) How well did the results meet my expectations? 6a) Why did the method work as expected? Did I as user / analyst use prior knowledge about the test data to set up idealized input, or did I stop the method at a point where I knew the result was close to the known answer? 6b) Why did the method not work? Did it rely on data or information I could not possibly have? Were there noise or other imperfections in the data that undermined the workings of the method? Was the desired result discernible from mere noise? 7) How much prior knowledge about a data set, the generating process and the inner workings of the method does it take for a user to obtain useful results? And so on. I know, it's a long list of questions (and quite a few of them requires some dicipline to ask oneself, particularly when stakes are high or when working alone), but there are no other ways to learn. Rune
Rune Allnor wrote:
> On 10 Nov, 21:20, Richard Owlett <rowl...@pcnetinc.com> wrote: > >> What guidelines are there for choosing number and location of >> breakpoints? > > You have stumbled upon the 'art' part of 'the art of data > analysis'. Data analysis is a bit desceptive, as it relies > extensively but not entirely on mathemathics, which is a > quantitative science. > > The deception lies in the fact that when it comes to making > the decisions that are *not* governed by the maths you are > more or less on your own. There might be an established 'best > practice' within a field or user community, but if such a > 'best practice' exists at all, it will be based on one or more > qualitative factors like empiri, user experience, mutual > agreements among users and/or clients, and convenience. > > Unless you happen to stumble upon a user community that happen > to use *exactly* the same methods as you to answer *exactly* the > same question as you - and who are willing let you in on their > experiences - the best you can do to get an answer is to play > with your data while keeping some key questions in mind: > > 1) What do I attempt to achieve? > 2) Why do I expect any one particular method to produce > the results I want? > 3) What does it take to implement / apply the method? > 4) How well did the method work? > 5) How well did the results meet my expectations? > 6a) Why did the method work as expected? Did I as user / > analyst use prior knowledge about the test data to > set up idealized input, or did I stop the method at > a point where I knew the result was close to the > known answer? > 6b) Why did the method not work? Did it rely on data > or information I could not possibly have? Were there > noise or other imperfections in the data that undermined > the workings of the method? Was the desired result > discernible from mere noise? > 7) How much prior knowledge about a data set, the generating > process and the inner workings of the method does it take > for a user to obtain useful results? > > And so on. I know, it's a long list of questions (and quite > a few of them requires some dicipline to ask oneself, particularly > when stakes are high or when working alone), but there are no > other ways to learn. > > Rune
I hadn't set out exactly those questions. But brick walls of reality effectively required me to answer them. My initial goal may have been too ambitious for my abilities. So I attack smaller problems that come to light. Retirement is for learning what you never learned in school ;)