Back in September, Zevv titled a post "Isolating semi-periodic waveforms in a signal" in which he gave a link to http://tinyurl.com/pmgraph (a plot of household power usage). He was kind enough to send me a week's worth of raw data ( sampled every 2 seconds). As I am visually oriented, the first thing I did was to plot the data. I saw at least two distinct types of noise: 1. apparently random small fluctuations 2. large spikes associated with state changes As these spikes were very large, independent of size of following level shift, and *EXACTLY* of 1 sample duration, I arbitrarily replaced them with the value of the following data point. My next iteration was to replace all samples between a pair of state changes with the average during that period. That was useful to point out what would have to be taken into account for a better approximation. The next thing was to consider doing a running average over a set of n samples between state changes. Two problems: 1. how to chose n 2. what to do within n samples of start/end of current state If this were "the good old days", I would grab graph paper, french curve and a straight edge to do some calibrated eyeball curve fitting. A later post to another group showed up, but this did not so "if at first you don't succeed .... .. . ' But it's no "the good old days" and I want a less tedious and more reproducible method. I looked at tools available in Scilab and came across "lsq_splin" which given m data points and n breakpoints (m>n, >> implied) generates *a* curve of m points which is a least squares fit. I've some playing/experimenting and demonstrated that too many breakpoints is as poor a solution as too few (surprise surprise). What guidelines are there for choosing number and location of breakpoints? What are good search terms to use so that Google would show informative pages? I'm explicitly looking to doing _piecewise_ approximations as creating an analytical function to represent discontinuous data is a fool's errand. I tilt at windmills enough already ;/ {P.S. This student hasn't been in math class for ~50 yrs] TIA
Possible repost - Using least squares cubic spline fitting as a "filter", what is analogus to Nyquist criterion?
Started by ●November 10, 2009
Reply by ●November 10, 20092009-11-10
Richard Owlett <rowlett@pcnetinc.com> writes:> [...] > {P.S. This student hasn't been in math class for ~50 yrs]Today is the first day of the rest of your life... Seriously, nothing's stopping you from jumping back in the game. I do it regularly (whether it's been 5 months, 5 years, or 5 decades...). -- Randy Yates % "...the answer lies within your soul Digital Signal Labs % 'cause no one knows which side mailto://yates@ieee.org % the coin will fall." http://www.digitalsignallabs.com % 'Big Wheels', *Out of the Blue*, ELO
Reply by ●November 10, 20092009-11-10
Randy Yates wrote:> Richard Owlett <rowlett@pcnetinc.com> writes: >> [...] >> {P.S. This student hasn't been in math class for ~50 yrs] > > Today is the first day of the rest of your life... > > Seriously, nothing's stopping you from jumping back in the game. I do it > regularly (whether it's been 5 months, 5 years, or 5 decades...).ARGHHH! *LOL* The fee to youngster harassing his elder is an answer to question ;) Seriously, I added the PS to indicate that I KNEW I was ignorant. I have NEITHER inclination nor resources to pursue further FORMAL education. Hints to questions I did ask? ;) { Wish I could remember daffynition of Freshman, SophisMoris, etc etc }
Reply by ●November 12, 20092009-11-12
On 10 Nov, 21:20, Richard Owlett <rowl...@pcnetinc.com> wrote:> What guidelines are there for choosing number and location of > breakpoints?You have stumbled upon the 'art' part of 'the art of data analysis'. Data analysis is a bit desceptive, as it relies extensively but not entirely on mathemathics, which is a quantitative science. The deception lies in the fact that when it comes to making the decisions that are *not* governed by the maths you are more or less on your own. There might be an established 'best practice' within a field or user community, but if such a 'best practice' exists at all, it will be based on one or more qualitative factors like empiri, user experience, mutual agreements among users and/or clients, and convenience. Unless you happen to stumble upon a user community that happen to use *exactly* the same methods as you to answer *exactly* the same question as you - and who are willing let you in on their experiences - the best you can do to get an answer is to play with your data while keeping some key questions in mind: 1) What do I attempt to achieve? 2) Why do I expect any one particular method to produce the results I want? 3) What does it take to implement / apply the method? 4) How well did the method work? 5) How well did the results meet my expectations? 6a) Why did the method work as expected? Did I as user / analyst use prior knowledge about the test data to set up idealized input, or did I stop the method at a point where I knew the result was close to the known answer? 6b) Why did the method not work? Did it rely on data or information I could not possibly have? Were there noise or other imperfections in the data that undermined the workings of the method? Was the desired result discernible from mere noise? 7) How much prior knowledge about a data set, the generating process and the inner workings of the method does it take for a user to obtain useful results? And so on. I know, it's a long list of questions (and quite a few of them requires some dicipline to ask oneself, particularly when stakes are high or when working alone), but there are no other ways to learn. Rune
Reply by ●November 13, 20092009-11-13
Rune Allnor wrote:> On 10 Nov, 21:20, Richard Owlett <rowl...@pcnetinc.com> wrote: > >> What guidelines are there for choosing number and location of >> breakpoints? > > You have stumbled upon the 'art' part of 'the art of data > analysis'. Data analysis is a bit desceptive, as it relies > extensively but not entirely on mathemathics, which is a > quantitative science. > > The deception lies in the fact that when it comes to making > the decisions that are *not* governed by the maths you are > more or less on your own. There might be an established 'best > practice' within a field or user community, but if such a > 'best practice' exists at all, it will be based on one or more > qualitative factors like empiri, user experience, mutual > agreements among users and/or clients, and convenience. > > Unless you happen to stumble upon a user community that happen > to use *exactly* the same methods as you to answer *exactly* the > same question as you - and who are willing let you in on their > experiences - the best you can do to get an answer is to play > with your data while keeping some key questions in mind: > > 1) What do I attempt to achieve? > 2) Why do I expect any one particular method to produce > the results I want? > 3) What does it take to implement / apply the method? > 4) How well did the method work? > 5) How well did the results meet my expectations? > 6a) Why did the method work as expected? Did I as user / > analyst use prior knowledge about the test data to > set up idealized input, or did I stop the method at > a point where I knew the result was close to the > known answer? > 6b) Why did the method not work? Did it rely on data > or information I could not possibly have? Were there > noise or other imperfections in the data that undermined > the workings of the method? Was the desired result > discernible from mere noise? > 7) How much prior knowledge about a data set, the generating > process and the inner workings of the method does it take > for a user to obtain useful results? > > And so on. I know, it's a long list of questions (and quite > a few of them requires some dicipline to ask oneself, particularly > when stakes are high or when working alone), but there are no > other ways to learn. > > RuneI hadn't set out exactly those questions. But brick walls of reality effectively required me to answer them. My initial goal may have been too ambitious for my abilities. So I attack smaller problems that come to light. Retirement is for learning what you never learned in school ;)