"Scott Hemphill" <hemphill@hemphills.net> wrote in message news:m31x7eyhwk.fsf@pearl.local...> Richard Owlett <rowlett@atlascomm.net> writes: > >> Fred Marshall wrote: >> >> > "Richard Owlett" <rowlett@atlascomm.net> wrote in message >> > news:11a9f08hv1v19fd@corp.supernews.com... >> > >> >>Assume that if sampling DOES occur it will be more frequent than >> >>once/minute. >> >> >> >>Assume also that underlying phenomena have periods of days, months, or >> >>years. >> >> >> >>Assume data recording outages on order of hours to days. >> >> >> >>What can be done? >> >>what questions should I be asking? >> > >> > >> > The underlying phenomena have periods of days (or more) >> > So, one question you should ask is: "what is the smallest number of >> > days >> > possible?" >> > >> > [snip] >> > What's more important is the >> > minimum sample rate defined by the outages. >> > >> >> I was suspecting that. Does it make any difference that I have some 'a >> priori' knowledge of the rep rate of underlying phenomena? > > Yes. It would help a lot if you would give some specifics. For example, > one problem that fits your abstract description is tide prediction. The > tidal constituents have periods that vary considerably--the more important > ones have periods from about 3 hours to 0.5 years. Outages don't affect > analysis of tidal data much because the underlying phenomena are well > understood (and periodic!). The preferred technique for analysis is LSHA, > or Least Squares Harmonic Analysis. This works because each constituent > "j" is known to have the form: > > a_j cos(sum[k=1,5; n_{jk} omega_k t] + theta_j) > > where the a_j's represent amplitudes, the theta_j,s represent phases, the > n_{jk}'s are small integers and the omega_k's are five fundamental > angular velocities resulting from the earth's and the moon's orbits. > The n_{jk}'s identify the constituent, and a least squares analysis solves > for the a_j and theta_j which best fit the data. > > ScottIf the "underlying phenomena" are understood then there is more data available than implied by the samples. It skirts the issue it seems. Fred
Analyzing irregularly sampled data -- a neophyte question
Started by ●June 6, 2005
Reply by ●June 7, 20052005-06-07
Reply by ●June 8, 20052005-06-08
In article <-NqdndQYjoH8TznfRVn-vQ@centurytel.net>, Fred Marshall <fmarshallx@remove_the_x.acm.org> wrote:>"Richard Owlett" <rowlett@atlascomm.net> wrote in message >news:11a9f08hv1v19fd@corp.supernews.com... >> Assume that if sampling DOES occur it will be more frequent than >> once/minute. >> >> Assume also that underlying phenomena have periods of days, months, or >> years. >> >> Assume data recording outages on order of hours to days. >> >> What can be done? >> what questions should I be asking? > >The underlying phenomena have periods of days (or more) >So, one question you should ask is: "what is the smallest number of days >possible?" > >Data recording outages occur on the order of hours to days. >So, another question you should ask is: "what is the maximum outage?" > >If the maximum outage is greater than 1/2 the smallest number of days >possible in the underlying phenomena being sampled then you have a problem >because the sample rate is inadequate.Maximum outage? Or average outage? If the average outage was small enough with respect to the underlying frequency of interest, then interpolation might still yield some very good statistical info about the phenomena even with gaps that occasionally span greater than the proper nyquist sampling period. This seems similar to the problem of (click | pop | dust spec | sample- -with-a-parity-error) removal in audio/video reconstruction. IMHO. YMMV. -- Ron Nicholson rhn AT nicholson DOT com http://www.nicholson.com/rhn/ #include <canonical.disclaimer> // only my own opinions, etc.
Reply by ●June 8, 20052005-06-08
"Ronald H. Nicholson Jr." <rhn@mauve.rahul.net> wrote in message news:d85tfo$15h$1@blue.rahul.net...> In article <-NqdndQYjoH8TznfRVn-vQ@centurytel.net>, > Fred Marshall <fmarshallx@remove_the_x.acm.org> wrote: >>"Richard Owlett" <rowlett@atlascomm.net> wrote in message >>news:11a9f08hv1v19fd@corp.supernews.com... >>> Assume that if sampling DOES occur it will be more frequent than >>> once/minute. >>> >>> Assume also that underlying phenomena have periods of days, months, or >>> years. >>> >>> Assume data recording outages on order of hours to days. >>> >>> What can be done? >>> what questions should I be asking? >> >>The underlying phenomena have periods of days (or more) >>So, one question you should ask is: "what is the smallest number of days >>possible?" >> >>Data recording outages occur on the order of hours to days. >>So, another question you should ask is: "what is the maximum outage?" >> >>If the maximum outage is greater than 1/2 the smallest number of days >>possible in the underlying phenomena being sampled then you have a problem >>because the sample rate is inadequate. > > Maximum outage? Or average outage? If the average outage was small > enough with respect to the underlying frequency of interest, then > interpolation might still yield some very good statistical info about > the phenomena even with gaps that occasionally span greater than the > proper nyquist sampling period. > > This seems similar to the problem of (click | pop | dust spec | sample- > -with-a-parity-error) removal in audio/video reconstruction.Yep. If an outage is small enough then some interpolation is possible. This might be viewed as follows: Perfectly lowpass the incoming data to the greatest degree possible (narrowest bandwidth). This defines the widest possible sinc that can be used to interpolate. So, interpolation will work only out to maybe 1/2 the sinc interval. After that, there's no data to base the sincs on. Or: If there's a gap, conceptually lowpass even further and use a wider sinc or the interpolation method of your choice. Either way, the "instantaneous" bandwidth is lower than the actual bandwidth you're working with. In the limit, match the end points of the gap with their values and with their first derivatives. That's maybe a 3rd order interpolation? and has not much ability to wiggle in the gap (thus low bandwidth). Or another method in the limit: use at least two wider and wider sincs weighted by the two end points and perhaps more such wide sincs weighted by adjacent points until the gap is adequately filled. Then figure that there will be very little "useful" information in the gap. It may sound much better if the record is music but it doesn't add anything from a data anaylsis point of view. Fred
Reply by ●June 8, 20052005-06-08
Richard Owlett wrote:> Assume that if sampling DOES occur it will be more frequent than > once/minute. > > Assume also that underlying phenomena have periods of days, months, or > years. > > Assume data recording outages on order of hours to days. > > What can be done? > what questions should I be asking? > > >Thank you one and all. Problem is not reality based although triggered by "real" situation. I was following some thoughts to ( and perhaps beyond ) "reductio ad absurdum". I like the tide analogy. Assume a cloud covered planet with large oceans and multiple moons orbiting in multiple planes whose periods are relatively prime. The residents are super genius mathematicians. They have centuries of tide data for the prime seaport. Unfortunately there are many gaps ranging from hours to days to months. Do they have any useful chance at analyzing data? [ Apologies to Mr. Asimov -- any correlation to "Nightfall" is intentional ;]
Reply by ●June 8, 20052005-06-08
Richard Owlett <rowlett@atlascomm.net> writes:> Richard Owlett wrote: > > > Assume that if sampling DOES occur it will be more frequent than > > once/minute. > > > > Assume also that underlying phenomena have periods of days, months, or > > years. > > > > Assume data recording outages on order of hours to days. > > > > What can be done? > > what questions should I be asking? > > > > > > > > Thank you one and all. > > Problem is not reality based although triggered by "real" situation. > I was following some thoughts to ( and perhaps beyond ) "reductio ad > absurdum". > > I like the tide analogy. > > Assume a cloud covered planet with large oceans and multiple moons > orbiting in multiple planes whose periods are relatively prime. The > residents are super genius mathematicians.I assume the clouds are to prevent the population from knowing about their solar system.> They have centuries of tide data for the prime seaport. > Unfortunately there are many gaps ranging from hours to days to months. > > Do they have any useful chance at analyzing data?Sure. It's just fitting data to a model. When you fit a regression line to a cloud of points, there's nothing that requires the points be equally spaced. In this case, they'll notice some of the important periods by inspection. They can fit the data to A*cos(omega*t+theta), subtract the fit from the data to find the residual. The residual will have more periodic components. If they find enough of these periodic components, they may notice relationships between them. If they've discovered Newton's Law of Gravity, they may even have enough information to deduce the existence of their moons, and calculate some of the parameters of their orbits. (This could be considered analogous to spectral lines, which allowed us to deduce the structure of electron orbitals in atoms.) By the way, the centuries of data are less useful than mere decades of data from multiple seaports. The shape of the sea bottom causes nonlinearities in tide propagation that attenuate and retard the different components in differing amounts. So a seaport can potentially reveal components that are not noticed at other seaports. Shallow water (not the prime seaport) tends to reveal even more components than deep water. The shape of the sea bottom also changes over time, so the fit for current data might not work as well for centuries old data. I believe the U.S. now uses 19-year periods for data analysis.> [ Apologies to Mr. Asimov -- any correlation to "Nightfall" is > intentional ;]I don't believe I've read that one. I'll have to look it up. Scott -- Scott Hemphill hemphill@alumni.caltech.edu "This isn't flying. This is falling, with style." -- Buzz Lightyear






