On Jun 10, 1:38 pm, "Fred Marshall" <fmarshallx@remove_the_x.acm.org>
wrote:
> So, I might take 1024 or 4096 or .... you choose the number ... and compute
> an FFT on just those contiguous samples. You might do this for various such
> epochs along the total sequence. While the resolution will be limited, the
> entire frequency range will be covered each time.
An fft of 1024 contiguous samples won't tell you too much
about the spectral content of frequencies on the rough
order of one-hundredth to one-billionth of the fft's first
bin, which seems to be where the OP is looking (the first
10e6 dft bins out of circa 10e13 possible?).
It still seems to me that the way to look at such huge
potential data sets (the weight of every rodent in North
America by longitude, or some such) is to start with
some statistical sampling. What I'm wondering is if
there is a name for the procedure of taking a bunch of
randomly spaced samples and doing a regression fit of
those samples against an set of orthogonal sinusoidal
basis vectors.
IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M
Reply by Fred Marshall●June 11, 20072007-06-11
"John E. Hadstate" <jh113355@hotmail.com> wrote in message
news:DL8bi.26179$dy1.22507@bigfe9...
>
> "Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote in message
> news:aqCdnboPO6bS_PHbnZ2dnUVZ_q2pnZ2d@centurytel.net...
>
>>
>> So, I might take 1024 or 4096 or .... you choose the number ... and
>> compute an FFT on just those contiguous samples. You might do this for
>> various such epochs along the total sequence. While the resolution will
>> be limited, the entire frequency range will be covered each time.
>> If the results are quite different then you know that the spectral
>> character of the samples is varying from segment to segment.
>> If the results are rather similar then the opposite.
>
> Consider what you would see if you computed a short DFT on a
> low-baud-rate, highly-oversampled FSK signal. If the DFT is short enough
> relative to the baud rate, you will see the Mark and Space frequencies in
> separate DFT windows. Would you then conclude that the spectral character
> of the signal is varying or would you conclude that the varying spectrum
> characterizes the signal?
John,
Yes, of course I would. :-)
Fred
Reply by John E. Hadstate●June 11, 20072007-06-11
"Fred Marshall" <fmarshallx@remove_the_x.acm.org> wrote in
message
news:aqCdnboPO6bS_PHbnZ2dnUVZ_q2pnZ2d@centurytel.net...
>
> So, I might take 1024 or 4096 or .... you choose the
> number ... and compute an FFT on just those contiguous
> samples. You might do this for various such epochs along
> the total sequence. While the resolution will be limited,
> the entire frequency range will be covered each time.
> If the results are quite different then you know that the
> spectral character of the samples is varying from segment
> to segment.
> If the results are rather similar then the opposite.
Consider what you would see if you computed a short DFT on a
low-baud-rate, highly-oversampled FSK signal. If the DFT is
short enough relative to the baud rate, you will see the
Mark and Space frequencies in separate DFT windows. Would
you then conclude that the spectral character of the signal
is varying or would you conclude that the varying spectrum
characterizes the signal?
> You should be asking yourself this question:
> Even though I have a huge number of samples, what is the
> frequency resolution that I require? The frequency
> resolution is the reciprocal of the temporal epoch that
> you choose to analyze.
>
A huge number of samples might be the result of
oversampling. It might also be the result of long-term
observation of a phenomenon that is undersampled. It
sounds like the OP is not sure which case applies to his
data.
Reply by prad●June 10, 20072007-06-10
Fred and Ron,
Thanks for your input. I'll try your suggestions.
Prad.
>prad,
>
>If someone suggested it, then I've missed it.... here's what I would do:
>
>Because you have so many data points the frequency resolution could be
quite
>a bit better than you need. The fewer contiguous points you use, the
>coarser the frequency resolution.
>
>So, I might take 1024 or 4096 or .... you choose the number ... and
compute
>an FFT on just those contiguous samples. You might do this for various
such
>epochs along the total sequence. While the resolution will be limited,
the
>entire frequency range will be covered each time.
>If the results are quite different then you know that the spectral
character
>of the samples is varying from segment to segment.
>If the results are rather similar then the opposite.
>
>Also, you'll be able to see the actual important bandwidth of the
>information - so you might be able to decide that some decimation is OK
to
>do without aliasing.
>
>You should be asking yourself this question:
>Even though I have a huge number of samples, what is the frequency
>resolution that I require? The frequency resolution is the reciprocal of
>the temporal epoch that you choose to analyze.
>
>Example:
>
>If you have 1 second worth of samples and the sample rate is 1MHz, then
you
>have 10^6 samples. If you FFT the whole sequence, you will have 1Hz
>resolution over a range 0.5MHz (fs/2). Maybe 1Hz resolution is overkill
for
>your application.
>
>So, 0.1secs of data would be 100,000 samples with 10Hz resolution... and
so
>forth.
>
>Pick the temporal length that gives suitable resolution.
>
>I hope this helps.
>
>Fred
>
>
>
Reply by Fred Marshall●June 10, 20072007-06-10
prad,
If someone suggested it, then I've missed it.... here's what I would do:
Because you have so many data points the frequency resolution could be quite
a bit better than you need. The fewer contiguous points you use, the
coarser the frequency resolution.
So, I might take 1024 or 4096 or .... you choose the number ... and compute
an FFT on just those contiguous samples. You might do this for various such
epochs along the total sequence. While the resolution will be limited, the
entire frequency range will be covered each time.
If the results are quite different then you know that the spectral character
of the samples is varying from segment to segment.
If the results are rather similar then the opposite.
Also, you'll be able to see the actual important bandwidth of the
information - so you might be able to decide that some decimation is OK to
do without aliasing.
You should be asking yourself this question:
Even though I have a huge number of samples, what is the frequency
resolution that I require? The frequency resolution is the reciprocal of
the temporal epoch that you choose to analyze.
Example:
If you have 1 second worth of samples and the sample rate is 1MHz, then you
have 10^6 samples. If you FFT the whole sequence, you will have 1Hz
resolution over a range 0.5MHz (fs/2). Maybe 1Hz resolution is overkill for
your application.
So, 0.1secs of data would be 100,000 samples with 10Hz resolution... and so
forth.
Pick the temporal length that gives suitable resolution.
I hope this helps.
Fred
Reply by Ron N.●June 9, 20072007-06-09
On Jun 7, 7:11 pm, "prad" <pradeep.ferna...@gmail.com> wrote:
> Ron:
> Randomized Statistical Sampling is another good idea. Initially I was
> thinking along another line involving random sampling. I was thinking
> about producing random data samples and then performing FFT on these
> samples. In fact, I did it. But since I am not that familiar with DSP and
> FFT, could not really figure out how to interpret the FFT results. In
> fact, most of the links I found on FFT with non-uniformly spaced samples
> were interpolating to find the equally spaced samples and then performing
> FFT. Is this the standard technique for FFT with non-uniformly spaced
> samples? Thanks Ron for this new idea. I will investigate it further.
Actually, this might be a place where trying to use a randomly
sampled low pass filter might be better than nothing.
Essentially create your low pass filter waveform (say a
windowed sinc of some period and width), and then use that
filter waveform in a weighted random number generator. Use
those weighted random numbers to select sub-samples centered
around the neighborhood of a sample point of interest. After
some number of sub-samples, if the mean and variance seem to be
converging somewhere after a sufficient number of sub-samples,
then the mean might approximate the value of a decimated sample
of the bandlimited signal perhaps within some statistical
confidence interval.
Does this type of procedure have a name?
IMHO. YMMV.
--
rhn A.T nicholson d.0.t C-o-M
Reply by Vladimir Vassilevsky●June 9, 20072007-06-09
Now I am sure that what you are doing is sheer nonsense. Besides the
dumb brute forcing, the obvious sign is the cluelessness, the other
obvious sign is the secrecy. The earlier you will dismiss your priceless
ideas, the better.
When people do a serious research, they don't ask the stupid questions
in the newsgroups. Instead, they learn the subject themselves and/or
seek for the professional assistance.
VLV
prad wrote:
> I am sorry that you think that. This is for a research subject and not
> homework. I am not sharing details about the data as it would give away
> the novel modeling I am trying to do. Thanks to all those who gave useful
> information and helped me.
>
> VLV: Please think before you send a comment like that.
>
>
>
> Pradeep.
>
>>
>>Don't you know? This is homework. A stupident is generating a huge data
>>by software and then trying to make use of that data. Numerical nonsense
>
>
>>instead of thinking of the better approaches, as usual.
>>
>>VLV
>>
Reply by glen herrmannsfeldt●June 9, 20072007-06-09
Rune Allnor wrote:
(snip)
> Why don't you get into details about how these data
> come about and what you try to do with them?
He seems to want to keep the data source proprietary,
in which case I believe our answers also should be.
Now, if he wants to pay for answers, that is different.
-- glen
Reply by glen herrmannsfeldt●June 9, 20072007-06-09
jim wrote:
(snip)
> What good is a 1000 point moving average filter going to be? If he is
> downsampling from 10^13 samples to 10^6 samples to prevent aliasing he's
> going to need a low pass filter that has millions and millions of
> coefficients.
Maybe it should be more than 1000, maybe less. I haven't seen any
numbers related to the frequency range of interest.
> It's hard to imagine why, if the data contains only important info at
> such a low frequency, it is being sampled at such a high sample rate in
> the first place.
It seems that it is generated, and not from a natural source. It might
be the output of a linear congruential or linear feedback shift
register, for example. It might be a 43 bit LFSR, and one bit samples.
Note that the number of bits per sample still hasn't been mentioned.
I believe with either linear congruential or LFSR you can rewrite it
to generate every Nth point of the original series.
-- glen
Reply by Rune Allnor●June 9, 20072007-06-09
On 9 Jun, 02:39, "prad" <pradeep.ferna...@gmail.com> wrote:
> I am sorry that you think that.
Don't top post when commenting on previous posts.
> This is for a research subject and not
> homework. I am not sharing details about the data as it would give away
> the novel modeling I am trying to do.
Those who have followed my posts here for a couple of years
would know that I sometimes make a point of saying that I
am an engineer, not a researcher, despite my misfortune
of having obtained a PhD degree. The difference is that
the engineer knows what he is doing whereas the researcher [*]
does not.
Tour project is flawed. There is no need whatsoever to
come up with those sorts of data in any but the biggest
survey projects.
There is the vanishing (though still non-zero) chance that
you really are into something, but even so, your timing is
wrong. These days, the sheer logistics of what you try to do
is out of reach for anyone but the largest computer centra.
That will change with time. When I started playing with
computers 20 years ago, it was specialist's work to handle
more than 64 kilobytes of data at any one time. When I first
played with certan sonar data processing ideas some 10 years
ago, anything beyond 1MByte was, for practical purposes,
outside my reach. These days my computer's RAM is the
limitation (it has only 512 MBytes) and I plan my programs
for use with, say, 10 Gyte of data once I can get my hands
on that sort of computer. It will be affordable by the
time I finish my program.
So times change, and maybe you or I will look up this
thread in five or ten years time and smile at the old
days when a mere 20 TB of data represented an insurmountable
obstacle.
However, ath the time of writing, June 2007, 20 TB of data
*is* an insurmountable obstacle. If you end up with the
need to process that sort of data, there is sucha huge
discrepancy between what you want and what is whithin
your abilities to do, that you might as well do something
else in the mean time, waiting for the required technology
to become available to you.
If you do this as part of an employment, circulate your
resume. Whoever assigned you to this task has no idea
what he or she is doing.
> VLV: Please think before you send a comment like that.
Vladimir is, as is his habit, perfectly on the mark.
Rune
[*] In Norwegian, "researcher" is translated to "forsker"
"one who does research", whereas "scientist" is translated
to "vitenskapsmann" which means "one who knows the
sciences." The difference might be subtle, but anyone
can emark on some sort of research, while insight into
the sciences requires more insight, usually otained
through decades of dedicated studies. Needless to say,
the world are full of researchers, with hardly one
scientist alive, world wide.