Hi all, I am a newbie to DSP. I have a huge number of data samples that I want to perform FFT on. But due to the enormous amount of the data samples, I'd like to use only 1M samples and perform FFT. I know that this is severe undersampling and that it does not satisfy the Nyquist rate. But all I want is to prevent aliasing so that I can get correct information about the low frequency content in the data. I cannot use an analog anti-alias filter as the data is inherently digitized. Is there any time-domain method to help with my problem? I read about a half-band digital filter that eliminates the frequencies higher than half the sampling frequency. Can this filter help my problem as I read a lot of articles that the high frequencies have to be eliminated before digitization? If so, can someone point me to some link that discusses the design of such a filter? Thanks, Pradeep.

# Beginner's question on undersampling and aliasing

Hi all, I am a newbie to DSP. I have a huge number of data samples that I want to perform FFT on. But due to the enormous amount of the data samples, I'd like to use only 1M samples and perform FFT. I know that this is severe undersampling and that it does not satisfy the Nyquist rate. But all I want is to prevent aliasing so that I can get correct information about the low frequency content in the data. I cannot use an analog anti-alias filter as the data is inherently digitized. Is there any time-domain method to help with my problem? I read about a half-band digital filter that eliminates the frequencies higher than half the sampling frequency. Can this filter help my problem as I read a lot of articles that the high frequencies have to be eliminated before digitization? If so, can someone point me to some link that discusses the design of such a filter? Thanks, Pradeep.

"prad" <pradeep.fernando@gmail.com> wrote in message news:D_mdneMc2IHBxfrbnZ2dnUVZ_s2vnZ2d@giganews.com...> Hi all, > > I am a newbie to DSP. I have a huge number of data samples that I want > to perform FFT on. But due to the enormous amount of the data samples, I'd > like to use only 1M samples and perform FFT. I know that this is severe > undersampling and that it does not satisfy the Nyquist rate. But all I > want is to prevent aliasing so that I can get correct information about > the low frequency content in the data. I cannot use an analog anti-alias > filter as the data is inherently digitized. Is there any time-domain > method to help with my problem? I read about a half-band digital filter > that eliminates the frequencies higher than half the sampling frequency. > Can this filter help my problem as I read a lot of articles that the high > frequencies have to be eliminated before digitization? If so, can someone > point me to some link that discusses the design of such a filter?It's not clear what you mean. The sample rate and the number of samples are two different things. You could use any sequence of samples to FFT and just keep the sample rate the same. Have you tried that? Fred

prad wrote:> Hi all, > > I am a newbie to DSP. I have a huge number of data samples that I want > to perform FFT on. But due to the enormous amount of the data samples, I'd > like to use only 1M samples and perform FFT. I know that this is severe > undersampling and that it does not satisfy the Nyquist rate. But all I > want is to prevent aliasing so that I can get correct information about > the low frequency content in the data. I cannot use an analog anti-alias > filter as the data is inherently digitized. Is there any time-domain > method to help with my problem? I read about a half-band digital filter > that eliminates the frequencies higher than half the sampling frequency. > Can this filter help my problem as I read a lot of articles that the high > frequencies have to be eliminated before digitization? If so, can someone > point me to some link that discusses the design of such a filter?You don't say what samples you expect to remove to pare your data down to a mere million. If they are contiguous, the paring will not create aliases. If the bandwidth of the original signal exceeded half the sample rate, the aliases are already there. If the original sampled data are clean, you can extract every Nth by a process called decimation, and filtering before decimating is required. Ask more after googling for "decimation". ry -- Engineering is the art of making what you want from things you can get. ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

On 7 Jun, 02:59, "prad" <pradeep.ferna...@gmail.com> wrote:> Hi all, > > I am a newbie to DSP. I have a huge number of data samples that I want > to perform FFT on.Why do you want to compute the DFT? What do you hope to achieve?> But due to the enormous amount of the data samples, I'd > like to use only 1M samples and perform FFT. I know that this is severe > undersampling and that it does not satisfy the Nyquist rate.Then you are in deep trouble. If the data are aliased, you face a question on the form "Given x+y = 1, find x and y." There exists one true answer, but there is no way you can find it, let lone tell it from all the other combinations of x and y which satisfy x+y=1.> But all I > want is to prevent aliasing so that I can get correct information about > the low frequency content in the data. I cannot use an analog anti-alias > filter as the data is inherently digitized.What do you mean? If you sample an analog process, then you have no choise but to use an analog anti-alias filter. If you have data from some discrete-time process, aliasing is not an issue.> Is there any time-domain > method to help with my problem?As far as filtering the data is concered, only the analog anti-alias filter. Provided, of course, you are actually sampling an analog process.> I read about a half-band digital filter > that eliminates the frequencies higher than half the sampling frequency. > Can this filter help my problem as I read a lot of articles that the high > frequencies have to be eliminated before digitization?Doesn't apply here. Analog anti-alias filtering is the way to go.> If so, can someone > point me to some link that discusses the design of such a filter?I am sure someone can. However, it won't help you. Did I mention that analog anti-alias filters are useful? Rune

>On 7 Jun, 02:59, "prad" <pradeep.ferna...@gmail.com> wrote: >> Hi all, >> >> I am a newbie to DSP. I have a huge number of data samples that Iwant>> to perform FFT on. > >Why do you want to compute the DFT? What do you hope >to achieve? > >> But due to the enormous amount of the data samples, I'd >> like to use only 1M samples and perform FFT. I know that this issevere>> undersampling and that it does not satisfy the Nyquist rate. > >Then you are in deep trouble. If the data are aliased, you >face a question on the form > >"Given x+y = 1, find x and y." > >There exists one true answer, but there is no way you can >find it, let lone tell it from all the other combinations >of x and y which satisfy x+y=1. > >> But all I >> want is to prevent aliasing so that I can get correct informationabout>> the low frequency content in the data. I cannot use an analoganti-alias>> filter as the data is inherently digitized. > >What do you mean? If you sample an analog process, then you >have no choise but to use an analog anti-alias filter. >If you have data from some discrete-time process, aliasing >is not an issue. > >> Is there any time-domain >> method to help with my problem? > >As far as filtering the data is concered, only the analog >anti-alias filter. Provided, of course, you are actually >sampling an analog process. > >> I read about a half-band digital filter >> that eliminates the frequencies higher than half the samplingfrequency.>> Can this filter help my problem as I read a lot of articles that thehigh>> frequencies have to be eliminated before digitization? > >Doesn't apply here. Analog anti-alias filtering is the way >to go. > >> If so, can someone >> point me to some link that discusses the design of such a filter? > >I am sure someone can. However, it won't help you. >Did I mention that analog anti-alias filters are useful? > >Rune > >Dear all, Thanks for all your replies but I am still a bit confused. Let me try rephrasing my question so that everyone gets a better picture of what I am trying to do. I have a large number (order of 10^13) of discretized data samples. I have to obtain the frequency spectrum of this data so that I can predict minima in the discretized data. I am using the FFTW C library to obtain the Frequency spectrum. But the maximum size of my data array is limited to 2^32 in C. So I am picking a smaller subset of 1M samples from the original set of 10^13 data points. Jerry is right about the fact that what I am looking for is subsampling or decimation without running into aliasing problems. But due to the large amount of data, I cannot perform decimation on the entire data set. Also, I'd like to point out that my data is inherently discretized and so using an analog anti-aliasing filter to remove all the high frequency components that I am not interested in is NOT an option for me. So some scheme that resembles what John mentioned is of great interest to me. But I am not sure if that will work. An ideal scheme for me would be like the following: 1. Collect N=1M equidistant samples out of the n=10^13 data points. 2. Filter out the high frequency components present in the samples using some kind of LPF, i.e. For each sample "s_i" in the 1M samples i. Collect some constant number (say 10) samples before and after the sample "s_i" ii. Perform a Digital Low Pass Filtering operation on these 21 samples. iii. Pick one of these 21 samples (say the median of the 21 samples) to replace "s_i" Will the above procedure eliminate or atleast drastically reduce aliasing (assuming of course that the original 10^13 data points were sampled without aliasing)? Can I somehow tell which is the lowest frequency that was significantly filtered out (so that I know which of the lower frequencies contain aliases)? Some other basic questions regarding interpretation of the N-pt FFT results : 1. The frequency bin "i" corresponds to frequency fi = (i*fs/N), right? So bin N/2 contains frequency fs/2 while bin 1 contains frequency fs/N . 2. Assuming that the phase angles lie in [0, 2pi], if the phase angle of a frequency is 0, does that mean that the first positive peak of that frequency is at t=0? (Is this the case for FFTW's FFT results?) Thanks, Pradeep.

On Jun 6, 5:59 pm, "prad" <pradeep.ferna...@gmail.com> wrote:> Hi all, > > I am a newbie to DSP. I have a huge number of data samples that I want > to perform FFT on. But due to the enormous amount of the data samples, I'd > like to use only 1M samples and perform FFT. I know that this is severe > undersampling and that it does not satisfy the Nyquist rate. But all I > want is to prevent aliasing so that I can get correct information about > the low frequency content in the data. I cannot use an analog anti-alias > filter as the data is inherently digitized. Is there any time-domain > method to help with my problem? I read about a half-band digital filter > that eliminates the frequencies higher than half the sampling frequency. > Can this filter help my problem as I read a lot of articles that the high > frequencies have to be eliminated before digitization? If so, can someone > point me to some link that discusses the design of such a filter?Sounds like you simply need to low pass filter your samples so that there no spectral content above your required noise or accuracy level left at or above 0.5M cycles per total set data length. Then you can decimate or interpolate M samples out of your original data set. Any sufficiently quality digital low pass filter will do, depending on your needs and constraints (performance limits, maintaining relative phase information, etc.) This assumes that any aliasing in your original enormous sample set was below your required noise or accuracy floor. IMHO. YMMV. -- rhn A.T nicholson d.0.t C-o-M

Hi Are you implying that I first pick 1M samples, then LPF the samples and use them? Will that work? Or are you saying that I have to LPF the entire digitized data and then do decimation? The second method is not feasible for me as the data set is too large and will require an enormous amount of time. Please advise. Thanks Pradeep.>On Jun 6, 5:59 pm, "prad" <pradeep.ferna...@gmail.com> wrote: >> Hi all, >> >> I am a newbie to DSP. I have a huge number of data samples that Iwant>> to perform FFT on. But due to the enormous amount of the data samples,I'd>> like to use only 1M samples and perform FFT. I know that this issevere>> undersampling and that it does not satisfy the Nyquist rate. But all I >> want is to prevent aliasing so that I can get correct informationabout>> the low frequency content in the data. I cannot use an analoganti-alias>> filter as the data is inherently digitized. Is there any time-domain >> method to help with my problem? I read about a half-band digitalfilter>> that eliminates the frequencies higher than half the samplingfrequency.>> Can this filter help my problem as I read a lot of articles that thehigh>> frequencies have to be eliminated before digitization? If so, cansomeone>> point me to some link that discusses the design of such a filter? > >Sounds like you simply need to low pass filter your samples >so that there no spectral content above your required noise >or accuracy level left at or above 0.5M cycles per total set >data length. Then you can decimate or interpolate M samples >out of your original data set. Any sufficiently quality digital low >pass filter will do, depending on your needs and constraints >(performance limits, maintaining relative phase information, >etc.) > >This assumes that any aliasing in your original enormous >sample set was below your required noise or accuracy floor. > > >IMHO. YMMV. >-- >rhn A.T nicholson d.0.t C-o-M > >

"prad" <pradeep.fernando@gmail.com> writes:> Hi > > Are you implying that I first pick 1M samples, then LPF the samples and > use them? Will that work? Or are you saying that I have to LPF the entire > digitized data and then do decimation? The second method is not feasible > for me as the data set is too large and will require an enormous amount of > time. Please advise. > > Thanks > Pradeep.Pradeep, I ran some rough computations and I have a method that could perform this decimation in around two or three days using a modern PC. So yes it would take a lot of time, but it's certainly possible. Email me directly if you're interested. -- % Randy Yates % "I met someone who looks alot like you, %% Fuquay-Varina, NC % she does the things you do, %%% 919-577-9882 % but she is an IBM." %%%% <yates@ieee.org> % 'Yours Truly, 2095', *Time*, ELO http://home.earthlink.net/~yatescr

prad wrote:> Are you implying that I first pick 1M samples, then LPF the samples and > use them? Will that work? Or are you saying that I have to LPF the entire > digitized data and then do decimation? The second method is not feasible > for me as the data set is too large and will require an enormous amount of > time. Please advise.The exact answer depends much on the ratio of sampling rate to the frequency you are interested in. Most likely it is possible to come up with a relatively fast filter to get what you want. Consider the moving average filter. You will find that DSP people don't like it very much because its properties are not very good, but it is easy to describe and fast to implement. You might, for example, do a 1000 point moving average where the first output value is the mean of the first 1000 input samples, the second is the mean of 2 through 1001, etc. This can be done very fast with a circular buffer of length 1000 to update the sum appropriately. It might be that a moving average filter followed by a decimation by 10, (output only every 10th point) would be a good initial filter, and reduce your problem by a factor of ten. Then another filter could finish off the job. I have not done the calculations to show that the above is actually a good way to proceed, but only offer it to show that there are fast ways to filter large amounts of data, and that in some cases they might be useful. Also, that cascaded filters might be faster than doing the whole thing in one step. -- glen