Forums

impact of noise in zero-padded region of filtered time trend data

Started by all4dsp October 17, 2012
I included a Matlab program below you can run to illustrate my question (or
just look at the output I placed in the comments below the code). Note that
I'm using a small set of data below to simplify this question (the real app
uses much more data). 

This code:

1. zero pads an example (all real) 10-element array
2. takes its FFT
3. creates a Laplace (s-domain) filter and applies it to the FFT data
4. takes an IFFT
5. observes the time-domain (filtered) result

Looking at the time trend resulting output, I expected the filtered data
corresponding to the original zero-padded portion of the time trend to be
pretty close to zero. That is, if I have 10 original time-domain samples
and 64-10=54 zero-padded samples, I expected the last 54 samples of the
time-trend data to be, say, at least 10 orders of magnitude lower than the
filtered time data in the first 10 samples.  I suspect some of this relates
to filter ramping up and down in the zero-padded portion, but does this
ramping up/down entirely explain this, or is there something else going
on?

My application doesn't use the zero-padded portion of the time trend data.
My only concern is if the magnitude of noise that I see in the zero-padded
portion of the filtered time trend data ALSO exists in the non-zero padded
portion of the filtered time trend data (e.g. samples 1 to 10).

Could anyone comment on whether I messed something up in my analysis below,
and/or if there's reason to expect the noise seen in the zero-padded
portion will also appear in the non-zero padded portion of the filtered
time trend data? Thanks in advance for any comments.

-----Matlab program-----

clear; format compact; format long e; close all;
N=64; % number samples to use for FFT and IFFT

% create example zero-mean data containing 10 elements; sample every 1 ms
time=(0:0.001:(0.001*(10-1))); % time axis without zero padding
time_zeropad = (N/1000)*linspace(0,1,N)'; % time axis with zero padding
data_y = sin(2*pi*100*time) + sin(2*pi*400*time); % contains 100 Hz and 400
Hz
data_y = data_y - mean(data_y); % remove mean

% with sample rate Fs = 1K Samples/second
% and N analysis points, the analysis frequencies = m * Fs / N
freq_x = (0: (1000/N): (1000*(N-1)/N)); % analysis frequency (Hz)

% zero-pad then perform FFT
ffty = fft(data_y, N);
figure(1); plot(freq_x(1:N/2+1), abs(ffty(1:N/2+1)));
xlabel('Frequency (Hz)'); ylabel('FFT Magnitude'); title('Input data with
zero padding');

% create s-domain (Laplace) low-pass filter (LPF) with corner freq = 200
Hz
ss  = 1i*2*pi*freq_x; % s-domain values
H_complex = 200 ./ (ss+200); % s-domain LPF
figure(2); plot(freq_x(1:N/2+1), abs(H_complex(1:(N/2+1))));
xlabel('Frequency (Hz)'); ylabel('FFT Magnitude'); title('Filter
spectrum');

% apply filter in freq domain to half spectrum, including Nyquist bin
ffty_filtered = ffty(1:(N/2+1)) .* H_complex(1:(N/2+1)); % populate half of
spectrum

% populate other half spectrum using conjugate
bins=(2:N/2);
ffty_filtered(N-bins+2) = conj(ffty_filtered(bins));

% perform ifft
timetrend = ifft(ffty_filtered, N);
figure(3); plot(time, data_y,'b-o',time_zeropad, real(timetrend), 'r-+');
xlabel('Time (s)'); ylabel('Amplitude (s)'); title('Original zero-padded
data before (blue) and after (red) filtering');
real(timetrend)'

%{
real(timetrend)'
ans =
    -2.505535006057165e-02
     1.219166276197586e-01
     1.916111148307225e-01
     3.155071088051060e-01
     4.999766490699722e-01
     3.123552274218518e-01
     3.531539974032563e-01
     3.478600275025787e-02
    -1.171590340583797e-01
    -1.961192397191824e-01
    -2.949814680126898e-01  <-- I expected the following samples to be very
close to 0
    -2.056611741817437e-01
    -1.894223480924002e-01
    -1.406300177369708e-01
    -1.258911492483028e-01
    -9.465927771342163e-02
    -8.432003838287264e-02
    -6.335424652106726e-02
    -5.670645728944172e-02
    -4.223597250313536e-02
    -3.826744471255336e-02
    -2.804388754179695e-02
    -2.592376872638898e-02
    -1.852549178380533e-02
    -1.764846576766347e-02
    -1.214948503691726e-02
    -1.209554234726473e-02
    -7.881666616421402e-03
    -8.367552152736088e-03
    -5.025761517920641e-03
    -5.864868379353669e-03
    -3.113753952665312e-03
    -4.186438486258784e-03
    -1.831266312979044e-03
    -3.063964441311864e-03
    -9.670754458511732e-04
    -2.318099120110578e-03
    -3.790440931364381e-04
    -1.829261166003460e-03
     2.886726847434185e-05
    -1.518215002944249e-03
     3.221597471673027e-04
    -1.333222750709490e-03
     5.464285907102531e-04
    -1.241684023138803e-03
     7.348121038171784e-04
    -1.224950352986393e-03
     9.135376797942685e-04
    -1.275590252356906e-03
     1.106646959079052e-03
    -1.396946407229788e-03
     1.341113445572546e-03
    -1.605623458577374e-03
     1.654328892160460e-03
    -1.939197745498285e-03
     2.108356413570513e-03
    -2.475923795870458e-03
     2.823178637476942e-03
    -3.388181723048556e-03
     4.070822961638798e-03
    -5.114841703537848e-03
     6.619967098659654e-03
    -9.126133643749153e-03
     1.374217428191916e-02
%}
On Wed, 17 Oct 2012 14:13:51 -0500, all4dsp wrote:

> I included a Matlab program below you can run to illustrate my question > (or just look at the output I placed in the comments below the code). > Note that I'm using a small set of data below to simplify this question > (the real app uses much more data). > > This code: > > 1. zero pads an example (all real) 10-element array 2. takes its FFT > 3. creates a Laplace (s-domain) filter and applies it to the FFT data 4. > takes an IFFT > 5. observes the time-domain (filtered) result > > Looking at the time trend resulting output, I expected the filtered data > corresponding to the original zero-padded portion of the time trend to > be pretty close to zero. That is, if I have 10 original time-domain > samples and 64-10=54 zero-padded samples, I expected the last 54 samples > of the time-trend data to be, say, at least 10 orders of magnitude lower > than the filtered time data in the first 10 samples. I suspect some of > this relates to filter ramping up and down in the zero-padded portion, > but does this ramping up/down entirely explain this, or is there > something else going on? > > My application doesn't use the zero-padded portion of the time trend > data. My only concern is if the magnitude of noise that I see in the > zero-padded portion of the filtered time trend data ALSO exists in the > non-zero padded portion of the filtered time trend data (e.g. samples 1 > to 10). > > Could anyone comment on whether I messed something up in my analysis > below, and/or if there's reason to expect the noise seen in the > zero-padded portion will also appear in the non-zero padded portion of > the filtered time trend data? Thanks in advance for any comments.
<< code and data snipped >> The reason they're called infinite impulse response filters is because their response is -- well -- infinite. It looks like what you're calling "noise" is the exponential * sine settling of the filter, which is exactly what one would expect given what you're doing. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
Tim Wescott <tim@seemywebsite.com> wrote:
> On Wed, 17 Oct 2012 14:13:51 -0500, all4dsp wrote:
>> I included a Matlab program below you can run to illustrate my question >> (or just look at the output I placed in the comments below the code). >> Note that I'm using a small set of data below to simplify this question >> (the real app uses much more data).
>> This code:
>> 1. zero pads an example (all real) 10-element array 2. takes its FFT >> 3. creates a Laplace (s-domain) filter and applies it to the FFT data 4. >> takes an IFFT >> 5. observes the time-domain (filtered) result
>> Looking at the time trend resulting output, I expected the filtered data >> corresponding to the original zero-padded portion of the time trend to >> be pretty close to zero.
(even more snipped than before)
> << code and data snipped >>
> The reason they're called infinite impulse response filters is because > their response is -- well -- infinite.
(With a special definition for infinite in the periodic case.)
> It looks like what you're calling "noise" is the exponential * sine > settling of the filter, which is exactly what one would expect given what > you're doing.
So, to the OPs question, it is deterministic noise, not stochastic noise. That doesn't make it less noisy, but you should understand the difference. -- glen
On Wed, 17 Oct 2012 22:01:03 +0000, glen herrmannsfeldt wrote:

> Tim Wescott <tim@seemywebsite.com> wrote: >> On Wed, 17 Oct 2012 14:13:51 -0500, all4dsp wrote: > >>> I included a Matlab program below you can run to illustrate my >>> question (or just look at the output I placed in the comments below >>> the code). Note that I'm using a small set of data below to simplify >>> this question (the real app uses much more data). > >>> This code: > >>> 1. zero pads an example (all real) 10-element array 2. takes its FFT >>> 3. creates a Laplace (s-domain) filter and applies it to the FFT data >>> 4. takes an IFFT >>> 5. observes the time-domain (filtered) result > >>> Looking at the time trend resulting output, I expected the filtered >>> data corresponding to the original zero-padded portion of the time >>> trend to be pretty close to zero. > > (even more snipped than before) > >> << code and data snipped >> > >> The reason they're called infinite impulse response filters is because >> their response is -- well -- infinite. > > (With a special definition for infinite in the periodic case.) > >> It looks like what you're calling "noise" is the exponential * sine >> settling of the filter, which is exactly what one would expect given >> what you're doing. > > So, to the OPs question, it is deterministic noise, not stochastic > noise. > > That doesn't make it less noisy, but you should understand the > difference.
I think I'm in an argumentative mood today: your "deterministic noise" is my "unwanted signal". Not that what you call it makes a qualitative difference to what the system is doing, but it whether its stochastic or deterministic certainly makes a BIG difference in how you should treat the whatever-you-call-it (and how surprised you should be when it shows up in your FFT results). -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
On Wed, 17 Oct 2012 16:03:38 -0500, Tim Wescott wrote:

> On Wed, 17 Oct 2012 14:13:51 -0500, all4dsp wrote: > >> I included a Matlab program below you can run to illustrate my question >> (or just look at the output I placed in the comments below the code). >> Note that I'm using a small set of data below to simplify this question >> (the real app uses much more data). >> >> This code: >> >> 1. zero pads an example (all real) 10-element array 2. takes its FFT 3. >> creates a Laplace (s-domain) filter and applies it to the FFT data 4. >> takes an IFFT >> 5. observes the time-domain (filtered) result >> >> Looking at the time trend resulting output, I expected the filtered >> data corresponding to the original zero-padded portion of the time >> trend to be pretty close to zero. That is, if I have 10 original >> time-domain samples and 64-10=54 zero-padded samples, I expected the >> last 54 samples of the time-trend data to be, say, at least 10 orders >> of magnitude lower than the filtered time data in the first 10 samples. >> I suspect some of this relates to filter ramping up and down in the >> zero-padded portion, but does this ramping up/down entirely explain >> this, or is there something else going on? >> >> My application doesn't use the zero-padded portion of the time trend >> data. My only concern is if the magnitude of noise that I see in the >> zero-padded portion of the filtered time trend data ALSO exists in the >> non-zero padded portion of the filtered time trend data (e.g. samples 1 >> to 10). >> >> Could anyone comment on whether I messed something up in my analysis >> below, and/or if there's reason to expect the noise seen in the >> zero-padded portion will also appear in the non-zero padded portion of >> the filtered time trend data? Thanks in advance for any comments. > > << code and data snipped >> > > The reason they're called infinite impulse response filters is because > their response is -- well -- infinite. > > It looks like what you're calling "noise" is the exponential * sine > settling of the filter, which is exactly what one would expect given > what you're doing.
I never answered the last part of your question: to the extent that you're calling the expected tail of the LTI system response "noise", yes, you should expect the "noise" from the expected start-up transient of the LTI system to be present as well. Perhaps if you back up a few steps, and instead of telling us about a few misplaced trees, you tell us about the forest you're dealing with and what you want to do with it: given your surprise at the behavior of your system, it may be a good idea for us to look at your whole approach. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
>On Wed, 17 Oct 2012 16:03:38 -0500, Tim Wescott wrote: > >> On Wed, 17 Oct 2012 14:13:51 -0500, all4dsp wrote: >> >>> I included a Matlab program below you can run to illustrate my
question
>>> (or just look at the output I placed in the comments below the code). >>> Note that I'm using a small set of data below to simplify this
question
>>> (the real app uses much more data). >>>
<snipped>
>Perhaps if you back up a few steps, and instead of telling us about a few
>misplaced trees, you tell us about the forest you're dealing with and >what you want to do with it: given your surprise at the behavior of your >system, it may be a good idea for us to look at your whole approach. >
Thanks Tim, the Matlab code pretty much summarizes my "whole approach": I have a time sequence of data that I need to accurately filter, but the filter I'm given is specified in the s (e.g. Laplace) domain. As shown, I simply zero-pad then FFT the time data to get it in the frequency domain. Then I create an s-domain filter using 100% of the FFT bins, perform multiplication (e.g. I believe this is called frequency convolution), then take the IFFT and review the results. I'm only interested in the filtered time-domain signal outside of the zero-padded region (the noise in the zero-padded region is a "don't care," but if similar noise exists in the non-zero padded region, that would be good to know; I can live with say 1% error, but if it approaches, say, 10%, that's a problem). Basically, I'm trying to figure out if this an accurate procedure or not (e.g. for example, does circular convolution, or other theory, lead to noise everywhere and corrupt the data to the point it's inaccurate, etc.). I'm not an expert on DSP. I'm trying to get a sense of any noise that appears in the non-zero-padded region of the filtered time domain signal, to understand if this approach is OK to use. When I saw the zero-padded region of the filtered time domain signal higher than expected, it led me to ask here whether or not I should expect to see the same noise everywhere. As an aside, since I'm not that experienced in DSP, I wonder if anyone could comment if this Matlab procedure I'm using to filter a time sequence using s-domain filter is something you've seen people use successfully in the past, or am I totally making up something unconventional here? I'd like to follow convention as much as possible, but I'm stuck with the filter being defined in the s-domain, my input signal in time domain, and the output signal in the time domain.
On Wed, 17 Oct 2012 18:32:03 -0500, all4dsp wrote:

>>On Wed, 17 Oct 2012 16:03:38 -0500, Tim Wescott wrote: >> >>> On Wed, 17 Oct 2012 14:13:51 -0500, all4dsp wrote: >>> >>>> I included a Matlab program below you can run to illustrate my > question >>>> (or just look at the output I placed in the comments below the code). >>>> Note that I'm using a small set of data below to simplify this > question >>>> (the real app uses much more data). >>>> > <snipped> >>Perhaps if you back up a few steps, and instead of telling us about a >>few > >>misplaced trees, you tell us about the forest you're dealing with and >>what you want to do with it: given your surprise at the behavior of your >>system, it may be a good idea for us to look at your whole approach. >> >> > Thanks Tim, the Matlab code pretty much summarizes my "whole approach":
Well, with all due respect a "whole approach" description would start out something like "I am filtering altitude information from an airplane to determine if it's going to crash into a mountain" or "I have an audio recording from which I am trying to remove..." etc.
> I have a time sequence of data that I need to accurately filter, but the > filter I'm given is specified in the s (e.g. Laplace) domain. As shown, > I simply zero-pad then FFT the time data to get it in the frequency > domain. Then I create an s-domain filter using 100% of the FFT bins, > perform multiplication (e.g. I believe this is called frequency > convolution), then take the IFFT and review the results. > > I'm only interested in the filtered time-domain signal outside of the > zero-padded region (the noise in the zero-padded region is a "don't > care," but if similar noise exists in the non-zero padded region, that > would be good to know; I can live with say 1% error, but if it > approaches, say, 10%, that's a problem).
I really want to help you, but given that you don't seem to be getting the difference between a transient and noise, and since you're not letting us onto what you're actually _doing_ (no one except for students just filter sequences of numbers, after all), I can't give you any informed opinion. 1% error from what? Your input signal? Your desired signal? What is your desired signal? Etc.
> Basically, I'm trying to figure out if this an accurate procedure or not > (e.g. for example, does circular convolution, or other theory, lead to > noise everywhere and corrupt the data to the point it's inaccurate, > etc.). I'm not an expert on DSP. I'm trying to get a sense of any noise > that appears in the non-zero-padded region of the filtered time domain > signal, to understand if this approach is OK to use. When I saw the > zero-padded region of the filtered time domain signal higher than > expected, it led me to ask here whether or not I should expect to see > the same noise everywhere.
You will see the _transient_ from the filter continue on after the original signal has gone to zero. Its envelope will taper off exponentially. You've already used the key word here: it's _circular_ convolution. That means (among other things) that the transient from the filter will wrap around into your desired output data. If you take your frequency-domain filter representation and do an IFFT on it, you'll have the filter's time-domain impulse response. From this, you can estimate whether the impulse response dies out in the space of your IFFT, and if so, how much time it takes. Once you know how much time the filter's transient takes to die off, put that many zeros after your data (you don't need to zero-pad in front: it's _circular_ convolution, so front-padding or end-padding just changes where the output starts, not what bleeds into your desired signal). When you see a filtered output that dies out to zero _before_ it gets to the end of your zero-padded region, then you know that it won't be wrapping around.
> > As an aside, since I'm not that experienced in DSP, I wonder if anyone > could comment if this Matlab procedure I'm using to filter a time > sequence using s-domain filter is something you've seen people use > successfully in the past, or am I totally making up something > unconventional here? I'd like to follow convention as much as possible, > but I'm stuck with the filter being defined in the s-domain, my input > signal in time domain, and the output signal in the time domain.
It is a bit unconventional, and there are pitfalls to what you're doing (mostly involving the fact that your sampled time sequence input and output aren't what a continuous-time filter either sees or spits out). As long as your sampling rate is high enough compared to your filter (and signal) bandwidth there's a good chance that you're OK. -- My liberal friends think I'm a conservative kook. My conservative friends think I'm a liberal kook. Why am I not happy that they have found common ground? Tim Wescott, Communications, Control, Circuits & Software http://www.wescottdesign.com
Tim Wescott <tim@seemywebsite.com> wrote:

(snip)

>>>> Looking at the time trend resulting output, I expected the filtered >>>> data corresponding to the original zero-padded portion of the time >>>> trend to be pretty close to zero.
(even more snipped than before) (snip)
>>> It looks like what you're calling "noise" is the exponential * sine >>> settling of the filter, which is exactly what one would expect given >>> what you're doing.
>> So, to the OPs question, it is deterministic noise, not stochastic >> noise.
>> That doesn't make it less noisy, but you should understand the >> difference.
> I think I'm in an argumentative mood today: your "deterministic noise" is > my "unwanted signal".
First, some here might be interested, and not know about this yet: http://work.caltech.edu/telecourse.html It is a little late for this session, but will likely start again next year. One of the lectures covers stochastic and deterministic noise. (In physics, they are random and systematic error. One can be reduced by averaging, the other can't be.)
> Not that what you call it makes a qualitative difference to what the > system is doing, but it whether its stochastic or deterministic certainly > makes a BIG difference in how you should treat the whatever-you-call-it > (and how surprised you should be when it shows up in your FFT results).
Yes. Well, pretty often in signal processing you get random noise. The low bits of your 32 bit ADC are likely pretty random. (Does anyone make one yet?) In this case, if you do the same transform again, you will get the same values. -- glen
Tim Wescott <tim@seemywebsite.com> wrote:
> On Wed, 17 Oct 2012 18:32:03 -0500, all4dsp wrote:
(snip)
> Well, with all due respect a "whole approach" description would start out > something like "I am filtering altitude information from an airplane to > determine if it's going to crash into a mountain" or "I have an audio > recording from which I am trying to remove..." etc.
>> I have a time sequence of data that I need to accurately filter, but the >> filter I'm given is specified in the s (e.g. Laplace) domain. As shown, >> I simply zero-pad then FFT the time data to get it in the frequency >> domain. Then I create an s-domain filter using 100% of the FFT bins, >> perform multiplication (e.g. I believe this is called frequency >> convolution), then take the IFFT and review the results.
>> I'm only interested in the filtered time-domain signal outside of the >> zero-padded region (the noise in the zero-padded region is a "don't >> care," but if similar noise exists in the non-zero padded region, that >> would be good to know; I can live with say 1% error, but if it >> approaches, say, 10%, that's a problem).
Well, the problem with filter transients is general. For an RLC filter, the inititial voltage on the capacitor and current in the inductor (likely zero, but you never know) cause a transient. You have to filter long enough for that to decay away. (If you want the steady-state solution.)
> I really want to help you, but given that you don't seem to be getting > the difference between a transient and noise, and since you're not > letting us onto what you're actually _doing_ (no one except for students > just filter sequences of numbers, after all), I can't give you any > informed opinion.
> 1% error from what? Your input signal? Your desired signal? What is > your desired signal? Etc.
>> Basically, I'm trying to figure out if this an accurate procedure or not >> (e.g. for example, does circular convolution, or other theory, lead to >> noise everywhere and corrupt the data to the point it's inaccurate, >> etc.).
(snip)
> You will see the _transient_ from the filter continue on after the > original signal has gone to zero. Its envelope will taper off > exponentially.
> You've already used the key word here: it's _circular_ convolution. That > means (among other things) that the transient from the filter will wrap > around into your desired output data.
Yes. Unlike an RLC analog filter, or IIR or FIR digital filter, the transient in this case isn't causal. It can affect samples before, not just including the circular convolution. Well, more specifically, non-causality is a feature of this type of filter.
> If you take your frequency-domain filter representation and do an IFFT on > it, you'll have the filter's time-domain impulse response. From this, > you can estimate whether the impulse response dies out in the space of > your IFFT, and if so, how much time it takes.
Yes, you should do that.
> Once you know how much time the filter's transient takes to die off, put > that many zeros after your data (you don't need to zero-pad in front: > it's _circular_ convolution, so front-padding or end-padding just changes > where the output starts, not what bleeds into your desired signal).
> When you see a filtered output that dies out to zero _before_ it gets to > the end of your zero-padded region, then you know that it won't be > wrapping around.
>> As an aside, since I'm not that experienced in DSP, I wonder if anyone >> could comment if this Matlab procedure I'm using to filter a time >> sequence using s-domain filter is something you've seen people use >> successfully in the past, or am I totally making up something >> unconventional here? I'd like to follow convention as much as possible, >> but I'm stuck with the filter being defined in the s-domain, my input >> signal in time domain, and the output signal in the time domain.
> It is a bit unconventional, and there are pitfalls to what you're doing > (mostly involving the fact that your sampled time sequence input and > output aren't what a continuous-time filter either sees or spits out). > As long as your sampling rate is high enough compared to your filter (and > signal) bandwidth there's a good chance that you're OK.
I agree. -- glen
>> It is a bit unconventional, and there are pitfalls to what you're doing
>> (mostly involving the fact that your sampled time sequence input and >> output aren't what a continuous-time filter either sees or spits out).
>> As long as your sampling rate is high enough compared to your filter
(and
>> signal) bandwidth there's a good chance that you're OK. > >I agree. > >-- glen >
I understand one wants the sampling rate higher than the signal bandwidth to prevent aliasing. But, what problem arrises as the sampling rate approaches the filter bandwidth? And, generally, what would be a good minimum ratio of sampling rate to filter bandwidth?