comp.dsp | cross correlation and FFT problem

Hi my friends,
I don't find solution for this, i give you the example
i have two signals
x=[1 0 1 1 2 3 4 4 1 0 1 0 1];
y=[1 1 1 1 1 1 2 3 4 4 1 0 1 0 1];
I calculated cross correlation with xcorr in matlab
z=xcorr(x,y), length(z)=2*max(length(x),length(y))-1
[m,I]=max(z)
m=51, I=17
i'm trying to compare this to rxy
L=max(length(x),length(y))
NFFT=2*L
xx=fft(x,NFFT)
yy=fft(y,NFFT)
Sxy=xx.*conj(yy)
rxy=fftshift(real(ifft(Sxy)));
[k,J]=max(rxy)
k=51
J=14
my question is why I is different to J
may be the NFFT is not correct, can you help me please
thanks a lot




		
This message was sent using the Comp.DSP web interface on
www.DSPRelated.com

Reply by ●September 28, 20052005-09-28

Hi,

Be careful about the length of the FFT; you need a length at least the
sum of the lenghts of the two sequences minus one. The reason is that
the frequency domain operation is really a circular convolution.

If you end up requiring a weird FFT length you may want to take a look
at my FFT routine available from my homepage.

Best regards,
Jens J. Nielsen
http:/home.get2net.dk/jjn

On Mon, 04 Jul 2005 15:10:32 -0500, "signal" <monirov@hotmail.com>
wrote:

>Hi my friends,
>I don't find solution for this, i give you the example
>i have two signals
>x=[1 0 1 1 2 3 4 4 1 0 1 0 1];
>y=[1 1 1 1 1 1 2 3 4 4 1 0 1 0 1];
>I calculated cross correlation with xcorr in matlab
>z=xcorr(x,y), length(z)=2*max(length(x),length(y))-1
>[m,I]=max(z)
>m=51, I=17
>i'm trying to compare this to rxy
>L=max(length(x),length(y))
>NFFT=2*L
>xx=fft(x,NFFT)
>yy=fft(y,NFFT)
>Sxy=xx.*conj(yy)
>rxy=fftshift(real(ifft(Sxy)));
>[k,J]=max(rxy)
>k=51
>J=14
>my question is why I is different to J
>may be the NFFT is not correct, can you help me please
>thanks a lot
>
>
>
>
>		
>This message was sent using the Comp.DSP web interface on
>www.DSPRelated.com

Reply by Bob Cain ●September 29, 20052005-09-29


Jens J&#4294967295;rgen Nielsen wrote:
> Hi,
> 
> Be careful about the length of the FFT; you need a length at least the
> sum of the lenghts of the two sequences minus one. The reason is that
> the frequency domain operation is really a circular convolution.

A rather useless operation IMHO.  :-)


Bob
-- 

"Things should be described as simply as possible, but no 
simpler."

                                              A. Einstein

Reply by ●September 29, 20052005-09-29

Is there any good examples (like commercial products or standards)
using the frequency domain for efficient convolution/filtering?

I guess that's what Bob's comment is about!?

Jens

I tend to agree with Bob's comment about doing convolution in the
frequency domain. Even though it's theoretically more efficient than
the time domain operation in certain cases I'm not familiar with any
practical implementations

On Wed, 28 Sep 2005 21:59:13 -0700, Bob Cain
<arcane@arcanemethods.com> wrote:

>
>
>Jens J&#4294967295;rgen Nielsen wrote:
>> Hi,
>> 
>> Be careful about the length of the FFT; you need a length at least the
>> sum of the lenghts of the two sequences minus one. The reason is that
>> the frequency domain operation is really a circular convolution.
>
>A rather useless operation IMHO.  :-)
>
>
>Bob

Reply by Mark Borgerding ●September 30, 20052005-09-30

I disagree with the conclusion.  Jens, at least you are more diplomatic 
in your opinions.  Bob, you shouldn't claim something is useless, just 
because you can't find a use for it.

I can see how you'd form your opinion *if* you've mainly worked in the 
fixed-point world.

I assume everyone here knows that fast convolution filtering becomes 
increasingly more efficient for longer filter lengths.  Below some 
crossover point, direct convolution is more efficient. Above that point, 
fast convolution is quicker. I will not spend time talking about this 
point, since I doubt it is contested.

For a fixed point DSP processor, fast convolution has a couple of 
strikes against it:
1) lack of dynamic range in a fixed-point FFT. This prohibits large FFTs 
required for large filters
2) DSP chips are already optimized for direct FIR compution/correlation
   Most (all) DSP chips have a single cycle multiply-accumulate 
operation.  This moves the crossover point for direct vs. fast 
convolution much higher than the theoretical 25-30 taps.
3) Such systems are usually memory-poor.
4) For the applications built on fixed-point DSPs, latency requirements 
are just as important than throughput requirements.

These factors act to squeeze down the useful range of fast convolution. 
  The filters cannot be too big, because the FFT turns to mush.  For a 
small filter, you'd be better off using direct computation. ON A FIXED 
POINT PLATFORM.!!!!

On the other hand:
For floating point on a desktop/server processor, fast convolution is my 
filtering tool of choice.  Unless you are using REALLY short filters, 
you'd probably be better off with fast convolution.

1) The dynamic range of a decent floating point FFT is not really an issue.
2) Efficient SIMD FFT libraries such as FFTW or IPP make the crossover 
point for direct vs fast filtering much lower than the theoretical 25-30 
taps, more like 16 taps.
3) The extra memory required for fast convolution is negligible on a 
desktop/server.
4) desktop/server DSP applications are often file-based. For these, 
latency is a non-issue.

-- Mark

Jens J&#4294967295;rgen Nielsen wrote:
> Is there any good examples (like commercial products or standards)
> using the frequency domain for efficient convolution/filtering?
> 
> I guess that's what Bob's comment is about!?
> 
> Jens
> 
> I tend to agree with Bob's comment about doing convolution in the
> frequency domain. Even though it's theoretically more efficient than
> the time domain operation in certain cases I'm not familiar with any
> practical implementations
> 
> On Wed, 28 Sep 2005 21:59:13 -0700, Bob Cain
> <arcane@arcanemethods.com> wrote:
> 

[]
>>>the frequency domain operation is really a circular convolution.
>>
>>A rather useless operation IMHO.  :-)
>>
>>
>>Bob
> 
>

Reply by Bob Cain ●September 30, 20052005-09-30


Jens J&#4294967295;rgen Nielsen wrote:
> Is there any good examples (like commercial products or standards)
> using the frequency domain for efficient convolution/filtering?
> 
> I guess that's what Bob's comment is about!?
> 
> Jens
> 
> I tend to agree with Bob's comment about doing convolution in the
> frequency domain. Even though it's theoretically more efficient than
> the time domain operation in certain cases I'm not familiar with any
> practical implementations

Oops!  My comment was with respect to doing non-extended, 
circular convolution.  Doing convolution in the frequency 
domain with overlap add/save is _very_ efficient for long 
kernels compared to doing it in the time domain.


Bob
-- 

"Things should be described as simply as possible, but no 
simpler."

                                              A. Einstein

Reply by Bob Cain ●September 30, 20052005-09-30


Mark Borgerding wrote:
> I disagree with the conclusion.  Jens, at least you are more diplomatic 
> in your opinions.  Bob, you shouldn't claim something is useless, just 
> because you can't find a use for it.

Sorry my terse comment was ambiguous.  See my response to Jens.


Bob
-- 

"Things should be described as simply as possible, but no 
simpler."

                                              A. Einstein

Reply by john ●September 30, 20052005-09-30

Mark Borgerding wrote:
> I disagree with the conclusion.  Jens, at least you are more diplomatic
> in your opinions.  Bob, you shouldn't claim something is useless, just
> because you can't find a use for it.
>
> I can see how you'd form your opinion *if* you've mainly worked in the
> fixed-point world.
>
> I assume everyone here knows that fast convolution filtering becomes
> increasingly more efficient for longer filter lengths.  Below some
> crossover point, direct convolution is more efficient. Above that point,
> fast convolution is quicker. I will not spend time talking about this
> point, since I doubt it is contested.
>
>
> For a fixed point DSP processor, fast convolution has a couple of
> strikes against it:
> 1) lack of dynamic range in a fixed-point FFT. This prohibits large FFTs
> required for large filters
> 2) DSP chips are already optimized for direct FIR compution/correlation
>    Most (all) DSP chips have a single cycle multiply-accumulate
> operation.  This moves the crossover point for direct vs. fast
> convolution much higher than the theoretical 25-30 taps.
> 3) Such systems are usually memory-poor.
> 4) For the applications built on fixed-point DSPs, latency requirements
> are just as important than throughput requirements.
>
> These factors act to squeeze down the useful range of fast convolution.
>   The filters cannot be too big, because the FFT turns to mush.  For a
> small filter, you'd be better off using direct computation. ON A FIXED
> POINT PLATFORM.!!!!
>
> On the other hand:
> For floating point on a desktop/server processor, fast convolution is my
> filtering tool of choice.  Unless you are using REALLY short filters,
> you'd probably be better off with fast convolution.
>
> 1) The dynamic range of a decent floating point FFT is not really an issue.
> 2) Efficient SIMD FFT libraries such as FFTW or IPP make the crossover
> point for direct vs fast filtering much lower than the theoretical 25-30
> taps, more like 16 taps.
> 3) The extra memory required for fast convolution is negligible on a
> desktop/server.
> 4) desktop/server DSP applications are often file-based. For these,
> latency is a non-issue.
>
>
> -- Mark
>
>

Thanks for a thoughtful posting, Mark. I would add one more item to
your strike list:

5) Optimized FFT libraries for fixed-point DSP chips are usually
limited to powers of two up to 1k or so.

I would also point out that modern fixed-point DSP chips can directly
convolve in less than one clock per cycle, either by using specialized
instructions for symmetric coefficients or parallel MAC units.

If there is a rate change in the FIR, the crossover point moves yet
again because the linearly reduced (increased) MAC load has to be
compared to the reduced (increased) IFFT size.

John

Reply by Mark Borgerding ●October 1, 20052005-10-01

john wrote:

> Thanks for a thoughtful posting, Mark. I would add one more item to
> your strike list:
> 
> 5) Optimized FFT libraries for fixed-point DSP chips are usually
> limited to powers of two up to 1k or so.

The power-of-2 limitation is not really an issue for fast convolution. 
But the overall size limitation certainly sets an upper bound on filter 
lengths.

> 
> I would also point out that modern fixed-point DSP chips can directly
> convolve in less than one clock per cycle, either by using specialized
> instructions for symmetric coefficients or parallel MAC units.

Good point.  I should've said, "single cycle multiply-accumulate
*or* *better*"

> 
> If there is a rate change in the FIR, the crossover point moves yet
> again because the linearly reduced (increased) MAC load has to be
> compared to the reduced (increased) IFFT size.

Interesting.

This effect would be observed regardless of floating vs. fixed point.

It can be mitigated by the fact that the forward & inverse FFTs don't 
need to be the same size.  For details, keep an eye out for my article 
in next May's issue of IEEE Signal Processing Magazine. </PLUG>

If I am thinking clearly, the direction that the crossover point moves 
would depend on the direction of the rate change (upsampling vs. 
downsampling).  Upsampling should favor fast convolution while 
downsampling should favor direct conv.

Let me try to spell out my thoughts.

The following compares the cpu cost(work) involved non-rate-changing 
filtering vs.
   1-stage (not-multirate) interpolator or decimator
  (complete with anti-aliasing)
  with either direct or fast convolution.

With direct convolution:
   decimation by D
     anti-aliasing filter is designed at higher (input) rate
     scales the processor work by 1/D  (computed for each output sample)
   interpolation by I
     anti-aliasing filter is designed at higher (output) rate
     scales the processor work by I (again, for each output sample)

With fast convolution
   decimation by D
     anti-aliasing filter is designed at higher (input) rate
     forward FFT same size (as no r.c.), inverse FFT is 1/D size
      scales work by roughly .5 + .5/D
	i.e. you can *never* get rid of the first half of the work.
   interpolation by I
     anti-aliasing filter only needs to be designed at *lower* (input) rate
     fwd FFT same size,  inv FFT I times bigger (zero padded).
     scales work by roughly .5 + .5*I

> 
> John
> 
Good to hear from you! Drop me a line. I only have your work email.

-- Mark Borgerding

Reply by john ●October 1, 20052005-10-01

Mark Borgerding wrote:
> john wrote:
>
> > Thanks for a thoughtful posting, Mark. I would add one more item to
> > your strike list:
> >
> > 5) Optimized FFT libraries for fixed-point DSP chips are usually
> > limited to powers of two up to 1k or so.
>
> The power-of-2 limitation is not really an issue for fast convolution.
> But the overall size limitation certainly sets an upper bound on filter
> lengths.

For a non-power-of-2 rate change, the IFFT might not be availabe
though.

>
> >
> > I would also point out that modern fixed-point DSP chips can directly
> > convolve in less than one clock per cycle, either by using specialized
> > instructions for symmetric coefficients or parallel MAC units.
>
>
> Good point.  I should've said, "single cycle multiply-accumulate
> *or* *better*"
>
> >
> > If there is a rate change in the FIR, the crossover point moves yet
> > again because the linearly reduced (increased) MAC load has to be
> > compared to the reduced (increased) IFFT size.
>
> Interesting.
>
> This effect would be observed regardless of floating vs. fixed point.
>
> It can be mitigated by the fact that the forward & inverse FFTs don't
> need to be the same size.  For details, keep an eye out for my article
> in next May's issue of IEEE Signal Processing Magazine. </PLUG>
>
> If I am thinking clearly, the direction that the crossover point moves
> would depend on the direction of the rate change (upsampling vs.
> downsampling).  Upsampling should favor fast convolution while
> downsampling should favor direct conv.
>
> Let me try to spell out my thoughts.
>
> The following compares the cpu cost(work) involved non-rate-changing
> filtering vs.
>    1-stage (not-multirate) interpolator or decimator
>   (complete with anti-aliasing)
>   with either direct or fast convolution.
>
> With direct convolution:
>    decimation by D
>      anti-aliasing filter is designed at higher (input) rate
>      scales the processor work by 1/D  (computed for each output sample)
>    interpolation by I
>      anti-aliasing filter is designed at higher (output) rate
>      scales the processor work by I (again, for each output sample)
>
> With fast convolution
>    decimation by D
>      anti-aliasing filter is designed at higher (input) rate
>      forward FFT same size (as no r.c.), inverse FFT is 1/D size
>       scales work by roughly .5 + .5/D
> 	i.e. you can *never* get rid of the first half of the work.
>    interpolation by I
>      anti-aliasing filter only needs to be designed at *lower* (input) rate
>      fwd FFT same size,  inv FFT I times bigger (zero padded).
>      scales work by roughly .5 + .5*I
>
> >
> > John
> >
> Good to hear from you! Drop me a line. I only have your work email.
>
>
> -- Mark Borgerding

The work scales down in both the D and I cases (I misspoke on this
point). In the former case every Dth input is clocked into a
full-length MAC loop. In the latter case all inputs are clocked into a
MAC loop that is reduced by I by eliminating the 0*tap multiplies
(using the appropriate tap subset each time).

As you know, the convolution method can be attractive where a
fractional rate change is desired.

John

cross correlation and FFT problem

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group