comp.dsp | Fast Convolution

I did search the groups for fast convolututions,but did not find the
answer relevent to my research.So here is the question.

I use a gaussian to smoothen the image.Making the filter seperable (in
row and column) is the normal approach(unless you have a lot of time
to process a 2D gaussian).I am NOT doing recursive filtering,just
filtering it once.Some of the approaches I came across are

1. FFTs/DFTs/overlap add/overlap save methods ->dont think its worth
the pain,as the gaussian kernel is small(less than 10 samples).may be
more time will be spent in transforming the signals,i dont know,but i
did not try it yet.I also remember something taught in school,that it
would not be a good idea to transform the signals for small sequences

2. SIMD/MIMD approaches. Sounds reasonable and logical.I use Intel
MMX/Windows->VC  and want to get some references for SIMD/MIMD if this
is a good approach

3.Specialized hardware/FPGAs/DSPs. I cannot afford it.

Are there any other approaches that I am missing?If yes,please also
point me to some references.

My other thought was since the gaussian is symmetrical, say
g[-3],g[-2],g[-1],g[0],g[1],g[2],g[3], i.e. g[i] = g[-i] and if the
data sequence is ,say, data[0],data[1]....data[250], and say data[i]
is the current point under consideration. data[i-1] is multiplied with
g[i+1] to give ,say A,, and data[i+1] is multiplied with g[i-1],to
give say B. when the current point shifts to data[i+2], data[i+1] is
multiplied with g[i-1] which is essentially same as B, and this
multiplication need not be performed. So essentially,only (N-1)/2
multiplications are performed for every point which is
convolved(because of symmetry of gaussian). (The same holds true for
derivative of gaussian too,but with a sign change). Eventhough my idea
is amorphous,at this point, I dont want to re-invent the wheel. Does
anybody have any references/better method for faster convolution?

Thanks for all your help

Ram

ip4ram@yahoo.com writes:

> 1. FFTs/DFTs/overlap add/overlap save methods ->dont think its worth
> the pain,as the gaussian kernel is small(less than 10 samples).may
> be more time will be spent in transforming the signals,i dont
> know,but i did not try it yet.I also remember something taught in
> school,that it would not be a good idea to transform the signals for
> small sequences
>
> Are there any other approaches that I am missing?If yes,please also
> point me to some references.

Ok, here is a simple idea for a kernel of about 3*k samples size:
Approximate the Gaussian with 3 piecewise quadratic curves.  You can
do that in the following manner:

s0 = s1 = s2 = 0;  // actually, this works even with arbitrary initialization!
for (i=0; i<n; i++)
  a[i] = s2 += s1 += s0 += a[i];
for (i=0; i<n-2*k; i++)
  a[i] = a[i] - 2* a[i+k] + a[i+2*k];

You definitely need to use some integer or other exact data type to do
this (or the exploding s0, s1, s2 terms will not cancel out
numerically), and the result has to be scaled down again afterwards.

This same "sparse" approximation technique can be used for arbitrarily
shaped kernels that are approximated with B-splines of some fixed
degree.  The first for-loop basically convolves the input with the
base B-spline, the second one then convolves with a sparse pulse train
that generates the kernel shape from the basic B-spline.

Actually, this is my own idea: I have not found it in literature.  But
it is a fairly obvious trick, so it is probably just for lack of
looking.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum

Fast Convolution

Sign in

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About DSPRelated.com

Social Networks

The Related Media Group